Remove the internal TOMLChar wrapper#492
Conversation
b4509b9 to
1c43d4d
Compare
1c43d4d to
d92e0a0
Compare
|
Rebased onto |
After the bulk run-scans, the parser only built a `TOMLChar` (a `str` subclass) at run boundaries and used a handful of its `is_*()` helpers. Drop the class entirely: `Source` now yields plain `str` characters and detects end-of-input positionally (`_idx >= len` / `Source.end()`) instead of an identity sentinel, and the remaining character-class checks use module-level frozensets. A real NUL byte is still rejected as an invalid control char and is never mistaken for end-of-input, since EOF is now positional rather than a sentinel comparison. No behaviour change (972 tests incl. the toml-test conformance submodule; plus an 11.5k-input adversarial differential over EOF/truncation, real-NUL placement, empty/whitespace and structural fuzz — output and error-type byte-identical to master). Removes the per-character object construction and method dispatch (~1.1-1.18x over the previous step).
d92e0a0 to
01860b1
Compare
|
Done — dropped the duplicate raw-string constants and kept only frozensets, renamed without the Note on CI: all unit-test jobs (every OS × Python) + pre-commit + the |
What
After the bulk run-scans (#490/#491), the parser only constructs a
TOMLChar(astrsubclass) at run boundaries and uses a handful of itsis_*()helpers. This removes the class entirely:Sourceyields plainstrcharacters;inc()/advance_*readself[i]directly._idx >= len/Source.end()) instead of an identity sentinel.A real NUL byte is still rejected as an invalid control char and is never mistaken for end-of-input, since EOF is now positional rather than a value/identity comparison.
Benchmarks
Median, interleaved A/B vs
master(includes #489–#491):The removal itself adds ~1.1–1.18× over #491. No regression on any shape.
Tests
Full suite passes (972, incl. the toml-test conformance submodule). On top of that, an 11.5k-input adversarial differential — EOF/truncation at every prefix length, real-NUL placement in every position, empty/whitespace/BOM, and structural fuzz — is byte-identical in output and exception type to
master. No public API change (TOMLCharwas not exported).