Skip to content

feat(isthmus)!: make unquoted identifier casing configurable in ConverterProvider#983

Open
nielspardon wants to merge 1 commit into
substrait-io:mainfrom
nielspardon:feat/configurable-unquoted-casing
Open

feat(isthmus)!: make unquoted identifier casing configurable in ConverterProvider#983
nielspardon wants to merge 1 commit into
substrait-io:mainfrom
nielspardon:feat/configurable-unquoted-casing

Conversation

@nielspardon

@nielspardon nielspardon commented Jul 2, 2026

Copy link
Copy Markdown
Member

Summary

Adds constructor-based configuration of unquoted SQL identifier casing to ConverterProvider, so that isthmus consumers can control how unquoted identifiers are cased during parsing. The default remains Casing.TO_UPPER (no behaviour change).

Previously the only way to change this was to subclass ConverterProvider and override getSqlParserConfig() — as IsthmusEntryPoint already did with an anonymous class. That workaround is now replaced by a first-class constructor parameter.

Breaking change

The SqlParser.Config-based parsing entry points (public in v0.94.0) have been removed in favour of ConverterProvider overloads:

  • SubstraitSqlStatementParser.parseStatements(String, SqlParser.Config)
  • SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlParser.Config)
  • SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlValidator, RelOptCluster, SqlParser.Config)

Callers should pass a ConverterProvider (constructed with the desired Casing, or subclassed with an overridden getSqlParserConfig() for fully custom parser settings) instead of a SqlParser.Config.

Changes

ConverterProvider

  • unquotedCasing is a new final field, consistent with executionBehavior
  • getUnquotedCasing() — getter
  • getSqlParserConfig() reads unquotedCasing instead of hard-coding Casing.TO_UPPER
  • New constructors: ConverterProvider(Casing) and ConverterProvider(extensions, typeFactory, Casing) for the common cases; the existing 7-arg all-components constructor gains Casing as an 8th parameter. All narrower constructors default to Casing.TO_UPPER.
  • ConverterProvider.DEFAULT — a shared constant for the default (all-system-defaults) provider, used at every call site that previously wrote new ConverterProvider().

Propagation through the pipeline

The casing setting is applied consistently across both CREATE TABLE parsing and query parsing, so that the table name stored in a NamedScan matches the configured casing end-to-end.

Class Change
SubstraitSqlStatementParser parseStatements(String, ConverterProvider) now owns the SqlParser instantiation directly; the SqlParser.Config overload is removed — callers needing fully custom config should subclass ConverterProvider and override getSqlParserConfig()
SubstraitSqlToCalcite New convertQueries(sql, catalog, ConverterProvider) and convertQueries(sql, catalog, ConverterProvider, operatorTable) overloads; all internal overloads now route through ConverterProvider; the SqlParser.Config overloads are removed
SubstraitCreateStatementParser New processCreateStatements(ConverterProvider, sql) and processCreateStatementsToCatalog(ConverterProvider, ...) overloads
SqlToSubstrait convert(sql, catalog) now uses the ConverterProvider path; convert(sql, catalog, SqlDialect) is @Deprecated — the SqlDialect argument is ignored and it simply delegates to convert(sql, catalog)
SqlExpressionToSubstrait Uses processCreateStatements(converterProvider, tableDef)
SubstraitToSql No-arg constructor uses ConverterProvider.DEFAULT
IsthmusEntryPoint Uses new ConverterProvider(unquotedCasing); anonymous ConverterProvider subclass removed
FromSql (example) Replaces the deprecated convert(sql, catalog, SqlDialect) call with a single ConverterProvider(Casing.UNCHANGED) shared across both the schema build and the query conversion, preserving the lower-case identifiers as written

Test

UnquotedCasingTest verifies:

  • The default casing is TO_UPPER and is reflected in getSqlParserConfig()
  • new ConverterProvider(Casing) sets the casing correctly for all three Casing values
  • End-to-end: with TO_UPPER a plan built from CREATE TABLE employees … / SELECT … FROM employees produces a NamedScan with name EMPLOYEES; with UNCHANGED it produces employees

Existing tests DdlToSubstraitConversionTest and DdlToSubstraitConversionWithOptimizationTest are updated to use the ConverterProvider API instead of the removed SqlParser.Config overloads.

Notes

For consumers who need a fully custom parser configuration beyond what ConverterProvider exposes (e.g. a different parser factory), the supported extension point is subclassing ConverterProvider and overriding getSqlParserConfig(). The SqlParser.Config overloads on SubstraitSqlStatementParser and SubstraitSqlToCalcite have been removed since they are entirely superseded by this.

@nielspardon nielspardon force-pushed the feat/configurable-unquoted-casing branch 15 times, most recently from a3f5945 to 2285cd1 Compare July 3, 2026 08:39
…rterProvider

Add constructor-based configuration of unquoted SQL identifier casing to
ConverterProvider, so that isthmus consumers can control how unquoted
identifiers are cased during parsing. The default remains Casing.TO_UPPER
(no behaviour change).

Previously the only way to change this was to subclass ConverterProvider
and override getSqlParserConfig() — as IsthmusEntryPoint already did with
an anonymous class. That workaround is now replaced by a first-class
constructor parameter.

Changes to ConverterProvider:
- unquotedCasing is a new final field, consistent with executionBehavior
- getUnquotedCasing() getter
- getSqlParserConfig() reads unquotedCasing instead of hard-coding TO_UPPER
- new ConverterProvider(Casing) and ConverterProvider(extensions, typeFactory, Casing)
  for the common cases; the existing 7-arg constructor gains Casing as an 8th
  parameter; all narrower constructors default to Casing.TO_UPPER

Propagation through the pipeline — casing is applied consistently across
both CREATE TABLE parsing and query parsing:
- SubstraitSqlToCalcite: new convertQueries(sql, catalog, ConverterProvider,
  operatorTable) overload passes getSqlParserConfig() to the statement parser
- SqlToSubstrait: convert(sql, catalog) uses the ConverterProvider overload;
  the legacy convert(sql, catalog, SqlDialect) overload is deprecated and now
  delegates to it (the SqlDialect argument is ignored — casing is controlled
  by the ConverterProvider)
- SubstraitCreateStatementParser: new processCreateStatements(ConverterProvider, sql)
  and processCreateStatementsToCatalog(ConverterProvider, ...) overloads;
  SqlParser.Config stays an internal detail
- SqlExpressionToSubstrait: uses processCreateStatements(converterProvider, tableDef)
- IsthmusEntryPoint: uses new ConverterProvider(unquotedCasing);
  anonymous ConverterProvider subclass removed

Examples:
- FromSql: replaced the deprecated convert(sql, catalog, SqlDialect) call with
  a shared ConverterProvider(Casing.UNCHANGED) used for both schema and query
  parsing, preserving the lower-case identifiers as written

BREAKING CHANGE: The SqlParser.Config-based parsing entry points have been
removed in favour of ConverterProvider overloads:
- SubstraitSqlStatementParser.parseStatements(String, SqlParser.Config)
- SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlParser.Config)
- SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlValidator,
  RelOptCluster, SqlParser.Config)
Callers should pass a ConverterProvider (configured for the desired casing, or
subclassed with an overridden getSqlParserConfig() for fully custom parser
settings) instead of a SqlParser.Config.
@nielspardon nielspardon force-pushed the feat/configurable-unquoted-casing branch from 2285cd1 to 283d871 Compare July 3, 2026 10:02
@nielspardon nielspardon changed the title feat(isthmus): make unquoted identifier casing configurable in ConverterProvider feat(isthmus)!: make unquoted identifier casing configurable in ConverterProvider Jul 3, 2026
@nielspardon nielspardon marked this pull request as ready for review July 3, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant