feat(isthmus)!: make unquoted identifier casing configurable in ConverterProvider#983
Open
nielspardon wants to merge 1 commit into
Open
Conversation
a3f5945 to
2285cd1
Compare
…rterProvider Add constructor-based configuration of unquoted SQL identifier casing to ConverterProvider, so that isthmus consumers can control how unquoted identifiers are cased during parsing. The default remains Casing.TO_UPPER (no behaviour change). Previously the only way to change this was to subclass ConverterProvider and override getSqlParserConfig() — as IsthmusEntryPoint already did with an anonymous class. That workaround is now replaced by a first-class constructor parameter. Changes to ConverterProvider: - unquotedCasing is a new final field, consistent with executionBehavior - getUnquotedCasing() getter - getSqlParserConfig() reads unquotedCasing instead of hard-coding TO_UPPER - new ConverterProvider(Casing) and ConverterProvider(extensions, typeFactory, Casing) for the common cases; the existing 7-arg constructor gains Casing as an 8th parameter; all narrower constructors default to Casing.TO_UPPER Propagation through the pipeline — casing is applied consistently across both CREATE TABLE parsing and query parsing: - SubstraitSqlToCalcite: new convertQueries(sql, catalog, ConverterProvider, operatorTable) overload passes getSqlParserConfig() to the statement parser - SqlToSubstrait: convert(sql, catalog) uses the ConverterProvider overload; the legacy convert(sql, catalog, SqlDialect) overload is deprecated and now delegates to it (the SqlDialect argument is ignored — casing is controlled by the ConverterProvider) - SubstraitCreateStatementParser: new processCreateStatements(ConverterProvider, sql) and processCreateStatementsToCatalog(ConverterProvider, ...) overloads; SqlParser.Config stays an internal detail - SqlExpressionToSubstrait: uses processCreateStatements(converterProvider, tableDef) - IsthmusEntryPoint: uses new ConverterProvider(unquotedCasing); anonymous ConverterProvider subclass removed Examples: - FromSql: replaced the deprecated convert(sql, catalog, SqlDialect) call with a shared ConverterProvider(Casing.UNCHANGED) used for both schema and query parsing, preserving the lower-case identifiers as written BREAKING CHANGE: The SqlParser.Config-based parsing entry points have been removed in favour of ConverterProvider overloads: - SubstraitSqlStatementParser.parseStatements(String, SqlParser.Config) - SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlParser.Config) - SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlValidator, RelOptCluster, SqlParser.Config) Callers should pass a ConverterProvider (configured for the desired casing, or subclassed with an overridden getSqlParserConfig() for fully custom parser settings) instead of a SqlParser.Config.
2285cd1 to
283d871
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds constructor-based configuration of unquoted SQL identifier casing to
ConverterProvider, so that isthmus consumers can control how unquoted identifiers are cased during parsing. The default remainsCasing.TO_UPPER(no behaviour change).Previously the only way to change this was to subclass
ConverterProviderand overridegetSqlParserConfig()— asIsthmusEntryPointalready did with an anonymous class. That workaround is now replaced by a first-class constructor parameter.Breaking change
The
SqlParser.Config-based parsing entry points (public inv0.94.0) have been removed in favour ofConverterProvideroverloads:SubstraitSqlStatementParser.parseStatements(String, SqlParser.Config)SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlParser.Config)SubstraitSqlToCalcite.convertQueries(String, CatalogReader, SqlValidator, RelOptCluster, SqlParser.Config)Callers should pass a
ConverterProvider(constructed with the desiredCasing, or subclassed with an overriddengetSqlParserConfig()for fully custom parser settings) instead of aSqlParser.Config.Changes
ConverterProviderunquotedCasingis a newfinalfield, consistent withexecutionBehaviorgetUnquotedCasing()— gettergetSqlParserConfig()readsunquotedCasinginstead of hard-codingCasing.TO_UPPERConverterProvider(Casing)andConverterProvider(extensions, typeFactory, Casing)for the common cases; the existing 7-arg all-components constructor gainsCasingas an 8th parameter. All narrower constructors default toCasing.TO_UPPER.ConverterProvider.DEFAULT— a shared constant for the default (all-system-defaults) provider, used at every call site that previously wrotenew ConverterProvider().Propagation through the pipeline
The casing setting is applied consistently across both CREATE TABLE parsing and query parsing, so that the table name stored in a
NamedScanmatches the configured casing end-to-end.SubstraitSqlStatementParserparseStatements(String, ConverterProvider)now owns theSqlParserinstantiation directly; theSqlParser.Configoverload is removed — callers needing fully custom config should subclassConverterProviderand overridegetSqlParserConfig()SubstraitSqlToCalciteconvertQueries(sql, catalog, ConverterProvider)andconvertQueries(sql, catalog, ConverterProvider, operatorTable)overloads; all internal overloads now route throughConverterProvider; theSqlParser.Configoverloads are removedSubstraitCreateStatementParserprocessCreateStatements(ConverterProvider, sql)andprocessCreateStatementsToCatalog(ConverterProvider, ...)overloadsSqlToSubstraitconvert(sql, catalog)now uses theConverterProviderpath;convert(sql, catalog, SqlDialect)is@Deprecated— theSqlDialectargument is ignored and it simply delegates toconvert(sql, catalog)SqlExpressionToSubstraitprocessCreateStatements(converterProvider, tableDef)SubstraitToSqlConverterProvider.DEFAULTIsthmusEntryPointnew ConverterProvider(unquotedCasing); anonymousConverterProvidersubclass removedFromSql(example)convert(sql, catalog, SqlDialect)call with a singleConverterProvider(Casing.UNCHANGED)shared across both the schema build and the query conversion, preserving the lower-case identifiers as writtenTest
UnquotedCasingTestverifies:TO_UPPERand is reflected ingetSqlParserConfig()new ConverterProvider(Casing)sets the casing correctly for all threeCasingvaluesTO_UPPERa plan built fromCREATE TABLE employees … / SELECT … FROM employeesproduces aNamedScanwith nameEMPLOYEES; withUNCHANGEDit producesemployeesExisting tests
DdlToSubstraitConversionTestandDdlToSubstraitConversionWithOptimizationTestare updated to use theConverterProviderAPI instead of the removedSqlParser.Configoverloads.Notes
For consumers who need a fully custom parser configuration beyond what
ConverterProviderexposes (e.g. a different parser factory), the supported extension point is subclassingConverterProviderand overridinggetSqlParserConfig(). TheSqlParser.Configoverloads onSubstraitSqlStatementParserandSubstraitSqlToCalcitehave been removed since they are entirely superseded by this.