Skip to content

Ignore 412s for Cosmos Spark ItemPatch with filter predicate#49700

Open
tvaron3 wants to merge 4 commits into
Azure:mainfrom
tvaron3:tvaron3-spark-itempatch-ignore-412
Open

Ignore 412s for Cosmos Spark ItemPatch with filter predicate#49700
tvaron3 wants to merge 4 commits into
Azure:mainfrom
tvaron3:tvaron3-spark-itempatch-ignore-412

Conversation

@tvaron3

@tvaron3 tvaron3 commented Jul 2, 2026

Copy link
Copy Markdown
Member

Summary

Implements the feature requested in #49594. Adds an opt-in config option spark.cosmos.write.patch.filterPredicateIgnorePreconditionFailures (boolean, default false) for the Cosmos DB Spark connector.

When enabled with the ItemPatch / ItemPatchIfExists write strategy together with a conditional patch filter (spark.cosmos.write.patch.filter), an HTTP 412 Precondition Failed — which the Cosmos service returns for documents excluded by the filter predicate — is treated as a successful no-op skip instead of failing the whole (bulk or point) write. This mirrors the existing graceful-skip behavior of ItemOverwriteIfNotModified / ItemDeleteIfNotModified. Default false preserves the current fail-fast behavior.

Customer scenario

Kafka -> Spark -> Cosmos ingestion using server-side increment patch operations guarded by an idempotency filter (e.g. NOT IS_DEFINED(last_batch_id) OR last_batch_id < <batchId>). On replays, the filter legitimately excludes already-applied documents, producing 412s that today fail the entire batch. This flag lets those 412s be skipped so idempotent replays succeed.

Changes

  • CosmosConfig.scala — new config name constant, registration in the known-config list, CosmosConfigEntry[Boolean] (default false), new filterPredicateIgnorePreconditionFailures field on CosmosPatchConfigs, and parsing in parseWriteConfig for the ItemPatch/ItemPatchIfExists branch.
  • BulkWriter.scalashouldIgnore now skips a 412 for ItemPatch/ItemPatchIfExists when the flag is enabled and a filter predicate is configured (read None-safely). Existing ItemPatchIfExists not-found skip is preserved.
  • PointWriter.scalapatchWithRetry adds a catch case that skips a 412 (logs skip, tracks a 0-count op, returns) under the same gate. 412 is not classified transient, so ordering is safe.
  • TestsCosmosConfigSpec parse test; BulkWriterITest + PointWriterITest skip-on-412 scenarios; new filterPredicateIgnorePreconditionFailures param on CosmosPatchTestHelper.
  • Docs & changelogconfiguration-reference.md row; "Features Added" entry in all 6 released Spark modules that build from the shared source tree.

Verification

  • Module recompiles; scalastyle clean.
  • CosmosConfigSpec passes (56/56, including the new parse test).
  • The two new integration tests require the Cosmos emulator / a live account and were not run in this environment.

Fixes #49594

tvaron3 and others added 4 commits July 1, 2026 12:34
Adds an opt-in config `spark.cosmos.write.patch.filterPredicateIgnorePreconditionFailures`
(default false) that treats a 412 Precondition Failed as a successful no-op skip when
using the ItemPatch/ItemPatchIfExists write strategy together with a conditional
spark.cosmos.write.patch.filter, in both bulk and point write paths. This mirrors the
existing graceful-skip behavior of ItemOverwriteIfNotModified/ItemDeleteIfNotModified.

Fixes Azure#49594

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added the Cosmos label Jul 2, 2026
@tvaron3

tvaron3 commented Jul 2, 2026

Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@tvaron3 tvaron3 marked this pull request as ready for review July 2, 2026 20:10
@tvaron3 tvaron3 requested review from a team and kirankumarkolli as code owners July 2, 2026 20:10
Copilot AI review requested due to automatic review settings July 2, 2026 20:10
@tvaron3 tvaron3 requested a review from a team as a code owner July 2, 2026 20:10

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in write configuration for the Cosmos DB Spark connector to treat 412 Precondition Failed responses as successful no-op skips when using ItemPatch / ItemPatchIfExists with a conditional patch filter, aligning behavior with existing “skip” semantics for other conditional write strategies.

Changes:

  • Introduces spark.cosmos.write.patch.filterPredicateIgnorePreconditionFailures (default false) and wires it through write config parsing into CosmosPatchConfigs.
  • Updates bulk and point write paths to ignore 412 only when (a) patch strategy is used, (b) a filter predicate is configured, and (c) the new flag is enabled.
  • Adds unit/integration tests for config parsing and the skip-on-412 behavior, plus documentation and changelog updates across Spark artifacts.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/CosmosConfig.scala Adds the new config key, registers it, extends CosmosPatchConfigs, and parses the flag for patch write strategies.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/BulkWriter.scala Extends shouldIgnore to skip 412 for ItemPatch/ItemPatchIfExists when the new flag is enabled and a filter predicate is present.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/PointWriter.scala Adds a targeted 412 skip branch in patchWithRetry under the same guard conditions as bulk.
sdk/cosmos/azure-cosmos-spark_3/src/test/scala/com/azure/cosmos/spark/CosmosConfigSpec.scala Adds a unit test to validate parsing/defaulting of the new patch config flag.
sdk/cosmos/azure-cosmos-spark_3/src/test/scala/com/azure/cosmos/spark/BulkWriterITest.scala Adds an integration test covering skip-on-412 behavior for bulk patch with an always-false filter when flag is enabled.
sdk/cosmos/azure-cosmos-spark_3/src/test/scala/com/azure/cosmos/spark/PointWriterITest.scala Adds an integration test covering skip-on-412 behavior for point patch with an always-false filter when flag is enabled.
sdk/cosmos/azure-cosmos-spark_3/src/test/scala/com/azure/cosmos/spark/utils/CosmosPatchTestHelper.scala Extends patch writer helpers to accept and pass through the new flag.
sdk/cosmos/azure-cosmos-spark_3/docs/configuration-reference.md Documents the new configuration option and its applicability constraints.
sdk/cosmos/azure-cosmos-spark_3-3_2-12/CHANGELOG.md Adds a “Features Added” changelog entry describing the new config flag.
sdk/cosmos/azure-cosmos-spark_3-4_2-12/CHANGELOG.md Adds a “Features Added” changelog entry describing the new config flag.
sdk/cosmos/azure-cosmos-spark_3-5_2-12/CHANGELOG.md Adds a “Features Added” changelog entry describing the new config flag.
sdk/cosmos/azure-cosmos-spark_3-5_2-13/CHANGELOG.md Adds a “Features Added” changelog entry describing the new config flag.
sdk/cosmos/azure-cosmos-spark_4-0_2-13/CHANGELOG.md Adds a “Features Added” changelog entry describing the new config flag.
sdk/cosmos/azure-cosmos-spark_4-1_2-13/CHANGELOG.md Adds a “Features Added” changelog entry describing the new config flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE REQ] Allow ignoring 412s in Spark connector when using ItemPatch

2 participants