Skip to content

engine: retry-aware ingest status — keep 'parsing' while the queue retries (HAL-321)#47

Merged
hallelx2 merged 1 commit into
mainfrom
halleluyaholudele/hal-321-ingest-retry-aware
Jun 20, 2026
Merged

engine: retry-aware ingest status — keep 'parsing' while the queue retries (HAL-321)#47
hallelx2 merged 1 commit into
mainfrom
halleluyaholudele/hal-321-ingest-retry-aware

Conversation

@hallelx2

@hallelx2 hallelx2 commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Under heavy concurrent bulk ingestion, transient per-attempt failures (parse-timeout, not-yet-visible source) that River recovers on retry were being surfaced as terminal failed, so polling clients (and the benchmark harness) bailed prematurely.

  • queue.Job now carries Attempt/MaxAttempts (populated by the River worker; nil-row guarded for unit tests).
  • Pipeline.fail keeps the document parsing while retries remain and only marks failed on the final attempt — and now always logs the failure (it was previously silent, which hid the issue).
  • Ingest jobs cap retries at 5 so a genuinely-bad document fails in reasonable time.

Verified: 10 concurrent uploads — transient failures now log transient failure, will retry and recover to ready instead of flipping to failed. Build/vet clean; ingest + queue + api tests pass.

Closes HAL-321

Summary by Sourcery

Ensure ingest jobs treat transient queue retries as non-terminal failures and only mark documents failed on the final attempt.

New Features:

  • Allow ingest callers to cap retries per job (set to 5 for document ingest requests) to bound time-to-fail for genuinely bad documents.

Bug Fixes:

  • Keep documents in parsing status across non-final ingest attempts so transient failures do not surface as terminal failed states to polling clients.
  • Guard River job attempt metadata access so unit tests without JobRow continue to work correctly.

Enhancements:

  • Propagate attempt and max-attempt metadata from the River queue into ingest job handling to distinguish transient from terminal failures.
  • Always log ingest failures, including transient ones that will be retried, with document and stage context.

Summary by CodeRabbit

  • Bug Fixes

    • Document ingestion now includes explicit retry limits and improved failure handling to distinguish transient from terminal errors.
  • Refactor

    • Queue job processing enhanced with attempt tracking for better error diagnostics.

…ngest retries (HAL-321)

Under heavy concurrent bulk ingestion, individual attempts hit transient
failures (parse-timeout, not-yet-visible source) that River recovers on retry —
but the pipeline marked the document 'failed' on each miss, so polling clients
(and the benchmark) gave up prematurely. Now:
- queue.Job carries Attempt/MaxAttempts (populated by the River worker);
- Pipeline.fail keeps the doc 'parsing' while retries remain and only marks
  'failed' on the final attempt — and now always LOGS the failure (was silent);
- ingest jobs cap retries at 5 so a genuinely-bad doc fails in reasonable time.
@sourcery-ai

sourcery-ai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Reviewer's Guide

Makes ingest status retry-aware by propagating attempt metadata from River to ingest jobs, keeping documents in parsing while retries remain, logging failures explicitly, and capping ingest retries to 5.

Sequence diagram for retry-aware ingest failure handling

sequenceDiagram
    actor Client
    participant APIServer as APIServer
    participant Queue as Queue
    participant RiverWorker as envelopeWorker
    participant Pipeline as Pipeline
    participant Store as docPersister

    Client->>APIServer: handleIngestDocument
    APIServer->>Queue: Enqueue(Job{Kind:KindIngestDocument, MaxRetries:5})

    loop each_attempt
        Queue->>RiverWorker: Work(ctx, river.Job)
        RiverWorker->>Pipeline: Handler()(ctx, Job{Attempt, MaxAttempts})
        Pipeline->>Pipeline: Run(withLastAttempt(ctx, last))
        Pipeline-->>Pipeline: fail(ctx, store, id, stage, cause)
        alt not isLastAttempt(ctx)
            Pipeline->>Store: SetDocumentStatus(StatusParsing, "")
            Pipeline-->>Queue: return error (will retry)
        else isLastAttempt(ctx)
            Pipeline->>Store: SetDocumentStatus(StatusFailed, msg)
            Pipeline-->>Queue: return error (terminal)
        end
    end
Loading

File-Level Changes

Change Details Files
Propagate queue attempt metadata into ingest pipeline via context so handlers can distinguish transient (will-retry) failures from terminal ones.
  • Add context key helpers withLastAttempt/isLastAttempt in ingest pipeline, defaulting to treating missing metadata as last attempt for existing callers/tests.
  • Update Pipeline.Handler to derive last-attempt boolean from queue.Job.Attempt/MaxAttempts and wrap the handler context with this flag before calling Run.
  • Extend queue.Job with Attempt and MaxAttempts fields used at dispatch time (non-serialized) so handlers can inspect retry state.
  • Populate Attempt and MaxAttempts in the River envelopeWorker, guarding against nil JobRow in unit tests.
pkg/ingest/ingest.go
pkg/queue/queue.go
pkg/queue/river.go
Change ingest failure handling to keep documents in parsing on non-final attempts and to always log failures, only marking as failed on the last attempt.
  • Modify Pipeline.fail to branch on isLastAttempt: on non-final attempts, log a warning about transient failure and reset status to StatusParsing without a terminal error message.
  • Keep using a fresh background-based context with timeout for status updates to avoid being affected by request timeouts/cancellations.
  • On final attempts, log an error for the terminal failure and mark the document StatusFailed with a detailed error message, logging if the status write fails.
pkg/ingest/ingest.go
Limit ingest document jobs to a small, explicit retry budget to avoid long backoff chains for genuinely bad documents.
  • Set MaxRetries=5 on ingest-document enqueue requests so transient issues get a few retries but bad documents fail within a bounded time window.
  • Document the interaction between capped retries and Pipeline.fail’s last-attempt logic in the enqueue call comments.
internal/api/server.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bfc8017b-bee5-4fa3-89a6-a4a551d0ec61

📥 Commits

Reviewing files that changed from the base of the PR and between 0f11b8f and 5408484.

📒 Files selected for processing (4)
  • internal/api/server.go
  • pkg/ingest/ingest.go
  • pkg/queue/queue.go
  • pkg/queue/river.go

📝 Walkthrough

Walkthrough

The change adds Attempt and MaxAttempts fields to the queue Job struct, populates them from River's JobRow in the envelope worker, and threads them into the ingest pipeline via context. Pipeline.fail now resets document status to StatusParsing on non-final attempts instead of marking failure. The API enqueue call explicitly caps retries at 5.

Changes

Retry-aware ingest failure handling

Layer / File(s) Summary
Job attempt fields and River population
pkg/queue/queue.go, pkg/queue/river.go
Adds non-serializable Attempt and MaxAttempts fields to Job; envelopeWorker.Work populates them from job.JobRow when non-nil, defaulting to 0 for unit-test environments.
Pipeline retry context, Handler wiring, and fail behavior
pkg/ingest/ingest.go, internal/api/server.go
Adds isLastAttempt context helpers; Pipeline.Handler computes and injects the flag; Pipeline.fail resets document status to StatusParsing on transient attempts and marks failure only on the terminal attempt; ingest jobs are enqueued with MaxRetries: 5.

Sequence Diagram(s)

sequenceDiagram
    participant API as handleIngestDocument
    participant River as River Queue
    participant Worker as envelopeWorker
    participant Pipeline as Pipeline.Handler/Run
    participant DB as Document Status (DB)

    API->>River: Enqueue(KindIngestDocument, MaxRetries=5)
    River->>Worker: dispatch job (JobRow.Attempt, MaxAttempts)
    Worker->>Pipeline: Job{Attempt, MaxAttempts, ...}
    Pipeline->>Pipeline: inject isLastAttempt into ctx
    alt parse/ingest error
        Pipeline->>Pipeline: Pipeline.fail(ctx, err)
        alt isLastAttempt = false
            Pipeline->>DB: reset status → StatusParsing
        else isLastAttempt = true
            Pipeline->>DB: write StatusFailed
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 Hop, hop, retry — but not forever more,
Five chances to parse, then we close the door.
On the last attempt, the failure is real,
But transient troubles? Just reset and heal.
The rabbit keeps count with Attempt in hand,
And leaves no doc stuck in "parsing" land! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'engine: retry-aware ingest status — keep 'parsing' while the queue retries (HAL-321)' accurately summarizes the main change: implementing retry-aware status tracking to keep documents in 'parsing' state during transient failures rather than marking them failed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch halleluyaholudele/hal-321-ingest-retry-aware

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@hallelx2 hallelx2 merged commit b81db7e into main Jun 20, 2026
8 checks passed
@hallelx2 hallelx2 deleted the halleluyaholudele/hal-321-ingest-retry-aware branch June 20, 2026 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant