engine: retry-aware ingest status — keep 'parsing' while the queue retries (HAL-321)#47
Conversation
…ngest retries (HAL-321) Under heavy concurrent bulk ingestion, individual attempts hit transient failures (parse-timeout, not-yet-visible source) that River recovers on retry — but the pipeline marked the document 'failed' on each miss, so polling clients (and the benchmark) gave up prematurely. Now: - queue.Job carries Attempt/MaxAttempts (populated by the River worker); - Pipeline.fail keeps the doc 'parsing' while retries remain and only marks 'failed' on the final attempt — and now always LOGS the failure (was silent); - ingest jobs cap retries at 5 so a genuinely-bad doc fails in reasonable time.
Reviewer's GuideMakes ingest status retry-aware by propagating attempt metadata from River to ingest jobs, keeping documents in Sequence diagram for retry-aware ingest failure handlingsequenceDiagram
actor Client
participant APIServer as APIServer
participant Queue as Queue
participant RiverWorker as envelopeWorker
participant Pipeline as Pipeline
participant Store as docPersister
Client->>APIServer: handleIngestDocument
APIServer->>Queue: Enqueue(Job{Kind:KindIngestDocument, MaxRetries:5})
loop each_attempt
Queue->>RiverWorker: Work(ctx, river.Job)
RiverWorker->>Pipeline: Handler()(ctx, Job{Attempt, MaxAttempts})
Pipeline->>Pipeline: Run(withLastAttempt(ctx, last))
Pipeline-->>Pipeline: fail(ctx, store, id, stage, cause)
alt not isLastAttempt(ctx)
Pipeline->>Store: SetDocumentStatus(StatusParsing, "")
Pipeline-->>Queue: return error (will retry)
else isLastAttempt(ctx)
Pipeline->>Store: SetDocumentStatus(StatusFailed, msg)
Pipeline-->>Queue: return error (terminal)
end
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThe change adds ChangesRetry-aware ingest failure handling
Sequence Diagram(s)sequenceDiagram
participant API as handleIngestDocument
participant River as River Queue
participant Worker as envelopeWorker
participant Pipeline as Pipeline.Handler/Run
participant DB as Document Status (DB)
API->>River: Enqueue(KindIngestDocument, MaxRetries=5)
River->>Worker: dispatch job (JobRow.Attempt, MaxAttempts)
Worker->>Pipeline: Job{Attempt, MaxAttempts, ...}
Pipeline->>Pipeline: inject isLastAttempt into ctx
alt parse/ingest error
Pipeline->>Pipeline: Pipeline.fail(ctx, err)
alt isLastAttempt = false
Pipeline->>DB: reset status → StatusParsing
else isLastAttempt = true
Pipeline->>DB: write StatusFailed
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Under heavy concurrent bulk ingestion, transient per-attempt failures (parse-timeout, not-yet-visible source) that River recovers on retry were being surfaced as terminal
failed, so polling clients (and the benchmark harness) bailed prematurely.queue.Jobnow carriesAttempt/MaxAttempts(populated by the River worker; nil-row guarded for unit tests).Pipeline.failkeeps the documentparsingwhile retries remain and only marksfailedon the final attempt — and now always logs the failure (it was previously silent, which hid the issue).Verified: 10 concurrent uploads — transient failures now log
transient failure, will retryand recover toreadyinstead of flipping tofailed. Build/vet clean; ingest + queue + api tests pass.Closes HAL-321
Summary by Sourcery
Ensure ingest jobs treat transient queue retries as non-terminal failures and only mark documents failed on the final attempt.
New Features:
Bug Fixes:
Enhancements:
Summary by CodeRabbit
Bug Fixes
Refactor