Skip to content

stream: reduce allocations on WHATWG streams hot paths#63876

Draft
mcollina wants to merge 1 commit into
nodejs:mainfrom
mcollina:webstream-js-perf-squashed
Draft

stream: reduce allocations on WHATWG streams hot paths#63876
mcollina wants to merge 1 commit into
nodejs:mainfrom
mcollina:webstream-js-perf-squashed

Conversation

@mcollina

@mcollina mcollina commented Jun 12, 2026

Copy link
Copy Markdown
Member

Pure-JavaScript allocation reductions on the WHATWG streams hot paths partially based on the findings of #63872 : reused promise-reaction closures per controller (pull/write), a buffered fast path in the async iterator, queueMicrotask() for non-thenable start results, arity-specialized algorithm wrappers, shared nil state records, and removal of several dead per-instance allocations. No observable behavior change: WPT streams/compression/encoding results are identical to main (same subtests passing, same 8 expected failures by name).

Benchmark results (benchmark/compare.js --runs 10, both binaries built the same day from the same toolchain):

                                                                       confidence improvement accuracy (*)    (**)    (***)
webstreams/creation.js kind='ReadableStream.tee' n=50000                      ***     29.74 %       ±4.57%  ±6.28%   ±8.58%
webstreams/creation.js kind='ReadableStream' n=50000                          ***     21.46 %       ±6.77%  ±9.51%  ±13.48%
webstreams/creation.js kind='ReadableStreamBYOBReader' n=50000                ***    -51.78 %       ±3.80%  ±5.21%   ±7.10%
webstreams/creation.js kind='ReadableStreamDefaultReader' n=50000              **    102.49 %      ±62.86% ±90.21% ±132.47%
webstreams/creation.js kind='TransformStream' n=50000                         ***     33.00 %       ±6.62%  ±9.22%  ±12.90%
webstreams/creation.js kind='WritableStream' n=50000                          ***    102.28 %      ±13.80% ±19.67%  ±28.56%
webstreams/js_transfer.js n=10000 payload='ReadableStream'                     **      2.97 %       ±1.97%  ±2.74%   ±3.82%
webstreams/js_transfer.js n=10000 payload='TransformStream'                   ***      6.29 %       ±1.97%  ±2.75%   ±3.87%
webstreams/js_transfer.js n=10000 payload='WritableStream'                    ***      7.66 %       ±1.58%  ±2.19%   ±3.01%
webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=1024 n=500000                 1.27 %       ±2.50%  ±3.48%   ±4.85%
webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=2048 n=500000          *      2.61 %       ±2.34%  ±3.22%   ±4.40%
webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=4096 n=500000         **      3.47 %       ±1.93%  ±2.70%   ±3.77%
webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=512 n=500000         ***      3.13 %       ±1.61%  ±2.21%   ±3.03%
webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=1024 n=500000        ***      4.46 %       ±2.23%  ±3.05%   ±4.16%
webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=2048 n=500000        ***      3.71 %       ±1.89%  ±2.60%   ±3.54%
webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=4096 n=500000        ***      4.33 %       ±1.65%  ±2.27%   ±3.09%
webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=512 n=500000         ***      3.80 %       ±1.81%  ±2.48%   ±3.38%
webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=1024 n=500000                 1.20 %       ±1.53%  ±2.10%   ±2.88%
webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=2048 n=500000         **      4.01 %       ±2.18%  ±3.01%   ±4.15%
webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=4096 n=500000        ***      4.17 %       ±2.16%  ±2.99%   ±4.15%
webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=512 n=500000         ***      3.34 %       ±1.51%  ±2.10%   ±2.92%
webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=1024 n=500000         ***      2.91 %       ±1.39%  ±1.93%   ±2.67%
webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=2048 n=500000         ***      3.50 %       ±1.86%  ±2.56%   ±3.49%
webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=4096 n=500000           *      2.82 %       ±2.25%  ±3.13%   ±4.36%
webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=512 n=500000          ***      4.30 %       ±1.45%  ±2.00%   ±2.74%
webstreams/readable-async-iterator.js n=100000                                ***     38.07 %       ±5.69%  ±7.82%  ±10.69%
webstreams/readable-read-buffered.js bufferSize=1 n=100000                             2.94 %       ±6.05%  ±8.29%  ±11.30%
webstreams/readable-read-buffered.js bufferSize=10 n=100000                            2.50 %       ±7.03%  ±9.76%  ±13.61%
webstreams/readable-read-buffered.js bufferSize=100 n=100000                          -2.42 %       ±6.57%  ±9.01%  ±12.29%
webstreams/readable-read-buffered.js bufferSize=1000 n=100000                         -3.28 %       ±6.07%  ±8.42%  ±11.68%
webstreams/readable-read.js type='byob' n=100000                                      -0.25 %       ±1.73%  ±2.41%   ±3.36%
webstreams/readable-read.js type='normal' n=100000                                     4.12 %       ±6.84%  ±9.41%  ±12.91%

The creation.js rows at the stock n=50000 measure a 20-40ms window and are unreliable; re-run at --set n=500000:

                                                                   confidence improvement accuracy (*)   (**)   (***)
webstreams/creation.js kind='ReadableStream.tee' n=500000                          2.31 %       ±2.49% ±3.50%  ±4.95%
webstreams/creation.js kind='ReadableStream' n=500000                     ***     13.87 %       ±2.38% ±3.26%  ±4.46%
webstreams/creation.js kind='ReadableStreamBYOBReader' n=500000           ***     12.98 %       ±4.95% ±6.82%  ±9.40%
webstreams/creation.js kind='ReadableStreamDefaultReader' n=500000         **      9.82 %       ±6.90% ±9.61% ±13.42%
webstreams/creation.js kind='TransformStream' n=500000                    ***     50.30 %       ±2.23% ±3.07%  ±4.19%
webstreams/creation.js kind='WritableStream' n=500000                     ***     97.09 %       ±6.55% ±9.18% ±12.95%

Pure-JavaScript optimizations to lib/internal/webstreams/*:

- Reuse the pull and write promise-reaction closures per controller
  instead of allocating two fresh closures per chunk. They are created
  lazily on the first pull/write so construction-only workloads never
  pay for them.
- Add the buffered fast path to the ReadableStream async iterator,
  mirroring the one ReadableStreamDefaultReader.read() already has:
  when data is queued in a default controller, dequeue directly and
  skip the read request, the deferred promise, and the promise
  chaining on the following call.
- Run the post-start step through queueMicrotask() when the start
  algorithm result is not an object (fulfillment is guaranteed and no
  .then lookup is observable), instead of wrapping it in
  new Promise((r) => r(result)) plus a two-closure reaction. Object
  and thenable results keep the promise path since their adoption
  timing and .then lookups are observable.
- Specialize the promise-callback wrappers for user algorithms by
  arity (0/1/2), replacing the rest-parameter + ReflectApply form that
  allocated an arguments array per invocation. The exact number of
  arguments each user callback observes is preserved.
- Share immutable nil records for the writable stream closeRequest,
  inFlightWriteRequest, inFlightCloseRequest and pendingAbortRequest
  resets; these records are only ever replaced wholesale. Push the
  PromiseWithResolvers() record directly as the write request rather
  than rebuilding an identical object.
- Remove dead per-instance allocations: the never-read close record in
  the writable stream state, the placeholder close/ready records that
  reader/writer setup unconditionally replaces, the per-stream
  () => 1 size algorithm closures, and the kControllerErrorFunction
  placeholder plus bound function (now a prototype method; byte
  streams keep their historical no-op behavior there).

Benchmark results vs main, same-day builds, benchmark/compare.js
--runs 10 (statistically significant rows):

  creation WritableStream (n=500k)              +97.1% ***
  creation TransformStream (n=500k)             +50.3% ***
  creation ReadableStream (n=500k)              +13.9% ***
  creation ReadableStreamBYOBReader (n=500k)    +13.0% ***
  creation ReadableStreamDefaultReader (n=500k)  +9.8% **
  readable-async-iterator                       +38.1% ***
  pipe-to (16 hwm configs)                +2.6..+4.5% (all positive)
  js_transfer WS / TS / RS              +7.7% / +6.3% / +3.0%
  readable-read normal/byob, read-buffered      parity (n.s.)

WPT streams/compression/encoding results are identical to main
(1403/338/3822 subtests passed, same 8 expected failures by name),
and all webstreams-related parallel tests pass.

Assisted-by: Claude Fable 5 <noreply@anthropic.com>
@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. web streams labels Jun 12, 2026
@mcollina mcollina requested review from MattiasBuelens, aduh95 and jasnell and removed request for aduh95 June 12, 2026 14:04

@MattiasBuelens MattiasBuelens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty sensible to me! 👍

Comment on lines +552 to +557
// No read is in flight. Mirror the buffered fast path of
// ReadableStreamDefaultReader.read(): when data is already queued
// in a default controller, resolve immediately without allocating
// a read request. The result settles synchronously, so leaving
// state.current undefined matches the state the slow path reaches
// once its read request callbacks have settled.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we always have to inline this... 🤔

I ported some of your previous optimizations to web-streams-polyfill, but instead of copying the code around I made defaultReader.read() create a different kind of ReadRequest if it knows that it will be resolved synchronously.

Perhaps we can do the same here, and have nextSteps create a different kind of AsyncIteratorReadRequest if it can be resolved synchronously? (I forgot to do that in my polyfill, it seems. 😅 ) Or would that risk turning readableStreamDefaultReaderRead megamorphic? (Or maybe it already is?)

Comment thread lib/internal/webstreams/readablestream.js
Comment on lines +2608 to +2613
queueMicrotask(() => {
controller[kState].started = true;
assert(!controller[kState].pulling);
assert(!controller[kState].pullAgain);
readableStreamDefaultControllerCallPullIfNeeded(controller);
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-pick: maybe pull this callback into a const, so we can reuse it for the PromisePrototypeThen below?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ci PRs that need a full CI run. web streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants