fix(eval): update action for latest braintrust and zod apis by AbhiPrasad · Pull Request #103 · braintrustdata/eval-action

Abhijeet Prasad (AbhiPrasad) · 2026-06-29T20:26:04Z

Upgrade the eval action dependencies and update argument parsing for the new Zod API.

Replace the removed Braintrust core capitalize helper with a local implementation and regenerate the bundled action assets.

Upgrade the eval action dependencies and update argument parsing for the new Zod API. Replace the removed Braintrust core capitalize helper with a local implementation and regenerate the bundled action assets.

github-actions · 2026-06-30T12:52:06Z

Braintrust eval report

Say Hi Bot Python (main-1782823929)

Score	Average	Improvements	Regressions
Levenshtein	77.8% (+0pp)	-	-
Llm_calls	0 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	0tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	0tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	0tok (+0tok)	-	-
Duration	0s (+0s)	-	2 🔴

github-actions · 2026-06-30T12:52:09Z

Braintrust eval report

Say Hi Bot Python (main-1782823930)

Score	Average	Improvements	Regressions
Levenshtein	77.8% (+0pp)	-	-
Llm_calls	0 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	0tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	0tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	0tok (+0tok)	-	-
Duration	0s (0s)	2 🟢	-

github-actions · 2026-06-30T12:52:17Z

Braintrust eval report

Console logging (main-1782823939)

Score	Average	Improvements	Regressions
Levenshtein	82.9% (+0pp)	5 🟢	5 🔴
Llm_calls	0 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	0tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	0tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	0tok (+0tok)	-	-
Duration	0s (+0s)	-	20 🔴

My Evaluation (main-1782823939)

Score	Average	Improvements	Regressions
Exact match	100% (+0pp)	-	-
Llm_calls	0 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	10tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	2tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	12tok (+0tok)	-	-
Duration	0.26s (+0.17s)	-	1 🔴

Say Hi Bot (main-1782823939)

Score	Average	Improvements	Regressions
Levenshtein	82.2% (+1pp)	7 🟢	5 🔴
Llm_calls	0 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	0tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	0tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	0tok (+0tok)	-	-
Duration	1s (0s)	20 🟢	-

github-actions · 2026-06-30T12:52:18Z

Braintrust eval report

Say Hi Bot (main-1782823939-280d1a7f)

Score	Average	Improvements	Regressions
Levenshtein	82.2% (+0pp)	4 🟢	3 🔴
Llm_calls	0 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	0tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	0tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	0tok (+0tok)	-	-
Duration	1s (+0s)	-	19 🔴

github-actions · 2026-06-30T12:52:27Z

Braintrust eval report

Say Hi Bot (main-1782823949)

Score	Average	Improvements	Regressions
Levenshtein	82.1% (0pp)	4 🟢	4 🔴
Llm_calls	0 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	0tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	0tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	0tok (+0tok)	-	-
Duration	1s (+0s)	-	5 🔴

fix(eval): update action for latest braintrust and zod apis

6ec68b3

Upgrade the eval action dependencies and update argument parsing for the new Zod API. Replace the removed Braintrust core capitalize helper with a local implementation and regenerate the bundled action assets.

Abhijeet Prasad (AbhiPrasad) requested a review from Luca Forstner (lforst) June 29, 2026 20:26

Abhijeet Prasad (AbhiPrasad) self-assigned this Jun 29, 2026

Luca Forstner (lforst) approved these changes Jun 30, 2026

View reviewed changes

Abhijeet Prasad (AbhiPrasad) merged commit 9fb6ffc into main Jun 30, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(eval): update action for latest braintrust and zod apis#103

fix(eval): update action for latest braintrust and zod apis#103
Abhijeet Prasad (AbhiPrasad) merged 1 commit into
mainfrom
abhi-fix-eval-action-new-apis

Abhijeet Prasad (AbhiPrasad) commented Jun 29, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Abhijeet Prasad (AbhiPrasad) commented Jun 29, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 30, 2026 •

edited

Loading

github-actions Bot commented Jun 30, 2026 •

edited

Loading

github-actions Bot commented Jun 30, 2026 •

edited

Loading

github-actions Bot commented Jun 30, 2026 •

edited

Loading

github-actions Bot commented Jun 30, 2026 •

edited

Loading