Skip to content

fix(eval): update action for latest braintrust and zod apis#103

Merged
Abhijeet Prasad (AbhiPrasad) merged 1 commit into
mainfrom
abhi-fix-eval-action-new-apis
Jun 30, 2026
Merged

fix(eval): update action for latest braintrust and zod apis#103
Abhijeet Prasad (AbhiPrasad) merged 1 commit into
mainfrom
abhi-fix-eval-action-new-apis

Conversation

@AbhiPrasad

Copy link
Copy Markdown
Member

Upgrade the eval action dependencies and update argument parsing for the new Zod API.

Replace the removed Braintrust core capitalize helper with a local implementation and regenerate the bundled action assets.

Upgrade the eval action dependencies and update argument parsing for the
new Zod API. Replace the removed Braintrust core capitalize helper with a
local implementation and regenerate the bundled action assets.
@AbhiPrasad Abhijeet Prasad (AbhiPrasad) merged commit 9fb6ffc into main Jun 30, 2026
8 checks passed
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Braintrust eval report

Say Hi Bot Python (main-1782823929)

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Llm_calls 0 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Duration 0s (+0s) - 2 🔴

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Braintrust eval report

Say Hi Bot Python (main-1782823930)

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Llm_calls 0 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Duration 0s (0s) 2 🟢 -

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Braintrust eval report

Console logging (main-1782823939)

Score Average Improvements Regressions
Levenshtein 82.9% (+0pp) 5 🟢 5 🔴
Llm_calls 0 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Duration 0s (+0s) - 20 🔴

My Evaluation (main-1782823939)

Score Average Improvements Regressions
Exact match 100% (+0pp) - -
Llm_calls 0 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 10tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 2tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 12tok (+0tok) - -
Duration 0.26s (+0.17s) - 1 🔴

Say Hi Bot (main-1782823939)

Score Average Improvements Regressions
Levenshtein 82.2% (+1pp) 7 🟢 5 🔴
Llm_calls 0 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Duration 1s (0s) 20 🟢 -

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Braintrust eval report

Say Hi Bot (main-1782823939-280d1a7f)

Score Average Improvements Regressions
Levenshtein 82.2% (+0pp) 4 🟢 3 🔴
Llm_calls 0 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Duration 1s (+0s) - 19 🔴

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Braintrust eval report

Say Hi Bot (main-1782823949)

Score Average Improvements Regressions
Levenshtein 82.1% (0pp) 4 🟢 4 🔴
Llm_calls 0 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Duration 1s (+0s) - 5 🔴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants