Skip to content

Fix HF token validation to support Hugging Face CLI cache#4313

Open
Shuwen-Fang wants to merge 1 commit into
mainfrom
hf_token_documentation
Open

Fix HF token validation to support Hugging Face CLI cache#4313
Shuwen-Fang wants to merge 1 commit into
mainfrom
hf_token_documentation

Conversation

@Shuwen-Fang

@Shuwen-Fang Shuwen-Fang commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

This PR fixes a bug where train_rl.py (and other post-training scripts) would fail with ValueError: hf_access_token must be provided when not providing a pre-existing checkpoint even if the user had authenticated via Hugging Face CLI (hf auth login).

Cause

model_creation_utils.py:from_pretrained strictly validated config.hf_access_token (which only comes from HF_TOKEN environment variable) and did not check the Hugging Face CLI cache.

Fix

Updated from_pretrained to fallback to huggingface_hub.get_token() to retrieve the token from the CLI cache if config.hf_access_token is not set. The retrieved token is also passed to the to_maxtext subprocess via the HF_TOKEN environment variable to ensure it can authenticate.

This makes the on-the-fly conversion during training work seamlessly after running hf auth login without needing to manually export HF_TOKEN.

Buganizer: https://b.corp.google.com/issues/528366193

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@Shuwen-Fang Shuwen-Fang force-pushed the hf_token_documentation branch from aac71a0 to 842dea1 Compare July 1, 2026 03:41
@Shuwen-Fang Shuwen-Fang changed the title Document HF_TOKEN export requirement for Hugging Face checkpoint download Fix HF token validation to support Hugging Face CLI cache Jul 1, 2026
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/maxtext/utils/model_creation_utils.py 83.33% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@Shuwen-Fang Shuwen-Fang self-assigned this Jul 1, 2026
# Try to convert checkpoint on the fly
if not config.hf_access_token:
raise ValueError("hf_access_token must be provided when not providing a pre-existing checkpoint")
from huggingface_hub import get_token # pylint: disable=import-outside-toplevel

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have this import here? Often AI agents are just lazy.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, updated

- Fallback to huggingface_hub.get_token() if config.hf_access_token is missing.
- Pass the fallback token to the checkpoint conversion subprocess.
- Add unit tests to verify authentication behavior.
@Shuwen-Fang Shuwen-Fang force-pushed the hf_token_documentation branch 3 times, most recently from 06149ac to 0d6925a Compare July 1, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants