Baseten Deprecates DeepSeek v3 0324 and GLM 4.6 Model APIs on May 1

Baseten has announced the deprecation of two model APIs while simultaneously launching discounted cache token pricing for all remaining model endpoints.

Baseten Changelog 2026-04-25 03:26 UTC Key: baseten-changelog::identity::Baseten Changelog Confidence: strong Mode: claude

Article body

Baseten will retire the DeepSeek v3 0324 and GLM 4.6 Model APIs at 5pm PT on May 1st, 2026. The deprecation was announced via the company's changelog on April 17th, giving developers roughly two weeks' notice before the cutoff. Any applications still routing requests through these endpoints will need to migrate before the deadline to avoid service disruption. On the pricing side, Baseten introduced cache token pricing for Model APIs effective April 16th, 2026. Cached input tokens now attract a discounted billing rate across all models, with the sole exception of GPT-OSS. The discounted rate applies automatically to the relevant portion of each request, requiring no manual configuration from users. A third changelog entry, published April 6th, added a quality-of-life improvement to Baseten's logs viewer. Users can now copy logs directly to their clipboard or download them as CSV or JSON files via a new export menu positioned next to the search box.

Why this matters

Applications still calling the DeepSeek v3 0324 or GLM 4.6 endpoints will break after May 1st unless developers update their integrations beforehand.
The new cache token pricing can reduce inference costs significantly for workloads with repeated or predictable input tokens, making Baseten more cost-competitive.
The logs export feature streamlines debugging and audit workflows, removing a manual step that teams frequently needed to work around.

Source note

This report is based on Baseten's public changelog page as of the fetch date. The deprecation notice and pricing changes contain specific dates and model names, providing strong factual grounding. Details on replacement models or migration paths were not included in the excerpt and would require a closer look at the full changelog entry or Baseten's documentation.

Original link

Open the monitored source