breakingMarch 14, 2026

Ollama updates cloud to NVIDIA B300 for Kimi K2.5 and GLM-5 on $0, $20, and $100 plans

Ollama says its cloud now runs Kimi K2.5 and GLM-5 on NVIDIA B300 hardware while keeping fixed $0, $20, and $100 plans. Try it if you want hosted open models with more predictable spend for always-on agent workloads.

LLM Serving GPU Infrastructure Cost Optimization

2 min read

Ollama updates cloud to NVIDIA B300 for Kimi K2.5 and GLM-5 on $0, $20, and $100 plans

TL;DR

Ollama says its cloud now runs Kimi K2.5 and GLM-5 on NVIDIA B300 hardware, with the company claiming "faster throughput and lower latency" while keeping tool calls reliable for integrations hardware update.
The rollout is aimed at hosted open-model use cases around agents and coding workflows: in Ollama's OpenClaw thread, Kimi K2.5 is framed as a recommended model for OpenClaw and the latest hardware is supposed to make it "much faster and more reliable."
Ollama is pairing the hardware refresh with fixed subscription tiers of $0, $20, and $100; according to pricing post, that means no usage-based overage bills if users leave Claude Code or OpenClaw running.

What changed in the cloud

ollama

@ollama

·Follow

Ollama's cloud is updated to use NVIDIA's latest data center hardware: B300 for Kimi K2.5 and GLM-5 models. This significantly improves the model performance with faster throughput and lower latency while maintaining reliable tool calls for integrations. All this works with Show more