Ollama says its cloud now runs Kimi K2.5 and GLM-5 on NVIDIA B300 hardware while keeping fixed $0, $20, and $100 plans. Try it if you want hosted open models with more predictable spend for always-on agent workloads.

The concrete change is infrastructure. Ollama says its cloud has been updated to NVIDIA B300 data center hardware for Kimi K2.5 and GLM-5, and the company ties that directly to better serving characteristics: higher throughput, lower latency, and "reliable tool calls" for integrations hardware update. It also says the upgrade still works through Ollama's launch command and its claimed base of more than 45,000 GitHub integrations hardware update.
Ollama's follow-up post makes the intended workflow more explicit. In OpenClaw thread, the company says Kimi K2.5 is "one of the most recommended models" for OpenClaw, adds that the model is now "much faster and more reliable" on the refreshed cloud stack, and says OpenClaw gets web search by default.
The other operational detail is pricing. Ollama says Cloud uses fixed subscription rates at $0, $20, and $100 instead of metered overages, positioning that as protection against runaway spend when Claude Code or OpenClaw is left running for long sessions pricing post. The linked pricing page also says higher-usage plans are coming later.
For engineers, that makes this less a model launch than a packaging change: hosted access to Kimi K2.5 and GLM-5 now comes with a hardware refresh and a clearer spend envelope. Ollama is explicitly pitching that combination at agent-style usage where latency, tool reliability, and predictable billing matter more than raw benchmark claims OpenClaw thread.
Unsloth Studio launched as an open-source web UI to run, fine-tune, compare, and export local models, with file-to-dataset workflows and sandboxed code execution. Try it if you want to move prototype training and evaluation off cloud notebooks and onto local or rented boxes.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
Ollama's cloud is updated to use NVIDIA's latest data center hardware: B300 for Kimi K2.5 and GLM-5 models. This significantly improves the model performance with faster throughput and lower latency while maintaining reliable tool calls for integrations. All this works with Show more
Predictable costs ollama.com/pricing Ollama's Cloud comes with fixed subscription rates at $0, $20, and $100. This means you won't wake up to surprise overage bills if you leave Claude Code or OpenClaw running. In the future, we are adding additional plans for higher Show more