Miles added ROCm support for AMD Instinct clusters and reported GRPO post-training gains on Qwen3-30B-A3B, including AIME rising from 0.665 to 0.729. It matters if you are evaluating rollout-heavy RL jobs off NVIDIA and want concrete throughput and step-time numbers before porting.

Miles has added ROCm support for large-scale RL post-training on AMD Instinct systems, with LMSYS describing it as an end-to-end pipeline for MI300- and MI350-class clusters in the blog post. The release matters because Miles is not just a trainer: the [img:0|architecture diagram] in LMSYS's thread shows rollout generation and policy optimization split across separate components, coordinated by a scheduler and tied together with Megatron and SGLang.
The implementation details are practical. LMSYS's repo announcement says Miles is open-sourced via the Miles GitHub repo, and the blog summary says deployment is packaged through prebuilt Docker containers for MI300X and MI350X/355X, with ROCm validated end to end. That framing fits the project's pitch that rollout generation "dominates RL compute" on these jobs, making AMD's HBM bandwidth the hardware angle behind the port rather than a generic accelerator expansion launch thread.
The main reported quality gain is on Qwen3-30B-A3B with GRPO, where LMSYS's results thread says AIME rose from 0.665 to 0.729 during training. On the systems side, the same thread reports MI300X rollout throughput of about 1.1-1.3k tok/GPU/s and a mean step time of 388.5 seconds on a single 8-GPU node using 32x8 sampling with an 8k response cap.
That makes this more of a reproducible infrastructure datapoint than a vague hardware claim. The blog summary says Miles also validated multi-turn agentic training on ROCm, and LMSYS's Trainium and Inferentia post places the release in a wider push to run the same serving and rollout stack across AMD GPUs and AWS Trainium/Inferentia rather than keeping SGLang tied to one silicon path.
A pure C and Metal engine streams 209GB of MoE weights from SSD and reports tool-calling support in 4-bit mode on a laptop-class Mac. It is a concrete benchmark for teams exploring expert streaming, quantization, and page-cache tricks on consumer hardware.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
Full blog link: lmsys.org/blog/2026-03-1…
Excited to see inference moving beyond GPUs, with @YottaLabs and @radixark bringing SGLang to AWS Trainium & Inferentia, pushing toward a truly multi-silicon serving future 🔥 Check out the blog here👇 Show more
The real shift in AI infra isn’t new models. It’s inference moving beyond GPUs. New deep dive: SGLang on