releaseMarch 22, 2026

llm-circuit-finder compares duplicated layers and reports BBH logical deduction gains

The toolkit sweeps contiguous layer ranges in GGUF and llama.cpp-style setups to test whether duplicating them can unlock better reasoning without retraining. Treat the jump as a reproducible experiment, not a settled mechanism, because thread responses challenge whether the effect reflects circuits, routing, or training artifacts.

Evals Benchmarks Interpretability

3 min read

llm-circuit-finder compares duplicated layers and reports BBH logical deduction gains

TL;DR

The new llm-circuit-finder toolkit packages an inference-time experiment for GGUF models in llama.cpp-style setups: sweep contiguous layer ranges, duplicate selected blocks, and measure whether reasoning improves without retraining.
In the repo summary, duplicating layers 12-14 in Devstral-24B raised BBH logical deduction from 0.22 to 0.76, while duplicating layers 7-9 in Qwen2.5-32B improved reasoning by 17% repo summary.
The project is framed as a reproducible workflow, not just a claim: it includes sweep.py for circuit discovery, layer_path.py for path modification, and evaluation scripts, according to the toolkit page.
The HN discussion pushes back on the explanation, with commenters arguing the effect may reflect near-identity layers, routing or looping behavior, or training artifacts rather than a settled “reasoning circuit” mechanism.

llm-circuit-finder compares duplicated layers and reports BBH logical deduction gains

TL;DR

What shipped, exactly?

llm-circuit-finder

How convincing are the gains?

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

Discussion around Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

Epoch AI reports GPT-5.4 Pro solved one FrontierMath Open Problems conjecture

OpenHands benchmarks EvoClaw and caps continuous-evolution scores at 38.03%

Vals AI updates SWE-Bench Verified harness to mini-swe-agent and score slips to 78.8%

MiniMax M2.7 reportedly opens weights in about 2 weeks

Read next

OpenClaw ships 2026.3.22 with ClawHub marketplace and OpenShell SSH sandboxes

Cursor adds Instant Grep: 13ms regex search across millions of files

ChatGPT adds Library tab for reusable file uploads across conversations

Epoch AI reports GPT-5.4 Pro solved one FrontierMath Open Problems conjecture