AI PrimerAI Primer
DistCA claims 1.35x long-context training gains with disaggregated core attention | AI Primer