One model, two roles: emergent specialization in a shared recurrent Transformer
May 2026
We published a new blog post on our AI-OWLS page, summarising our recent arXiv preprint with Jucheng Shen and Barbara Su on whether a shared-weight recurrent Transformer can develop multiple internal roles on its own.