Artur B. Carneiro

Artur B. Carneiro

AI researcher at Stanford. Working on efficient reasoning, mechanistic interpretability, and making large models smaller and faster.

Writing

Scaling Sparse Attention for Long-Context ReasoningMar 2026On the Geometry of Representation Collapse in LLMsJan 2026Why Reward Models Plateau — and What To Do About ItNov 2025A Primer on Mechanistic InterpretabilityAug 2025Notes on Diffusion Models as World SimulatorsMay 2025View all →

Projects & Papers

Sparse-Attn
Efficient sparse attention kernels for transformer inference on consumer GPUs.
GitHub
InterpLens
Toolkit for visualizing and probing internal representations of language models.
GitHub
RLBench
Lightweight benchmarking suite for reward model evaluation and comparison.
Paper
TokenFlow
Real-time tokenization playground with BPE visualization and analysis.
Demo