Llm

All Postsclassical (11)ai (8)quantum (7)llm (7)code (6)

All Posts

Published on
June 22, 2025
Cross-Layer Transcoders and Sparse Autoencoders
ai llm classical
Sparse Autoencoders (SAEs) and Cross-Layer Transcoders (CLTs) are two approaches to interpretability of transformer models. Read up on what they're good for and how they differ.
Published on
June 18, 2025
Emerging Principles of Agent Design
ai llm classical
Practical tips on how to design single and multi-agent systems. A synthesis of emerging principles and patterns as of June 2025.
Published on
May 13, 2025
Induction Circuits - LLMs are more than next-token predictors
ai llm classical
This blog post explores Induction Circuits, an example of a learned mini algorithm of transformer models that runs at inference time.
Published on
December 23, 2024
A brief history of LLM Scaling Laws and what to expect in 2025
ai llm classical
A brief history of LLM Scaling Laws from compute-optimal training and inference to scaling test-time compute and whether Scaling Laws are coming to an end.
Published on
December 18, 2024
Scaling Laws for LLM Pretraining
ai llm classical
A comparison of Scaling Laws for LLM Pretraining, from Kaplan, to Chinchilla, the Chinchilla Trap, covering compute-optimal training and inference.
Published on
November 15, 2024
Generating Synthetic Data for LLM Post-Training
ai llm classical
An overview of the motivations and techniques used for generating synthetic data for LLM post-training, as seen in the Llama 3.1, AFM, Qwen2 and Hunyuan-Large papers.
Published on
October 20, 2024
Scaling LLM Test Time Compute
ai llm classical
An overview of recent research on scaling test-time compute in large language models (LLMs) including CoT, STaR, ReST, RISE, ORMs, PRMs and more.

Llm

llm (7)

Cross-Layer Transcoders and Sparse Autoencoders

Emerging Principles of Agent Design

Induction Circuits - LLMs are more than next-token predictors

A brief history of LLM Scaling Laws and what to expect in 2025

Scaling Laws for LLM Pretraining

Generating Synthetic Data for LLM Post-Training

Scaling LLM Test Time Compute