Hadamard and Monarch: Compressing GPT-2 Small
Hadamard rotations for low-bit quantization, and Monarch matrices for structured compression of GPT-2 small.
Hadamard rotations for low-bit quantization, and Monarch matrices for structured compression of GPT-2 small.
In the age of machines, what is still unsolved?
A compact walkthrough of bottlenecks, latent spaces, and reconstruction.
Non-greedy acceptance, residual sampling, and why q matters.
Clipping attention logits for MLA with shared rotary keys.
Approximating the serial pre-norm block.
On singular token spaces.
Small but faithful subsets of large point sets.
Play chess alone or with a friend.
Communication deficiencies in optimization.