Is EAGLE-3 Speculative Decoding Lossless?
Non-greedy acceptance, residual sampling, and why q matters.
Here are some of the things I've worked on in the past—projects, short papers, write-ups, etc.
Non-greedy acceptance, residual sampling, and why q matters.
Clipping attention logits for MLA with shared rotary keys.
Approximating the serial pre-norm block.
On singular token spaces.
Small but faithful subsets of large point sets.
Play chess alone or with a friend.
Communication deficiencies in optimization.