Multi-armed bandit (by [ChatGPT Images 2.0](https://openai.com/index/introducing-chatgpt-images-2-0/))

Fully Annotated Guide to "The Multi-Armed Bandit Problem and Its Solutions"

The multi-armed bandit problem is a classic exploration–exploitation dilemma in reinforcement learning. Lilian Weng’s post is an excellent introduction, but some mathematical details and motivations can be cryptic. This article annotates it with step-by-step explanations and supplementary notes.

 · Updated:  · 16 min · 3235 words · Shichao Song
Diagram of the probability transition process in speculative sampling.

How is the Speculative Decoding Algorithm Constructed?

A simple mathematical derivation of the algorithm construction process from the paper “Fast Inference from Transformers via Speculative Decoding”.

 · Updated:  · 5 min · 884 words · Shichao Song
Diagram of the basic principle of diffusion models, showing recovery of an image from noise. Generated by [Google Nano Banana (gemini-2.5-flash-image-preview)](https://www.nano-banana.ai/)

Fully Annotated Guide to "What are Diffusion Models?"

Diffusion models are the de facto standard for image generation. Lilian Weng’s “What Are Diffusion Models?” is an excellent introduction to it, but readers without a solid mathematical background may struggle. This article fills that gap with clear, step‑by‑step derivations and explanations.

 · 74 min · 15730 words · Shichao Song