Fully Annotated Guide to "A (Long) Peek into Reinforcement Learning"
This is a fully annotated guide to Lilian Weng’s post A (Long) Peek into Reinforcement Learning.
This is a fully annotated guide to Lilian Weng’s post A (Long) Peek into Reinforcement Learning.
The multi-armed bandit problem is a classic exploration–exploitation dilemma in reinforcement learning. Lilian Weng’s post is an excellent introduction, but some mathematical details and motivations can be cryptic. This article annotates it with step-by-step explanations and supplementary notes.
Diffusion models are the de facto standard for image generation. Lilian Weng’s “What Are Diffusion Models?” is an excellent introduction to it, but readers without a solid mathematical background may struggle. This article fills that gap with clear, step‑by‑step derivations and explanations.