👋 Hi, all!

I am Shichao Song, a third-year PhD student. My current research focuses on foundation models. I blog about my research and life.

Reinforcement Learning illustration (by [Google Gemini](https://gemini.google.com/))

Fully Annotated Guide to "A (Long) Peek into Reinforcement Learning"

This is a fully annotated guide to Lilian Weng’s post A (Long) Peek into Reinforcement Learning.

Multi-armed bandit (by [ChatGPT Images 2.0](https://openai.com/index/introducing-chatgpt-images-2-0/))

Fully Annotated Guide to "The Multi-Armed Bandit Problem and Its Solutions"

The multi-armed bandit problem is a classic exploration–exploitation dilemma in reinforcement learning. Lilian Weng’s post is an excellent introduction, but some mathematical details and motivations can be cryptic. This article annotates it with step-by-step explanations and supplementary notes.

The Ouroboros Process (by [Google Gemini](https://gemini.google.com/))

Product Requirements Document of Ouroboros

An agentic DOM workspace where an LLM has full read/write/delete privileges over its own source code and visual interface.

The Great Decoupling (by [Google Gemini](https://gemini.google.com/))

Everything is Fleeting: A Roadmap to the Post-Labor AI Economy

A speculative roadmap of how AI will dismantle the labor market and what comes after.

快速的 AI 时代和温暖的人心（[Google Gemini](https://gemini.google.com/) 生成）

我目前知道的

关于 AI 和人的一个简单技术人文讨论