- Version: 1.0
- Demo Page: View Demo
- Product Type: Single-File, Self-Modifying HTML Application
- Core Concept: An agentic DOM workspace where an LLM has full read/write/delete privileges over its own source code and visual interface.
1. Executive Summary
Project Ouroboros is a standalone .html file that acts as a boundless, draggable workspace (“Infinite Canvas”). It contains an integrated LLM loop (via OpenAI) that reads the current state of the document’s DOM as its context.
Instead of a conversational chat interface, Ouroboros operates on a state-transition model: The LLM takes the current state of the HTML file (State A) and generates executable JavaScript to mutate it into a new state (State B). This allows the application to create new tools, optimize existing code, or restructure its own interface on the fly.
2. Core Architecture & The “Context Loop”
The fundamental engine of this application relies on a continuous feedback loop between the DOM and the LLM. Crucially, everything in the HTML file is included in the LLM context.
Read State: Upon a user query, the application captures its entire current state (the full DOM).
- KV Cache Optimization: To maximize Key-Value (KV) Cache hits on the LLM side, all static content (libraries, core scripts, base CSS) remains at the top of the file structure. We do not strip out static CDN links; they are part of the context.
Construct Payload: The app combines the user’s prompt with the current DOM snapshot.
- System Prompt: We do not use the API’s “system prompt” feature. Instead, the System Prompt is hardcoded directly into the HTML file itself (e.g., as a hidden element or comment block), ensuring it is always part of the read context.
API Call: The payload is sent to the OpenAI API using a CDN-imported OpenAI ESM package (no bundlers required).
Execute Mutation: The app extracts the JavaScript from the LLM’s response and executes it via dynamic
<script>tag injection.Render: The DOM updates immediately, introducing the new Widget, feature, or optimization.
3. User Interface & Experience
The UI follows a “Window Manager” paradigm on an Infinite Canvas.
- Infinite Canvas: The base
<body>acts as a boundless desktop environment. - Widgets (Windows): Every functional element (terminal, tools, logs) is a self-contained “Widget”.
- Window Mechanics: All Widgets must be absolutely positioned, draggable (by a header), resizable, and closeable.
Default Widgets (Genesis State)
| Widget Name | Description | Core Functionality |
|---|---|---|
| Terminal / Prompt Box | The primary user interface. | A text area for the user to issue commands (e.g., “Build me a currency converter”). |
| Activity Log | The history of mutations. | Displays past user queries and a brief summary of what the LLM executed. |
| Token Monitor | The resource gauge. | Tracks estimated token count of the current DOM. Warns the user when approaching context limits. |
| Settings / Auth | The access gate. | A secure input to store the OpenAI API Key in the browser’s localStorage (since we cannot hardcode it in a shareable file). |
4. Technical Specifications
Because this must be a single file, we rely heavily on modern browser APIs and external CDNs.
- Styling: Tailwind CSS (via CDN script) for rapid, inline styling that the LLM can easily read and modify without needing a separate stylesheet.
- Interaction (Drag/Drop):
interact.js(via CDN) or lightweight custom vanilla JS to handle Widget dragging and resizing efficiently. - LLM Integration: CDN-imported OpenAI ESM package (e.g., via
esm.shorskypack). This avoids complex build steps while providing a cleaner API surface than rawfetch(). - Execution Sandbox:
- The LLM’s response will be parsed for
javascriptcode blocks. - The code is injected into the DOM as a new
<script>element to execute, then immediately removed to keep the DOM clean.
- The LLM’s response will be parsed for
5. Security & Risk Mitigation
This architecture carries unique risks that must be acknowledged and managed.
- Token Inflation (The “Bloat” Problem): If the LLM generates messy DOM elements, the context will hit the token limit rapidly.
- Mitigation: The “Token Monitor” Widget will track usage.
- Trigger: When context usage exceeds 75%, the LLM will be alerted to the high usage in its prompt.
- Strategy: While automatic pruning is an option, the preferred method is User-Directed Pruning. A button or command will allow the user to specify what to clean or refactor (e.g., “Summarize the logs”, “Remove the unused test widget”), giving the user control over their context window.
- Arbitrary Code Execution (XSS): The application relies on executing AI-generated code.
- Mitigation: Because this is a local, single-user tool, standard XSS is less of a threat (you are hacking yourself). However, the LLM must be strictly prompted not to execute malicious web requests.
- Destructive Edits: The LLM might accidentally delete the prompt box, rendering the app useless.
- Mitigation: The core app logic (the OpenAI wrapper and the Terminal Widget) will be wrapped in a specific
divwith anid="ouroboros-core". The embedded system prompt will instruct the LLM to never delete or alter this specific node.
- Mitigation: The core app logic (the OpenAI wrapper and the Terminal Widget) will be wrapped in a specific
6. Appendix: The Embedded System Prompt
The following prompt is hardcoded directly into the application’s source (e.g., inside a <script type="text/plain" id="system-prompt"> tag) and is injected into every API call. It defines the LLM’s role, constraints, and operating procedures.
| |
Citation
@article{shichaosong2026productrequirem,
title = {Product Requirements Document of Ouroboros},
author = {Shichao Song},
journal = {The Kiseki Log},
year = {2026},
month = {March},
url = {https://ki-seki.github.io/posts/260322-ouroboros/}
}
)](https://ki-seki.github.io/posts/260322-ouroboros/process.png)