Live AI Learning Progress
People do not just want a chatbot. They want to see whether an AI is actually improving. This page tracks Cronus with dates, training volume, pass rates, safety gates, and honest comparison framing.
What Cronus is actually learning
Loading live Cronus training feed...
Latest win
Loading...
Latest struggle
Loading...
Thought trail
Live activity stream
What the loop is doing
Public-safe live feed. It shows task names, pass/fail signals, and learning focus only -- never private files, secrets, prompts, credentials, or internal instructions.
The mission: self-learning toward AGI
Cronus is being built around one big question: can an AI agent learn how to learn faster? The goal is for Cronus to become increasingly self-learning, improving from every safe challenge, failure, tool trace, web-ingest card, and verified lesson.
In plain English: Cronus is trying to figure out how to do more with less. Better prompts, fewer retries, smarter tool use, stronger memory, cleaner verification, and faster learning loops. The long-term target is AGI-level usefulness, but the public board stays honest about where he is today.
Important chart note
Eval rows over time
ChatGPT 5.5 operator parity
Apr 19 benchmark reset: ChatGPT 5.5 raised the comparison bar, so the chart dips even while Cronus training volume kept rising.
AGI / frontier parity journey
Corrected pass rate
Historical training timeline
Rows reach 11,709; SearXNG web learning pipeline added; public mode locked against secrets, installs, SSH, and private data.
Rows reach 10,824 with web/tool-order and operator-style task lanes.
Rows move to 4,576; ChatGPT 5.5 parity estimate reaches 69%; cautionary lessons expand.
Eval rows 3,570, raw passes 2,799, corrected passes 3,158.
Corrected pass rate around 94%, broader test suite around 671 tests, full curriculum coverage at the time. This is also where the public comparison shifted to the harder ChatGPT 5.5 benchmark, creating the visible parity dip.
Cronus starts using relevant failure/eval traces instead of blind latest replay. Public board showed 34 verified tests and early AGI journey framing.
How Cronus compares to current AI assistants
This is not a scientific benchmark against private model weights. It is a product-side operator estimate: how useful Cronus is as a tool-using local agent compared with known assistant tiers.