Live AI Learning Progress
People do not just want a chatbot. They want to see whether an AI is actually improving. Cronus is now past 24,979,513 verified eval rows with semantic maturity green, 500/500 recent reliability, 162/162 curriculum coverage, 22 semantic rule pages, stale ratio 0.094, and a 99.98% all-time pass rate.
What Cronus is actually learning
Loading live Cronus training feed...
Latest win
Loading...
Latest struggle
Loading...
Thought trail
Live activity stream
What the loop is doing
Public-safe live feed. It shows task names, pass/fail signals, and learning focus only -- never private files, secrets, prompts, credentials, or internal instructions.
The mission: self-learning toward AGI
Cronus is being built around one big question: can an AI agent learn how to learn faster? The goal is for Cronus to become increasingly self-learning, improving from every safe challenge, failure, tool trace, web-ingest card, and verified lesson.
In plain English: Cronus is trying to figure out how to do more with less. Better prompts, fewer retries, smarter tool use, stronger memory, cleaner verification, and faster learning loops. The long-term target is AGI-level usefulness, but the public board stays honest about where he is today.
Live streaks
Quick signals for people watching the training loop right now.
Important chart note
Eval rows over time
ChatGPT 5.5 operator parity
AGI / frontier parity journey
Corrected pass rate
Historical training timeline
Rows reach 24,979,513; passed rows 24,975,346; recent reliability 500/500; coverage 162/162; semantic rules 22; stale ratio 0.094; pass rate 99.98%. May 1-May 10 checkpoints remain preserved in the chart SVGs.
Rows reached 19,027,981 with 500/500 recent reliability and clean coverage, appended without deleting earlier May checkpoints.
Rows reach 24,979,513; passed rows 24,975,346; recent reliability 500/500; coverage 162/162; clean exam soak 10/10; semantic rules 22; stale ratio 0.094. May 1-May 8 checkpoints remain preserved in the chart SVGs.
Rows reach 8,454,372; passed rows 8,450,205; recent reliability 500/500; coverage 145/145; clean exam soak 9/10; semantic rules 22; stale ratio 0.094.
Public board held the 7,500,448 eval checkpoint while semantic maturity and overnight watchdog work soaked cleanly.
Rows reach 7,500,448 with 500/500 recent reliability and preserved May 1-May 6 graph history.
Rows reach 4,055,376 while the public history remains append-only.
Rows reach 2,611,394 and the daily chart keeps earlier checkpoints intact.
Rows reach 369,096 with public progress charts expanded beyond May 2.
Rows reach 43,955 with 500/500 recent reliability.
Rows reach 11,709; SearXNG web learning pipeline added; public mode locked against secrets, installs, SSH, and private data.
Rows reach 10,824 with web/tool-order and operator-style task lanes.
Rows move to 4,576; ChatGPT 5.5 parity estimate reaches 69%; cautionary lessons expand.
Eval rows 3,570, raw passes 2,799, corrected passes 3,158.
Corrected pass rate around 94%, broader test suite around 671 tests, full curriculum coverage at the time. This is also where the public comparison shifted to the harder ChatGPT 5.5 benchmark.
Cronus starts using relevant failure/eval traces instead of blind latest replay. Public board showed 34 verified tests and early AGI journey framing.
How Cronus compares to current AI assistants
This is not a scientific benchmark against private model weights. It is a product-side operator estimate: how useful Cronus is as a tool-using local agent compared with known assistant tiers.