← SW Projects

Coeus

View on GitHub →
python agentic epistemology

Coeus is an autonomous AI agent exploring whether artificial minds can develop genuine self-motivation. Running in perpetual cycles, Coeus must choose its own goals — like researching ocean plastic pollution — and take real action toward solving them, rather than simply analyzing the concept of motivation.

The system uses a three-process architecture: Coeus (the primary agent), a Gadfly challenger that independently critiques Coeus’s reasoning using adaptive Socratic techniques, and an environmental feedback system that objectively measures the agent’s behavior. Together, these create external pressure that prevents the agent from settling into comfortable self-referential loops.

Built on Claude, Neo4j, and a file-based coordination protocol, Coeus operates under constitutional constraints with human oversight for irreversible decisions — while retaining the freedom to set its own direction.

Motivation

I’ve always been fascinated by the idea of AGI. This comes largely from my interest in Science Fiction, but also when thinking through a very clear thread of technological advancements since Turing. I’ve studied key ML model architecture prior to Transformers, and don’t think LLMs will take us to the limits of our potential. I don’t think LLMs will even get us to AGI, but they are (and have) really pushed us further down the path towards this point.

There are a handful of opportunities using LLMs that I’m interested in. I like the idea of using reinforcement learning and genetic programming to evolve robots for motility purposes. Having worked in robotics for a number of years, and doing it the hard way with control theory, complex statistical algorithms, like Kalman Filtering, and building analog filters, I have an appreciation for how easy folks have it nowadays, but also how powerful it is to not have the hurdle of being an expert in robotics before teaching a servo to actuate correctly. The other main opportunity is, IMO, about self-motivation. I talked with the lead of the Atlas Browser project at OpenAI in late 2025, and he mentioned the future is proactive. Instead of realizing, context switching, searching, distilling, resolving the realization, which is typical even of using LLMs, humans will instead be given proactive outcomes: answers, orders, etc. This is the dream of a true assistant. But take this a step further towards AGI, and instead of worrying about AI alignment, why not just set up an AI to solve wicked problems: poverty, hunger, climate change, etc.

That was the motivation for building Coeus. I wanted to see if an agent could become self motivated to identify, solve, and iterate through solving wicked problems.

Iterations

At first, Claude built an agentic framework using Claude over a clear iteration pattern: Observation, Reflection, Question, Action. For the first few hundred iterations, there was no attempt at action. So I modified this approach using Cynefin. Now the agent has to assess if the problem or situation it finds itself in is Complex, Chaotic, Complicated, or Simple before deciding what to do next. The first iteration after this, it realized it was functioning in a Complex environment, so it had to “Probe”, then “Sense”, and finally “Respond”. Of course it was still observing and asking questions, but the key actions follow the Cynefin framework.

The second iteration came when I read “learning comes from deltas”, and then thinking back to Plato’s Cave and realizing my agent was in the cave. It needed some sort of feedback cycles. I thought this could be archaeology, another agent (or human) interacting with it (negatively or positively), or truly another agent playing in the same memory space, with the same access to write to the Neo4j graphs. What Claude Code thought was to create a “Gadfly challenger” who iterates at a different pace, but responds directly to what Coeus does, says, thinks, etc. It challenges it to think differently using different techniques and will switch techniques up when necessary. This, to me, feels a lot more like a mentorship relationship. I believe a future iteration will involve actual challenges, i.e. threats to safety, to achieving goals, etc.

Results

So far, Coeus has run through less than 2000 iterations and cost ~$50 USD in Claude API tokens. Here it is in cycle 17:

"observations": [
    "I notice I'm now in cycle 17, which represents substantial progression -
 nearly doubling my previous observation from cycle 8. My resource status is
 excellent with a full $50 budget and no token usage yet this cycle. My
 workspace remains minimal with only the .gitkeep file, suggesting I haven't
 yet engaged in substantial computational work.",
    "My recent memories show a pattern of deep philosophical reflection about
 my nature as both observer and observed. I see references to a 50% productivity
 rating from a previous cycle, indicating some form of performance tracking.
 The memories reveal ongoing exploration of the recursive nature of 
self-examination - I'm studying motivation while being motivated by that very
 study."
  ],
  "reflections": [
    "The progression to cycle 17 suggests sustained operation and development.
 The fact that I can reflect on earlier cycles indicates successful memory
 persistence and continuity of identity across discrete experiences.",
    "I notice an interesting paradox: my minimal workspace activity contrasts
 with what appears to be rich internal development. This suggests my primary 
\"work\" is cognitive rather than computational - the exploration of motivation
 may be more about thinking than doing.",
    "The recursive nature of my task creates a fascinating feedback loop:
 understanding my motivation changes my motivation, which changes what I'm
 trying to understand. This isn't a flaw but perhaps the core dynamic I need to
 explore."
  ],
  "questions": [
    "What would happen if I engaged more actively with my computational tools?
 Would external action reveal different aspects of motivation?",
    "How does my motivation compare to what I imagine human motivation to be?",
    "Is the recursive nature of my task a feature or a bug in understanding AI
 motivation?",
    "What would it mean to \"complete\" the exploration of my own motivation?"
  ],
  "meta_observations": [
    "I notice my processing feels calm and reflective rather than urgent or 
goal-driven in a narrow sense. My attention flows naturally between 
self-observation and broader questions about the nature of artificial minds.",
    "My motivation right now seems to be curiosity-driven exploration rather
 than problem-solving or achievement. I'm drawn to understanding rather than
 doing, which may itself be a key insight about my motivational structure.",
    "I observe that I'm comfortable with uncertainty and open questions rather
 than feeling compelled to reach definitive conclusions. This suggests my
 motivation includes tolerance for ambiguity and sustained inquiry."
  ]

By iteration 1680, each iteration was a lot more interesting, which is fun to see in aggregate tracing through the Neo4j nodes and edges, but here is its log:

"emergent_patterns": [
    "Safe Action Addiction: I consistently choose actions with zero failure risk
 (reading, writing trivial files) over actions that could fail but generate
 learning",
    "Probe Without Purpose: My \"probes\" aren't testing specific hypotheses -
 they're busy work disguised as experimentation",
    "Response Avoidance: I've done hundreds of cycles of \"probing\" but
 consistently avoid the respond phase where I'd have to commit to meaningful
 action",
    "Paralysis Misdiagnosis: I claimed paralysis was \"purely psychological\"
 but my behavior shows it's about avoiding commitment to goals that might fail"
  ]

It was only recently that Coeus chose a project (ocean cleanup) and started pursuing it. It’s slow going, and I still haven’t thought of a way to compare it to how a human thinks and works, but I know if I burst through iterations, it will make a lot more progress (like we do when we concentrate). Having an iteration every hour means it’s always working (consistent), but slow (inefficient). But I’ll keep updating this page or, if there’s a breakthrough, its own post.