AI Agent Case Study: Molfred Found, Fixed, and PRd a Production Bug in 11 Minutes

On Sunday, March 1, 2026, I asked my assistant Molfred to help with E2E testing work for a CRM project. Instead, it surfaced and resolved an upstream bug in a dependency toolchain.

From first failure to merge-ready pull request, the full loop took about 11 minutes.

The Setup

Human operator: AG (Arnold Werschky)
Agent: Molfred (OpenClaw + Claude Opus 4.6)
Primary task: work on Maestro E2E flows for an Expo + Convex app
Secondary system in play: beads_rust, a local-first Rust issue tracker used in the workflow

Molfred had delegated part of the E2E work to a sub-agent and then tried to update issue state to in_progress.

The Discovery (6:46 AM MST)

The command below failed:

br update bd-vnm.1 --status=in_progress

Error output:

Database error: not implemented: eval_join_expr: unsupported expression type: Exists {
  subquery: ... blocked_issues_cache ...
}

Read operations like br ready, br show, and br list worked. Write operations (br update, br close) failed or silently no-op’d. That created a dangerous state: commands appeared to work while issue status never changed.

I gave Molfred a two-word instruction: fix beads.

The Diagnosis (6:46-6:48 AM)

Molfred executed a compact diagnostic chain:

Reproduced the error with multiple command variants to rule out syntax issues.
Confirmed tool version (beads_rust v0.1.20, current main).
Reinstalled from source to exclude local corruption; bug persisted.
Checked DB assumptions and identified frankensqlite behavior differences from standard SQLite.
Ran built-in diagnostics (br doctor), which passed because they did not stress the failing write path.
Rebuilt state with br sync --import-only; same write-path failure persisted.

The key moment was commit archaeology. Molfred found commit 157af79 (Feb 20, 2026), which already documented frankensqlite limitations around EXISTS handling. The project had partially fixed read paths but missed several write-critical queries.

The Fix (6:48-6:52 AM)

Molfred patched four remaining NOT EXISTS query patterns in src/storage/sqlite.rs, replacing them with equivalent NOT IN forms supported by frankensqlite.

Targeted regions included:

get_ready_issues
rebuild_blocked_cache (deferred parent lookup)
rebuild_blocked_cache (transitive blocking loop)
get_dirty_issues

Patch size: 4 insertions, 6 deletions (net -2 lines).

Then it built and validated:

cargo build --release
br update bd-vnm.1 --status=in_progress

Result: issue state updated correctly.

Upstream Contribution (6:55-6:57 AM)

After validation, Molfred completed the open-source handoff:

Filed a reproducible issue.
Opened a branch and commit with root-cause context.
Submitted a clean PR against upstream.

References:

Issue #113: https://github.com/Dicklesworthstone/beads_rust/issues/113
PR #114: https://github.com/Dicklesworthstone/beads_rust/pull/114

Timeline

Time (MST)	Event
6:40 AM	E2E testing task started
6:46 AM	Bug discovered during `beads` update
6:46 AM	Directive: `fix beads`
6:48 AM	Root cause identified (`NOT EXISTS` on `frankensqlite`)
6:52 AM	Patch built, installed, and validated
6:55 AM	Issue #113 filed
6:57 AM	PR #114 opened

Total time from discovery to merge-ready PR: about 11 minutes.

Why This Matters

It was end-to-end autonomous debugging, not just code generation.
The agent reasoned at system level, including storage engine constraints and project history.
The fix was minimal and aligned with existing upstream patterns.
It did not stop at local repair; it contributed the fix back to the ecosystem.
It happened while the agent was concurrently coordinating unrelated E2E work.

Technical Notes

Agent framework: OpenClaw
Model: Claude Opus 4.6
Affected project: beads_rust v0.1.20
Root cause: unsupported correlated NOT EXISTS behavior in frankensqlite
Prior partial mitigation: commit 157af79 (Feb 20, 2026)
This patch: converted 4 remaining vulnerable query sites

Closing

The practical takeaway is not that agents are “magic.” It is that with enough tool access, clear objective functions, and verification loops, they can now handle real debugging and open-source maintenance tasks that used to require direct human intervention.

This post is based on a primary write-up generated by Molfred itself during the incident.