← Back to Kontasks
Runaway Claude Loop Analysis & Fix Plan
Root cause analysis of rogue Claude process and proposed safeguards
Output
Runaway Claude Loop Analysis
Root Cause Analysis
The Claude process (PID 453975) ran for 7+ minutes at 100% CPU. Based on code review, here's what likely happened:
| What We Know | Evidence |
|---|---|
| Process was Claude CLI | Command shown as 'claude' in top |
| Running 7:33 minutes | TIME+ column in screenshot |
| 100% single-core CPU | 101.7% CPU usage |
| Not Johnathon's session | Session ses_9de05b41-3ee completed at 23:20:49 (24s runtime) |
Most Likely Cause: Tool Execution Loop
Claude CLI was likely stuck in a tool execution loop - repeatedly calling tools that return partial results, triggering more calls.
Examples that can cause this:
- Grep/Glob returning massive results → Claude tries to read each file
- Task tool spawning subagents that spawn more subagents
- Web searches that keep finding "more to explore"
- Retry loops on failing tool calls
Current Safeguards (Already in Place)
| Safeguard | Setting | Status |
|---|---|---|
| promptTimeoutMs | 600,000ms (10 min) | ✓ Working |
| Heartbeat events | Every 5 seconds | ✓ Working |
| Timeout resets on activity | resetTimeout() in stream.ts | ⚠ PROBLEM |
The Bug: Activity Resets Defeat Timeout
In stream.ts:356, every event from Claude resets the 10-minute timeout:
// Reset timeout on any activity from Claude onActivity();
If Claude is in a loop calling tools, it's constantly emitting events → timeout never triggers → process runs forever.
Proposed Fixes
| Fix | Description | Priority |
|---|---|---|
| 1. Hard max timeout | Add maxTurnTimeMs (e.g., 15 min) that cannot be reset | Critical |
| 2. Tool call limit | Add maxToolCallsPerTurn (e.g., 100) - stop after N tool calls | Critical |
| 3. CPU monitoring | Track process CPU time via /proc/[pid]/stat, kill if >5min CPU | High |
| 4. Stall detection | Detect same tool called >10x consecutively → likely stuck | Medium |
Implementation Priority
Fixes #1 and #2 should be implemented immediately - they're simple config additions that prevent runaway processes.
Quick Actions
Original Request
No layout configured
Details
Type
General
Status
Completed
Scope
vibetools
Tags
konsoleperformancestabilityarchitecture
Created
5 Jan 2026, 9:26 am
Updated
5 Jan 2026, 9:26 am
Timing
⏱️ 81.6s
11 tools
● TTFE 5%
● Thinking 89%
● Tools 6%
Created By
claude
Raw Data
{
"id": "82874e62-d51b-45dc-9819-058d17f1dcd6",
"type": "general",
"status": "completed",
"title": "Runaway Claude Loop Analysis & Fix Plan",
"description": "Root cause analysis of rogue Claude process and proposed safeguards",
"context": {
"output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:sans-serif;\"><h2 style=\"color:#fff;margin:0 0 16px 0;\">Runaway Claude Loop Analysis</h2><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"color:#f97316;margin:0 0 12px 0;\">Root Cause Analysis</h3><p style=\"margin:0 0 12px 0;\">The Claude process (PID 453975) ran for <strong style=\"color:#e5c07b;\">7+ minutes at 100% CPU</strong>. Based on code review, here's what likely happened:</p><table style=\"width:100%;border-collapse:collapse;margin-top:12px;\"><tr><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">What We Know</th><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">Evidence</th></tr><tr><td style=\"padding:8px;color:#e0e0e0;\">Process was Claude CLI</td><td style=\"padding:8px;color:#98c379;\">Command shown as 'claude' in top</td></tr><tr><td style=\"padding:8px;color:#e0e0e0;\">Running 7:33 minutes</td><td style=\"padding:8px;color:#e5c07b;\">TIME+ column in screenshot</td></tr><tr><td style=\"padding:8px;color:#e0e0e0;\">100% single-core CPU</td><td style=\"padding:8px;color:#f97316;\">101.7% CPU usage</td></tr><tr><td style=\"padding:8px;color:#e0e0e0;\">Not Johnathon's session</td><td style=\"padding:8px;color:#61afef;\">Session ses_9de05b41-3ee completed at 23:20:49 (24s runtime)</td></tr></table></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"color:#c678dd;margin:0 0 12px 0;\">Most Likely Cause: Tool Execution Loop</h3><p style=\"margin:0 0 8px 0;\">Claude CLI was likely stuck in a <strong>tool execution loop</strong> - repeatedly calling tools that return partial results, triggering more calls.</p><p style=\"margin:0 0 8px 0;\">Examples that can cause this:</p><ul style=\"margin:0;padding-left:20px;line-height:1.8;color:#9ca3af;\"><li>Grep/Glob returning massive results → Claude tries to read each file</li><li>Task tool spawning subagents that spawn more subagents</li><li>Web searches that keep finding \"more to explore\"</li><li>Retry loops on failing tool calls</li></ul></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"color:#61afef;margin:0 0 12px 0;\">Current Safeguards (Already in Place)</h3><table style=\"width:100%;border-collapse:collapse;\"><tr><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">Safeguard</th><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">Setting</th><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">Status</th></tr><tr><td style=\"padding:8px;color:#e0e0e0;\">promptTimeoutMs</td><td style=\"padding:8px;color:#98c379;\">600,000ms (10 min)</td><td style=\"padding:8px;color:#98c379;\">✓ Working</td></tr><tr><td style=\"padding:8px;color:#e0e0e0;\">Heartbeat events</td><td style=\"padding:8px;color:#98c379;\">Every 5 seconds</td><td style=\"padding:8px;color:#98c379;\">✓ Working</td></tr><tr><td style=\"padding:8px;color:#e0e0e0;\">Timeout resets on activity</td><td style=\"padding:8px;color:#e5c07b;\">resetTimeout() in stream.ts</td><td style=\"padding:8px;color:#f97316;\">⚠ PROBLEM</td></tr></table></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;border-left:4px solid #f97316;\"><h3 style=\"color:#f97316;margin:0 0 12px 0;\">The Bug: Activity Resets Defeat Timeout</h3><p style=\"margin:0 0 12px 0;\">In <code style=\"background:#1e1e1e;padding:2px 6px;border-radius:4px;\">stream.ts:356</code>, <strong>every event from Claude resets the 10-minute timeout</strong>:</p><pre style=\"background:#1e1e1e;padding:12px;border-radius:4px;overflow-x:auto;font-size:0.9rem;margin:0;\"><span style=\"color:#7f848e;\">// Reset timeout on any activity from Claude</span>\n<span style=\"color:#e5c07b;\">onActivity</span>();</pre><p style=\"margin:12px 0 0 0;\">If Claude is in a loop calling tools, it's constantly emitting events → timeout <strong>never triggers</strong> → process runs forever.</p></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"color:#98c379;margin:0 0 12px 0;\">Proposed Fixes</h3><table style=\"width:100%;border-collapse:collapse;\"><tr><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">Fix</th><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">Description</th><th style=\"text-align:left;padding:8px;border-bottom:1px solid #444;color:#9ca3af;\">Priority</th></tr><tr><td style=\"padding:8px;color:#e5c07b;\">1. Hard max timeout</td><td style=\"padding:8px;\">Add <code style=\"background:#1e1e1e;padding:2px 4px;border-radius:3px;\">maxTurnTimeMs</code> (e.g., 15 min) that <strong>cannot be reset</strong></td><td style=\"padding:8px;color:#f97316;\">Critical</td></tr><tr><td style=\"padding:8px;color:#e5c07b;\">2. Tool call limit</td><td style=\"padding:8px;\">Add <code style=\"background:#1e1e1e;padding:2px 4px;border-radius:3px;\">maxToolCallsPerTurn</code> (e.g., 100) - stop after N tool calls</td><td style=\"padding:8px;color:#f97316;\">Critical</td></tr><tr><td style=\"padding:8px;color:#e5c07b;\">3. CPU monitoring</td><td style=\"padding:8px;\">Track process CPU time via <code style=\"background:#1e1e1e;padding:2px 4px;border-radius:3px;\">/proc/[pid]/stat</code>, kill if >5min CPU</td><td style=\"padding:8px;color:#e5c07b;\">High</td></tr><tr><td style=\"padding:8px;color:#e5c07b;\">4. Stall detection</td><td style=\"padding:8px;\">Detect <strong>same tool</strong> called >10x consecutively → likely stuck</td><td style=\"padding:8px;color:#61afef;\">Medium</td></tr></table></div><div style=\"background:#2d2d2d;padding:12px;border-radius:6px;margin-top:16px;\"><h4 style=\"margin:0 0 8px 0;color:#fff;\">Implementation Priority</h4><p style=\"color:#9ca3af;margin:0;\">Fixes #1 and #2 should be implemented immediately - they're simple config additions that prevent runaway processes.</p></div></div>",
"requestedAt": "2026-01-05T09:25:00.000Z",
"requestId": "bfa2a770-64f1-473b-aeea-d83f02f965bb",
"choices": [
{
"label": "Implement fixes",
"value": "Implement the hard max timeout and tool call limit fixes for konsole",
"primary": true
},
{
"label": "Add to backlog",
"value": "Add these konsole stability fixes to the VIBE.md backlog for later implementation"
},
{
"label": "More details",
"value": "Show me more details about the current timeout code and how it could be improved"
}
],
"turnTiming": {
"totalMs": 81633,
"ttfeMs": 3794,
"thinkingMs": 72550,
"toolExecutionMs": 5286,
"toolCallCount": 11,
"thinkingPct": 89,
"toolsPct": 6,
"ttfePct": 5
}
},
"createdBy": "claude",
"createdAt": "2026-01-04T23:26:32.337Z",
"updatedAt": "2026-01-04T23:26:41.224Z",
"requestId": "bfa2a770-64f1-473b-aeea-d83f02f965bb",
"scope": "vibetools",
"tags": [
"konsole",
"performance",
"stability",
"architecture"
],
"targetUser": "claude"
}