← Back to Kontasks
Root Cause Found: Health Check Spawns Process
Health check runs claude --version on every call, causing 503s during heavy turns
Output
🎯 Root Cause Found: Expensive Health Check
The Bug
The /health endpoint spawns claude --version on every request. During heavy turns, this subprocess contends for resources and takes >2 seconds, causing health checks to timeout.
The Failure Chain
1. Long turn finishes, SSE stream closes 2. Client tries to reconnect → "RECONNECTING" 3. Reconnect POSTs to/api/konsole/dashboard/stream4. Handler callsisKonsoleHealthy()(2s timeout) 5. Konsole health check runsclaude --version6. System busy from long turn → subprocess slow 7. Health check times out → 503 returned 8. Repeat for ~2 mins until system calms down
Code Evidence
konsole/src/main.ts:111-122
async function handleHealth(): Promise<Response> { const claudeAvailable = await checkClaudeAvailable(); // ⚠️ Spawns subprocess! const stats = getSessionStats(); ... }
konsole/src/claude.ts:262-277
export async function checkClaudeAvailable(): Promise<boolean> { const cmd = new Deno.Command(config.claudeCommand, { args: ["--version"], // ⚠️ Subprocess every health check! stdout: "piped", stderr: "piped", }); const result = await cmd.output(); return result.success; }
✅ Recommended Fix
- Cache the claude availability check - Run once at startup, cache result
- Remove from health endpoint - Health should be instant (just return session count + uptime)
- Optional: Background re-check - If needed, check every 60s in background
// Fixed version: let claudeAvailableCache = false; function handleHealth(): Response { const stats = getSessionStats(); return jsonResponse({ status: claudeAvailableCache ? "healthy" : "degraded", activeSessions: stats.running, uptime: Date.now() - startTime, }); }
Why This Wasn't Hot Sessions
This bug has always been present but only manifests during heavy turns. Before Hot Sessions, turns were shorter. The Quick Turn work didn't introduce this bug - it just made turns heavier (more tool calls, longer duration), exposing the existing issue more frequently.
Quick Actions
Original Request
No layout configured
Details
Type
General
Status
Completed
Scope
vibetools
Tags
konsolekonuibugperformancehealth-check
Created
6 Jan 2026, 12:28 am
Updated
6 Jan 2026, 12:28 am
Created By
claude
Raw Data
{
"id": "311f1139-7e19-430d-a01b-0e950b40c1ef",
"type": "general",
"status": "completed",
"title": "Root Cause Found: Health Check Spawns Process",
"description": "Health check runs claude --version on every call, causing 503s during heavy turns",
"context": {
"output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:'Segoe UI','Roboto',sans-serif;\"><h2 style=\"color:#fff;margin:0 0 16px 0;\">🎯 Root Cause Found: Expensive Health Check</h2><div style=\"background:#1e4d2b;padding:16px;border-radius:6px;margin-bottom:16px;border-left:4px solid #22c55e;\"><h3 style=\"margin:0 0 8px 0;color:#22c55e;\">The Bug</h3><p style=\"margin:0;color:#e0e0e0;\">The <code>/health</code> endpoint spawns <code>claude --version</code> on <strong>every request</strong>. During heavy turns, this subprocess contends for resources and takes >2 seconds, causing health checks to timeout.</p></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#e5c07b;\">The Failure Chain</h3><pre style=\"background:#1e1e1e;padding:12px;border-radius:4px;overflow-x:auto;font-size:0.85rem;line-height:1.6;\"><span style=\"color:#7f848e;\">1.</span> Long turn finishes, SSE stream closes\n<span style=\"color:#7f848e;\">2.</span> Client tries to reconnect → \"RECONNECTING\"\n<span style=\"color:#7f848e;\">3.</span> Reconnect POSTs to <code>/api/konsole/dashboard/stream</code>\n<span style=\"color:#7f848e;\">4.</span> Handler calls <code>isKonsoleHealthy()</code> (2s timeout)\n<span style=\"color:#7f848e;\">5.</span> Konsole health check runs <code>claude --version</code>\n<span style=\"color:#7f848e;\">6.</span> System busy from long turn → subprocess slow\n<span style=\"color:#7f848e;\">7.</span> Health check times out → <span style=\"color:#ef4444;\">503 returned</span>\n<span style=\"color:#7f848e;\">8.</span> Repeat for ~2 mins until system calms down</pre></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#c678dd;\">Code Evidence</h3><p style=\"color:#9ca3af;margin:0 0 8px 0;\"><code>konsole/src/main.ts:111-122</code></p><pre style=\"background:#1e1e1e;padding:12px;border-radius:4px;overflow-x:auto;font-size:0.85rem;\"><span style=\"color:#c678dd;\">async function</span> <span style=\"color:#61afef;\">handleHealth</span>(): <span style=\"color:#e5c07b;\">Promise</span><Response> {\n <span style=\"color:#e06c75;\">const</span> claudeAvailable = <span style=\"color:#c678dd;\">await</span> <span style=\"color:#61afef;\">checkClaudeAvailable</span>(); <span style=\"color:#7f848e;\">// ⚠️ Spawns subprocess!</span>\n <span style=\"color:#e06c75;\">const</span> stats = <span style=\"color:#61afef;\">getSessionStats</span>();\n ...\n}</pre><p style=\"color:#9ca3af;margin:12px 0 8px 0;\"><code>konsole/src/claude.ts:262-277</code></p><pre style=\"background:#1e1e1e;padding:12px;border-radius:4px;overflow-x:auto;font-size:0.85rem;\"><span style=\"color:#c678dd;\">export async function</span> <span style=\"color:#61afef;\">checkClaudeAvailable</span>(): <span style=\"color:#e5c07b;\">Promise</span><boolean> {\n <span style=\"color:#e06c75;\">const</span> cmd = <span style=\"color:#c678dd;\">new</span> Deno.<span style=\"color:#61afef;\">Command</span>(config.claudeCommand, {\n args: [<span style=\"color:#98c379;\">\"--version\"</span>], <span style=\"color:#7f848e;\">// ⚠️ Subprocess every health check!</span>\n stdout: <span style=\"color:#98c379;\">\"piped\"</span>,\n stderr: <span style=\"color:#98c379;\">\"piped\"</span>,\n });\n <span style=\"color:#e06c75;\">const</span> result = <span style=\"color:#c678dd;\">await</span> cmd.<span style=\"color:#61afef;\">output</span>();\n <span style=\"color:#c678dd;\">return</span> result.success;\n}</pre></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#98c379;\">✅ Recommended Fix</h3><ol style=\"margin:0;padding-left:24px;line-height:2;\"><li><strong>Cache the claude availability check</strong> - Run once at startup, cache result</li><li><strong>Remove from health endpoint</strong> - Health should be instant (just return session count + uptime)</li><li><strong>Optional: Background re-check</strong> - If needed, check every 60s in background</li></ol><pre style=\"background:#1e1e1e;padding:12px;border-radius:4px;overflow-x:auto;margin-top:12px;font-size:0.85rem;\"><span style=\"color:#7f848e;\">// Fixed version:</span>\n<span style=\"color:#e06c75;\">let</span> claudeAvailableCache = <span style=\"color:#d19a66;\">false</span>;\n\n<span style=\"color:#c678dd;\">function</span> <span style=\"color:#61afef;\">handleHealth</span>(): Response {\n <span style=\"color:#e06c75;\">const</span> stats = <span style=\"color:#61afef;\">getSessionStats</span>();\n <span style=\"color:#c678dd;\">return</span> <span style=\"color:#61afef;\">jsonResponse</span>({\n status: claudeAvailableCache ? <span style=\"color:#98c379;\">\"healthy\"</span> : <span style=\"color:#98c379;\">\"degraded\"</span>,\n activeSessions: stats.running,\n uptime: Date.<span style=\"color:#61afef;\">now</span>() - startTime,\n });\n}</pre></div><div style=\"background:#1e3a5f;padding:16px;border-radius:6px;\"><h4 style=\"margin:0 0 8px 0;color:#fff;\">Why This Wasn't Hot Sessions</h4><p style=\"color:#e0e0e0;margin:0;\">This bug has always been present but only manifests during heavy turns. Before Hot Sessions, turns were shorter. The Quick Turn work didn't introduce this bug - it just made turns heavier (more tool calls, longer duration), exposing the existing issue more frequently.</p></div></div>",
"requestedAt": "2026-01-06T04:30:00.000Z",
"requestId": "18b17b87-ad13-4cd8-823f-53dadae7583b",
"choices": [
{
"label": "Fix it now",
"value": "Implement the fix: cache claude availability at startup and make handleHealth() synchronous. This should resolve the RECONNECTING and 503 issues.",
"primary": true
},
{
"label": "Quick test",
"value": "Test the theory by running 'time curl http://localhost:8020/health' during a heavy turn to measure health check latency"
},
{
"label": "Increase timeout",
"value": "Quick workaround: increase the health check timeout from 2s to 5s in konsole-client.ts"
}
]
},
"createdBy": "claude",
"createdAt": "2026-01-05T14:28:36.197Z",
"updatedAt": "2026-01-05T14:28:36.508Z",
"requestId": "18b17b87-ad13-4cd8-823f-53dadae7583b",
"scope": "vibetools",
"tags": [
"konsole",
"konui",
"bug",
"performance",
"health-check"
],
"targetUser": "claude"
}