feat: local model reliability — SDK retries, capability probing, init skill, context compaction

Three compounding bugs prevented tool calling with llama.cpp: - Stream parser set argsComplete on partial JSON (e.g. "{"), dropping subsequent argument deltas — fix: use json.Valid to detect completeness - Missing tool_choice default — llama.cpp needs explicit "auto" to activate its GBNF grammar constraint; now set when tools are present - Tool names in history used internal format (fs.ls) while definitions used API format (fs_ls) — now re-sanitized in translateMessage Additional changes: - Disable SDK retries for local providers (500s are deterministic) - Dynamic capability probing via /props (llama.cpp) and /api/show (Ollama), replacing hardcoded model prefix list - Engine respects forced arm ToolUse capability when router is active - Bundled /init skill with Go template blocks, context-aware for local vs cloud models, deduplication rules against CLAUDE.md - Tool result compaction for local models — previous round results replaced with size markers to stay within small context windows - Text-only fallback when tool-parse errors occur on local models - "text-only" TUI indicator when model lacks tool support - Session ResetError for retry after stream failures - AllowedTools per-turn filtering in engine buildRequest
2026-04-13 02:01:01 +02:00
parent 99529e6156
commit 3873f90f83
25 changed files with 1420 additions and 142 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,67 +1,26 @@
 # AGENTS.md

-## 🚀 Project Overview: Gnoma Agentic Assistant
+## Domain Terminology
+- **Elf**: An agent instance.
+- **Turn**: A complete sequence of agentic reasoning and tool execution.
+- **Routing Arm**: A specific model/provider selected by the `Router` for a task.
+- **Stream Event**: Discrete updates during LLM generation (e.g., `EventTextDelta`, `EventToolCallStart`, `EventToolResult`).

-**Project Name:** Gnoma
-**Description:** A provider-agnostic agentic coding assistant written in Go. The name is derived from the northern pygmy-owl (*Glaucidium gnoma*). This system facilitates complex, multi-step reasoning and task execution by orchestrating calls to various external Large Language Model (LLM) providers.
+## Build & Test Targets
+- **Run**: `make run`
+- **Test (Verbose)**: `make test-v`
+- **Integration Tests**: `make test-integration` (requires `//go:build integration`)

-**Module Path:** `somegit.dev/Owlibou/gnoma`
+## Key Dependencies
+- **Mistral**: `github.com/VikingOwl91/mistral-go-sdk`
+- **Anthropic**: `github.com/anthropics/anthropic-sdk-go`
+- **OpenAI**: `github.com/openai/openai-go`
+- **Google GenAI**: `google.golang.org/genai`
+- **TUI**: `charm.land/bubbletea/v2`, `charm.land/lipgloss/v2`
+- **Other**: `charm.land/bubbles/v2`, `charm.land/glamour/v2`, `github.com/pkoukk/tiktoken-go`

---
-
-## 🛠️ Build & Testing Instructions
-
-The standard build system uses `make` for all development and testing tasks:
-
-*   **Build Binary:** `make build` (Creates the executable in `./bin/gnoma`)
-*   **Run All Tests:** `make test`
-*   **Lint Code:** `make lint` (Uses `golangci-lint`)
-*   **Run Coverage Report:** `make cover`
-
-**Architectural Note:** Changes requiring deep architectural review or boundaries must first consult the design decisions documented in `docs/essentials/INDEX.md`.
-
---
-
-## 🔗 Dependencies & Providers
-
-The system is designed for provider agnosticism and supports multiple primary backends through standardized interfaces:
-
-*   **Mistral:** Via `github.com/VikingOwl91/mistral-go-sdk`
-*   **Anthropic:** Via `github.com/anthropics/anthropic-sdk-go`
-*   **OpenAI:** Via `github.com/openai/openai-go`
-*   **Google:** Via `google.golang.org/genai`
-*   **Ollama/llama.cpp:** Handled via the OpenAI SDK structure with a custom base URL configuration.
-
---
-
-## 📜 Development Conventions (Mandatory Guidelines)
-
-Adherence to the following conventions is required for maintainability, testability, and consistency across the entire codebase.
-
-### ⭐ Go Idioms & Style
-*   **Modern Go:** Adhere strictly to Go 1.26 idioms, including the use of `new(expr)`, `errors.AsType[E]`, `sync.WaitGroup.Go`, and implementing structured logging via `log/slog`.
-*   **Data Structures:** Use **structs with explicit type discriminants** to model discriminated unions, *not* Go interfaces.
-*   **Streaming:** Implement pull-based stream iterators following the pattern: `Next() / Current() / Err() / Close()`.
-*   **API Handling:** Utilize `json.RawMessage` for tool schemas and arguments to ensure zero-cost JSON passthrough.
-*   **Configuration:** Favor **Functional Options** for complex configuration structures.
-*   **Concurrency:** Use `golang.org/x/sync/errgroup` for managing parallel work groups.
-
-### 🧪 Testing Philosophy
-*   **TDD First:** Always write tests *before* writing production code.
-*   **Test Style:** Employ table-driven tests extensively.
-*   **Contextual Testing:**
-    *   Use build tags (`//go:build integration`) for tests that interact with real external APIs.
-    *   Use `testing/synctest` for any tests requiring concurrent execution checks.
-    *   Use `t.TempDir()` for all file system simulations.
-
-### 🏷️ Naming Conventions
-*   **Packages:** Names must be short, entirely lowercase, and contain no underscores.
-*   **Interfaces:** Must describe *behavior* (what it does), not *implementation* (how it is done).
-*   **Filenames/Types:** Should follow standard Go casing conventions.
-
-### ⚙️ Execution & Pattern Guidelines
-*   **Orchestration Flow:** State management should be handled sequentially through specialized manager/worker structs (e.g., `AgentExecutor`).
-*   **Error Handling:** Favor structured, wrapped errors over bare `panic`/`recover`.
-
---
-***Note:*** *This document synthesizes the core architectural constraints derived from the project structure.*
+## Environment Variables
+- `MISTRAL_API_KEY`: Required for Mistral provider.
+- `ANTHROPIC_API_KEY`: Required for Anthropic provider.
+- `OPENAI_API_KEY`: Required for OpenAI provider.
+- `GOOGLE_API_KEY`: Required for Google provider.
--- a/cmd/gnoma/main.go
+++ b/cmd/gnoma/main.go
@@ -275,6 +275,7 @@ func main() {
 	localModels := router.DiscoverLocalModels(context.Background(), logger,
 		cfg.Provider.Endpoints["ollama"],
 		cfg.Provider.Endpoints["llamacpp"],
+		nil, // no cache for initial one-shot discovery
 	)
 	router.RegisterDiscoveredModels(rtr, localModels, func(provName, model string) provider.Provider {
 		p, err := createProvider(provName, "", model, cfg.Provider.Endpoints[provName])
@@ -634,6 +635,7 @@ func main() {
 					Args:        args,
 					Cwd:         cwd,
 					ProjectRoot: gnomacfg.ProjectRoot(),
+					Local:       localProviders[*providerName],
 				})
 				if renderErr != nil {
 					fmt.Fprintf(os.Stderr, "skill %q: %v\n", name, renderErr)
@@ -790,7 +792,7 @@ func discoverActiveModel(provName string, cfg *gnomacfg.Config, logger *slog.Log
 	case "llamacpp":
 		models, err = router.DiscoverLlamaCpp(ctx, cfg.Provider.Endpoints["llamacpp"])
 	case "ollama":
-		models, err = router.DiscoverOllama(ctx, cfg.Provider.Endpoints["ollama"])
+		models, err = router.DiscoverOllama(ctx, cfg.Provider.Endpoints["ollama"], nil)
 	default:
 		return ""
 	}
--- a/internal/engine/buildrequest_test.go
+++ b/internal/engine/buildrequest_test.go
@@ -0,0 +1,185 @@
+package engine
+
+import (
+	"context"
+	"testing"
+
+	"somegit.dev/Owlibou/gnoma/internal/provider"
+	"somegit.dev/Owlibou/gnoma/internal/router"
+	"somegit.dev/Owlibou/gnoma/internal/tool"
+)
+
+func TestForcedArmSupportsTools_NoRouter(t *testing.T) {
+	e := &Engine{cfg: Config{}}
+	if !e.forcedArmSupportsTools() {
+		t.Error("should return true when no router configured")
+	}
+}
+
+func TestForcedArmSupportsTools_NoForcedArm(t *testing.T) {
+	rtr := router.New(router.Config{})
+	e := &Engine{cfg: Config{Router: rtr}}
+	if !e.forcedArmSupportsTools() {
+		t.Error("should return true when no forced arm (multi-arm routing)")
+	}
+}
+
+func TestForcedArmSupportsTools_ArmWithTools(t *testing.T) {
+	rtr := router.New(router.Config{})
+	rtr.RegisterArm(&router.Arm{
+		ID:        "llamacpp/qwen3",
+		Provider:  &mockProvider{name: "llamacpp"},
+		ModelName: "qwen3",
+		IsLocal:   true,
+		Capabilities: provider.Capabilities{ToolUse: true},
+	})
+	rtr.ForceArm("llamacpp/qwen3")
+
+	e := &Engine{cfg: Config{Router: rtr}}
+	if !e.forcedArmSupportsTools() {
+		t.Error("should return true when forced arm supports tools")
+	}
+}
+
+func TestForcedArmSupportsTools_ArmWithoutTools(t *testing.T) {
+	rtr := router.New(router.Config{})
+	rtr.RegisterArm(&router.Arm{
+		ID:        "llamacpp/gemma",
+		Provider:  &mockProvider{name: "llamacpp"},
+		ModelName: "gemma",
+		IsLocal:   true,
+		Capabilities: provider.Capabilities{ToolUse: false},
+	})
+	rtr.ForceArm("llamacpp/gemma")
+
+	e := &Engine{cfg: Config{Router: rtr}}
+	if e.forcedArmSupportsTools() {
+		t.Error("should return false when forced arm does not support tools")
+	}
+}
+
+func TestBuildRequest_ForcedArmNoToolSupport_OmitsTools(t *testing.T) {
+	rtr := router.New(router.Config{})
+	rtr.RegisterArm(&router.Arm{
+		ID:        "llamacpp/gemma",
+		Provider:  &mockProvider{name: "llamacpp"},
+		ModelName: "gemma",
+		IsLocal:   true,
+		Capabilities: provider.Capabilities{ToolUse: false},
+	})
+	rtr.ForceArm("llamacpp/gemma")
+
+	reg := tool.NewRegistry()
+	reg.Register(&mockTool{name: "fs.read"})
+	reg.Register(&mockTool{name: "bash"})
+
+	e, err := New(Config{
+		Provider: &mockProvider{name: "llamacpp"},
+		Router:   rtr,
+		Tools:    reg,
+	})
+	if err != nil {
+		t.Fatalf("New() error = %v", err)
+	}
+
+	req := e.buildRequest(context.Background())
+	if len(req.Tools) != 0 {
+		t.Errorf("buildRequest() included %d tools, want 0 for arm without tool support", len(req.Tools))
+	}
+}
+
+func TestBuildRequest_ForcedArmWithToolSupport_IncludesTools(t *testing.T) {
+	rtr := router.New(router.Config{})
+	rtr.RegisterArm(&router.Arm{
+		ID:        "llamacpp/qwen3",
+		Provider:  &mockProvider{name: "llamacpp"},
+		ModelName: "qwen3",
+		IsLocal:   true,
+		Capabilities: provider.Capabilities{ToolUse: true},
+	})
+	rtr.ForceArm("llamacpp/qwen3")
+
+	reg := tool.NewRegistry()
+	reg.Register(&mockTool{name: "fs.read"})
+	reg.Register(&mockTool{name: "bash"})
+
+	e, err := New(Config{
+		Provider: &mockProvider{name: "llamacpp"},
+		Router:   rtr,
+		Tools:    reg,
+	})
+	if err != nil {
+		t.Fatalf("New() error = %v", err)
+	}
+
+	req := e.buildRequest(context.Background())
+	if len(req.Tools) != 2 {
+		t.Errorf("buildRequest() included %d tools, want 2 for arm with tool support", len(req.Tools))
+	}
+}
+
+func TestBuildRequest_AllowedToolsFilter(t *testing.T) {
+	reg := tool.NewRegistry()
+	reg.Register(&mockTool{name: "fs.ls"})
+	reg.Register(&mockTool{name: "fs.read"})
+	reg.Register(&mockTool{name: "fs.write"})
+	reg.Register(&mockTool{name: "bash"})
+	reg.Register(&mockTool{name: "agent"})
+
+	e, err := New(Config{
+		Provider: &mockProvider{name: "llamacpp"},
+		Tools:    reg,
+	})
+	if err != nil {
+		t.Fatalf("New() error = %v", err)
+	}
+
+	// Without filter: all 5 tools
+	req := e.buildRequest(context.Background())
+	if len(req.Tools) != 5 {
+		t.Errorf("unfiltered: got %d tools, want 5", len(req.Tools))
+	}
+
+	// With filter: only fs.ls and fs.write
+	e.turnOpts.AllowedTools = []string{"fs.ls", "fs.write"}
+	req = e.buildRequest(context.Background())
+	if len(req.Tools) != 2 {
+		t.Errorf("filtered: got %d tools, want 2", len(req.Tools))
+	}
+	names := make(map[string]bool)
+	for _, td := range req.Tools {
+		names[td.Name] = true
+	}
+	if !names["fs.ls"] || !names["fs.write"] {
+		t.Errorf("filtered tools = %v, want fs.ls and fs.write", names)
+	}
+}
+
+func TestBuildRequest_MultiArmRouting_IncludesTools(t *testing.T) {
+	rtr := router.New(router.Config{})
+	rtr.RegisterArm(&router.Arm{
+		ID:        "llamacpp/gemma",
+		Provider:  &mockProvider{name: "llamacpp"},
+		ModelName: "gemma",
+		IsLocal:   true,
+		Capabilities: provider.Capabilities{ToolUse: false},
+	})
+	// No forced arm — multi-arm routing
+
+	reg := tool.NewRegistry()
+	reg.Register(&mockTool{name: "fs.read"})
+
+	e, err := New(Config{
+		Provider: &mockProvider{name: "llamacpp"},
+		Router:   rtr,
+		Tools:    reg,
+	})
+	if err != nil {
+		t.Fatalf("New() error = %v", err)
+	}
+
+	req := e.buildRequest(context.Background())
+	if len(req.Tools) != 1 {
+		t.Errorf("buildRequest() included %d tools, want 1 for multi-arm routing (no forced arm)", len(req.Tools))
+	}
+}
--- a/internal/engine/compact.go
+++ b/internal/engine/compact.go
@@ -0,0 +1,70 @@
+package engine
+
+import (
+	"fmt"
+
+	"somegit.dev/Owlibou/gnoma/internal/message"
+)
+
+// compactPreviousToolResults replaces the content of tool results from
+// already-processed rounds with a short size marker. The most recent tool
+// results (after the last assistant message) are kept intact because the
+// model hasn't responded to them yet.
+//
+// This dramatically reduces context usage in multi-round agentic loops,
+// which is critical for local models with small context windows.
+func compactPreviousToolResults(msgs []message.Message) []message.Message {
+	// Find the last assistant message — tool results before it have been
+	// processed; those after it are pending.
+	lastAssistant := -1
+	for i := len(msgs) - 1; i >= 0; i-- {
+		if msgs[i].Role == message.RoleAssistant {
+			lastAssistant = i
+			break
+		}
+	}
+	if lastAssistant <= 0 {
+		return msgs
+	}
+
+	out := make([]message.Message, len(msgs))
+	copy(out, msgs)
+	for i := range out {
+		if i >= lastAssistant {
+			break
+		}
+		if isToolResultMessage(out[i]) {
+			out[i] = compactToolResultMessage(out[i])
+		}
+	}
+	return out
+}
+
+func isToolResultMessage(m message.Message) bool {
+	return m.Role == message.RoleUser &&
+		len(m.Content) > 0 &&
+		m.Content[0].Type == message.ContentToolResult
+}
+
+func compactToolResultMessage(m message.Message) message.Message {
+	compacted := message.Message{
+		Role:    m.Role,
+		Content: make([]message.Content, len(m.Content)),
+	}
+	for i, c := range m.Content {
+		if c.Type == message.ContentToolResult && c.ToolResult != nil {
+			summary := fmt.Sprintf("[prior result: %d chars]", len(c.ToolResult.Content))
+			compacted.Content[i] = message.Content{
+				Type: message.ContentToolResult,
+				ToolResult: &message.ToolResult{
+					ToolCallID: c.ToolResult.ToolCallID,
+					Content:    summary,
+					IsError:    c.ToolResult.IsError,
+				},
+			}
+		} else {
+			compacted.Content[i] = c
+		}
+	}
+	return compacted
+}
--- a/internal/engine/compact_test.go
+++ b/internal/engine/compact_test.go
@@ -0,0 +1,158 @@
+package engine
+
+import (
+	"encoding/json"
+	"testing"
+
+	"somegit.dev/Owlibou/gnoma/internal/message"
+)
+
+func TestCompactPreviousToolResults_NoAssistant(t *testing.T) {
+	msgs := []message.Message{
+		{Role: message.RoleUser, Content: []message.Content{message.NewTextContent("hello")}},
+	}
+	got := compactPreviousToolResults(msgs)
+	if len(got) != 1 || got[0].TextContent() != "hello" {
+		t.Error("should return messages unchanged when no assistant message exists")
+	}
+}
+
+func TestCompactPreviousToolResults_SingleRound(t *testing.T) {
+	// user → assistant(tool_call) → tool_result
+	// Only one round, tool result is the latest — should NOT be compacted.
+	msgs := []message.Message{
+		{Role: message.RoleUser, Content: []message.Content{message.NewTextContent("do /init")}},
+		{Role: message.RoleAssistant, Content: []message.Content{
+			message.NewToolCallContent(message.ToolCall{ID: "c1", Name: "fs.ls", Arguments: json.RawMessage(`{}`)}),
+		}},
+		toolResultMsg("c1distances", "file1.go\nfile2.go\nfile3.go\n"),
+	}
+	got := compactPreviousToolResults(msgs)
+	// Tool result is after the last assistant message — should be intact.
+	result := got[2].Content[0].ToolResult
+	if result.Content == "" || len(result.Content) < 10 {
+		t.Errorf("latest tool result should be intact, got %q", result.Content)
+	}
+}
+
+func TestCompactPreviousToolResults_TwoRounds(t *testing.T) {
+	bigContent := make([]byte, 2000)
+	for i := range bigContent {
+		bigContent[i] = 'x'
+	}
+
+	msgs := []message.Message{
+		{Role: message.RoleUser, Content: []message.Content{message.NewTextContent("do /init")}},
+		// Round 0
+		{Role: message.RoleAssistant, Content: []message.Content{
+			message.NewToolCallContent(message.ToolCall{ID: "c1", Name: "fs.read", Arguments: json.RawMessage(`{}`)}),
+		}},
+		toolResultMsg("c1", string(bigContent)), // 2000 chars — should be compacted
+		// Round 1
+		{Role: message.RoleAssistant, Content: []message.Content{
+			message.NewToolCallContent(message.ToolCall{ID: "c2", Name: "fs.write", Arguments: json.RawMessage(`{}`)}),
+		}},
+		toolResultMsg("c2", "file written"), // latest — should be intact
+	}
+
+	got := compactPreviousToolResults(msgs)
+
+	// Round 0 tool result (index 2) should be compacted
+	r0 := got[2].Content[0].ToolResult
+	if len(r0.Content) > 100 {
+		t.Errorf("round 0 tool result should be compacted, got %d chars", len(r0.Content))
+	}
+	if r0.ToolCallID != "c1" {
+		t.Errorf("compacted result should preserve ToolCallID, got %q", r0.ToolCallID)
+	}
+
+	// Round 1 tool result (index 4) should be intact
+	r1 := got[4].Content[0].ToolResult
+	if r1.Content != "file written" {
+		t.Errorf("latest tool result should be intact, got %q", r1.Content)
+	}
+}
+
+func TestCompactPreviousToolResults_PreservesNonToolMessages(t *testing.T) {
+	msgs := []message.Message{
+		{Role: message.RoleUser, Content: []message.Content{message.NewTextContent("hello")}},
+		{Role: message.RoleAssistant, Content: []message.Content{
+			message.NewTextContent("I'll read the file"),
+			message.NewToolCallContent(message.ToolCall{ID: "c1", Name: "fs.read", Arguments: json.RawMessage(`{}`)}),
+		}},
+		toolResultMsg("c1", "file contents here..."),
+		{Role: message.RoleAssistant, Content: []message.Content{message.NewTextContent("done")}},
+	}
+	got := compactPreviousToolResults(msgs)
+
+	// User text message should be unchanged
+	if got[0].TextContent() != "hello" {
+		t.Errorf("user message should be unchanged, got %q", got[0].TextContent())
+	}
+	// Assistant text should be unchanged
+	if got[1].TextContent() != "I'll read the file" {
+		t.Errorf("assistant message should be unchanged, got %q", got[1].TextContent())
+	}
+}
+
+func TestCompactPreviousToolResults_PreservesErrorFlag(t *testing.T) {
+	msgs := []message.Message{
+		{Role: message.RoleUser, Content: []message.Content{message.NewTextContent("hi")}},
+		{Role: message.RoleAssistant, Content: []message.Content{
+			message.NewToolCallContent(message.ToolCall{ID: "c1", Name: "fs.read", Arguments: json.RawMessage(`{}`)}),
+		}},
+		errorToolResultMsg("c1", "permission denied: /etc/shadow"),
+		{Role: message.RoleAssistant, Content: []message.Content{message.NewTextContent("sorry")}},
+	}
+	got := compactPreviousToolResults(msgs)
+	r := got[2].Content[0].ToolResult
+	if !r.IsError {
+		t.Error("compacted error result should preserve IsError=true")
+	}
+}
+
+func TestCompactPreviousToolResults_DoesNotMutateOriginal(t *testing.T) {
+	original := "a]long tool result content that should not be modified"
+	msgs := []message.Message{
+		{Role: message.RoleUser, Content: []message.Content{message.NewTextContent("hi")}},
+		{Role: message.RoleAssistant, Content: []message.Content{
+			message.NewToolCallContent(message.ToolCall{ID: "c1", Name: "fs.read", Arguments: json.RawMessage(`{}`)}),
+		}},
+		toolResultMsg("c1", original),
+		{Role: message.RoleAssistant, Content: []message.Content{message.NewTextContent("ok")}},
+	}
+	_ = compactPreviousToolResults(msgs)
+	// Original message should be unchanged
+	if msgs[2].Content[0].ToolResult.Content != original {
+		t.Error("compaction should not mutate the original messages")
+	}
+}
+
+// helpers
+
+func toolResultMsg(toolCallID, content string) message.Message {
+	return message.Message{
+		Role: message.RoleUser,
+		Content: []message.Content{{
+			Type: message.ContentToolResult,
+			ToolResult: &message.ToolResult{
+				ToolCallID: toolCallID,
+				Content:    content,
+			},
+		}},
+	}
+}
+
+func errorToolResultMsg(toolCallID, content string) message.Message {
+	return message.Message{
+		Role: message.RoleUser,
+		Content: []message.Content{{
+			Type: message.ContentToolResult,
+			ToolResult: &message.ToolResult{
+				ToolCallID: toolCallID,
+				Content:    content,
+				IsError:    true,
+			},
+		}},
+	}
+}
--- a/internal/engine/engine.go
+++ b/internal/engine/engine.go
@@ -51,7 +51,8 @@ type Turn struct {

 // TurnOptions carries per-turn overrides that apply for a single Submit call.
 type TurnOptions struct {
-	ToolChoice provider.ToolChoiceMode // "" = use provider default
+	ToolChoice   provider.ToolChoiceMode // "" = use provider default
+	AllowedTools []string                // if non-nil, only these tools are sent (matched by name)
 }

 // Engine orchestrates the conversation.
@@ -73,6 +74,55 @@ type Engine struct {
 	turnOpts TurnOptions
 }

+// ToolsAvailable reports whether the current model supports tool calling.
+func (e *Engine) ToolsAvailable() bool {
+	return e.forcedArmSupportsTools()
+}
+
+// forcedArmSupportsTools returns true if tool definitions should be included
+// in the request. When the router has a forced arm, checks its ToolUse
+// capability. Returns true for multi-arm routing (feasibility filter handles it)
+// or when no router is configured.
+func (e *Engine) forcedArmSupportsTools() bool {
+	if e.cfg.Router == nil {
+		return true
+	}
+	id := e.cfg.Router.ForcedArm()
+	if id == "" {
+		return true // multi-arm routing: router handles feasibility
+	}
+	arm, ok := e.cfg.Router.LookupArm(id)
+	if !ok {
+		if e.logger != nil {
+			e.logger.Debug("forced arm not found in router, assuming tool support", "arm", id)
+		}
+		return true
+	}
+	if e.logger != nil {
+		e.logger.Debug("forced arm tool support check",
+			"arm", id,
+			"tool_use", arm.Capabilities.ToolUse,
+		)
+	}
+	return arm.Capabilities.ToolUse
+}
+
+// isLocalArm returns true if the forced arm is a local provider (Ollama, llama.cpp).
+func (e *Engine) isLocalArm() bool {
+	if e.cfg.Router == nil {
+		return false
+	}
+	id := e.cfg.Router.ForcedArm()
+	if id == "" {
+		return false
+	}
+	arm, ok := e.cfg.Router.LookupArm(id)
+	if !ok {
+		return false
+	}
+	return arm.IsLocal
+}
+
 // New creates an engine.
 func New(cfg Config) (*Engine, error) {
 	if err := cfg.validate(); err != nil {
--- a/internal/engine/loop.go
+++ b/internal/engine/loop.go
@@ -4,6 +4,8 @@ import (
 	"context"
 	"errors"
 	"fmt"
+	"slices"
+	"strings"
 	"sync"
 	"time"

@@ -184,9 +186,13 @@ func (e *Engine) runLoop(ctx context.Context, cb Callback) (*Turn, error) {
 		}
 		streamEnd := time.Now()
 		if err := s.Err(); err != nil {
+			e.logger.Debug("stream terminated with error",
+				"error", err,
+				"rounds", turn.Rounds,
+			)
 			s.Close()
 			decision.Rollback()
-			return nil, fmt.Errorf("stream error: %w", err)
+			return nil, e.annotateStreamError(err, len(req.Tools))
 		}
 		s.Close()

@@ -276,6 +282,11 @@ func (e *Engine) buildRequest(ctx context.Context) provider.Request {
 	if e.cfg.Context != nil {
 		messages = e.cfg.Context.AllMessages()
 	}
+	// For local models, compact tool results from previous rounds to stay
+	// within small context windows. Cloud models keep full results.
+	if e.isLocalArm() {
+		messages = compactPreviousToolResults(messages)
+	}
 	systemPrompt := e.cfg.System
 	if e.cfg.Firewall != nil {
 		messages = e.cfg.Firewall.ScanOutgoingMessages(messages)
@@ -290,22 +301,37 @@ func (e *Engine) buildRequest(ctx context.Context) provider.Request {
 	}

 	// Only include tools if the model supports them.
-	// When Router is active, skip capability gating — the router selects the arm
-	// and already knows its capabilities. Gating here would use the wrong provider.
+	// When a forced arm is set, check its ToolUse capability directly.
+	// For multi-arm routing (no forced arm), include tools and let the
+	// router's feasibility filter handle capability matching.
 	caps := e.resolveCapabilities(ctx)
-	if e.cfg.Router != nil || caps == nil || caps.ToolUse {
-		// Router active, nil caps (unknown model), or model supports tools
+	includeTools := false
+	if e.cfg.Router != nil {
+		includeTools = e.forcedArmSupportsTools()
+	} else {
+		includeTools = caps == nil || caps.ToolUse
+	}
+	if includeTools {
+		allowed := e.turnOpts.AllowedTools
 		for _, t := range e.cfg.Tools.All() {
 			// Skip deferred tools until the model requests them
 			if dt, ok := t.(tool.DeferrableTool); ok && dt.ShouldDefer() && !e.activatedTools[t.Name()] {
 				continue
 			}
+			// Filter to allowed tools when a restrict list is set
+			if allowed != nil && !slices.Contains(allowed, t.Name()) {
+				continue
+			}
 			req.Tools = append(req.Tools, provider.ToolDefinition{
 				Name:        t.Name(),
 				Description: t.Description(),
 				Parameters:  t.Parameters(),
 			})
 		}
+		e.logger.Debug("tools included in request",
+			"model", req.Model,
+			"count", len(req.Tools),
+		)
 	} else {
 		e.logger.Debug("tools omitted — model does not support tool use",
 			"model", req.Model,
@@ -553,6 +579,10 @@ func (e *Engine) handleRequestTooLarge(ctx context.Context, origErr error, req p
 func (e *Engine) retryOnTransient(ctx context.Context, firstErr error, fn func() (stream.Stream, error)) (stream.Stream, error) {
 	var provErr *provider.ProviderError
 	if !errors.As(firstErr, &provErr) || !provErr.Retryable {
+		e.logger.Debug("error not retryable",
+			"is_provider_error", errors.As(firstErr, &provErr),
+			"error", firstErr,
+		)
 		return nil, firstErr
 	}

@@ -595,3 +625,19 @@ func (e *Engine) retryOnTransient(ctx context.Context, firstErr error, fn func()
 	return nil, firstErr
 }

+// annotateStreamError wraps a stream error with diagnostic context when the
+// failure is a deterministic tool-parse error from a local server. The extra
+// context is visible in the TUI (slog.Debug goes to a file).
+func (e *Engine) annotateStreamError(err error, toolCount int) error {
+	var provErr *provider.ProviderError
+	if errors.As(err, &provErr) && provErr.StatusCode == 500 &&
+		strings.Contains(strings.ToLower(provErr.Message), "parse tool call") {
+		toolSupport := e.forcedArmSupportsTools()
+		return fmt.Errorf("stream error (tools_sent=%d, probe_tool_support=%v): %w\n"+
+			"hint: the model's chat template claims tool support but it generated invalid tool JSON. "+
+			"Ensure llama.cpp is started with --jinja, or try a model with better tool-calling ability",
+			toolCount, toolSupport, err)
+	}
+	return fmt.Errorf("stream error: %w", err)
+}
+
--- a/internal/provider/openai/provider.go
+++ b/internal/provider/openai/provider.go
@@ -39,6 +39,9 @@ func NewWithStreamOptions(cfg provider.ProviderConfig, streamOpts []option.Reque
 	if cfg.BaseURL != "" {
 		opts = append(opts, option.WithBaseURL(cfg.BaseURL))
 	}
+	if cfg.MaxRetries != nil {
+		opts = append(opts, option.WithMaxRetries(*cfg.MaxRetries))
+	}

 	client := oai.NewClient(opts...)

--- a/internal/provider/openai/stream.go
+++ b/internal/provider/openai/stream.go
@@ -3,6 +3,7 @@ package openai
 import (
 	"encoding/json"
 	"errors"
+	"log/slog"

 	"somegit.dev/Owlibou/gnoma/internal/message"
 	"somegit.dev/Owlibou/gnoma/internal/provider"
@@ -80,7 +81,7 @@ func (s *openaiStream) Next() bool {
 						id:           tc.ID,
 						name:         tc.Function.Name,
 						args:         tc.Function.Arguments,
-						argsComplete: tc.Function.Arguments != "",
+						argsComplete: tc.Function.Arguments != "" && json.Valid([]byte(tc.Function.Arguments)),
 					}
 					s.toolCalls[tc.Index] = existing
 					s.hadToolCalls = true
@@ -193,6 +194,12 @@ func wrapSDKError(err error) error {
 		return err
 	}
 	kind, retryable := provider.ClassifyHTTPError(apiErr.StatusCode, apiErr.Message)
+	slog.Debug("openai SDK error wrapped",
+		"status", apiErr.StatusCode,
+		"kind", kind,
+		"retryable", retryable,
+		"message", apiErr.Message,
+	)
 	return &provider.ProviderError{
 		Kind:       kind,
 		Provider:   "openai",
--- a/internal/provider/openai/translate.go
+++ b/internal/provider/openai/translate.go
@@ -72,7 +72,7 @@ func translateMessage(m message.Message) []oai.ChatCompletionMessageParamUnion {
 			msg.OfAssistant.ToolCalls = append(msg.OfAssistant.ToolCalls, oai.ChatCompletionMessageToolCallParam{
 				ID: tc.ID,
 				Function: oai.ChatCompletionMessageToolCallFunctionParam{
-					Name:      tc.Name,
+					Name:      sanitizeToolName(tc.Name),
 					Arguments: string(tc.Arguments),
 				},
 			})
@@ -131,9 +131,13 @@ func translateRequest(req provider.Request) oai.ChatCompletionNewParams {
 		IncludeUsage: param.NewOpt(true),
 	}

-	if req.ToolChoice != "" && len(params.Tools) > 0 {
+	if len(params.Tools) > 0 {
+		choice := "auto"
+		if req.ToolChoice != "" {
+			choice = string(req.ToolChoice)
+		}
 		params.ToolChoice = oai.ChatCompletionToolChoiceOptionUnionParam{
-			OfAuto: param.NewOpt(string(req.ToolChoice)),
+			OfAuto: param.NewOpt(choice),
 		}
 	}

--- a/internal/provider/openai/translate_test.go
+++ b/internal/provider/openai/translate_test.go
@@ -0,0 +1,110 @@
+package openai
+
+import (
+	"encoding/json"
+	"testing"
+
+	"somegit.dev/Owlibou/gnoma/internal/message"
+	"somegit.dev/Owlibou/gnoma/internal/provider"
+
+	"github.com/openai/openai-go/packages/param"
+)
+
+func TestTranslateMessage_AssistantToolCallNames_Sanitized(t *testing.T) {
+	msg := message.Message{
+		Role: message.RoleAssistant,
+		Content: []message.Content{
+			message.NewTextContent("calling tools"),
+			message.NewToolCallContent(message.ToolCall{
+				ID:        "call_1",
+				Name:      "fs.ls", // internal gnoma name (dot)
+				Arguments: json.RawMessage(`{"path":"/"}`),
+			}),
+			message.NewToolCallContent(message.ToolCall{
+				ID:        "call_2",
+				Name:      "fs.read", // internal gnoma name (dot)
+				Arguments: json.RawMessage(`{"path":"/tmp/x"}`),
+			}),
+		},
+	}
+
+	out := translateMessage(msg)
+	if len(out) != 1 {
+		t.Fatalf("translateMessage returned %d messages, want 1", len(out))
+	}
+
+	calls := out[0].OfAssistant.ToolCalls
+	if len(calls) != 2 {
+		t.Fatalf("got %d tool calls, want 2", len(calls))
+	}
+	if calls[0].Function.Name != "fs_ls" {
+		t.Errorf("tool call 0 name = %q, want %q", calls[0].Function.Name, "fs_ls")
+	}
+	if calls[1].Function.Name != "fs_read" {
+		t.Errorf("tool call 1 name = %q, want %q", calls[1].Function.Name, "fs_read")
+	}
+}
+
+func TestTranslateRequest_ToolChoiceDefault(t *testing.T) {
+	tests := []struct {
+		name       string
+		tools      []provider.ToolDefinition
+		toolChoice provider.ToolChoiceMode
+		wantChoice string // "" means omitted
+	}{
+		{
+			name:       "no tools, no choice — omitted",
+			tools:      nil,
+			toolChoice: "",
+			wantChoice: "",
+		},
+		{
+			name: "tools present, no explicit choice — defaults to auto",
+			tools: []provider.ToolDefinition{
+				{Name: "fs_ls", Description: "list dir", Parameters: json.RawMessage(`{"type":"object"}`)},
+			},
+			toolChoice: "",
+			wantChoice: "auto",
+		},
+		{
+			name: "tools present, explicit required",
+			tools: []provider.ToolDefinition{
+				{Name: "fs_ls", Description: "list dir", Parameters: json.RawMessage(`{"type":"object"}`)},
+			},
+			toolChoice: provider.ToolChoiceRequired,
+			wantChoice: "required",
+		},
+		{
+			name: "tools present, explicit none",
+			tools: []provider.ToolDefinition{
+				{Name: "fs_ls", Description: "list dir", Parameters: json.RawMessage(`{"type":"object"}`)},
+			},
+			toolChoice: provider.ToolChoiceNone,
+			wantChoice: "none",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			req := provider.Request{
+				Model:      "test-model",
+				Tools:      tt.tools,
+				ToolChoice: tt.toolChoice,
+			}
+
+			params := translateRequest(req)
+
+			if tt.wantChoice == "" {
+				if !param.IsOmitted(params.ToolChoice.OfAuto) {
+					t.Errorf("tool_choice should be omitted, got OfAuto=%q", params.ToolChoice.OfAuto.Value)
+				}
+			} else {
+				if param.IsOmitted(params.ToolChoice.OfAuto) {
+					t.Errorf("tool_choice should be %q, but was omitted", tt.wantChoice)
+				} else if params.ToolChoice.OfAuto.Value != tt.wantChoice {
+					t.Errorf("tool_choice = %q, want %q", params.ToolChoice.OfAuto.Value, tt.wantChoice)
+				}
+			}
+		})
+	}
+}
--- a/internal/provider/openaicompat/provider.go
+++ b/internal/provider/openaicompat/provider.go
@@ -12,6 +12,8 @@ const (
 	llamacppDefaultURL = "http://localhost:8080/v1"
 )

+func intPtr(v int) *int { return &v }
+
 // NewOllama creates a provider for a local Ollama instance.
 func NewOllama(cfg provider.ProviderConfig) (provider.Provider, error) {
 	if cfg.BaseURL == "" {
@@ -23,6 +25,9 @@ func NewOllama(cfg provider.ProviderConfig) (provider.Provider, error) {
 	if cfg.Model == "" {
 		cfg.Model = "qwen3:8b"
 	}
+	if cfg.MaxRetries == nil {
+		cfg.MaxRetries = intPtr(0) // local 500s are deterministic, not transient
+	}
 	return oaiprov.New(cfg)
 }

@@ -37,5 +42,8 @@ func NewLlamaCpp(cfg provider.ProviderConfig) (provider.Provider, error) {
 	if cfg.Model == "" {
 		cfg.Model = "default"
 	}
+	if cfg.MaxRetries == nil {
+		cfg.MaxRetries = intPtr(0) // local 500s are deterministic, not transient
+	}
 	return oaiprov.New(cfg)
 }
--- a/internal/provider/openaicompat/provider_test.go
+++ b/internal/provider/openaicompat/provider_test.go
@@ -0,0 +1,51 @@
+package openaicompat
+
+import (
+	"testing"
+
+	"somegit.dev/Owlibou/gnoma/internal/provider"
+)
+
+func TestNewOllama_SetsMaxRetriesToZero(t *testing.T) {
+	cfg := provider.ProviderConfig{Model: "test-model"}
+	_, err := NewOllama(cfg)
+	if err != nil {
+		t.Fatalf("NewOllama() error = %v", err)
+	}
+}
+
+func TestNewLlamaCpp_SetsMaxRetriesToZero(t *testing.T) {
+	cfg := provider.ProviderConfig{Model: "test-model"}
+	_, err := NewLlamaCpp(cfg)
+	if err != nil {
+		t.Fatalf("NewLlamaCpp() error = %v", err)
+	}
+}
+
+func TestNewOllama_Defaults(t *testing.T) {
+	cfg := provider.ProviderConfig{}
+	p, err := NewOllama(cfg)
+	if err != nil {
+		t.Fatalf("NewOllama() error = %v", err)
+	}
+	if p.Name() != "openai" {
+		t.Errorf("Name() = %q, want %q", p.Name(), "openai")
+	}
+	if p.DefaultModel() != "qwen3:8b" {
+		t.Errorf("DefaultModel() = %q, want %q", p.DefaultModel(), "qwen3:8b")
+	}
+}
+
+func TestNewLlamaCpp_Defaults(t *testing.T) {
+	cfg := provider.ProviderConfig{}
+	p, err := NewLlamaCpp(cfg)
+	if err != nil {
+		t.Fatalf("NewLlamaCpp() error = %v", err)
+	}
+	if p.Name() != "openai" {
+		t.Errorf("Name() = %q, want %q", p.Name(), "openai")
+	}
+	if p.DefaultModel() != "default" {
+		t.Errorf("DefaultModel() = %q, want %q", p.DefaultModel(), "default")
+	}
+}
--- a/internal/provider/registry.go
+++ b/internal/provider/registry.go
@@ -7,11 +7,12 @@ import (

 // ProviderConfig is the common configuration for any provider.
 type ProviderConfig struct {
-	Name    string
-	APIKey  string
-	BaseURL string         // override for OpenAI-compat endpoints
-	Model   string         // default model for this provider
-	Options map[string]any // provider-specific options
+	Name       string
+	APIKey     string
+	BaseURL    string         // override for OpenAI-compat endpoints
+	Model      string         // default model for this provider
+	Options    map[string]any // provider-specific options
+	MaxRetries *int           // nil = SDK default; ptr(0) = no retries
 }

 // Factory creates a Provider from configuration.
--- a/internal/router/discovery.go
+++ b/internal/router/discovery.go
@@ -6,7 +6,6 @@ import (
 	"fmt"
 	"log/slog"
 	"net/http"
-	"strings"
 	"time"

 	"somegit.dev/Owlibou/gnoma/internal/provider"
@@ -24,33 +23,12 @@ type DiscoveredModel struct {
 	ContextSize   int    // context window in tokens (0 = unknown, use default)
 }

-// toolSupportedModelPrefixes lists known model families that support tool/function calling.
-// This is a conservative allowlist — unknown models default to no tool support.
-var toolSupportedModelPrefixes = []string{
-	"mistral", "mixtral", "codestral",
-	"llama3", "llama-3",
-	"qwen2", "qwen-2", "qwen2.5",
-	"command-r",
-	"functionary",
-	"hermes",
-	"firefunction",
-	"nexusraven",
-	"groq-tool",
-}
-
-// inferToolSupport returns true if the model name suggests tool/function calling support.
-func inferToolSupport(modelName string) bool {
-	lower := strings.ToLower(modelName)
-	for _, prefix := range toolSupportedModelPrefixes {
-		if strings.Contains(lower, prefix) {
-			return true
-		}
-	}
-	return false
-}

 // DiscoverOllama polls the local Ollama instance for available models.
-func DiscoverOllama(ctx context.Context, baseURL string) ([]DiscoveredModel, error) {
+// toolCache caches /api/show probe results per model name to avoid N requests
+// per discovery cycle. Pass nil to probe every model unconditionally.
+// The caller owns the cache and should pass the same map across cycles.
+func DiscoverOllama(ctx context.Context, baseURL string, toolCache map[string]bool) ([]DiscoveredModel, error) {
 	if baseURL == "" {
 		baseURL = "http://localhost:11434"
 	}
@@ -87,17 +65,35 @@ func DiscoverOllama(ctx context.Context, baseURL string) ([]DiscoveredModel, err
 		return nil, fmt.Errorf("ollama response parse: %w", err)
 	}

+	currentModels := make(map[string]bool, len(result.Models))
 	var models []DiscoveredModel
 	for _, m := range result.Models {
+		currentModels[m.Name] = true
+		supportsTools, cached := false, false
+		if toolCache != nil {
+			supportsTools, cached = toolCache[m.Name]
+		}
+		if !cached {
+			supportsTools = probeOllamaToolSupport(ctx, baseURL, m.Name)
+			if toolCache != nil {
+				toolCache[m.Name] = supportsTools
+			}
+		}
 		models = append(models, DiscoveredModel{
 			ID:            m.Name,
 			Name:          m.Name,
 			Provider:      "ollama",
 			Size:          m.Size,
-			SupportsTools: inferToolSupport(m.Name),
+			SupportsTools: supportsTools,
 			ContextSize:   32768, // conservative default; Ollama /api/show can refine this
 		})
 	}
+	// Prune cache entries for disappeared models (may be a different quant next time).
+	for name := range toolCache {
+		if !currentModels[name] {
+			delete(toolCache, name)
+		}
+	}
 	return models, nil
 }

@@ -134,13 +130,20 @@ func DiscoverLlamaCpp(ctx context.Context, baseURL string) ([]DiscoveredModel, e
 		return nil, fmt.Errorf("llama.cpp response parse: %w", err)
 	}

+	// llama.cpp loads one model server-wide; probe once for tool support.
+	toolSupport := probeLlamaCppToolSupport(ctx, baseURL)
+	slog.Debug("llamacpp discovery probe complete",
+		"models_found", len(result.Data),
+		"tool_support", toolSupport,
+	)
+
 	var models []DiscoveredModel
 	for _, m := range result.Data {
 		models = append(models, DiscoveredModel{
 			ID:            m.ID,
 			Name:          m.ID,
 			Provider:      "llamacpp",
-			SupportsTools: inferToolSupport(m.ID),
+			SupportsTools: toolSupport,
 			ContextSize:   8192, // llama.cpp default; --ctx-size configurable
 		})
 	}
@@ -149,10 +152,11 @@ func DiscoverLlamaCpp(ctx context.Context, baseURL string) ([]DiscoveredModel, e

 // DiscoverLocalModels discovers all available local models (ollama + llama.cpp).
 // Non-blocking: failures are logged and skipped.
-func DiscoverLocalModels(ctx context.Context, logger *slog.Logger, ollamaURL, llamacppURL string) []DiscoveredModel {
+// ollamaToolCache is passed to DiscoverOllama; nil skips caching.
+func DiscoverLocalModels(ctx context.Context, logger *slog.Logger, ollamaURL, llamacppURL string, ollamaToolCache map[string]bool) []DiscoveredModel {
 	var all []DiscoveredModel

-	if models, err := DiscoverOllama(ctx, ollamaURL); err != nil {
+	if models, err := DiscoverOllama(ctx, ollamaURL, ollamaToolCache); err != nil {
 		logger.Debug("ollama discovery failed (non-fatal)", "error", err)
 	} else {
 		logger.Debug("discovered ollama models", "count", len(models))
@@ -178,6 +182,7 @@ func StartDiscoveryLoop(ctx context.Context, r *Router, logger *slog.Logger,
 	onReconcile func(ArmID),
 ) {
 	go func() {
+		ollamaToolCache := make(map[string]bool)
 		ticker := time.NewTicker(interval)
 		defer ticker.Stop()
 		for {
@@ -185,7 +190,7 @@ func StartDiscoveryLoop(ctx context.Context, r *Router, logger *slog.Logger,
 			case <-ctx.Done():
 				return
 			case <-ticker.C:
-				models := DiscoverLocalModels(ctx, logger, ollamaURL, llamacppURL)
+				models := DiscoverLocalModels(ctx, logger, ollamaURL, llamacppURL, ollamaToolCache)
 				reconcileArms(r, models, providerFactory, logger, onReconcile)
 			}
 		}
--- a/internal/router/probe.go
+++ b/internal/router/probe.go
@@ -0,0 +1,97 @@
+package router
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"log/slog"
+	"net/http"
+	"slices"
+)
+
+// probeLlamaCppToolSupport queries the llama.cpp /props endpoint to determine
+// if the loaded model supports tool calling. Returns false on any error
+// (conservative: unknown = no tools).
+func probeLlamaCppToolSupport(ctx context.Context, baseURL string) bool {
+	ctx, cancel := context.WithTimeout(ctx, discoveryTimeout)
+	defer cancel()
+
+	req, err := http.NewRequestWithContext(ctx, "GET", baseURL+"/props", nil)
+	if err != nil {
+		return false
+	}
+
+	resp, err := http.DefaultClient.Do(req)
+	if err != nil {
+		return false
+	}
+	defer resp.Body.Close()
+
+	if resp.StatusCode != 200 {
+		return false
+	}
+
+	var result struct {
+		ChatTemplateCaps struct {
+			SupportsTools     bool `json:"supports_tools"`
+			SupportsToolCalls bool `json:"supports_tool_calls"`
+		} `json:"chat_template_caps"`
+	}
+	if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
+		slog.Debug("llamacpp /props decode failed", "error", err)
+		return false
+	}
+
+	caps := result.ChatTemplateCaps
+	supported := caps.SupportsTools && caps.SupportsToolCalls
+	slog.Debug("llamacpp tool probe",
+		"supports_tools", caps.SupportsTools,
+		"supports_tool_calls", caps.SupportsToolCalls,
+		"result", supported,
+	)
+	return supported
+}
+
+// probeOllamaToolSupport queries Ollama's /api/show endpoint to determine
+// if a specific model supports tool calling. Returns false on any error.
+func probeOllamaToolSupport(ctx context.Context, baseURL, modelName string) bool {
+	ctx, cancel := context.WithTimeout(ctx, discoveryTimeout)
+	defer cancel()
+
+	body, err := json.Marshal(map[string]string{"model": modelName})
+	if err != nil {
+		return false
+	}
+
+	req, err := http.NewRequestWithContext(ctx, "POST", baseURL+"/api/show", bytes.NewReader(body))
+	if err != nil {
+		return false
+	}
+	req.Header.Set("Content-Type", "application/json")
+
+	resp, err := http.DefaultClient.Do(req)
+	if err != nil {
+		return false
+	}
+	defer resp.Body.Close()
+
+	if resp.StatusCode != 200 {
+		return false
+	}
+
+	var result struct {
+		Capabilities []string `json:"capabilities"`
+	}
+	if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
+		slog.Debug("ollama /api/show decode failed", "model", modelName, "error", err)
+		return false
+	}
+
+	supported := slices.Contains(result.Capabilities, "tools")
+	slog.Debug("ollama tool probe",
+		"model", modelName,
+		"capabilities", result.Capabilities,
+		"supports_tools", supported,
+	)
+	return supported
+}
--- a/internal/router/probe_test.go
+++ b/internal/router/probe_test.go
@@ -0,0 +1,147 @@
+package router
+
+import (
+	"context"
+	"net/http"
+	"net/http/httptest"
+	"testing"
+)
+
+func TestProbeLlamaCppToolSupport_SupportsTools(t *testing.T) {
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		if r.URL.Path != "/props" {
+			t.Errorf("unexpected path %q", r.URL.Path)
+		}
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{
+			"chat_template": "...",
+			"chat_template_caps": {
+				"supports_tools": true,
+				"supports_tool_calls": true,
+				"supports_parallel_tool_calls": false,
+				"supports_system_role": true
+			}
+		}`))
+	}))
+	defer srv.Close()
+
+	got := probeLlamaCppToolSupport(context.Background(), srv.URL)
+	if !got {
+		t.Error("probeLlamaCppToolSupport() = false, want true for model with tool support")
+	}
+}
+
+func TestProbeLlamaCppToolSupport_NoToolSupport(t *testing.T) {
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{
+			"chat_template": "...",
+			"chat_template_caps": {
+				"supports_tools": false,
+				"supports_tool_calls": false,
+				"supports_system_role": true
+			}
+		}`))
+	}))
+	defer srv.Close()
+
+	got := probeLlamaCppToolSupport(context.Background(), srv.URL)
+	if got {
+		t.Error("probeLlamaCppToolSupport() = true, want false for model without tool support")
+	}
+}
+
+func TestProbeLlamaCppToolSupport_NoCaps(t *testing.T) {
+	// Old llama.cpp version that doesn't return chat_template_caps
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{"chat_template": "...", "total_slots": 1}`))
+	}))
+	defer srv.Close()
+
+	got := probeLlamaCppToolSupport(context.Background(), srv.URL)
+	if got {
+		t.Error("probeLlamaCppToolSupport() = true, want false when chat_template_caps is absent")
+	}
+}
+
+func TestProbeLlamaCppToolSupport_ServerDown(t *testing.T) {
+	got := probeLlamaCppToolSupport(context.Background(), "http://127.0.0.1:1")
+	if got {
+		t.Error("probeLlamaCppToolSupport() = true, want false when server unreachable")
+	}
+}
+
+func TestProbeLlamaCppToolSupport_ToolsWithoutToolCalls(t *testing.T) {
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{
+			"chat_template_caps": {
+				"supports_tools": true,
+				"supports_tool_calls": false
+			}
+		}`))
+	}))
+	defer srv.Close()
+
+	got := probeLlamaCppToolSupport(context.Background(), srv.URL)
+	if got {
+		t.Error("probeLlamaCppToolSupport() = true, want false when supports_tool_calls is false")
+	}
+}
+
+func TestProbeOllamaToolSupport_HasTools(t *testing.T) {
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		if r.URL.Path != "/api/show" || r.Method != http.MethodPost {
+			t.Errorf("unexpected %s %s", r.Method, r.URL.Path)
+		}
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{
+			"details": {"family": "qwen2", "parameter_size": "7B"},
+			"capabilities": ["completion", "tools"]
+		}`))
+	}))
+	defer srv.Close()
+
+	got := probeOllamaToolSupport(context.Background(), srv.URL, "qwen2.5:7b")
+	if !got {
+		t.Error("probeOllamaToolSupport() = false, want true for model with tools capability")
+	}
+}
+
+func TestProbeOllamaToolSupport_NoTools(t *testing.T) {
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{
+			"details": {"family": "phi", "parameter_size": "3B"},
+			"capabilities": ["completion"]
+		}`))
+	}))
+	defer srv.Close()
+
+	got := probeOllamaToolSupport(context.Background(), srv.URL, "phi3:3b")
+	if got {
+		t.Error("probeOllamaToolSupport() = true, want false for model without tools capability")
+	}
+}
+
+func TestProbeOllamaToolSupport_NoCapsField(t *testing.T) {
+	// Old Ollama version without capabilities
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{"details": {"family": "llama"}}`))
+	}))
+	defer srv.Close()
+
+	got := probeOllamaToolSupport(context.Background(), srv.URL, "llama3:8b")
+	if got {
+		t.Error("probeOllamaToolSupport() = true, want false when capabilities field absent")
+	}
+}
+
+func TestProbeOllamaToolSupport_ServerDown(t *testing.T) {
+	got := probeOllamaToolSupport(context.Background(), "http://127.0.0.1:1", "test")
+	if got {
+		t.Error("probeOllamaToolSupport() = true, want false when server unreachable")
+	}
+}
--- a/internal/session/local.go
+++ b/internal/session/local.go
@@ -177,6 +177,15 @@ func (s *Local) Cancel() {
 	}
 }

+func (s *Local) ResetError() {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	if s.state == StateError {
+		s.state = StateIdle
+		s.err = nil
+	}
+}
+
 func (s *Local) Close() error {
 	s.Cancel()
 	s.mu.Lock()
@@ -197,12 +206,13 @@ func (s *Local) Status() Status {
 	defer s.mu.Unlock()

 	st := Status{
-		State:      s.state,
-		Provider:   s.provider,
-		Model:      s.model,
-		TokensUsed: s.eng.Usage().TotalTokens(),
-		TurnCount:  s.turnCount,
-		TokenState: "ok",
+		State:          s.state,
+		Provider:       s.provider,
+		Model:          s.model,
+		TokensUsed:     s.eng.Usage().TotalTokens(),
+		TurnCount:      s.turnCount,
+		TokenState:     "ok",
+		ToolsAvailable: s.eng.ToolsAvailable(),
 	}

 	if w := s.eng.ContextWindow(); w != nil {
--- a/internal/session/session.go
+++ b/internal/session/session.go
@@ -38,14 +38,15 @@ func (s SessionState) String() string {

 // Status holds observable session state.
 type Status struct {
-	State        SessionState
-	Provider     string
-	Model        string
-	TokensUsed   int64
-	TokensMax    int64
-	TokenPercent int // 0-100
-	TokenState   string // "ok", "warning", "critical"
-	TurnCount    int
+	State          SessionState
+	Provider       string
+	Model          string
+	TokensUsed     int64
+	TokensMax      int64
+	TokenPercent   int    // 0-100
+	TokenState     string // "ok", "warning", "critical"
+	TurnCount      int
+	ToolsAvailable bool // false when model does not support tool calling
 }

 // Session is the boundary between UI and engine.
@@ -62,6 +63,9 @@ type Session interface {
 	TurnResult() (*engine.Turn, error)
 	// Cancel aborts the current turn.
 	Cancel()
+	// ResetError transitions the session from StateError back to StateIdle
+	// so a retry can be attempted. No-op if not in StateError.
+	ResetError()
 	// Close shuts down the session.
 	Close() error
 	// Status returns current session state.
--- a/internal/skill/embed_test.go
+++ b/internal/skill/embed_test.go
@@ -51,3 +51,188 @@ func TestBundledSkills_AllParseClean(t *testing.T) {
 		}
 	}
 }
+
+func TestBundledSkills_InitExists(t *testing.T) {
+	skills, err := BundledSkills()
+	if err != nil {
+		t.Fatalf("BundledSkills() error: %v", err)
+	}
+	var init *Skill
+	for _, s := range skills {
+		if s.Frontmatter.Name == "init" {
+			init = s
+			break
+		}
+	}
+	if init == nil {
+		t.Fatal("init skill not found in bundled skills")
+	}
+	if init.Frontmatter.Description == "" {
+		t.Error("init skill missing description")
+	}
+	if init.Body == "" {
+		t.Error("init skill has empty body")
+	}
+}
+
+func TestBundledSkills_InitRender_Local(t *testing.T) {
+	skills, err := BundledSkills()
+	if err != nil {
+		t.Fatalf("BundledSkills() error: %v", err)
+	}
+	var init *Skill
+	for _, s := range skills {
+		if s.Frontmatter.Name == "init" {
+			init = s
+			break
+		}
+	}
+	if init == nil {
+		t.Fatal("init skill not found")
+	}
+
+	rendered, err := init.Render(TemplateData{
+		ProjectRoot: "/tmp/myproject",
+		Local:       true,
+	})
+	if err != nil {
+		t.Fatalf("Render() error: %v", err)
+	}
+
+	// Local mode should use sequential fs_* tools, not spawn_elfs for orchestration
+	if !contains(rendered, "fs_ls") {
+		t.Error("local render should contain fs_ls")
+	}
+	if !contains(rendered, "fs_read") {
+		t.Error("local render should contain fs_read")
+	}
+	if contains(rendered, "Use spawn_elfs") {
+		t.Error("local render should NOT instruct to use spawn_elfs")
+	}
+	if !contains(rendered, "/tmp/myproject") {
+		t.Error("local render should contain ProjectRoot")
+	}
+}
+
+func TestBundledSkills_InitRender_Cloud(t *testing.T) {
+	skills, err := BundledSkills()
+	if err != nil {
+		t.Fatalf("BundledSkills() error: %v", err)
+	}
+	var init *Skill
+	for _, s := range skills {
+		if s.Frontmatter.Name == "init" {
+			init = s
+			break
+		}
+	}
+	if init == nil {
+		t.Fatal("init skill not found")
+	}
+
+	rendered, err := init.Render(TemplateData{
+		ProjectRoot: "/tmp/myproject",
+		Local:       false,
+	})
+	if err != nil {
+		t.Fatalf("Render() error: %v", err)
+	}
+
+	// Cloud mode should use spawn_elfs
+	if !contains(rendered, "spawn_elfs") {
+		t.Error("cloud render should contain spawn_elfs")
+	}
+	if !contains(rendered, "Elf 1") {
+		t.Error("cloud render should contain Elf 1")
+	}
+	if !contains(rendered, "/tmp/myproject") {
+		t.Error("cloud render should contain ProjectRoot")
+	}
+	if !contains(rendered, "creating") {
+		t.Error("cloud render (no Args) should say 'creating'")
+	}
+}
+
+func TestBundledSkills_InitRender_CloudUpdate(t *testing.T) {
+	skills, err := BundledSkills()
+	if err != nil {
+		t.Fatalf("BundledSkills() error: %v", err)
+	}
+	var init *Skill
+	for _, s := range skills {
+		if s.Frontmatter.Name == "init" {
+			init = s
+			break
+		}
+	}
+	if init == nil {
+		t.Fatal("init skill not found")
+	}
+
+	rendered, err := init.Render(TemplateData{
+		ProjectRoot: "/tmp/myproject",
+		Args:        "/tmp/myproject/AGENTS.md",
+		Local:       false,
+	})
+	if err != nil {
+		t.Fatalf("Render() error: %v", err)
+	}
+
+	// Cloud update mode should have Elf 4 for review
+	if !contains(rendered, "Elf 4") {
+		t.Error("cloud update render should contain Elf 4")
+	}
+	if !contains(rendered, "updating") {
+		t.Error("cloud update render should say 'updating'")
+	}
+	if !contains(rendered, "/tmp/myproject/AGENTS.md") {
+		t.Error("cloud update render should contain existing path")
+	}
+}
+
+func TestBundledSkills_InitRender_LocalUpdate(t *testing.T) {
+	skills, err := BundledSkills()
+	if err != nil {
+		t.Fatalf("BundledSkills() error: %v", err)
+	}
+	var init *Skill
+	for _, s := range skills {
+		if s.Frontmatter.Name == "init" {
+			init = s
+			break
+		}
+	}
+	if init == nil {
+		t.Fatal("init skill not found")
+	}
+
+	rendered, err := init.Render(TemplateData{
+		ProjectRoot: "/tmp/myproject",
+		Args:        "/tmp/myproject/AGENTS.md",
+		Local:       true,
+	})
+	if err != nil {
+		t.Fatalf("Render() error: %v", err)
+	}
+
+	// Local update should mention existing file
+	if !contains(rendered, "existing AGENTS.md") {
+		t.Error("local update render should mention existing file")
+	}
+	if contains(rendered, "Use spawn_elfs") {
+		t.Error("local update render should NOT instruct to use spawn_elfs")
+	}
+}
+
+func contains(s, substr string) bool {
+	return len(s) > 0 && len(substr) > 0 && stringContains(s, substr)
+}
+
+func stringContains(s, substr string) bool {
+	for i := 0; i <= len(s)-len(substr); i++ {
+		if s[i:i+len(substr)] == substr {
+			return true
+		}
+	}
+	return false
+}
--- a/internal/skill/skills/init.md
+++ b/internal/skill/skills/init.md
@@ -0,0 +1,118 @@
+---
+name: init
+description: Generate or update AGENTS.md project documentation
+whenToUse: When user runs /init to create or update project documentation
+---
+{{define "local-init"}}You are {{if .Args}}updating{{else}}creating{{end}} an AGENTS.md project documentation file for the project at {{.ProjectRoot}}.
+
+Use ONLY these tools: fs_ls, fs_read, fs_glob, fs_grep, fs_write.
+Do NOT use bash or spawn_elfs.
+IMPORTANT: Keep context small — read only what you need, stop reading when you have enough.
+
+STEP 1 — Read config files (these are loaded at runtime alongside AGENTS.md):
+- fs_read CLAUDE.md if it exists. Note which topics it covers — AGENTS.md must NOT repeat any of them.{{if .Args}}
+- fs_read the existing AGENTS.md at {{.Args}}. Keep sections that are accurate and not in CLAUDE.md. Remove duplicated or stale content.{{end}}
+
+STEP 2 — Gather project facts (be brief — read as few files as possible):
+- fs_ls on the project root.
+- fs_read go.mod for dependencies not listed in CLAUDE.md.
+- fs_read Makefile for non-standard targets (skip build/test/lint/cover/fmt/vet/clean/tidy/install/run).
+- fs_read 1-2 source files to spot project-specific patterns. Stop once you have enough.
+- Do NOT use fs_grep — it returns too much output. If you need env var names, look in main.go or config files only.
+
+STEP 3 — Write AGENTS.md to {{.ProjectRoot}}/AGENTS.md.
+
+RULES:
+- Do NOT repeat anything from CLAUDE.md — it is loaded alongside AGENTS.md at runtime.
+- Quality test: would removing this line cause an AI to make a mistake? If no, cut it.
+- No emojis. Plain markdown headers. Terse directive-style bullets.
+- Short code examples only where the pattern is non-obvious — cite the source file.
+- Do not fabricate. Only write what you observed.
+
+INCLUDE (only if not in CLAUDE.md): key dependencies with import paths, non-standard build targets, domain terminology, environment variables, code patterns with real examples, architectural gotchas.
+EXCLUDE: anything in CLAUDE.md, standard targets, file listings, generic advice, standard language conventions.{{end}}
+{{define "cloud-elfs"}}IMPORTANT: Use only fs.ls, fs.glob, fs.grep, and fs.read for all analysis. Do NOT use bash — it will be denied and will cause you to fail. Your first action must be spawn_elfs.
+
+Use spawn_elfs to analyze the project in parallel. Spawn at least these elfs simultaneously:
+
+- Elf 1 (task_type: "explain"): Explore project structure at {{.ProjectRoot}}.
+  - Run fs.ls on root and every immediate subdirectory.
+  - Read go.mod (or package.json/Cargo.toml/pyproject.toml): extract module path, Go/runtime version, and key external dependencies with exact import paths. List TUI/UI framework deps (e.g. charm.land/*, tview) separately from backend/LLM deps.
+  - Read Makefile or build scripts: note targets beyond the standard (build/test/lint/fmt/vet/clean/tidy/install). Note non-standard flags, multi-step sequences, or env vars they require.
+  - Read existing AI config files if present: CLAUDE.md, .cursor/rules, .cursorrules, .github/copilot-instructions.md, .gnoma/GNOMA.md. These will be loaded at runtime — do NOT copy their content into AGENTS.md. Only note what topics they cover so the synthesis step knows what to skip.
+  - Build a domain glossary: read the primary type-definition files in these packages (use fs.ls to find them): internal/message, internal/engine, internal/router, internal/elf, internal/provider, internal/context, internal/security, internal/session. For each exported type, struct, or interface whose name would be ambiguous or non-obvious to an outside AI, add a one-line entry: Name → what it is in this project. Specifically look for: Arm, Turn, Elf, Accumulator, Firewall, LimitPool, TaskType, Incognito, Stream, Event, Session, Router. Do not list generic config struct fields.
+  - Report: module path, runtime version, non-standard Makefile targets only (skip standard ones: build/test/lint/cover/fmt/vet/clean/tidy/install/run), full dependency list (TUI + backend separated), domain glossary.
+
+- Elf 2 (task_type: "explain"): Discover non-standard code conventions at {{.ProjectRoot}}.
+  - Use fs.glob **/*.go (or language equivalent) to find source files. Read at least 8 files spanning different packages — prefer non-trivial ones (engine, provider, tool implementations, tests).
+  - Use fs.grep to locate each pattern below. NEVER use internal/tui as a source for code examples — it is application glue, not where idioms live. For each match found: read the file, then paste the relevant lines with the file path as the first comment (e.g. '// internal/foo/bar.go'). If fs.grep returns no matches outside internal/tui, omit that pattern entirely. Do NOT invent or paraphrase.
+    * new(expr): fs.grep '= new(' across **/*.go, exclude internal/tui
+    * errors.AsType: fs.grep 'errors.AsType' across **/*.go
+    * WaitGroup.Go: fs.grep '\.Go(func' across **/*.go
+    * testing/synctest: fs.grep 'synctest' across **/*.go
+    * Discriminated union: fs.grep 'Content|EventType|ContentType' across internal/message, internal/stream — look for a struct with a Type field switched on by callers
+    * Pull-based iterator: fs.grep 'func.*Next\(\)' across **/*.go — look for Next/Current/Err/Close pattern
+    * json.RawMessage passthrough: fs.grep 'json.RawMessage' across internal/tool — find a Parameters() or Execute() signature
+    * errgroup: fs.grep 'errgroup' across **/*.go
+    * Channel semaphore: fs.grep 'chan struct{}' across **/*.go, look for concurrency-limiting usage
+  - Error handling: fs.grep 'var Err' across **/*.go — paste a real sentinel definition. fs.grep 'fmt.Errorf' across **/*.go and look for error-wrapping calls — paste a real one. File path required on each.
+  - Test conventions: fs.grep '//go:build' across **/*_test.go for build tags. fs.grep 't.Helper()' across **/*_test.go for helper convention. fs.grep 't.TempDir()' across **/*_test.go. Paste one real example each with file path.
+  - Report ONLY what differs from standard language knowledge. Skip obvious conventions.
+
+- Elf 3 (task_type: "explain"): Extract setup requirements and gotchas at {{.ProjectRoot}}.
+  - Read README.md, CONTRIBUTING.md, docs/ contents if they exist.
+  - Find required environment variables: use fs.grep to search for os.Getenv and os.LookupEnv across all .go files. List every unique variable name found and what it configures based on surrounding context. Also check .env.example if it exists.
+  - Note non-obvious setup steps (token scopes, local service dependencies, build prerequisites not in the Makefile).
+  - Note repo etiquette ONLY if not already covered by CLAUDE.md — skip commit format and co-signing if CLAUDE.md documents them.
+  - Note architectural gotchas explicitly called out in comments or docs — skip generic advice.
+  - Skip anything obvious for a project of this type.{{end}}
+{{define "synth-rules"}}After all elfs complete, you may spawn additional focused elfs with agent tool if specific gaps need investigation.
+
+Then synthesize and write AGENTS.md to {{.ProjectRoot}}/AGENTS.md using fs.write.
+
+CRITICAL RULE — DO NOT DUPLICATE LOADED FILES:
+CLAUDE.md (and other AI config files) are loaded directly into the AI's context at runtime.
+Writing their content into AGENTS.md is pure noise — it will be read twice and adds nothing.
+AGENTS.md must only contain information those files do not already cover.
+If CLAUDE.md thoroughly covers a topic (e.g. Go style, commit format, provider list), skip it.
+
+QUALITY TEST: Before writing each line — would removing this cause an AI assistant to make a mistake on this codebase? If no, cut it.
+
+INCLUDE (only if not already in CLAUDE.md or equivalent):
+- Module path and key dependencies with exact import paths (especially non-obvious or private ones)
+- Build/test commands the AI cannot guess from manifest files alone (non-standard targets, flags, sequences)
+- Language-version-specific idioms in use: e.g. Go 1.26 new(expr), errors.AsType, WaitGroup.Go; show code examples
+- Non-standard type patterns: discriminated unions, pull-based iterators, json.RawMessage passthrough — with examples
+- Domain terminology: project-specific names that differ from industry-standard meanings
+- Testing quirks: build tags, helper conventions, concurrency test tools, mock policy
+- Required env var names and what they configure (not "see .env.example" — list them)
+- Non-obvious architectural constraints or gotchas not derivable from reading the code
+
+EXCLUDE:
+- Anything already documented in CLAUDE.md or other AI config files that will be loaded at runtime
+- File-by-file directory listing (discoverable via fs.ls)
+- Standard language conventions the AI already knows
+- Generic advice ("write clean code", "handle errors", "use descriptive names")
+- Standard Makefile/build targets (build, test, lint, cover, fmt, vet, clean, tidy, install, run) — do not list them at all, not even as a summary line; only write non-standard targets
+- The "Standard Targets: ..." line itself — it adds nothing and must not appear
+- Planned features not yet in code
+- Vague statements ("see config files for details", "follow project conventions") — include the actual detail or nothing
+
+Do not fabricate. Only write what was observed in files you actually read.
+Format: terse directive-style bullets. Short code examples where the pattern is non-obvious. No prose paragraphs.
+No emojis anywhere in the output. Use plain markdown headers.{{end}}
+{{if .Local}}{{template "local-init" .}}{{else}}{{if .Args}}You are updating the AGENTS.md project documentation file for the project at {{.ProjectRoot}}.
+
+{{template "cloud-elfs" .}}
+- Elf 4 (task_type: "review"): Read the existing AGENTS.md at {{.Args}}.
+  - For each section: accurate (keep), stale (update), missing (add), bloat (cut — fails quality test).
+  - Specifically flag: anything duplicated from CLAUDE.md or other loaded AI config files (remove it), fabricated content (remove it), and missing language-version-specific idioms.
+  - Report a structured diff: keep / update / add / remove.
+
+{{template "synth-rules" .}}
+
+When updating: tighten as well as correct. Remove duplication and bloat even if it was in the old version.{{else}}You are creating an AGENTS.md project documentation file for the project at {{.ProjectRoot}}.
+
+{{template "cloud-elfs" .}}
+
+{{template "synth-rules" .}}{{end}}{{end}}
--- a/internal/skill/template.go
+++ b/internal/skill/template.go
@@ -12,6 +12,7 @@ type TemplateData struct {
 	Args        string // raw user arguments after the skill name
 	Cwd         string // current working directory
 	ProjectRoot string // detected project root
+	Local       bool   // true if using a local provider (Ollama, llama.cpp)
 }

 // Render executes the skill body as a Go text/template with data.
--- a/internal/tui/app.go
+++ b/internal/tui/app.go
@@ -22,7 +22,6 @@ import (
 	"somegit.dev/Owlibou/gnoma/internal/engine"
 	"somegit.dev/Owlibou/gnoma/internal/message"
 	"somegit.dev/Owlibou/gnoma/internal/permission"
-	"somegit.dev/Owlibou/gnoma/internal/provider"
 	"somegit.dev/Owlibou/gnoma/internal/router"
 	"somegit.dev/Owlibou/gnoma/internal/security"
 	"somegit.dev/Owlibou/gnoma/internal/session"
@@ -479,6 +478,51 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		m.elfOrder = nil
 		m.runningTools = nil

+		// If /init failed with a tool-parse error on a local model, the model can
+		// generate text but not valid tool-call JSON. Retry without tools — ask the
+		// model to output AGENTS.md as plain markdown text instead.
+		if m.initPending && !m.initRetried && msg.err != nil &&
+			strings.Contains(msg.err.Error(), "parse tool call") {
+			m.initRetried = true
+			m.streaming = true
+			m.thinkingBuf.Reset()
+			m.streamBuf.Reset()
+			if m.config.Engine != nil {
+				m.config.Engine.Reset()
+			}
+			m.messages = append(m.messages, chatMessage{
+				role:    "system",
+				content: "tool-call JSON failed — retrying without tools (text-only fallback)",
+			})
+			root := gnomacfg.ProjectRoot()
+			textPrompt := fmt.Sprintf(`You are creating an AGENTS.md project documentation file for the project at %s.
+
+You have NO tools available. Based on common Go project conventions, generate a useful AGENTS.md skeleton.
+
+Output the complete document as markdown text, starting with a # heading. Include sections for:
+- Module path (use the project directory name as a hint)
+- Key dependencies (common for a Go TUI/LLM project)
+- Build commands (make build/test/lint/cover)
+- Code conventions
+- Environment variables
+- Domain terminology
+
+Mark anything you're unsure about with TODO. Be terse — directive-style bullets, no prose.`, root)
+			// Reset session from StateError so it accepts a new Send.
+			m.session.ResetError()
+			// Send with empty AllowedTools to suppress all tool schemas.
+			opts := engine.TurnOptions{AllowedTools: []string{}}
+			if err := m.session.SendWithOptions(textPrompt, opts); err != nil {
+				m.messages = append(m.messages, chatMessage{role: "error", content: err.Error()})
+				m.streaming = false
+				m.initPending = false
+			}
+			// Mark as write-nudged so the disk-write logic at turnDone catches the output.
+			m.initHadToolCalls = true
+			m.initWriteNudged = true
+			return m, m.listenForEvents()
+		}
+
 		// If /init completed with any content but no tool calls, the model described or
 		// planned but didn't call spawn_elfs. Retry once with a fresh context and a
 		// short direct prompt that's easier for local models to act on.
@@ -924,13 +968,30 @@ func (m Model) handleCommand(cmd string) (tea.Model, tea.Cmd) {
 		local := isLocalProvider(status.Provider)

 		var prompt string
-		if local {
-			prompt = localInitPrompt(root, existingPath)
-		} else {
-			prompt = initPrompt(root, existingPath)
+		if m.config.Skills != nil {
+			if sk := m.config.Skills.Get("init"); sk != nil {
+				rendered, err := sk.Render(skill.TemplateData{
+					Args:        existingPath,
+					ProjectRoot: root,
+					Cwd:         m.cwd,
+					Local:       local,
+				})
+				if err == nil {
+					prompt = rendered
+				}
+			}
+		}
+		// Fallback to hardcoded prompts if skill not found.
+		if prompt == "" {
+			if local {
+				prompt = localInitPrompt(root, existingPath)
+			} else {
+				prompt = initPrompt(root, existingPath)
+			}
 		}

 		m.messages = append(m.messages, chatMessage{role: "user", content: "/init"})
+
 		m.streaming = true
 		m.currentRole = "assistant"
 		m.streamBuf.Reset()
@@ -941,14 +1002,7 @@ func (m Model) handleCommand(cmd string) (tea.Model, tea.Cmd) {
 		m.initRetried = false
 		m.initWriteNudged = false

-		// Cloud models: use spawn_elfs for parallel analysis.
-		// Local models: use the simplified prompt with sequential fs_* tools.
-		// Force tool_choice: required for local models so the API emits function
-		// call JSON rather than narrating the tool calls as text.
 		opts := engine.TurnOptions{}
-		if local {
-			opts.ToolChoice = provider.ToolChoiceRequired
-		}
 		if err := m.session.SendWithOptions(prompt, opts); err != nil {
 			m.messages = append(m.messages, chatMessage{role: "error", content: err.Error()})
 			m.streaming = false
@@ -1092,6 +1146,7 @@ func (m Model) handleCommand(cmd string) (tea.Model, tea.Cmd) {
 					Args:        args,
 					Cwd:         m.cwd,
 					ProjectRoot: gnomacfg.ProjectRoot(),
+					Local:       isLocalProvider(m.session.Status().Provider),
 				})
 				if err != nil {
 					m.messages = append(m.messages, chatMessage{role: "error",
--- a/internal/tui/init.go
+++ b/internal/tui/init.go
@@ -10,9 +10,9 @@ import (
 	"somegit.dev/Owlibou/gnoma/internal/message"
 )

-// localInitPrompt builds a simplified /init prompt for local models (Ollama, llama.cpp).
-// Instead of spawn_elfs (complex nested JSON that local models can't produce reliably),
-// this uses sequential simple tool calls: fs_ls, fs_read, fs_glob, fs_write.
+// Deprecated: localInitPrompt is the hardcoded fallback for /init on local models.
+// Prefer the bundled "init" skill with Local=true. This function is retained as a
+// fallback if the skill registry is unavailable.
 func localInitPrompt(root, existingPath string) string {
 	existing := ""
 	if existingPath != "" {
@@ -40,10 +40,9 @@ Format: terse directive-style bullets. Short code examples where non-obvious.
 Do not fabricate. Only write what you observed.`, root, existing, root)
 }

-// initPrompt builds the prompt sent to the LLM for /init.
-// existingPath is the absolute path to an existing AGENTS.md, or "" if none exists.
-// The 3 base elfs always run. When existingPath is set, a 4th elf reads the current file.
-// The LLM is free to spawn additional elfs if it identifies gaps.
+// Deprecated: initPrompt is the hardcoded fallback for /init on cloud models.
+// Prefer the bundled "init" skill with Local=false. This function is retained as a
+// fallback if the skill registry is unavailable.
 func initPrompt(root, existingPath string) string {
 	baseElfs := fmt.Sprintf(`IMPORTANT: Use only fs.ls, fs.glob, fs.grep, and fs.read for all analysis. Do NOT use bash — it will be denied and will cause you to fail. Your first action must be spawn_elfs.

--- a/internal/tui/rendering.go
+++ b/internal/tui/rendering.go
@@ -520,6 +520,9 @@ func (m Model) renderStatus() string {
 	if m.incognito {
 		provModel += " " + sStatusIncognito.Render("🔒")
 	}
+	if !status.ToolsAvailable {
+		provModel += " " + sStatusDim.Render("text-only")
+	}
 	left := sStatusHighlight.Render(provModel)

 	// Center: cwd + git branch