gnoma

Author	SHA1	Message	Date
vikingowl	43ea2e562d	feat(engine): two-stage tool routing for small local arms Plan A from docs/superpowers/plans/2026-05-19-post-slm-unlock.md. Small local SLMs (<=16k context) waste ~1500 tokens per turn on the full tool catalogue. Two-stage routing replaces round-1 tools with a single synthetic select_category schema; round-2+ sends only the selected category's real tool schemas plus select_category for re-selection. - internal/tool/category.go: Category type, optional Categorized interface, CategoryOf() with meta fallback. fs.read/fs.ls -> read, fs.write/fs.edit -> write, fs.glob/fs.grep -> search, bash -> exec. - internal/engine/twostage.go: synthetic select_category tool, intercept helper, per-turn selectedCategory state under e.mu. - Engine round 1 forces ToolChoiceRequired so SLMs don't fall back to prose. State resets at the top and end of every runLoop. - Activates automatically on a forced local arm with ContextWindow <=16384, or via [router].force_two_stage TOML key. - Integration test drives a 3-round trip and asserts: round 1 emits exactly one schema (synthetic) with ToolChoiceRequired, round 2 contains only write-category schemas + select_category, real fs.write executes. Invalid-category fallback round-trips back to round-1 mode.	2026-05-19 20:53:21 +02:00
vikingowl	21da29e73e	docs(plan): capture post-SLM-unlock outstanding work New dated plan at docs/superpowers/plans/2026-05-19-post-slm-unlock.md covers the work surfaced during this session that hasn't shipped yet: Phase A — two-stage tool routing (last item from the original smallcode audit; gates on local + small-context arms; saves ~70% of schema tokens per request). Phase B — CLI agent binary override. [cli_agents] config section lets users map canonical agent names (claude / gemini / vibe) onto local aliases (claude-priv, gemini-work, etc.). Phase C — user profiles. Multiple named configs (work / private / experiment) layered over a base config.toml, switchable via --profile flag, [config].default_profile, and a /profile TUI command. Phase D — per-arm capability tags (Phase-4 prep). Per-arm Strengths []TaskType and CostWeight to make the router actually pick Opus over Gemini for Planning/SecurityReview etc., not just for cost reasons. Phase E — compound tools (deferred until SLM-arm telemetry shows which chain patterns fail). Plus an explicit drop list of things we considered and won't ship. TODO.md updated to point at the new plan and note that the original roadmap's Phase 4 is now superseded.	2026-05-19 19:31:40 +02:00
vikingowl	a9213ec382	feat(slm): Wave C — SLM classifier, MaxComplexity routing, CLI subcommands, TUI status - slm.Classifier: openaicompat → llamafile, 2s timeout + heuristic fallback, heuristic baseline blended so Priority/RequiredEffort are never zeroed, extractJSON strips markdown fences from small-model responses - router.ParseTaskType: case-insensitive string → TaskType, unknown → TaskGeneration - router.Arm.MaxComplexity: zero = no ceiling (preserves existing arm behavior); filterFeasible excludes arms when task.ComplexityScore > MaxComplexity - config.SLMSection: [slm] enabled / model_url / data_dir - openaicompat.NewLlamafile: no API key, model = "default", no retries - slm.Manager: DefaultDataDir() (XDG), Manifest() accessor - cmd/gnoma: `gnoma slm setup` / `gnoma slm status` subcommands; SLM arm registered with MaxComplexity=0.3 when enabled + set up - tui: /config shows slm status (ready/missing/not set up + base URL if running) - docs: roadmap updated to reflect llamafile pivot from Ollama	2026-05-07 16:44:32 +02:00
vikingowl	5569d4fb86	docs: consolidated roadmap, ADR-013, drop stale plans - New 7-phase roadmap (2026-05-07-gnoma-roadmap.md) covering M8 cleanup, PTY interactive shell, SLM classifier, router revisit, USP security, ELF support, and distribution - ADR-013 (002-slm-routing.md): SLM-first routing supersedes ADR-009; Thompson Sampling deferred pending SLM production data - ADR-009 status updated to "Superseded by ADR-013" - gemma-integration-analysis.md: header note that Node.js specifics (LiteRT-LM, daemon, PID) don't apply to gnoma's Go implementation - TODO.md replaced with thin pointer to roadmap + stable backlog - Deleted stale plan/spec files: m6-m7-closeout, m8-hooks-design	2026-05-07 15:06:54 +02:00
vikingowl	fef38b3502	docs: M8.1 hook system design spec	2026-04-06 02:42:34 +02:00
vikingowl	43dcc7e9de	docs: M6/M7 close-out implementation plan — 8 tasks, TDD, full file map	2026-04-05 21:33:42 +02:00
vikingowl	252ffde732	docs: M6/M7 close-out design spec — tool persistence, tokenizer, router feedback, coordinator	2026-04-05 21:22:26 +02:00

7 Commits