Commit Graph

10 Commits

Author SHA1 Message Date
vikingowl 635dad660c feat(config): per-profile config layering with --profile flag (Phase C-1)
Adds opt-in user profiles for swapping API keys, CLI binaries, and
permission modes between contexts (work/private/experiment/...).

Profile mode engages only when ~/.config/gnoma/profiles/ exists, so
existing single-config installations are untouched. Selection order:
--profile flag → default_profile in base config → fatal error.

Layering: defaults → ~/.config/gnoma/config.toml → profiles/<name>.toml
→ <projectRoot>/.gnoma/config.toml → env. Map sections merge per-key;
[[arms]] and [[mcp_servers]] merge by id/name; [[hooks]] appends.

Per-profile data: quality-<name>.json and sessions/<name>/ keep the
bandit and session list from cross-contaminating between profiles.

Profile names restricted to [A-Za-z0-9_-] to block --profile=../foo
path traversal into derived paths.
2026-05-19 21:35:33 +02:00
vikingowl 0aabd19906 feat(router): per-arm strengths + cost weight (Phase D)
Plan D from docs/superpowers/plans/2026-05-19-post-slm-unlock.md
(static portion; dynamic bandit-driven promotion deferred to D-2).

Routing previously let tier ordering (CLI > local > API) dominate
selection — Opus, in tier 3, would lose to a tier-1 CLI agent for
SecurityReview even though Opus is empirically stronger at that task.
This change introduces explicit per-arm overrides:

  [[arms]]
  id = "anthropic/claude-opus-4-7"
  strengths = ["security_review", "planning"]
  cost_weight = 0.3

Strengths gate cross-tier promotion: arms matching task.Type bypass
the tier loop and compete with each other directly. Promotion is a
preference, not a pin — if no strength-tagged arm is feasible
(backoff, pool capacity, tool support), selection falls through to
the default tier order.

CostWeight linearly dampens the cost penalty in scoreArm via
  effectiveCost = 1 + CostWeight * (cost - 1)
CostWeight=1.0 (or unset) preserves current behavior; lower values
trade cheapness for quality. The earlier draft used cost^CostWeight
which inverts direction for sub-1 local-arm costs (raising a
fraction <1 to a fractional power makes it bigger, not smaller); a
monotonicity regression test prevents that drift.

- internal/router/arm.go: Strengths []TaskType, CostWeight float64,
  HasStrength(), ResolvedCostWeight() (zero → 1.0).
- internal/router/selector.go: scoreArm strength bonus const
  (strengthScoreBonus = 0.15) + linear cost dampening; selectBest
  cross-tier promotion before tier loop.
- internal/router/router.go: ArmOverride type + ApplyArmOverrides()
  returns unknown IDs; unknown strength names skipped with per-name
  warning via slog.
- internal/router/task.go: ParseTaskTypeStrict() returns ok bool;
  ParseTaskType now delegates so the two switches stay in sync.
- internal/config/config.go: ArmConfig + [[arms]] TOML wiring.
- cmd/gnoma/main.go: applies overrides after all initial arms
  register; logs a warning when an [[arms]] id has no matching
  registered arm.

Tests cover: predicate helpers, scoring direction across two arms,
linear-formula monotonicity on both sides of cost=1, cross-tier
promotion, empty-Strengths preserves tier order, promoted arm in
backoff falls through via full Router.Select path, observed-quality
tiebreak between two strength-tagged arms, ApplyArmOverrides happy
path + unknown-ID reporting + unknown-strength skipping.
2026-05-19 21:14:45 +02:00
vikingowl b331dcd61a feat(subprocess): per-agent binary override via [cli_agents] config
Plan B from docs/superpowers/plans/2026-05-19-post-slm-unlock.md.

Users with aliased CLI binaries (claude-priv, claude-work,
gemini-personal) can now point gnoma's auto-discovery at them
without renaming. The override flows through to the actual subprocess
spawn at internal/provider/subprocess/provider.go:56, so routing
through the alias is functional, not cosmetic.

Config:
  [cli_agents]
  claude = "claude-priv"   # discovery uses claude-priv instead of claude
  gemini = ""              # empty value = no override (fall back to canonical)
  # vibe is absent = canonical name used

- internal/config/config.go: CLIAgentsSection map[string]string;
  TOML [cli_agents] key.
- internal/provider/subprocess/agent.go:
  - Package-level lookPath = exec.LookPath for test injection.
  - resolveAgentBinary(canonical, override) → (path, binName, err).
    Override='' falls back to canonical. Override set but missing from
    PATH returns an error (no silent fallback — masks user typos).
  - DiscoveredAgent.OverrideBinary records the override binary name
    when one was used; empty otherwise.
  - DiscoverCLIAgents(ctx, overrides) signature; warning logged when
    an override is configured but the binary isn't on PATH.
- cmd/gnoma/main.go: both call sites pass cfg.CLIAgents. The
  `gnoma providers` listing renders `claude-priv (via [cli_agents].claude)`
  when an override is in effect.

Tests cover: 5 resolver cases (no override, override set, empty
override falls back, override missing, canonical missing); 4
discovery cases (no overrides, override resolves alias, empty value
falls back, override missing skips agent); 2 config round-trip cases.
2026-05-19 21:02:16 +02:00
vikingowl 43ea2e562d feat(engine): two-stage tool routing for small local arms
Plan A from docs/superpowers/plans/2026-05-19-post-slm-unlock.md.

Small local SLMs (<=16k context) waste ~1500 tokens per turn on the
full tool catalogue. Two-stage routing replaces round-1 tools with a
single synthetic select_category schema; round-2+ sends only the
selected category's real tool schemas plus select_category for
re-selection.

- internal/tool/category.go: Category type, optional Categorized
  interface, CategoryOf() with meta fallback. fs.read/fs.ls -> read,
  fs.write/fs.edit -> write, fs.glob/fs.grep -> search, bash -> exec.
- internal/engine/twostage.go: synthetic select_category tool,
  intercept helper, per-turn selectedCategory state under e.mu.
- Engine round 1 forces ToolChoiceRequired so SLMs don't fall back to
  prose. State resets at the top and end of every runLoop.
- Activates automatically on a forced local arm with ContextWindow
  <=16384, or via [router].force_two_stage TOML key.
- Integration test drives a 3-round trip and asserts: round 1 emits
  exactly one schema (synthetic) with ToolChoiceRequired, round 2
  contains only write-category schemas + select_category, real
  fs.write executes. Invalid-category fallback round-trips back to
  round-1 mode.
2026-05-19 20:53:21 +02:00
vikingowl 21da29e73e docs(plan): capture post-SLM-unlock outstanding work
New dated plan at docs/superpowers/plans/2026-05-19-post-slm-unlock.md
covers the work surfaced during this session that hasn't shipped yet:

Phase A — two-stage tool routing (last item from the original
smallcode audit; gates on local + small-context arms; saves ~70% of
schema tokens per request).

Phase B — CLI agent binary override. [cli_agents] config section lets
users map canonical agent names (claude / gemini / vibe) onto local
aliases (claude-priv, gemini-work, etc.).

Phase C — user profiles. Multiple named configs (work / private /
experiment) layered over a base config.toml, switchable via
--profile flag, [config].default_profile, and a /profile TUI command.

Phase D — per-arm capability tags (Phase-4 prep). Per-arm Strengths
[]TaskType and CostWeight to make the router actually pick Opus over
Gemini for Planning/SecurityReview etc., not just for cost reasons.

Phase E — compound tools (deferred until SLM-arm telemetry shows
which chain patterns fail).

Plus an explicit drop list of things we considered and won't ship.
TODO.md updated to point at the new plan and note that the original
roadmap's Phase 4 is now superseded.
2026-05-19 19:31:40 +02:00
vikingowl a9213ec382 feat(slm): Wave C — SLM classifier, MaxComplexity routing, CLI subcommands, TUI status
- slm.Classifier: openaicompat → llamafile, 2s timeout + heuristic fallback,
  heuristic baseline blended so Priority/RequiredEffort are never zeroed,
  extractJSON strips markdown fences from small-model responses
- router.ParseTaskType: case-insensitive string → TaskType, unknown → TaskGeneration
- router.Arm.MaxComplexity: zero = no ceiling (preserves existing arm behavior);
  filterFeasible excludes arms when task.ComplexityScore > MaxComplexity
- config.SLMSection: [slm] enabled / model_url / data_dir
- openaicompat.NewLlamafile: no API key, model = "default", no retries
- slm.Manager: DefaultDataDir() (XDG), Manifest() accessor
- cmd/gnoma: `gnoma slm setup` / `gnoma slm status` subcommands; SLM arm
  registered with MaxComplexity=0.3 when enabled + set up
- tui: /config shows slm status (ready/missing/not set up + base URL if running)
- docs: roadmap updated to reflect llamafile pivot from Ollama
2026-05-07 16:44:32 +02:00
vikingowl 5569d4fb86 docs: consolidated roadmap, ADR-013, drop stale plans
- New 7-phase roadmap (2026-05-07-gnoma-roadmap.md) covering M8 cleanup,
  PTY interactive shell, SLM classifier, router revisit, USP security,
  ELF support, and distribution
- ADR-013 (002-slm-routing.md): SLM-first routing supersedes ADR-009;
  Thompson Sampling deferred pending SLM production data
- ADR-009 status updated to "Superseded by ADR-013"
- gemma-integration-analysis.md: header note that Node.js specifics
  (LiteRT-LM, daemon, PID) don't apply to gnoma's Go implementation
- TODO.md replaced with thin pointer to roadmap + stable backlog
- Deleted stale plan/spec files: m6-m7-closeout, m8-hooks-design
2026-05-07 15:06:54 +02:00
vikingowl fef38b3502 docs: M8.1 hook system design spec 2026-04-06 02:42:34 +02:00
vikingowl 43dcc7e9de docs: M6/M7 close-out implementation plan — 8 tasks, TDD, full file map 2026-04-05 21:33:42 +02:00
vikingowl 252ffde732 docs: M6/M7 close-out design spec — tool persistence, tokenizer, router feedback, coordinator 2026-04-05 21:22:26 +02:00