Commit Graph

14 Commits

Author SHA1 Message Date
vikingowl 5170c73dac docs: refresh README/CONTRIBUTING/AGENTS/TODO, add LICENSE, drop obsolete files
Top-level docs were stale and the .gitea/ issue templates referenced a
workflow that is no longer in use.

- README: rewrite around the current feature set (SLM routing, profiles,
  plugin TOFU, SafeProvider boundary, current model defaults). Add a
  pre-built-binary install section plus Docker (ghcr.io) install path
  for users without a Go toolchain. Document the GitHub mirror.
- CONTRIBUTING: drop the dead issue-template reference, note Gitea
  upstream + GitHub mirror split, expand the package map and test-target
  table.
- AGENTS: rebuild as a domain glossary (Elf / Arm / Turn / SafeProvider /
  Incognito / Profile) plus non-obvious conventions an outside agent
  needs and would not infer from the code.
- TODO: trim completed waves into a History section, fix a broken
  link to the never-written Wave 3 plan file, surface active backlog.
- docs/essentials/INDEX: add ADR-004 (PostToolUse hook ordering) to the
  ADR list.
- LICENSE + NOTICE: adopt Apache License 2.0. Patent grant matters
  because gnoma bundles SDKs from Anthropic / OpenAI / Google / Mistral
  and ships derivative tooling that runs untrusted MCP servers.
- Delete .gitea/issue_template/ and gemma-integration-analysis.md
  (latter is obsolete per its own preamble — Node.js-specific notes
  that don't apply to the Go implementation).
2026-05-20 03:13:40 +02:00
vikingowl f8c85a26e9 docs(security): ADR-004 PostToolUse hook ordering + invariant test
Closes the last remaining 2026-05-19 audit finding by documenting the
existing transitive guarantee rather than restructuring the hook
contract.

The audit observed that PostToolUse hooks receive raw tool output
before the firewall scan runs, and proposed reordering or splitting
the event into raw-local-only and redacted-for-LLM variants. After
Wave 1 (SafeProvider boundary at every router arm + non-engine
provider consumer), the audit's threat model is closed transitively:

- Shell hooks see raw output but never reach an LLM.
- Prompt hooks route Stream calls through routerStreamer → router →
  arm.Provider, every arm.Provider is now *SafeProvider, outgoing
  messages are scanned at the boundary.
- Agent hooks spawn an elf whose engine has Firewall set;
  buildRequest scans inline.

Reordering would regress legitimate shell-hook use cases (audit,
forensic, local alert) that need raw access. Splitting the contract
forces every existing hook config to migrate and introduces a
wrong-variant footgun. Neither is justified by the residual risk.

Three changes ship with the ADR:
- ADR-004 records the decision and the conditions for re-opening it.
- Doc comments on hook.PostToolUse and the dispatcher call site in
  the engine point at the ADR.
- internal/hook/posttooluse_redaction_test.go locks in the invariant:
  a prompt PostToolUse hook firing on a secret-bearing tool result
  produces a redacted prompt at the inner provider. If this test
  fails, ADR-004's Position A is no longer correct and the audit
  finding re-opens.
2026-05-19 23:28:25 +02:00
vikingowl dc438ea181 feat(plugin): trust-on-first-use manifest pinning
Plugins are now verified against ~/.config/gnoma/plugins.pins.toml at
load time. Each plugin's plugin.json bytes are hashed (SHA-256) and:

- recorded automatically on first load (TOFU) with a prominent warning
- compared on subsequent loads
- refused with a clear error if the hash drifted, without overwriting
  the pin so the user can review and re-enrol deliberately

Pin-store I/O failures degrade to load-without-pinning rather than
locking the user out of previously-trusted plugins.

Closes audit finding C2. See ADR-003 for the decision rationale and
docs/plugins-trust.md for the end-user trust model.
2026-05-19 16:44:09 +02:00
vikingowl 5569d4fb86 docs: consolidated roadmap, ADR-013, drop stale plans
- New 7-phase roadmap (2026-05-07-gnoma-roadmap.md) covering M8 cleanup,
  PTY interactive shell, SLM classifier, router revisit, USP security,
  ELF support, and distribution
- ADR-013 (002-slm-routing.md): SLM-first routing supersedes ADR-009;
  Thompson Sampling deferred pending SLM production data
- ADR-009 status updated to "Superseded by ADR-013"
- gemma-integration-analysis.md: header note that Node.js specifics
  (LiteRT-LM, daemon, PID) don't apply to gnoma's Go implementation
- TODO.md replaced with thin pointer to roadmap + stable backlog
- Deleted stale plan/spec files: m6-m7-closeout, m8-hooks-design
2026-05-07 15:06:54 +02:00
vikingowl 6c47f8643b feat(m8): MCP client, tool replaceability, and plugin system
Complete the remaining M8 extensibility deliverables:

- MCP client with JSON-RPC 2.0 over stdio transport, protocol
  lifecycle (initialize/tools-list/tools-call), and process group
  management for clean shutdown
- MCP tool adapter implementing tool.Tool with mcp__{server}__{tool}
  naming convention and replace_default for swapping built-in tools
- MCP manager for multi-server orchestration with parallel startup,
  tool discovery, and registry integration
- Plugin system with plugin.json manifest (name/version/capabilities),
  directory-based discovery (global + project scopes with precedence),
  loader that merges skills/hooks/MCP configs into existing registries,
  and install/uninstall/list lifecycle manager
- Config additions: MCPServerConfig, PluginsSection with opt-in/opt-out
  enabled/disabled resolution
- TUI /plugins command for listing installed plugins
- 54 tests across internal/mcp and internal/plugin packages
2026-04-12 03:09:05 +02:00
vikingowl 8d97c6cd39 docs: mark M8.2 skill system deliverables complete in milestones.md 2026-04-07 02:25:29 +02:00
vikingowl 24f4a739a6 docs: mark M8.1 hook system deliverables complete in milestones.md 2026-04-07 01:09:07 +02:00
vikingowl 2c0ff5ff1f docs: mark M7 deliverables complete in milestones.md 2026-04-06 00:59:16 +02:00
vikingowl abb3e3ca90 feat: spawn_elfs batch tool for guaranteed parallel elf execution
New spawn_elfs tool takes array of tasks, spawns all elfs simultaneously.
Solves the problem of models (Mistral Small, Devstral) that serialize
tool calls instead of batching them.

Schema: {"tasks": [{"prompt": "...", "task_type": "..."}], "max_turns": 30}

Also:
- Suppress spawn_elfs tool output from chat (tree handles display)
- Update M7 milestones to reflect completed deliverables
- Add CC-inspired features to M8/M10: task notification system,
  task framework, /batch skill, coordinator mode, StreamingToolExecutor,
  git worktree isolation
2026-04-03 21:03:51 +02:00
vikingowl 8e5ddb20cb feat: hybrid system inventory — dynamic PATH scan + runtime probing
No hardcoded tool lists. Scans all $PATH directories for executables
(5541 on this system), then probes known runtime patterns for version
info (23 detected: Go, Python, Node, Rust, Ruby, Perl, Java, Dart,
Deno, Bun, Lua, LuaJIT, Guile, GCC, Clang, NASM + package managers).

System prompt includes: OS, shell, runtime versions, and notable
tools (git, docker, kubectl, fzf, rg, etc.) from the full PATH scan.
Total executable count reported so the LLM knows the full scope.

Milestones updated: M6 fixed context prefix, M12 multimodality.
2026-04-03 14:36:22 +02:00
vikingowl c54471a37b refactor: migrate mistral sdk to github.com/VikingOwl91/mistral-go-sdk
Same package, new GitHub deployment with fixed tests.
somegit.dev/vikingowl → github.com/VikingOwl91, v1.2.0 → v1.2.1
2026-04-03 12:06:59 +02:00
vikingowl 69f5dba091 feat: complete M1 — core engine with Mistral provider
Mistral provider adapter with streaming, tool calls (single-chunk
pattern), stop reason inference, model listing, capabilities, and
JSON output support.

Tool system: bash (7 security checks, shell alias harvesting for
bash/zsh/fish), file ops (read, write, edit, glob, grep, ls).
Alias harvesting collects 300+ aliases from user's shell config.

Engine agentic loop: stream → tool execution → re-query → until
done. Tool gating on model capabilities. Max turns safety limit.

CLI pipe mode: echo "prompt" | gnoma streams response to stdout.
Flags: --provider, --model, --system, --api-key, --max-turns,
--verbose, --version.

Provider interface expanded: Models(), DefaultModel(), Capabilities
(ToolUse, JSONOutput, Vision, Thinking, ContextWindow, MaxOutput),
ResponseFormat with JSON schema support.

Live verified: text streaming + tool calling with devstral-small.
117 tests across 8 packages, 10MB binary.
2026-04-03 12:01:55 +02:00
vikingowl 951ab3b970 docs: update essentials for router, security, task learning
Restructure milestones from M1-M11 to M1-M15:
- M3: Security Firewall (secret scanner, incognito mode)
- M4: Router Foundation (arm registry, pools, task classifier)
- M5: TUI with full 6 permission modes
- M6: Full compaction (truncate + LLM summarization)
- M9: Router Advanced (bandit learning, ensemble strategies)
- M11: Task Learning (pattern detection, persistent tasks)

Add ADR-007 through ADR-012 for security-as-core, router split,
Thompson Sampling, MCP replaceability, task learning, incognito.

Add risks R-010 through R-015 for router, security, feedback,
task learning, ensemble quality, shell parser.

Update architecture dependency graph with security, router,
elf, hook, skill, mcp, plugin, tasklearn packages.

Update domain model with Router, Arm, LimitPool, Firewall entities.
2026-04-03 10:47:11 +02:00
vikingowl 154d978564 docs: add project essentials (12/12 complete)
Vision, domain model, architecture, patterns, process flows,
UML diagrams, API contracts, tech stack, constraints, milestones
(M1-M11), decision log (6 ADRs), and risk register.

Key decisions: single binary, pull-based streaming, Mistral as M1
reference provider, discriminated unions, multi-provider collaboration
as core identity.
2026-04-02 18:09:07 +02:00