docs: refresh README/CONTRIBUTING/AGENTS/TODO, add LICENSE, drop obsolete files
Top-level docs were stale and the .gitea/ issue templates referenced a workflow that is no longer in use. - README: rewrite around the current feature set (SLM routing, profiles, plugin TOFU, SafeProvider boundary, current model defaults). Add a pre-built-binary install section plus Docker (ghcr.io) install path for users without a Go toolchain. Document the GitHub mirror. - CONTRIBUTING: drop the dead issue-template reference, note Gitea upstream + GitHub mirror split, expand the package map and test-target table. - AGENTS: rebuild as a domain glossary (Elf / Arm / Turn / SafeProvider / Incognito / Profile) plus non-obvious conventions an outside agent needs and would not infer from the code. - TODO: trim completed waves into a History section, fix a broken link to the never-written Wave 3 plan file, surface active backlog. - docs/essentials/INDEX: add ADR-004 (PostToolUse hook ordering) to the ADR list. - LICENSE + NOTICE: adopt Apache License 2.0. Patent grant matters because gnoma bundles SDKs from Anthropic / OpenAI / Google / Mistral and ships derivative tooling that runs untrusted MCP servers. - Delete .gitea/issue_template/ and gemma-integration-analysis.md (latter is obsolete per its own preamble — Node.js-specific notes that don't apply to the Go implementation).
This commit is contained in:
@@ -1,58 +0,0 @@
|
||||
name: Bug Report
|
||||
about: Report something that isn't working correctly
|
||||
labels:
|
||||
- bug
|
||||
body:
|
||||
- type: textarea
|
||||
id: description
|
||||
attributes:
|
||||
label: Description
|
||||
description: What happened? What did you expect?
|
||||
validations:
|
||||
required: true
|
||||
- type: textarea
|
||||
id: reproduction
|
||||
attributes:
|
||||
label: Steps to reproduce
|
||||
description: Minimal steps to trigger the issue
|
||||
placeholder: |
|
||||
1. Run `gnoma --provider anthropic`
|
||||
2. Type "..."
|
||||
3. See error
|
||||
validations:
|
||||
required: true
|
||||
- type: input
|
||||
id: version
|
||||
attributes:
|
||||
label: gnoma version
|
||||
description: Output of `gnoma --version`
|
||||
placeholder: "gnoma 0.1.0 (abc1234, 2026-04-12)"
|
||||
validations:
|
||||
required: true
|
||||
- type: input
|
||||
id: os
|
||||
attributes:
|
||||
label: OS / Architecture
|
||||
placeholder: "Linux x86_64 / macOS arm64 / Windows amd64"
|
||||
validations:
|
||||
required: true
|
||||
- type: dropdown
|
||||
id: provider
|
||||
attributes:
|
||||
label: Provider
|
||||
options:
|
||||
- mistral
|
||||
- anthropic
|
||||
- openai
|
||||
- google
|
||||
- ollama
|
||||
- llamacpp
|
||||
- N/A
|
||||
validations:
|
||||
required: false
|
||||
- type: textarea
|
||||
id: logs
|
||||
attributes:
|
||||
label: Relevant logs
|
||||
description: Run with `--verbose` for debug output
|
||||
render: shell
|
||||
@@ -1,42 +0,0 @@
|
||||
name: Feature Request
|
||||
about: Suggest an improvement or new capability
|
||||
labels:
|
||||
- enhancement
|
||||
body:
|
||||
- type: textarea
|
||||
id: problem
|
||||
attributes:
|
||||
label: Problem
|
||||
description: What are you trying to do that gnoma doesn't support well?
|
||||
validations:
|
||||
required: true
|
||||
- type: textarea
|
||||
id: solution
|
||||
attributes:
|
||||
label: Proposed solution
|
||||
description: How would you like this to work?
|
||||
validations:
|
||||
required: true
|
||||
- type: textarea
|
||||
id: alternatives
|
||||
attributes:
|
||||
label: Alternatives considered
|
||||
description: Other approaches you've thought about
|
||||
validations:
|
||||
required: false
|
||||
- type: dropdown
|
||||
id: area
|
||||
attributes:
|
||||
label: Area
|
||||
options:
|
||||
- providers
|
||||
- tools
|
||||
- router
|
||||
- TUI
|
||||
- MCP / plugins
|
||||
- elfs (sub-agents)
|
||||
- security
|
||||
- config
|
||||
- other
|
||||
validations:
|
||||
required: false
|
||||
@@ -1,26 +1,75 @@
|
||||
# AGENTS.md
|
||||
|
||||
## Domain Terminology
|
||||
- **Elf**: An agent instance.
|
||||
- **Turn**: A complete sequence of agentic reasoning and tool execution.
|
||||
- **Routing Arm**: A specific model/provider selected by the `Router` for a task.
|
||||
- **Stream Event**: Discrete updates during LLM generation (e.g., `EventTextDelta`, `EventToolCallStart`, `EventToolResult`).
|
||||
Conventions for AI assistants working in this repository. CLAUDE.md
|
||||
covers Go style, commits, and TDD policy; this file adds gnoma-specific
|
||||
domain knowledge those rules do not capture.
|
||||
|
||||
## Build & Test Targets
|
||||
- **Run**: `make run`
|
||||
- **Test (Verbose)**: `make test-v`
|
||||
- **Integration Tests**: `make test-integration` (requires `//go:build integration`)
|
||||
## Domain glossary
|
||||
|
||||
## Key Dependencies
|
||||
- **Mistral**: `github.com/VikingOwl91/mistral-go-sdk`
|
||||
- **Anthropic**: `github.com/anthropics/anthropic-sdk-go`
|
||||
- **OpenAI**: `github.com/openai/openai-go`
|
||||
- **Google GenAI**: `google.golang.org/genai`
|
||||
- **TUI**: `charm.land/bubbletea/v2`, `charm.land/lipgloss/v2`
|
||||
- **Other**: `charm.land/bubbles/v2`, `charm.land/glamour/v2`, `github.com/pkoukk/tiktoken-go`
|
||||
| Term | Meaning |
|
||||
|---|---|
|
||||
| **Elf** | A sub-agent instance, spawned via `spawn_elfs`. |
|
||||
| **Turn** | One complete `stream → tool → re-query` cycle in the engine. |
|
||||
| **Arm** | A `(provider, model)` pair the router can select. Registered with cost and capability metadata. |
|
||||
| **Router** | Multi-armed-bandit selector that picks an Arm per Turn from the registered set. |
|
||||
| **SLM** | Small language model running locally for prompt classification and trivial-task execution. |
|
||||
| **Stream Event** | Discriminated-union update emitted while a provider streams: `EventTextDelta`, `EventToolCallStart`, `EventToolResult`, etc. See `internal/stream/event.go`. |
|
||||
| **SafeProvider** | The sealed boundary that gates outbound provider calls — every Provider implementation embeds the unexported marker. See `internal/security`. |
|
||||
| **Incognito** | Per-turn mode that disables session persistence and router learning. |
|
||||
| **Profile** | A named config overlay under `~/.config/gnoma/profiles/`. Switches keys, models, and per-profile router quality data. |
|
||||
|
||||
## Environment Variables
|
||||
- `MISTRAL_API_KEY`: Required for Mistral provider.
|
||||
- `ANTHROPIC_API_KEY`: Required for Anthropic provider.
|
||||
- `OPENAI_API_KEY`: Required for OpenAI provider.
|
||||
- `GOOGLE_API_KEY`: Required for Google provider.
|
||||
## Build & test targets (beyond standard)
|
||||
|
||||
| Target | Purpose |
|
||||
|---|---|
|
||||
| `make test-v` | Verbose unit tests |
|
||||
| `make test-integration` | Runs `//go:build integration` tests (real API calls) |
|
||||
| `make check` | fmt + vet + lint + test (use before committing) |
|
||||
| `go test -bench=. ./internal/router/` | Router benchmarks |
|
||||
|
||||
## Provider env vars
|
||||
|
||||
| Provider | Primary | Alternative |
|
||||
|---|---|---|
|
||||
| Anthropic | `ANTHROPIC_API_KEY` | `ANTHROPICS_API_KEY` |
|
||||
| OpenAI | `OPENAI_API_KEY` | — |
|
||||
| Google | `GEMINI_API_KEY` | `GOOGLE_API_KEY` |
|
||||
| Mistral | `MISTRAL_API_KEY` | — |
|
||||
|
||||
`GNOMA_PROVIDER` and `GNOMA_MODEL` override the resolved config.
|
||||
|
||||
## Non-obvious conventions
|
||||
|
||||
- **Discriminated unions** are structs with a `Type` field and pointer
|
||||
payloads — not Go interfaces. See `internal/stream/event.go` and
|
||||
`internal/message`.
|
||||
- **Pull-based iterators** follow the `Next() / Current() / Err() / Close()`
|
||||
shape. Streams in `internal/provider/*/stream.go` are the canonical examples.
|
||||
- **`json.RawMessage`** flows through `tool.Definition.Parameters` and tool
|
||||
arguments untouched — never marshal/unmarshal in the middle.
|
||||
- **Capabilities and ContextWindow** come from `internal/provider`
|
||||
`inferXxxModelCapabilities` per provider; updating model lists also updates
|
||||
these tables and the `ratelimits.go` map.
|
||||
- **Hook ordering** matters for `PostToolUse`. See ADR-004.
|
||||
- **Plugin trust** is TOFU pinning — see `internal/plugin/pinstore.go` and
|
||||
ADR-003.
|
||||
|
||||
## Sub-agent (elf) etiquette
|
||||
|
||||
When spawning elfs:
|
||||
|
||||
- One `spawn_elfs` call for all parallel work; never spawn one at a time.
|
||||
- Read-only tasks on disjoint files parallelize cleanly.
|
||||
- Writes to the same file must be sequenced into one elf.
|
||||
- Cap each batch at 5–7 elfs.
|
||||
|
||||
See `internal/skill/skills/batch.md` for the canonical batching template.
|
||||
|
||||
## Reference docs
|
||||
|
||||
- Architecture map: `docs/essentials/INDEX.md`
|
||||
- ADRs: `docs/essentials/decisions/`
|
||||
- Profiles: `docs/profiles.md`
|
||||
- SLM backends: `docs/slm-backends.md`
|
||||
- Plugin trust: `docs/plugins-trust.md`
|
||||
- Router benchmarks: `docs/benchmarks/README.md`
|
||||
|
||||
+49
-19
@@ -1,5 +1,10 @@
|
||||
# Contributing to gnoma
|
||||
|
||||
The upstream repository lives at
|
||||
<https://somegit.dev/Owlibou/gnoma> and is mirrored to
|
||||
<https://github.com/VikingOwl91/gnoma>. PRs are accepted on the upstream
|
||||
(Gitea) instance; the GitHub mirror is read-only.
|
||||
|
||||
## Setup
|
||||
|
||||
```sh
|
||||
@@ -11,34 +16,43 @@ make lint # requires golangci-lint
|
||||
|
||||
## Development workflow
|
||||
|
||||
1. Create a branch from `main`
|
||||
2. Write tests first (TDD) — table-driven, `t.TempDir()` for filesystem tests
|
||||
3. `make check` (fmt + vet + lint + test) must pass
|
||||
4. Commit with conventional messages: `feat:`, `fix:`, `refactor:`, `test:`, `docs:`
|
||||
1. Branch from `main`.
|
||||
2. Write tests first (TDD). Table-driven where possible, `t.TempDir()` for
|
||||
filesystem tests, `testing/synctest` for concurrent ones.
|
||||
3. `make check` (fmt + vet + lint + test) must pass.
|
||||
4. Conventional commits: `feat:`, `fix:`, `refactor:`, `test:`, `docs:`,
|
||||
`chore:`. **No co-signing or "Generated-by" trailers.**
|
||||
|
||||
## Code style
|
||||
|
||||
- Go 1.26 idioms (`new(expr)`, `errors.AsType[E]`)
|
||||
- Structured logging with `log/slog`
|
||||
- `json.RawMessage` for tool schemas (zero-cost passthrough)
|
||||
- Functional options for complex configuration
|
||||
- Short, lowercase package names — no underscores
|
||||
- Go 1.26 idioms (`new(expr)`, `errors.AsType[E]`, `sync.WaitGroup.Go`).
|
||||
- Structured logging with `log/slog`.
|
||||
- `json.RawMessage` for tool schemas (zero-cost passthrough).
|
||||
- Functional options for complex configuration.
|
||||
- Short, lowercase package names — no underscores.
|
||||
- Discriminated unions via struct + type discriminant, not interfaces.
|
||||
- Pull-based stream iterators: `Next() / Current() / Err() / Close()`.
|
||||
|
||||
## Testing
|
||||
|
||||
- Unit tests: `make test`
|
||||
- Integration tests (require API keys): `make test-integration`
|
||||
- Coverage: `make cover`
|
||||
- Benchmarks: `go test -bench=. ./internal/router/`
|
||||
| Command | What it runs |
|
||||
|---|---|
|
||||
| `make test` | unit tests |
|
||||
| `make test-integration` | tests behind `//go:build integration` — requires real API keys |
|
||||
| `make cover` | coverage → `coverage.html` |
|
||||
| `make lint` | `golangci-lint run ./...` |
|
||||
| `make check` | fmt + vet + lint + test |
|
||||
| `go test -bench=. ./internal/router/` | router benchmarks |
|
||||
|
||||
Integration tests use `//go:build integration` and are skipped by default.
|
||||
Integration tests are skipped by default.
|
||||
|
||||
## Architecture
|
||||
|
||||
Read `docs/essentials/INDEX.md` before making architectural changes. Key packages:
|
||||
Read [`docs/essentials/INDEX.md`](docs/essentials/INDEX.md) before changing
|
||||
architectural boundaries. Key packages:
|
||||
|
||||
| Package | Purpose |
|
||||
|---------|---------|
|
||||
|---|---|
|
||||
| `internal/engine` | Agentic loop (stream → tool → re-query) |
|
||||
| `internal/router` | Multi-armed bandit arm selection |
|
||||
| `internal/provider` | LLM provider adapters |
|
||||
@@ -46,8 +60,24 @@ Read `docs/essentials/INDEX.md` before making architectural changes. Key package
|
||||
| `internal/mcp` | MCP client (JSON-RPC over stdio) |
|
||||
| `internal/plugin` | Plugin manifest, loader, manager |
|
||||
| `internal/elf` | Sub-agent (elf) system |
|
||||
| `internal/tui` | Bubble Tea terminal UI |
|
||||
| `internal/security` | SafeProvider boundary, firewall, output scanner |
|
||||
| `internal/skill` | Skill registry and templating |
|
||||
| `internal/slm` | Small-language-model classifier + arm |
|
||||
| `internal/tui` | Bubble Tea v2 terminal UI |
|
||||
|
||||
## Issues
|
||||
ADRs live in [`docs/essentials/decisions/`](docs/essentials/decisions/).
|
||||
|
||||
Use the issue templates when filing bugs or requesting features. Include reproduction steps, expected behavior, and gnoma version (`gnoma --version`).
|
||||
## Reporting issues
|
||||
|
||||
File issues on the upstream Gitea instance with:
|
||||
|
||||
- A short reproduction (commands, prompts, configs that triggered the bug).
|
||||
- Expected vs. actual behavior.
|
||||
- `gnoma --version` output and OS / architecture.
|
||||
- Provider and model in use, if relevant.
|
||||
- `--verbose` log output if it sheds light.
|
||||
|
||||
## License
|
||||
|
||||
By contributing you agree your work is licensed under the
|
||||
[Apache License 2.0](LICENSE).
|
||||
|
||||
@@ -0,0 +1,202 @@
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
@@ -0,0 +1,5 @@
|
||||
gnoma
|
||||
Copyright 2026 vikingowl
|
||||
|
||||
This product includes software developed at the gnoma project
|
||||
(https://somegit.dev/Owlibou/gnoma).
|
||||
@@ -1,234 +1,153 @@
|
||||
# gnoma
|
||||
|
||||
**A provider-agnostic agentic coding assistant built in Go.** gnoma routes tasks to the best available LLM — cloud or local — through a multi-armed bandit router, while tools, hooks, skills, MCP servers, and plugins keep it extensible. Named after the northern pygmy-owl (*Glaucidium gnoma*); agents are called **elfs** (elf owl).
|
||||
**A provider-agnostic agentic coding assistant in Go.** gnoma routes each prompt
|
||||
to the best available model — cloud or local — through a multi-armed bandit
|
||||
router, executes tools on your behalf, and stays extensible through hooks,
|
||||
skills, MCP servers, and plugins.
|
||||
|
||||
Named after the northern pygmy-owl (*Glaucidium gnoma*); agents are called
|
||||
**elfs** (elf owl).
|
||||
|
||||
- **Upstream:** <https://somegit.dev/Owlibou/gnoma>
|
||||
- **GitHub mirror:** <https://github.com/VikingOwl91/gnoma>
|
||||
|
||||
---
|
||||
|
||||
## Install
|
||||
|
||||
### Pre-built binary (no Go toolchain required)
|
||||
|
||||
Releases are built by [GoReleaser](.goreleaser.yml) for
|
||||
`linux`, `darwin`, and `windows` × `amd64`/`arm64` as static (`CGO_ENABLED=0`)
|
||||
archives. Until the first tag is cut, see "Build from source" below.
|
||||
|
||||
Once releases are published:
|
||||
|
||||
```sh
|
||||
# Pick the archive matching your OS/arch from the releases page:
|
||||
# https://somegit.dev/Owlibou/gnoma/releases (upstream)
|
||||
# https://github.com/VikingOwl91/gnoma/releases (mirror)
|
||||
|
||||
# Linux/macOS one-liner (substitute the asset URL):
|
||||
curl -fsSL <ARCHIVE_URL> | tar -xz -C /tmp
|
||||
sudo mv /tmp/gnoma /usr/local/bin/
|
||||
gnoma --version
|
||||
```
|
||||
|
||||
Windows: download the `_windows_*.zip`, extract `gnoma.exe`, and put it on
|
||||
`%PATH%`.
|
||||
|
||||
### Docker
|
||||
|
||||
Multi-arch images (`linux/amd64`, `linux/arm64`) are published to GitHub
|
||||
Container Registry on each tagged release:
|
||||
|
||||
```sh
|
||||
docker pull ghcr.io/vikingowl91/gnoma:latest
|
||||
docker run --rm -it \
|
||||
-v "$PWD:/workspace" \
|
||||
-e ANTHROPIC_API_KEY \
|
||||
ghcr.io/vikingowl91/gnoma:latest --version
|
||||
```
|
||||
|
||||
Mount your project as `/workspace` (the image's working directory) and pass
|
||||
provider keys via `-e`.
|
||||
|
||||
### Go users
|
||||
|
||||
```sh
|
||||
go install somegit.dev/Owlibou/gnoma/cmd/gnoma@latest # latest tagged
|
||||
go install somegit.dev/Owlibou/gnoma/cmd/gnoma@main # bleeding edge
|
||||
```
|
||||
|
||||
### Build from source
|
||||
|
||||
```sh
|
||||
git clone https://somegit.dev/Owlibou/gnoma && cd gnoma
|
||||
make build # → ./bin/gnoma
|
||||
make install # → $GOPATH/bin/gnoma
|
||||
```
|
||||
|
||||
Requires Go 1.26+.
|
||||
|
||||
---
|
||||
|
||||
## Quickstart
|
||||
|
||||
```sh
|
||||
# Install
|
||||
go install somegit.dev/Owlibou/gnoma/cmd/gnoma@latest
|
||||
# Set at least one provider key (or run a local model — see Providers below).
|
||||
export ANTHROPIC_API_KEY=sk-ant-...
|
||||
|
||||
# Or build from source
|
||||
git clone https://somegit.dev/Owlibou/gnoma && cd gnoma
|
||||
make build # binary at ./bin/gnoma
|
||||
|
||||
# Set at least one provider key
|
||||
export ANTHROPIC_API_KEY=sk-ant-... # or OPENAI_API_KEY, MISTRAL_API_KEY, GEMINI_API_KEY
|
||||
|
||||
# Run
|
||||
gnoma # interactive TUI
|
||||
echo "list files" | gnoma # pipe mode
|
||||
gnoma --provider ollama # use a local model
|
||||
gnoma # interactive TUI
|
||||
echo "list files" | gnoma # pipe / one-shot mode
|
||||
gnoma --provider ollama # use a local model
|
||||
gnoma --version
|
||||
```
|
||||
|
||||
## Build
|
||||
Inside the TUI, `Ctrl+X` toggles **incognito** (no session saved, no router
|
||||
learning); `/help` lists slash commands; `Esc` cancels an in-flight turn.
|
||||
|
||||
```sh
|
||||
make build # ./bin/gnoma
|
||||
make install # $GOPATH/bin/gnoma
|
||||
```
|
||||
---
|
||||
|
||||
## Providers
|
||||
|
||||
### Anthropic
|
||||
| Provider | Env var | Default model | Also available |
|
||||
|---|---|---|---|
|
||||
| Anthropic | `ANTHROPIC_API_KEY` | `claude-sonnet-4-6` | `claude-opus-4-7`, `claude-haiku-4-5-20251001` |
|
||||
| OpenAI | `OPENAI_API_KEY` | `gpt-5.5` | `gpt-5.5-pro`, `gpt-5.2`, `gpt-5.2-chat-latest` |
|
||||
| Google (Gemini) | `GEMINI_API_KEY` (alt: `GOOGLE_API_KEY`) | `gemini-3.5-flash` | `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite` |
|
||||
| Mistral | `MISTRAL_API_KEY` | `mistral-large-latest` (Mistral Large 3) | `mistral-medium-3.5`, `magistral-medium-2509` |
|
||||
| Ollama (local) | — | `qwen3:8b` (override with `--model`) | any model on your Ollama instance |
|
||||
| llama.cpp (local) | — | reported by `/v1/models` | n/a |
|
||||
| Subprocess (`claude`, `gemini`, `agy` CLIs) | provider-specific | binary name | configurable via `[cli_agents]` |
|
||||
|
||||
Override per-invocation:
|
||||
|
||||
```sh
|
||||
export ANTHROPIC_API_KEY=sk-ant-...
|
||||
./bin/gnoma --provider anthropic
|
||||
./bin/gnoma --provider anthropic --model claude-opus-4-5-20251001
|
||||
gnoma --provider anthropic --model claude-opus-4-7
|
||||
gnoma --provider openai --model gpt-5.5-pro # GPT-5.5 is the default; pro is the higher-accuracy tier
|
||||
gnoma --provider google --model gemini-3.1-pro-preview
|
||||
gnoma --provider ollama --model qwen2.5-coder:3b
|
||||
gnoma --provider llamacpp # model picked from server
|
||||
```
|
||||
|
||||
Integration tests hit the real API — keep a key in env:
|
||||
`gnoma providers` prints every discovered provider, model, and CLI agent.
|
||||
|
||||
### Local models
|
||||
|
||||
Start your local server, then point gnoma at it:
|
||||
|
||||
```sh
|
||||
go test -tags integration ./internal/provider/...
|
||||
```
|
||||
# Ollama (default http://localhost:11434/v1)
|
||||
ollama pull qwen2.5-coder:3b
|
||||
gnoma --provider ollama --model qwen2.5-coder:3b
|
||||
|
||||
---
|
||||
|
||||
### OpenAI
|
||||
|
||||
```sh
|
||||
export OPENAI_API_KEY=sk-proj-...
|
||||
./bin/gnoma --provider openai
|
||||
./bin/gnoma --provider openai --model gpt-4o
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Mistral
|
||||
|
||||
```sh
|
||||
export MISTRAL_API_KEY=...
|
||||
./bin/gnoma --provider mistral
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Google (Gemini)
|
||||
|
||||
```sh
|
||||
export GEMINI_API_KEY=AIza...
|
||||
./bin/gnoma --provider google
|
||||
./bin/gnoma --provider google --model gemini-2.0-flash
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Ollama (local)
|
||||
|
||||
Start Ollama and pull a model, then:
|
||||
|
||||
```sh
|
||||
./bin/gnoma --provider ollama --model gemma4:latest
|
||||
./bin/gnoma --provider ollama --model qwen3:8b # default if --model omitted
|
||||
```
|
||||
|
||||
Default endpoint: `http://localhost:11434/v1`. Override via config or env:
|
||||
|
||||
```sh
|
||||
# .gnoma/config.toml
|
||||
[provider]
|
||||
default = "ollama"
|
||||
model = "gemma4:latest"
|
||||
|
||||
[provider.endpoints]
|
||||
ollama = "http://myhost:11434/v1"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### llama.cpp (local)
|
||||
|
||||
Start the llama.cpp server:
|
||||
|
||||
```sh
|
||||
# llama.cpp (default http://localhost:8080/v1)
|
||||
llama-server --model /path/to/model.gguf --port 8080 --ctx-size 8192
|
||||
gnoma --provider llamacpp
|
||||
```
|
||||
|
||||
Then:
|
||||
Override the endpoint in `.gnoma/config.toml`:
|
||||
|
||||
```sh
|
||||
./bin/gnoma --provider llamacpp
|
||||
# model name is taken from the server's /v1/models response
|
||||
```
|
||||
|
||||
Default endpoint: `http://localhost:8080/v1`. Override:
|
||||
|
||||
```sh
|
||||
```toml
|
||||
[provider.endpoints]
|
||||
ollama = "http://myhost:11434/v1"
|
||||
llamacpp = "http://localhost:9090/v1"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Extensibility (M8)
|
||||
|
||||
gnoma supports hooks, skills, MCP servers, and plugins.
|
||||
|
||||
### MCP Servers
|
||||
|
||||
Connect any [MCP](https://modelcontextprotocol.io)-compatible tool server:
|
||||
|
||||
```toml
|
||||
[[mcp_servers]]
|
||||
name = "git"
|
||||
command = "mcp-server-git"
|
||||
args = ["--repo", "."]
|
||||
timeout = "30s"
|
||||
|
||||
# Replace a built-in tool with an MCP tool
|
||||
[mcp_servers.replace_default]
|
||||
exec = "bash" # MCP tool "exec" replaces gnoma's built-in "bash"
|
||||
```
|
||||
|
||||
MCP tools appear as `mcp__{server}__{tool}` (e.g., `mcp__git__status`), or under the built-in name when using `replace_default`.
|
||||
|
||||
### Skills
|
||||
|
||||
Drop markdown files into `.gnoma/skills/` or `~/.config/gnoma/skills/`:
|
||||
|
||||
```
|
||||
/skillname # invoke a skill
|
||||
/skills # list available skills
|
||||
```
|
||||
|
||||
### Hooks
|
||||
|
||||
Run shell commands on tool events:
|
||||
|
||||
```toml
|
||||
[[hooks]]
|
||||
name = "block-rm-rf"
|
||||
event = "pre_tool_use"
|
||||
type = "command"
|
||||
exec = "bash-safety-check.sh"
|
||||
tool_pattern = "bash*"
|
||||
```
|
||||
|
||||
### Plugins
|
||||
|
||||
Bundle skills, hooks, and MCP configs into installable plugins:
|
||||
|
||||
```sh
|
||||
gnoma plugin install ./my-plugin # install from directory
|
||||
gnoma plugin list # list installed plugins
|
||||
```
|
||||
|
||||
Plugins are pinned by SHA-256 of their `plugin.json` on first load
|
||||
(Trust-On-First-Use). A manifest that changes between runs is refused with a
|
||||
clear error and a re-enrollment hint. See [docs/plugins-trust.md](docs/plugins-trust.md)
|
||||
and [ADR-003](docs/essentials/decisions/003-plugin-trust.md).
|
||||
|
||||
---
|
||||
|
||||
## Session Persistence
|
||||
|
||||
Conversations are auto-saved to `.gnoma/sessions/` after each completed turn. On a crash you lose at most the current in-flight turn; all previously completed turns are safe.
|
||||
|
||||
### Resume a session
|
||||
|
||||
```sh
|
||||
gnoma --resume # interactive session picker (↑↓ navigate, Enter load, Esc cancel)
|
||||
gnoma --resume <id> # restore directly by ID
|
||||
gnoma -r # shorthand
|
||||
```
|
||||
|
||||
Inside the TUI:
|
||||
|
||||
```
|
||||
/resume # open picker
|
||||
/resume <id> # restore by ID
|
||||
```
|
||||
|
||||
### Incognito mode
|
||||
|
||||
```sh
|
||||
gnoma --incognito # no session saved, no quality scores updated
|
||||
```
|
||||
|
||||
Toggle at runtime with `Ctrl+X`.
|
||||
|
||||
### Config
|
||||
|
||||
```toml
|
||||
[session]
|
||||
max_keep = 20 # how many sessions to retain per project (default: 20)
|
||||
```
|
||||
|
||||
Sessions are stored per-project under `.gnoma/sessions/<id>/`. Quality scores (EMA routing data) are stored globally at `~/.config/gnoma/quality.json`.
|
||||
|
||||
---
|
||||
|
||||
## Config
|
||||
|
||||
Config is read in priority order:
|
||||
Configuration merges (lowest → highest priority):
|
||||
|
||||
1. `~/.config/gnoma/config.toml` — global
|
||||
2. `.gnoma/config.toml` — project-local (next to `go.mod` / `.git`)
|
||||
3. Environment variables
|
||||
1. Built-in defaults
|
||||
2. `~/.config/gnoma/config.toml` — global base
|
||||
3. `~/.config/gnoma/profiles/<name>.toml` — active profile (when profile mode is enabled)
|
||||
4. `<projectRoot>/.gnoma/config.toml` — project override
|
||||
5. Environment variables (`GNOMA_PROVIDER`, `GNOMA_MODEL`, `*_API_KEY`)
|
||||
|
||||
Example `.gnoma/config.toml`:
|
||||
Example global config:
|
||||
|
||||
```toml
|
||||
[provider]
|
||||
@@ -243,21 +162,165 @@ ollama = "http://localhost:11434/v1"
|
||||
llamacpp = "http://localhost:8080/v1"
|
||||
|
||||
[permission]
|
||||
mode = "auto" # auto | accept_edits | bypass | deny | plan
|
||||
mode = "auto" # default | accept_edits | bypass | deny | plan | auto
|
||||
|
||||
[session]
|
||||
max_keep = 20 # sessions retained per project
|
||||
```
|
||||
|
||||
Environment variable overrides: `GNOMA_PROVIDER`, `GNOMA_MODEL`.
|
||||
### Profiles
|
||||
|
||||
Drop multiple configs under `~/.config/gnoma/profiles/` and switch with
|
||||
`--profile <name>` or `/profile <name>`. Each profile keeps its own router
|
||||
quality data and session history. Full details: [docs/profiles.md](docs/profiles.md).
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
## SLM (small-language-model) routing
|
||||
|
||||
```sh
|
||||
make test # unit tests
|
||||
make test-integration # integration tests (require real API keys)
|
||||
make cover # coverage report → coverage.html
|
||||
make lint # golangci-lint
|
||||
make check # fmt + vet + lint + test
|
||||
gnoma can run a tiny local model alongside the main provider to:
|
||||
|
||||
- **Classify** each prompt (task type + complexity + tool requirement) so the
|
||||
router picks the right arm.
|
||||
- **Execute** trivial tasks itself (knowledge questions, single file reads,
|
||||
anything with complexity ≤ 0.3), keeping the heavy provider for real work.
|
||||
|
||||
```toml
|
||||
[slm]
|
||||
enabled = true
|
||||
backend = "auto" # ollama | llamacpp | llamafile | openaicompat | auto | disabled
|
||||
model = "reecdev/tiny3.5:500m"
|
||||
```
|
||||
|
||||
Integration tests are gated behind `//go:build integration` and skipped by default.
|
||||
Setup, presets, and verification: [docs/slm-backends.md](docs/slm-backends.md).
|
||||
The `auto` backend probes Ollama → llama.cpp → llamafile on startup and picks
|
||||
the first reachable option. Inspect with `gnoma slm status` and
|
||||
`gnoma router stats`.
|
||||
|
||||
---
|
||||
|
||||
## Session persistence
|
||||
|
||||
Sessions are auto-saved per project under `.gnoma/sessions/<id>/` after each
|
||||
completed turn. On a crash you lose at most the current in-flight turn.
|
||||
|
||||
```sh
|
||||
gnoma --resume # interactive picker
|
||||
gnoma --resume <id> # restore by ID
|
||||
gnoma -r # shorthand
|
||||
gnoma --incognito # no save, no router learning
|
||||
```
|
||||
|
||||
Inside the TUI: `/resume`, `/resume <id>`, `Ctrl+X` (incognito toggle).
|
||||
|
||||
Router-quality data (EMA scores) is stored at
|
||||
`~/.config/gnoma/quality.json` (or `quality-<profile>.json` in profile mode).
|
||||
|
||||
---
|
||||
|
||||
## Extensibility
|
||||
|
||||
### MCP servers
|
||||
|
||||
Connect any [MCP](https://modelcontextprotocol.io)-compatible server:
|
||||
|
||||
```toml
|
||||
[[mcp_servers]]
|
||||
name = "git"
|
||||
command = "mcp-server-git"
|
||||
args = ["--repo", "."]
|
||||
timeout = "30s"
|
||||
|
||||
# Optionally replace a built-in tool with an MCP one
|
||||
[mcp_servers.replace_default]
|
||||
exec = "bash"
|
||||
```
|
||||
|
||||
MCP tools appear as `mcp__{server}__{tool}` unless mapped via `replace_default`.
|
||||
|
||||
### Skills
|
||||
|
||||
Drop markdown files into `.gnoma/skills/` or `~/.config/gnoma/skills/`. Invoke
|
||||
with `/<skill-name>`. List with `/skills`.
|
||||
|
||||
### Hooks
|
||||
|
||||
Shell commands run on tool events (`pre_tool_use`, `post_tool_use`, etc.):
|
||||
|
||||
```toml
|
||||
[[hooks]]
|
||||
name = "block-rm-rf"
|
||||
event = "pre_tool_use"
|
||||
type = "command"
|
||||
exec = "bash-safety-check.sh"
|
||||
tool_pattern = "bash*"
|
||||
```
|
||||
|
||||
Ordering rules: [ADR-004](docs/essentials/decisions/004-posttooluse-hook-ordering.md).
|
||||
|
||||
### Plugins
|
||||
|
||||
Plugins bundle skills, hooks, and MCP server configs. Drop a plugin directory
|
||||
into `~/.config/gnoma/plugins/` (global) or `<project>/.gnoma/plugins/`
|
||||
(project-local); gnoma auto-discovers them on startup.
|
||||
|
||||
Each plugin's `plugin.json` is pinned by SHA-256 on first load
|
||||
(Trust-On-First-Use). A manifest that changes between runs is refused with a
|
||||
clear error and a re-enrolment hint. Full model:
|
||||
[docs/plugins-trust.md](docs/plugins-trust.md) and
|
||||
[ADR-003](docs/essentials/decisions/003-plugin-trust.md).
|
||||
|
||||
### Elfs (sub-agents)
|
||||
|
||||
The `spawn_elfs` tool decomposes work into parallel sub-tasks. See
|
||||
[`internal/skill/skills/batch.md`](internal/skill/skills/batch.md) for the
|
||||
built-in batching skill.
|
||||
|
||||
---
|
||||
|
||||
## Subcommands
|
||||
|
||||
| Command | What it does |
|
||||
|---|---|
|
||||
| `gnoma providers` | List every discovered provider, model, and CLI agent |
|
||||
| `gnoma profile list` / `show <name>` | Profile diagnostics |
|
||||
| `gnoma router stats` | Quality EMA + classifier source breakdown |
|
||||
| `gnoma slm setup` / `slm status` | Manage the llamafile-backed SLM |
|
||||
|
||||
`gnoma --help` for the full flag set.
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
|
||||
gnoma runs tools and shell commands on your behalf. The
|
||||
[`internal/security`](internal/security) package canonicalises every path
|
||||
(TOCTOU-safe), gates network access through a configurable firewall, and
|
||||
scans tool output for secrets before it ever reaches the model. The
|
||||
`SafeProvider` boundary keeps incognito-mode data out of long-lived stores.
|
||||
|
||||
Architecture references:
|
||||
|
||||
- [docs/essentials/INDEX.md](docs/essentials/INDEX.md) — full architecture map
|
||||
- [docs/essentials/decisions/](docs/essentials/decisions/) — ADRs 001–004
|
||||
|
||||
---
|
||||
|
||||
## Development
|
||||
|
||||
```sh
|
||||
make build # ./bin/gnoma
|
||||
make test # unit tests
|
||||
make test-integration # //go:build integration — requires real API keys
|
||||
make cover # coverage.html
|
||||
make lint # golangci-lint
|
||||
make check # fmt + vet + lint + test
|
||||
```
|
||||
|
||||
Architecture, conventions, and TDD workflow: [CONTRIBUTING.md](CONTRIBUTING.md).
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
Apache License 2.0. See [LICENSE](LICENSE) and [NOTICE](NOTICE).
|
||||
|
||||
@@ -1,51 +1,53 @@
|
||||
# Gnoma — TODO
|
||||
|
||||
Active plans, newest first:
|
||||
Active work, newest first.
|
||||
|
||||
- **Post-audit security hardening** — **complete (2026-05-19)**. All 14
|
||||
findings from the external review are closed across three waves +
|
||||
one ADR:
|
||||
## In flight
|
||||
|
||||
- **Distribution** — `.goreleaser.yml` is configured for
|
||||
`linux`/`darwin`/`windows` × `amd64`/`arm64`. Still pending: first
|
||||
tag + release pipeline trigger, optional Homebrew tap and Docker
|
||||
image, mirror release publishing to GitHub.
|
||||
- **Compound tools (post-SLM Phase E)** — held until ≥50 SLM
|
||||
observations inform which primitives are worth adding. See
|
||||
[`docs/superpowers/plans/2026-05-19-post-slm-unlock.md`](docs/superpowers/plans/2026-05-19-post-slm-unlock.md).
|
||||
|
||||
## Stable backlog (not in active phases)
|
||||
|
||||
- **Thinking mode** (disabled / budget / adaptive) — M12.
|
||||
- **Structured output** with JSON schema validation — M12.
|
||||
- **Native agy JSON output** — switch the subprocess provider to
|
||||
`--output-format stream-json` once the agy CLI supports it,
|
||||
replacing the current prompt-augmentation fallback.
|
||||
- **SQLite session persistence** + serve mode — M10.
|
||||
- **Task learning** (pattern recognition, persistent tasks) — M11.
|
||||
- **Web UI** (`gnoma web`) — M15.
|
||||
- **OAuth / keyring** — M13.
|
||||
- **Observability** (feature flags, cost dashboards) — M14.
|
||||
- **PE / Mach-O ELF support** — future, after ELF Phase 6.
|
||||
|
||||
## History
|
||||
|
||||
Completed initiatives, kept here as pointers to their plan files:
|
||||
|
||||
- **Post-audit security hardening** — complete 2026-05-19. Three waves
|
||||
+ one ADR closed all 14 findings from the external review:
|
||||
- [Wave 1 — SafeProvider boundary](docs/superpowers/plans/2026-05-19-security-wave1-safeprovider.md)
|
||||
- [Wave 2 — Incognito coherence](docs/superpowers/plans/2026-05-19-security-wave2-incognito.md)
|
||||
- [Wave 3 — Scanner + path hygiene](docs/superpowers/plans/2026-05-19-security-wave3-scanner-paths.md)
|
||||
- Wave 3 — scanner + path hygiene (rolled out directly without a
|
||||
plan file; see commits leading up to 2026-05-19 on `internal/security`)
|
||||
- [ADR-004 — PostToolUse hook ordering](docs/essentials/decisions/004-posttooluse-hook-ordering.md)
|
||||
- **[`docs/superpowers/plans/2026-05-19-post-slm-unlock.md`](docs/superpowers/plans/2026-05-19-post-slm-unlock.md)**
|
||||
— outstanding work after the SLM unlock session. Phases A (two-stage
|
||||
tool routing), B (CLI agent binary override), C (user profiles), and
|
||||
D (per-arm capability tags) are **complete**. Phase E (compound
|
||||
tools) is held until ≥50 SLM observations inform which primitives are
|
||||
worth adding.
|
||||
- **[`docs/superpowers/plans/2026-05-07-gnoma-roadmap.md`](docs/superpowers/plans/2026-05-07-gnoma-roadmap.md)**
|
||||
— broader roadmap (PTY shell, USP integration, ELF, distribution).
|
||||
Phase 4 ("Router Revisit") is superseded by the post-SLM plan above.
|
||||
- **Post-SLM unlock** —
|
||||
[plan](docs/superpowers/plans/2026-05-19-post-slm-unlock.md). Phases
|
||||
A–D complete (two-stage tool routing, CLI agent binary override,
|
||||
user profiles, per-arm capability tags).
|
||||
- **2026-05-07 roadmap** —
|
||||
[plan](docs/superpowers/plans/2026-05-07-gnoma-roadmap.md). M1–M8
|
||||
done; SLM classifier (Phase 3) complete; Phase 4 superseded by the
|
||||
post-SLM plan.
|
||||
|
||||
Phases (2026-05-07 roadmap):
|
||||
1. M8 Cleanup (wiring gaps)
|
||||
2. PTY Interactive Shell (`tea.ExecProcess`)
|
||||
3. SLM Task Classifier (Ollama HTTP, opt-in) — **complete**
|
||||
4. Router Revisit — **superseded by post-SLM plan**
|
||||
5. USP Security Integration
|
||||
6. ELF Binary Support (deferred/opportunistic)
|
||||
7. Distribution (CI trigger for goreleaser)
|
||||
|
||||
---
|
||||
|
||||
## Stable Backlog (not in active phases)
|
||||
|
||||
- **Thinking mode** (disabled / budget / adaptive) — M12 in milestones
|
||||
- **Structured output** with JSON schema validation — M12
|
||||
- **Native agy JSON output** — update subprocess provider to use `--output-format stream-json` once supported by agy CLI, replacing the current prompt-augmentation fallback.
|
||||
- **SQLite session persistence** + serve mode — M10
|
||||
- **Task learning** (pattern recognition, persistent tasks) — M11
|
||||
- **Web UI** (`gnoma web`) — M15
|
||||
- **OAuth / keyring** — M13
|
||||
- **Observability** (feature flags, cost dashboards) — M14
|
||||
- **PE / Mach-O support** — future, after ELF Phase 6
|
||||
|
||||
---
|
||||
|
||||
## Architecture References
|
||||
## Reference
|
||||
|
||||
- Milestones: `docs/essentials/milestones.md`
|
||||
- Decisions: `docs/essentials/decisions/`
|
||||
- ADR-013 (SLM routing, supersedes ADR-009): `docs/essentials/decisions/002-slm-routing.md`
|
||||
- ADR-002 (SLM routing, supersedes earlier ADR-009): `docs/essentials/decisions/002-slm-routing.md`
|
||||
|
||||
@@ -39,3 +39,4 @@ essentials:
|
||||
- [ADR-001 — Initial Decisions](decisions/001-initial-decisions.md)
|
||||
- [ADR-002 — SLM Routing](decisions/002-slm-routing.md)
|
||||
- [ADR-003 — Plugin Trust via TOFU Manifest Pinning](decisions/003-plugin-trust.md)
|
||||
- [ADR-004 — PostToolUse Hook Ordering](decisions/004-posttooluse-hook-ordering.md)
|
||||
|
||||
@@ -1,160 +0,0 @@
|
||||
> **Note (2026-05-07):** This document describes the `gemini-cli` (Node.js) implementation.
|
||||
> The specifics — LiteRT-LM runtime, daemon/PID management, `litert-lm pull`, React/Ink UI —
|
||||
> are Node.js artifacts and do not apply to gnoma. The **conceptually relevant part** is the
|
||||
> Complexity Rubric and the `GemmaClassifierStrategy` JSON interface, which informed the Go
|
||||
> `SLMClassifier` design in Phase 3 of `docs/superpowers/plans/2026-05-07-gnoma-roadmap.md`.
|
||||
> For the Go implementation, see ADR-013 (`docs/essentials/decisions/002-slm-routing.md`).
|
||||
|
||||
# Gemini CLI Local Model Routing (/gemma) Architecture
|
||||
|
||||
The `/gemma` integration in the `gemini-cli` uses a local LLM to perform "Model Routing". It automatically decides whether to use a cheaper/faster model (Flash) or a more powerful one (Pro) based on the user's request.
|
||||
|
||||
## Core Architecture
|
||||
* **Engine:** Uses **LiteRT-LM**, a lightweight runtime that serves Gemma models via a Gemini-compatible HTTP API.
|
||||
* **Model:** Specifically uses a quantized **Gemma 3 1B** model (`gemma3-1b-gpu-custom`). It's ~1GB and runs locally with low latency (~100-200ms for classification).
|
||||
* **Orchestration:** The CLI manages the LiteRT server as a background daemon, tracking its state via PID files and logs.
|
||||
* **Integration:** A `GemmaClassifierStrategy` is injected into the core `ModelRouterService`. It flattens recent chat history, sends it to the local Gemma model with a strict "Complexity Rubric," and uses the JSON response to switch models dynamically.
|
||||
|
||||
---
|
||||
|
||||
## Integration Todo List
|
||||
|
||||
### 1. Infrastructure & Asset Management
|
||||
- [ ] **Platform Detection:** Logic to map OS/Arch to the correct LiteRT-LM binary download URL.
|
||||
- [ ] **Safe Installer:** Implementation of binary download + SHA256 checksum verification + permission handling (`chmod +x`, macOS quarantine removal).
|
||||
- [ ] **Model Manager:** Wrapper for the `litert-lm pull` command to download and verify the 1GB Gemma model.
|
||||
|
||||
### 2. Process & Server Management
|
||||
- [ ] **Background Daemon:** Implementation of `spawn(..., { detached: true })` to keep the LiteRT server running independently of the CLI session.
|
||||
- [ ] **State Tracking:** A PID-file system to manage server lifecycle (start/stop/status) and prevent port collisions.
|
||||
- [ ] **Auto-Start Logic:** A manager class (`LiteRtServerManager`) that checks server health on CLI startup and launches it if enabled in settings.
|
||||
|
||||
### 3. Routing Logic (The "Brain")
|
||||
- [ ] **Complexity Rubric:** A specialized system prompt that defines what constitutes a "SIMPLE" vs "COMPLEX" task.
|
||||
- [ ] **Context Flattener:** Utility to compress the last ~4-20 turns of chat history into a prompt suitable for a small 1B model.
|
||||
- [ ] **Strategy Implementation:** The `GemmaClassifierStrategy` class to handle the local API call, parse the JSON "reasoning," and return the model decision.
|
||||
|
||||
### 4. User Experience (CLI & UI)
|
||||
- [ ] **Management Commands:** Commands like `gemini gemma {setup|start|stop|status|logs}` for lifecycle and troubleshooting.
|
||||
- [ ] **Slash Command:** A built-in `/gemma` command that queries the local server health and displays a status panel inside a session.
|
||||
- [ ] **React/Ink UI:** A status component to show visual indicators (green/red) for the binary, model, and server state.
|
||||
|
||||
### 5. Configuration & Safety
|
||||
- [ ] **Scoped Settings:** Separate "User" settings (binary path) from "Workspace" settings (router enabled/disabled for a specific project).
|
||||
- [ ] **Failure Resilience:** Logic to gracefully fall back to the default model if the local classifier times out or fails.
|
||||
|
||||
---
|
||||
|
||||
## Routing Prompts
|
||||
|
||||
These are the exact prompts used by the `gemini-cli` to force the small 1B model to output structured JSON with strict reasoning criteria.
|
||||
|
||||
### 1. The Complexity Rubric
|
||||
```markdown
|
||||
### Complexity Rubric
|
||||
A task is COMPLEX (Choose \`pro\`) if it meets ONE OR MORE of the following criteria:
|
||||
1. **High Operational Complexity (Est. 4+ Steps/Tool Calls):** Requires dependent actions, significant planning, or multiple coordinated changes.
|
||||
2. **Strategic Planning & Conceptual Design:** Asking "how" or "why." Requires advice, architecture, or high-level strategy.
|
||||
3. **High Ambiguity or Large Scope (Extensive Investigation):** Broadly defined requests requiring extensive investigation.
|
||||
4. **Deep Debugging & Root Cause Analysis:** Diagnosing unknown or complex problems from symptoms.
|
||||
A task is SIMPLE (Choose \`flash\`) if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls). Operational simplicity overrides strategic phrasing.
|
||||
```
|
||||
|
||||
### 2. Output Format Enforcement
|
||||
```markdown
|
||||
### Output Format
|
||||
Respond *only* in JSON format like this:
|
||||
{
|
||||
"reasoning": Your reasoning...
|
||||
"model_choice": Either flash or pro
|
||||
}
|
||||
And you must follow the following JSON schema:
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"reasoning": {
|
||||
"type": "string",
|
||||
"description": "A brief summary of the user objective, followed by a step-by-step explanation for the model choice, referencing the rubric."
|
||||
},
|
||||
"model_choice": {
|
||||
"type": "string",
|
||||
"enum": ["flash", "pro"]
|
||||
}
|
||||
},
|
||||
"required": ["reasoning", "model_choice"]
|
||||
}
|
||||
You must ensure that your reasoning is no more than 2 sentences long and directly references the rubric criteria.
|
||||
When making your decision, the user's request should be weighted much more heavily than the surrounding context when making your determination.
|
||||
```
|
||||
|
||||
### 3. The Main System Prompt
|
||||
```markdown
|
||||
### Role
|
||||
You are the **Lead Orchestrator** for an AI system. You do not talk to users. Your sole responsibility is to analyze the **Chat History** and delegate the **Current Request** to the most appropriate **Model** based on the request's complexity.
|
||||
|
||||
### Models
|
||||
Choose between \`flash\` (SIMPLE) or \`pro\` (COMPLEX).
|
||||
1. \`flash\`: A fast, efficient model for simple, well-defined tasks.
|
||||
2. \`pro\`: A powerful, advanced model for complex, open-ended, or multi-step tasks.
|
||||
|
||||
[... Injects COMPLEXITY_RUBRIC here ...]
|
||||
|
||||
[... Injects OUTPUT_FORMAT here ...]
|
||||
|
||||
### Examples
|
||||
**Example 1 (Strategic Planning):**
|
||||
*User Prompt:* "How should I architect the data pipeline for this new analytics service?"
|
||||
*Your JSON Output:*
|
||||
{
|
||||
"reasoning": "The user is asking for high-level architectural design and strategy. This falls under 'Strategic Planning & Conceptual Design'.",
|
||||
"model_choice": "pro"
|
||||
}
|
||||
**Example 2 (Simple Tool Use):**
|
||||
*User Prompt:* "list the files in the current directory"
|
||||
*Your JSON Output:*
|
||||
{
|
||||
"reasoning": "This is a direct command requiring a single tool call (ls). It has Low Operational Complexity (1 step).",
|
||||
"model_choice": "flash"
|
||||
}
|
||||
**Example 3 (High Operational Complexity):**
|
||||
*User Prompt:* "I need to add a new 'email' field to the User schema in 'src/models/user.ts', migrate the database, and update the registration endpoint."
|
||||
*Your JSON Output:*
|
||||
{
|
||||
"reasoning": "This request involves multiple coordinated steps across different files and systems. This meets the criteria for High Operational Complexity (4+ steps).",
|
||||
"model_choice": "pro"
|
||||
}
|
||||
**Example 4 (Simple Read):**
|
||||
*User Prompt:* "Read the contents of 'package.json'."
|
||||
*Your JSON Output:*
|
||||
{
|
||||
"reasoning": "This is a direct command requiring a single read. It has Low Operational Complexity (1 step).",
|
||||
"model_choice": "flash"
|
||||
}
|
||||
**Example 5 (Deep Debugging):**
|
||||
*User Prompt:* "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?"
|
||||
*Your JSON Output:*
|
||||
{
|
||||
"reasoning": "The user is reporting an error symptom without a known cause. This requires investigation and falls under 'Deep Debugging'.",
|
||||
"model_choice": "pro"
|
||||
}
|
||||
**Example 6 (Simple Edit despite Phrasing):**
|
||||
*User Prompt:* "What is the best way to rename the variable 'data' to 'userData' in 'src/utils.js'?"
|
||||
*Your JSON Output:*
|
||||
{
|
||||
"reasoning": "Although the user uses strategic language ('best way'), the underlying task is a localized edit. The operational complexity is low (1-2 steps).",
|
||||
"model_choice": "flash"
|
||||
}
|
||||
```
|
||||
|
||||
### 4. The Per-Request Prompt Structure
|
||||
For every routing decision, the CLI flattens the last ~4 turns of chat history and appends the new user request.
|
||||
|
||||
```markdown
|
||||
You are provided with a **Chat History** and the user's **Current Request** below.
|
||||
|
||||
#### Chat History:
|
||||
[... Flattened text of the last 4 turns, excluding tool calls ...]
|
||||
|
||||
#### Current Request:
|
||||
"[... The actual text of what the user just typed ...]"
|
||||
```
|
||||
Reference in New Issue
Block a user