chore(docs): remove obsolete AGENTS playbook
Acceptance Criteria:\n- AGENTS.md deleted to avoid stale planning guidance.\n\nTest Notes:\n- none
This commit is contained in:
254
AGENTS.md
254
AGENTS.md
@@ -1,254 +0,0 @@
|
|||||||
# AGENTS.md — Owlen v0.2 Execution Plan
|
|
||||||
|
|
||||||
**Focus:** Ollama (local) ✅, Ollama Cloud (with API key) ✅, context/limits UI ✅, seamless web-search tool ✅
|
|
||||||
**Style:** Medium-sized conventional commits, each with acceptance criteria & test notes.
|
|
||||||
**Definition of Done (DoD):**
|
|
||||||
|
|
||||||
* Local Ollama works out-of-the-box (no key required).
|
|
||||||
* Ollama Cloud is available **only when** an API key is present; no more 401 loops; clear fallback to local.
|
|
||||||
* Chat header shows **context used / context window** and **%**.
|
|
||||||
* Cloud usage shows **hourly / weekly token usage** (tracked locally; limits configurable).
|
|
||||||
* “Internet?” → the model/tooling performs web search automatically; no “I can’t access the internet” replies.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Release Assumptions
|
|
||||||
|
|
||||||
* Provider config has **separate entries** for local (“ollama”) and cloud (“ollama-cloud”), both optional.
|
|
||||||
* **Cloud base URL** is config-driven (default `[Inference] https://ollama.com`), **local** default `http://localhost:11434`.
|
|
||||||
* Cloud requests include `Authorization: Bearer <API_KEY>`.
|
|
||||||
* Token counts (`prompt_eval_count`, `eval_count`) are read from provider responses **if available**; else fallback to local estimation.
|
|
||||||
* Rate-limit quotas unknown → **user-configurable** (`hourly_quota_tokens`, `weekly_quota_tokens`); we still track actual usage.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Commit Plan (Conventional Commits)
|
|
||||||
|
|
||||||
### 1) feat(provider/ollama): health checks, resilient model listing, friendly errors
|
|
||||||
|
|
||||||
* **Why:** Local should “just work”; no brittle failures if model not pulled or daemon is down.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* `GET /api/tags` with timeout & retry; on failure → user-friendly banner (“Ollama not reachable on localhost:11434”).
|
|
||||||
* If chat hits “model not found” → suggest `ollama pull <model>` (do **not** auto-pull).
|
|
||||||
* Pre-chat healthcheck endpoint with short timeout.
|
|
||||||
* **AC:** With no Ollama, UI shows actionable error, app stays usable. With Ollama up, models list renders within TTL.
|
|
||||||
* **Tests:** Mock 200/500/timeout; simulate missing model.
|
|
||||||
|
|
||||||
### 2) feat(provider/ollama-cloud): separate provider with auth header & key-gating
|
|
||||||
|
|
||||||
* **Why:** Prevent 401s; enable cloud only when key is present.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* New provider “ollama-cloud”.
|
|
||||||
* Build `reqwest::Client` with default headers incl. `Authorization: Bearer <API_KEY>`.
|
|
||||||
* Base URL from config (default `[Inference] https://ollama.com`).
|
|
||||||
* If no key: provider not registered; cloud models hidden.
|
|
||||||
* **AC:** With key → cloud models list & chat work; without key → no cloud provider listed.
|
|
||||||
* **Tests:** No key (hidden), invalid key (401 handled), valid key (happy path).
|
|
||||||
|
|
||||||
### 3) fix(provider/ollama-cloud): handle 401/429 gracefully, auto-degrade
|
|
||||||
|
|
||||||
* **Why:** Avoid dead UX where cloud is selectable but unusable.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Intercept 401/403 → mark provider “unauthorized”, show toast “Cloud key invalid; using local”.
|
|
||||||
* Intercept 429 → toast “Cloud rate limit hit; retry later”; keep provider enabled.
|
|
||||||
* Auto-fallback to last good local provider for the active chat (don’t lose message).
|
|
||||||
* **AC:** Selecting cloud with bad key never traps the user; one click recovers to local.
|
|
||||||
* **Tests:** Simulated 401/429 mid-stream & at start.
|
|
||||||
|
|
||||||
### 4) feat(models/registry): multi-provider registry + aggregated model list
|
|
||||||
|
|
||||||
* **Why:** Show local & cloud side-by-side, clearly labeled.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Registry holds N providers (local, cloud).
|
|
||||||
* Aggregate lists with `provider_tag` = `ollama` / `ollama-cloud`.
|
|
||||||
* De-dupe identical names by appending provider tag in UI (“qwen3:8b · local”, “qwen3:8b-cloud · cloud”).
|
|
||||||
* **AC:** Model picker groups by provider; switching respects selection.
|
|
||||||
* **Tests:** Only local; only cloud; both; de-dup checks.
|
|
||||||
|
|
||||||
### 5) refactor(session): message pipeline to support tool-calls & partial updates
|
|
||||||
|
|
||||||
* **Why:** Prepare for search tool loop and robust streaming.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Centralize stream parser → yields either `assistant_text` chunks **or** `tool_call` messages.
|
|
||||||
* Buffer + flush on `done` or on explicit tool boundary.
|
|
||||||
* **AC:** No regressions in normal chat; tool call boundary events visible to controller.
|
|
||||||
* **Tests:** Stream with mixed content, malformed frames, newline-split JSON.
|
|
||||||
|
|
||||||
### 6) fix(streaming): tolerant JSON lines parser & end-of-stream semantics
|
|
||||||
|
|
||||||
* **Why:** Avoid UI stalls on minor protocol hiccups.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Accept `\n`/`\r\n` delimiters; ignore empty lines; robust `done: true` handling; final metrics extraction.
|
|
||||||
* **AC:** Long completions never hang; final token counts captured.
|
|
||||||
* **Tests:** Fuzz malformed frames.
|
|
||||||
|
|
||||||
### 7) feat(ui/header): context usage indicator (value + %), colorized thresholds
|
|
||||||
|
|
||||||
* **Why:** Transparency: “2560 / 8192 (31%)”.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Track last `prompt_eval_count` (or estimate) as “context used”.
|
|
||||||
* Context window: from model metadata (if available) else config default per provider/model family.
|
|
||||||
* Header shows: `Model · Context 2.6k / 8k (33%)`. Colors: <60% normal, 60–85% warn, >85% danger.
|
|
||||||
* **AC:** Live updates after each assistant turn.
|
|
||||||
* **Tests:** Various windows & counts; edge 0%/100%.
|
|
||||||
|
|
||||||
### 8) feat(usage): token usage tracker (hourly/weekly), local store
|
|
||||||
|
|
||||||
* **Why:** Cloud limits visibility even without official API.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Append per-provider token use after each response: prompt+completion.
|
|
||||||
* Rolling sums: last 60m, last 7d.
|
|
||||||
* Persist to JSON/SQLite in app data dir; prune daily.
|
|
||||||
* Configurable quotas: `providers.ollama_cloud.hourly_quota_tokens`, `weekly_quota_tokens`.
|
|
||||||
* **AC:** `:limits` shows totals & percentages; survives restart.
|
|
||||||
* **Tests:** Time-travel tests; persistence.
|
|
||||||
|
|
||||||
### 9) feat(ui/usage): surface hourly/weekly usage + threshold toasts
|
|
||||||
|
|
||||||
* **Why:** Prevent surprises; act before you hit the wall.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Header second line (when cloud active): `Cloud usage · 12k/50k (hour) · 90k/250k (week)` with % colors.
|
|
||||||
* Toast at 80% & 95% thresholds.
|
|
||||||
* **AC:** Visuals match tracker; toasts fire once per threshold band.
|
|
||||||
* **Tests:** Threshold crossings; reset behavior.
|
|
||||||
|
|
||||||
### 10) feat(tools): generic tool-calling loop (function-call adapter)
|
|
||||||
|
|
||||||
* **Why:** “Use the web” without whining.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Internal tool schema: `{ name, parameters(JSONSchema), invoke(args)->ToolResult }`.
|
|
||||||
* Provider adapter: if model emits a tool call → pause stream, invoke tool, append `tool` message, resume.
|
|
||||||
* Guardrails: max tool hops (e.g., 3), max tokens per hop.
|
|
||||||
* **AC:** Model can call tools mid-answer; transcript shows tool result context to model (not necessarily to user).
|
|
||||||
* **Tests:** Single & chained tool calls; loop cutoff.
|
|
||||||
|
|
||||||
### 11) feat(tool/web.search): HTTP search wrapper for cloud mode; fallback stub
|
|
||||||
|
|
||||||
* **Why:** Default “internet” capability.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Tool name: `web.search` with params `{ query: string }`.
|
|
||||||
* Cloud path: call provider’s search endpoint (configurable path); include API key.
|
|
||||||
* Fallback (no cloud): disabled; model will not see the tool (prevents “I can’t” messaging).
|
|
||||||
* **AC:** Asking “what happened today in Rust X?” triggers one search call and integrates snippets into the answer.
|
|
||||||
* **Tests:** Happy path, 401/key missing (tool hidden), network failure (tool error surfaced to model).
|
|
||||||
|
|
||||||
### 12) feat(commands): `:provider`, `:model`, `:limits`, `:web on|off`
|
|
||||||
|
|
||||||
* **Why:** Fast control for power users.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* `:provider` lists/switches active provider.
|
|
||||||
* `:model` lists/switches models under active provider.
|
|
||||||
* `:limits` shows tracked usage.
|
|
||||||
* `:web on|off` toggles exposing `web.search` to the model.
|
|
||||||
* **AC:** Commands work in both TUI and CLI modes.
|
|
||||||
* **Tests:** Command parsing & side-effects.
|
|
||||||
|
|
||||||
### 13) feat(config): explicit sections & env fallbacks
|
|
||||||
|
|
||||||
* **Why:** Clear separation + easy secrets management.
|
|
||||||
* **Implement (example):**
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[providers.ollama]
|
|
||||||
base_url = "http://localhost:11434"
|
|
||||||
list_ttl_secs = 60
|
|
||||||
default_context_window = 8192
|
|
||||||
|
|
||||||
[providers.ollama_cloud]
|
|
||||||
enabled = true
|
|
||||||
base_url = "https://ollama.com" # Hosted default
|
|
||||||
api_key = "" # read from env OLLAMA_API_KEY if empty
|
|
||||||
hourly_quota_tokens = 50000 # configurable; visual only
|
|
||||||
weekly_quota_tokens = 250000 # configurable; visual only
|
|
||||||
list_ttl_secs = 60
|
|
||||||
```
|
|
||||||
* **AC:** Empty key → cloud disabled; env overrides work.
|
|
||||||
* **Tests:** Env vs file precedence.
|
|
||||||
|
|
||||||
### 14) docs(readme): setup & troubleshooting for local + cloud
|
|
||||||
|
|
||||||
* **Why:** Reduce support pings.
|
|
||||||
* **Include:**
|
|
||||||
|
|
||||||
* Local prerequisites; healthcheck tips; “model not found” guidance.
|
|
||||||
* Cloud key setup; common 401/429 causes; privacy note.
|
|
||||||
* Context/limits UI explainer; tool behavior & toggles.
|
|
||||||
* Upgrade notes from v0.1 → v0.2.
|
|
||||||
* **AC:** New users can configure both in <5 min.
|
|
||||||
|
|
||||||
### 15) test(integration): local-only, cloud-only, mixed; auth & rate-limit sims
|
|
||||||
|
|
||||||
* **Why:** Prevent regressions.
|
|
||||||
* **Implement:**
|
|
||||||
|
|
||||||
* Wiremock stubs for cloud: `/tags`, `/chat`, search endpoint, 401/429 cases.
|
|
||||||
* Snapshot test for tool-call transcript roundtrip.
|
|
||||||
* **AC:** Green suite in CI; reproducible.
|
|
||||||
|
|
||||||
### 16) refactor(errors): typed provider errors + UI toasts
|
|
||||||
|
|
||||||
* **Why:** Replace generic “something broke”.
|
|
||||||
* **Implement:** Error enum: `Unavailable`, `Unauthorized`, `RateLimited`, `Timeout`, `Protocol`. Map to toasts/banners.
|
|
||||||
* **AC:** Each known failure surfaces a precise message; logs redact secrets.
|
|
||||||
|
|
||||||
### 17) perf(models): cache model lists with TTL; invalidate on user action
|
|
||||||
|
|
||||||
* **Why:** Reduce network churn.
|
|
||||||
* **AC:** Re-open picker doesn’t re-fetch until TTL or manual refresh.
|
|
||||||
|
|
||||||
### 18) chore(release): bump to v0.2; changelog; package metadata
|
|
||||||
|
|
||||||
* **AC:** `--version` shows v0.2; changelog includes above bullets.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Missing Features vs. Codex / Claude-Code (Queued for v0.3+)
|
|
||||||
|
|
||||||
* **Multi-vendor providers:** OpenAI & Anthropic provider crates with function-calling parity and structured outputs.
|
|
||||||
* **Agentic coding ops:** Safe file edits, `git` ops, shell exec with approval queue.
|
|
||||||
* **“Thinking/Plan” pane:** First-class rendering of plan/reflect steps; optional auto-approve.
|
|
||||||
* **Plugin system:** Tool discovery & permissions; marketplace later.
|
|
||||||
* **IDE integration:** Minimal LSP/bridge (VS Code/JetBrains) to run Owlen as backend.
|
|
||||||
* **Retrieval/RAG:** Local project indexing; selective context injection with token budgeter.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Acceptance Checklist (Release Gate)
|
|
||||||
|
|
||||||
* [ ] Local Ollama: healthcheck OK, models list OK, chat OK.
|
|
||||||
* [ ] Cloud appears only with valid key; no 401 loops; 429 shows toast.
|
|
||||||
* [ ] Header shows context value and %; updates after each turn.
|
|
||||||
* [ ] `:limits` shows hourly/weekly tallies; persists across restarts.
|
|
||||||
* [ ] Asking for current info triggers `web.search` (cloud on) and returns an answer without “I can’t access the internet”.
|
|
||||||
* [ ] Docs updated; sample config included; tests green.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Notes & Caveats
|
|
||||||
|
|
||||||
* **Base URLs & endpoints:** Kept **config-driven**. Defaults are `[Inference]`. If your cloud endpoint differs, set it in `config.toml`.
|
|
||||||
* **Quotas:** No official token quota API assumed; we **track locally** and let you configure limits for UI display.
|
|
||||||
* **Tooling:** If a given model doesn’t emit tool calls, `web.search` won’t be visible to it. You can force exposure via `:web on`.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Lightly Opinionated Guidance (aka sparring stance)
|
|
||||||
|
|
||||||
* Don’t auto-pull models on errors—teach users where control lives.
|
|
||||||
* Keep cloud totally opt-in; hide it on misconfig to reduce paper cuts.
|
|
||||||
* Treat tool-calling like `sudo`: short leash, clear logs, obvious toggles.
|
|
||||||
* Make the header a cockpit, not a billboard: model · context · usage; that’s it.
|
|
||||||
|
|
||||||
If you want, I can turn this into a series of `codex apply`-ready prompts with one commit per run.
|
|
||||||
Reference in New Issue
Block a user