feat: Ollama/gemma4 compat — /init flow, stream filter, safety fixes
provider/openai: - Fix doubled tool call args (argsComplete flag): Ollama sends complete args in the first streaming chunk then repeats them as delta, causing doubled JSON and 400 errors in elfs - Handle fs: prefix (gemma4 uses fs:grep instead of fs.grep) - Add Reasoning field support for Ollama thinking output cmd/gnoma: - Early TTY detection so logger is created with correct destination before any component gets a reference to it (fixes slog WARN bleed into TUI textarea) permission: - Exempt spawn_elfs and agent tools from safety scanner: elf prompt text may legitimately mention .env/.ssh/credentials patterns and should not be blocked tui/app: - /init retry chain: no-tool-calls → spawn_elfs nudge → write nudge (ask for plain text output) → TUI fallback write from streamBuf - looksLikeAgentsMD + extractMarkdownDoc: validate and clean fallback content before writing (reject refusals, strip narrative preambles) - Collapse thinking output to 3 lines; ctrl+o to expand (live stream and committed messages) - Stream-level filter for model pseudo-tool-call blocks: suppresses <<tool_code>>...</tool_code>> and <<function_call>>...<tool_call|> from entering streamBuf across chunk boundaries - sanitizeAssistantText regex covers both block formats - Reset streamFilterClose at every turn start
This commit is contained in:
@@ -99,17 +99,19 @@ type QualityThreshold struct {
|
||||
Target float64 // ideal
|
||||
}
|
||||
|
||||
// DefaultThresholds are calibrated for M4 heuristic scores (range ~0–0.85).
|
||||
// M9 will replace these with bandit-derived values once quality data accumulates.
|
||||
var DefaultThresholds = map[TaskType]QualityThreshold{
|
||||
TaskBoilerplate: {0.50, 0.70, 0.80},
|
||||
TaskGeneration: {0.60, 0.75, 0.88},
|
||||
TaskRefactor: {0.65, 0.78, 0.90},
|
||||
TaskReview: {0.70, 0.82, 0.92},
|
||||
TaskUnitTest: {0.60, 0.75, 0.85},
|
||||
TaskPlanning: {0.75, 0.88, 0.95},
|
||||
TaskOrchestration: {0.80, 0.90, 0.96},
|
||||
TaskSecurityReview: {0.88, 0.94, 0.99},
|
||||
TaskDebug: {0.65, 0.80, 0.90},
|
||||
TaskExplain: {0.55, 0.72, 0.85},
|
||||
TaskBoilerplate: {0.40, 0.55, 0.70}, // any capable arm works
|
||||
TaskGeneration: {0.45, 0.60, 0.75},
|
||||
TaskRefactor: {0.50, 0.65, 0.78},
|
||||
TaskReview: {0.55, 0.68, 0.80},
|
||||
TaskUnitTest: {0.45, 0.60, 0.75},
|
||||
TaskPlanning: {0.60, 0.72, 0.82},
|
||||
TaskOrchestration: {0.65, 0.75, 0.83},
|
||||
TaskSecurityReview: {0.70, 0.78, 0.84}, // requires thinking or large context window
|
||||
TaskDebug: {0.50, 0.65, 0.78},
|
||||
TaskExplain: {0.40, 0.55, 0.72},
|
||||
}
|
||||
|
||||
// ClassifyTask infers a TaskType from the user's prompt using keyword heuristics.
|
||||
|
||||
Reference in New Issue
Block a user