marktvogt.de

Author	SHA1	Message	Date
vikingowl	6b8d2d621f	fix(research): add beschreibung to prompt, auto-note on apply	2026-04-25 11:05:43 +02:00
vikingowl	282d59e6c1	fix(research): add beschreibung to prompt, auto-note on apply The beschreibung field was schema-required but absent from ## Felder, causing the LLM to always return null. Add explicit extraction instruction. Also reword the opening line which said "Keine Beschreibungstexte" — contradicting the field we actually want. On apply, append "KI-Recherche: DD.MM.YYYY HH:MM" to admin_notes so there's a permanent audit trail of when research was run.	2026-04-25 11:05:27 +02:00
vikingowl	d7dd003a67	fix(research): convert LLM schema shapes to form-compatible types on apply	2026-04-25 11:01:36 +02:00
vikingowl	dd9a5ae9cc	fix(research): convert LLM schema shapes to form-compatible types on apply Researcher emits {datum_von,von,bis} for opening hours and [{name,betrag,waehrung}] for admission info — both incompatible with the form's {day,open,close} and AdmissionInfo shapes. Normalize on apply; extend normalizeDayName to handle ISO YYYY-MM-DD dates the LLM produces. ResearchPanel renders both LLM and form-native formats with dedicated table/list views.	2026-04-25 11:01:18 +02:00
vikingowl	25b682f030	fix(research): remove Grounded from LLM call — incompatible with JSONSchema in Gemini API Gemini rejects requests that set both GoogleSearchRetrieval and response_schema. The orchestrator already provides web content via SearxNG + scraping, so grounding is unnecessary here.	2026-04-25 10:50:01 +02:00
vikingowl	eff7b7ec65	fix(ai): strip models/ prefix from Gemini model names in ListModelNames	2026-04-25 10:46:32 +02:00
vikingowl	016d7a0792	fix(settings): handle missing migrations gracefully, guard AI status page factory.go: treat DB errors from GetGeminiAPIKey as "no key" and fall back to the GEMINI_API_KEY env var instead of propagating the error (which caused a panic/crash when migrations haven't been run yet). gemini.go: ListModelNames returns a ProviderError when the client is nil so that connected=false is reported correctly in GetAI instead of the previous nil,nil→connected=true false positive. +page.server.ts: catch fetch errors so a backend outage doesn't 500 the whole page. +page.svelte: guard all data.ai access with {#if data.ai} so the page renders an error banner instead of crashing on null access.	2026-04-25 10:41:25 +02:00
vikingowl	c6ce0f3a2d	feat(discovery): auto-accept high-confidence crawl rows during crawl When a freshly-inserted discovered_market has a matched series, konfidenz "hoch" (≥2 sources), and both start/end dates present, Accept() is called inline with a nil reviewer (mapped to NULL reviewed_by) so the row goes straight to accepted without manual review. CrawlSummary gains auto_accepted counter; slog summary logs it. MarkAccepted / Service.Accept now take *uuid.UUID for reviewer so nil cleanly maps to NULL in the DB column (already nullable).	2026-04-25 10:08:26 +02:00
vikingowl	3ddfd87408	feat(ai): migrate to Google Gemini 2.5 Flash-Lite, drop Mistral/Ollama Replace the Mistral + Ollama AI stack with a single Google Gemini provider backed by google.golang.org/genai. API key moves from env/Helm to the DB (AES-256-GCM, key derived from JWT_SECRET via HKDF) so it can be rotated via the admin UI without a pod restart. New: - pkg/crypto/secretbox — AES-256-GCM encrypt/decrypt for secrets at rest - pkg/ai/gemini — GeminiProvider with grounding, structured output, usage recording, and hot-reload (Reinitialize swaps client under mutex) - pkg/ai/usage — UsageRecorder interface + UsageEvent struct - domain/settings/store — DB-backed settings (model, grounding toggle, key) - domain/settings/usage — UsageRepo implementing UsageRecorder; ai_usage table - migrations 000021 (system_settings) + 000022 (ai_usage) - settings API: GET /ai, POST /ai/key, POST /ai/model, POST /ai/grounding, GET /ai/usage - admin UI: 4-card settings page — provider status, model selector, grounding toggle with quota, usage rollups + recent-calls table Removed: - pkg/ai/ollama, mistral_provider, ratelimiter (+ tests) - Helm AI_API_KEY, AI_PROVIDER, AI_MODEL_COMPLEX, AI_AGENT_DISCOVERY, AI_RATE_LIMIT_RPS env vars Call sites set Grounded+CallType: research (true/"research"), enrich Pass B (true/"enrich_b"), similarity (false/"similarity"). Integration test updated to use a stub ai.Provider instead of a fake Ollama HTTP server.	2026-04-25 09:54:49 +02:00
vikingowl	80149de317	feat(discovery): auto-trigger Pass A enrichment after crawl	2026-04-25 08:42:28 +02:00
vikingowl	7552e5073f	feat(discovery): auto-trigger Pass A enrichment after crawl Run CrawlEnrich + Nominatim geocoding in the background immediately after a crawl discovers new rows. Manual triggers via the /enrichment/crawl-all endpoint remain for backfills but are no longer needed for fresh crawls.	2026-04-25 08:42:20 +02:00
vikingowl	c4207865c8	feat(settings): Ollama connection status + runtime model selector Add /admin/settings/ai endpoint (GET status + available models, POST model switch). OllamaProvider gains SetModel/Model/ListModels with a RWMutex so the active model can be swapped at runtime without restart. New /admin/einstellungen page shows provider, connection badge, and a model dropdown that calls the API on submit.	2026-04-25 08:29:38 +02:00
vikingowl	f13cd55393	feat(research): wire LLM output to ResearchResult, add beschreibung field Transform raw LLM felder output into FieldSuggestion[] for the UI panel. Skip suggestions identical to current market values. Add beschreibung to both schemas, the Go struct, and the transformation mapping so description is extracted during research. Fix field labels (Land, Startdatum, Enddatum) in ResearchPanel.	2026-04-25 08:12:28 +02:00
vikingowl	c18babce5b	fix(research): use anyOf for nullable fields in Ollama constraint schema Ollama's llama.cpp grammar converter supports anyOf with primitive null — use it for all nullable wert/hinweis fields instead of type:string-only, so constrained decoding emits JSON null directly. This also fixes the orchestrator test fixture which uses JSON null for optional wert fields.	2026-04-24 18:18:05 +02:00
vikingowl	67b2eb5d74	feat(market): in-backend research orchestrator with SearxNG + schema-validated LLM Adds pkg/search (SearxNG impl), domain/market/research (orchestrator + embedded German prompt and JSON schema), and reinstates POST /markets/:id/research on top of the new pipeline. Seeds URLs from crawler provenance; falls back to search when fewer than two distinct seed domains are known.	2026-04-24 17:06:04 +02:00
vikingowl	24e072b63d	feat(ai): pluggable provider interface, Ollama + Mistral impls, migrate Pass2 sites Replaces the Mistral-only ai.Client with an ai.Provider interface backed by Ollama and Mistral implementations. Migrates enrichment + similarity callers to ai.Provider.Chat. Research endpoint returns 501 until commit 2 reinstates it on the new orchestrator.	2026-04-24 16:35:18 +02:00
vikingowl	2adb4882c7	docs(planning): add implementation plan for pluggable AI provider migration	2026-04-24 15:43:12 +02:00
vikingowl	020f4069b5	docs(planning): add spec for pluggable AI provider and local research orchestrator	2026-04-24 15:31:47 +02:00
Christian Nachtigall	7a2e81c8c9	Merge branch 'feat/reverse-geocode' into 'main' feat(market): reverse geocoding See merge request vikingowl/marktvogt.de!23	2026-04-24 13:01:00 +00:00
vikingowl	c9a2f8622f	feat(market): reverse geocoding — lat/lng to address Complements the existing forward geocoder with Nominatim's /reverse endpoint so the admin edit form can populate the address from coordinates (useful when a crawl gave us lat/lng but no street, e.g. after running crawl-enrich). Backend: - geocode.Reverse(ctx, lat, lng) hits Nominatim /reverse with addressdetails=1 and accept-language=de, reuses the 1 rps mutex already guarding forward calls. Falls through city → town → village → municipality → hamlet for small places. Returns nil when Nominatim has no match so callers can distinguish "no hit" from "all-empty address." - New DTOs ReverseGeocodeRequest/Response. - GeocodeHandler.ReverseGeocode wired at POST /reverse-geocode behind the same geocodeLimit middleware as /geocode. Frontend: - /api/reverse-geocode SvelteKit proxy mirrors /api/geocode. - MarketForm gets a second button next to "Koordinaten aus Adresse ermitteln" — "Adresse aus Koordinaten ermitteln". Writes non-empty street/city/zip back into the form; empty result surfaces "Keine Adresse gefunden."	2026-04-24 15:00:23 +02:00
Christian Nachtigall	a250fddbc2	Merge branch 'fix/discovery-remove-auto-research' into 'main' fix(discovery): stop auto-firing research on Accept See merge request vikingowl/marktvogt.de!22	2026-04-24 12:56:43 +00:00
vikingowl	38834c56a3	fix(discovery): stop auto-firing research on Accept Accepting a row triggered a background POST to /admin/markets/<edition>/research. The intent was to "warm up" the edit page, but the result was discarded (fire-and-forget), the edit page only renders research from its own form action, and the backend's 5-minute-per-market cooldown still got set — so the operator's first manual "Mit KI recherchieren" click hit "Bitte warte 5 Minuten zwischen Recherche-Aufrufen" instead. Removes the auto-fire. Research runs on user click. If we want prefetched suggestions later, that needs server-side caching + a load-time fetch, not fire-and-forget.	2026-04-24 14:56:12 +02:00
Christian Nachtigall	b6d7ebd2b1	Merge branch 'fix/discovery-enrich-timeout' into 'main' fix(discovery): enrich-all timeout + partial progress See merge request vikingowl/marktvogt.de!21	2026-04-24 12:12:35 +00:00
vikingowl	9cbe654d55	fix(discovery): raise enrich-all timeout + surface partial progress Pain: a 1400+ row pending queue can't finish crawl-enrich inside the old 10-minute cap (Nominatim's 1 rps means ~23m minimum). Operators saw a scary red "Crawl-enrich fehlgeschlagen: context deadline exceeded" banner even though the pipeline is resumable. - Introduce enrichAllTimeout constant (45m) sized for ~2700 rows per press; the original 10m assumed 600 rows worst-case. - On context.DeadlineExceeded, translate to a user-facing message ("Zeitlimit erreicht nach N von M Zeilen. Erneut starten, um die verbleibenden Zeilen zu bearbeiten.") instead of raw Go error. - Always stash the summary in handler state, even on error, so the UI can show partial progress (N/M processed) alongside the message. - Service: populate DurationMs on early-return too, so the status endpoint's duration reflects the partial run instead of zero. Behavior unchanged when a run finishes cleanly; the queue remains resumable across presses as before.	2026-04-24 14:11:38 +02:00
Christian Nachtigall	950d01e3d4	Merge branch 'fix/discovery-accept-redirect-path' into 'main' fix(discovery): Accept redirect 404 See merge request vikingowl/marktvogt.de!20	2026-04-24 11:59:53 +00:00
vikingowl	20055acd2e	fix(discovery): correct redirect path after Accept The accept action redirected to /admin/maerkte/<id>/edit, but the route is /admin/maerkte/[id]/bearbeiten — every other admin link uses the German segment. Reviewers hit a 404 after every Accept.	2026-04-24 13:59:25 +02:00
Christian Nachtigall	8528af8492	Merge branch 'feat/discovery-public-preview' into 'main' Discovery preview modal See merge request vikingowl/marktvogt.de!19	2026-04-24 11:51:20 +00:00
vikingowl	2c0154e4ce	feat(web): discovery preview modal Adds a Vorschau button to the detail drawer header that opens a full-width modal showing an approximate public /markt/[slug] layout for the candidate row. Lets reviewers sanity-check the user-facing result before clicking Accept. - DiscoveryPreview.svelte: renders title, date range, venue/PLZ/city location line, organizer, description, opening hours, website link and a Leaflet map pin (if lat/lng present). Banner calls out which fields (street, admission prices, title image) will come later from the organizer so the preview's gaps are not mistaken for bugs. - DetailDrawer.svelte: adds previewOpen state, an eye-icon Vorschau button next to Accept/Reject, and an overlay at z-60 over the drawer. Backdrop click or ✕ closes the preview without closing the drawer.	2026-04-24 13:50:43 +02:00
vikingowl	c94289a758	chore: ignore local .claude tool state	2026-04-24 13:50:32 +02:00
Christian Nachtigall	ef89cc283e	Merge branch 'chore/discovery-drawer-polish-and-rank-fix' into 'main' Discovery drawer polish + rank fix See merge request vikingowl/marktvogt.de!18	2026-04-24 11:41:03 +00:00
vikingowl	a2dffcb112	fix(discovery): sort source_contributions by rank on read MergePendingSources re-aggregates the jsonb array with ORDER BY source_name for DB determinism, but the admin UI treats index 0 as "Rang 1 = winning source." Legacy auto-merged rows were therefore surfacing mittelalterkalender (alphabetically first) as Rang 1 instead of the actual rank-1 source mittelaltermarkt_online. - Export crawler.SourceRank (was unexported rankOf) so other packages in the discovery domain can reference the canonical rank map. - scanDiscoveredMarket: sort.SliceStable SourceContributions by rank after unmarshal. Every read path now sees contributions in rank order regardless of how they were persisted; legacy rows self-correct on next read, no migration needed.	2026-04-24 13:38:23 +02:00
vikingowl	0d2c9c0f7f	fix(web): silence svelte 5 warnings, add missing enrichment proxy - Wrap $state initializers that read props (MarketForm, ResearchPanel, maerkte +page) in untrack() so Svelte 5 stops warning about state_referenced_locally. Intent stays "take an initial snapshot of the prop" — the warning existed to make that intent explicit. - Add enrichment/crawl-all-status/+server.ts proxy route; the admin discovery page was polling this path and getting 404s in a tight loop because the equivalent SvelteKit proxy only existed for the plain /crawl-status endpoint.	2026-04-24 13:38:03 +02:00
vikingowl	2fdd8e8222	feat(web): polish discovery admin page and drawer Discovery drawer - Wrap each section in a rounded card so boundaries are visible without parsing the uppercase headers. - Header: N Quellen and enrichment_status become consistent pills, matching the existing konfidenz pill treatment. - Enrichment: replace the inline "(llm)"/"(crawl)" trailing text with a color-coded badge on the label side (purple = llm, sky = crawl). - Empty enrichment state now tells the operator how to trigger it. - Audit timestamp uses a local-time helper so the displayed time matches the browser timezone (was UTC-as-local). - Quellen list: prefix each URL with its hostname for scannability; long URLs truncated with full URL in the title attribute. ContributionsPanel - Amber border/background now only on conflict rows; every row previously got border-amber-100 unconditionally, which diluted the conflict signal. Rang 1 badge flipped to emerald so it reads as a positive "winner" marker, not a warning. Discovery page - Remove dead dateInputValue() function and the stale a11y_click_events_have_key_events suppression — both flagged by eslint after earlier refactors. - Render crawl/enrich timestamps in the browser's local timezone via a new fmtLocalStamp helper; the previous .slice(0,16).replace('T',' ') treated the ISO UTC string as if it were local time.	2026-04-24 13:37:39 +02:00
Christian Nachtigall	fcebc37bcb	Merge branch 'fix/discovery-route-param-collision' into 'main' fix(discovery): route param name collision in ClassifySimilarPair See merge request vikingowl/marktvogt.de!17	2026-04-24 11:08:50 +00:00
vikingowl	c69fe4c07d	fix(discovery): route param name collision in ClassifySimilarPair gin panics at startup with: ':aid' in new path '/api/v1/admin/discovery/queue/:aid/similar/:bid/classify' conflicts with existing wildcard ':id' in existing prefix '/api/v1/admin/discovery/queue/:id' Gin's trie requires identical parameter names at the same prefix position. All sibling routes use :id; the tiebreak route was registered with :aid, crashing the server on every deploy since `e0b73ac`. Prod has been running the pre-tiebreak image (`52f3e4c0`) the whole time because every Helm upgrade crash-looped and rolled back. Rename :aid to :id in both the route and the handler's c.Param read. :bid is in a different slot and stays.	2026-04-24 13:06:08 +02:00
Christian Nachtigall	24675cf176	Merge branch 'refactor/discovery-eval-simplify' into 'main' refactor(discovery-eval): share JSON helpers, trim narration, tighten signatures See merge request vikingowl/marktvogt.de!16	2026-04-24 11:00:03 +00:00
vikingowl	126cc58cbf	refactor(discovery-eval): share JSON helpers, trim narration, tighten signatures - Extract readJSONFile + writeJSONAtomic in cache.go; category cache reuses them (saveCategoryCache is one line, loadCategoryCache uses the standard load-or-empty shape). - Drop dead errMsg param from scoreCategoryResult (always ""). - Wrap writeCategoryReport errors with context for consistency. - Wrap runSimilarityMode / runCategoryMode's 5 per-mode flags into an evalConfig struct so params don't drift. - Promote validModes to a package-level var. - Remove redundant cache = new...() fallback after load* (both load helpers already return a non-nil empty cache on error). - Strip narrating / diff-referencing comments per CLAUDE.md; keep the one genuine WHY on normalizeCategory (divergence from normalize.Name). Net -54 lines across 4 files; go build + go vet + tests green.	2026-04-24 12:59:06 +02:00
vikingowl	95d5eabdb5	Merge branch 'feat/discovery-enrichment-eval' — MR 5b category eval mode for LLM enricher	2026-04-24 12:44:42 +02:00
vikingowl	88d0ae9d96	feat(discovery): category eval mode for the LLM enricher Ship 2 MR 5b. Extends discovery-eval with a second mode that grades MistralLLMEnricher's category output against labelled ground truth. Accuracy + per-label confusion matrix so mix-ups between similar categories (mittelaltermarkt vs ritterfest, weihnachtsmarkt vs kirchweih) are visible at a glance. Usage: -mode similarity — existing MR 5 path, unchanged. -mode category — new: scrapes quellen URLs, asks LLM for {category, opening_hours, description}, scores category only. Structure - main.go: split into runSimilarityMode + runCategoryMode. Both share ai.Client construction and the ctx timeout (bumped to 15min for category mode since scraping adds I/O). Mode dispatched on -mode flag; unknown modes exit 2. - category.go: fixture / cache / run / metrics / report — parallel to the similarity files, not shared because the data shapes differ enough that generics would add more noise than they save. Cache key is sha256(markt_name_lower\|stadt_lower\|year\|model); separate from SimilarityPairKey since that one takes two rows. - fixtures/category.json: 10 hand-labelled DACH-market rows exercising the categories we expect the LLM to produce — mittelaltermarkt, weihnachtsmarkt, ritterfest, ritterturnier, handwerkermarkt, schlossfest, kirchweih. Each row lists a quelle URL the enricher will scrape live (first run only; cache takes over after). - normalizeCategory: strips casing + German umlauts + the -märkte plural drift so a correctly-categorised row doesn't get scored wrong for cosmetic LLM output variation. Metrics: Accuracy + per-label confusion matrix. Confusion format is `want → predictions` with `!` markers on off-diagonal predictions — readable in a terminal, machine-parseable in the JSON report. Mismatches are listed at the end with want/got pairs so operators can spot prompt failures and patch either the prompt or the fixture. Threshold gate reads accuracy (not F1) — category is multi-class, precision/recall don't have a single-label meaning. Tests: normalisation edge cases (casing, umlaut, plural, trimming), scoring drift tolerance, metrics counts + confusion matrix shape, errors excluded from confusion, cache round-trip + model scoping, missing/corrupt file handling. .gitignore adds .cat-eval-cache.json and cat-eval-report.json. Follow-ups (MR 5c / later): opening_hours and description scoring. Both need fuzzier matching (regex structure vs LLM judge) which is its own design problem.	2026-04-24 12:44:26 +02:00
vikingowl	169fa1b3c4	Merge branch 'feat/discovery-keyboard-shortcuts' — MR 8 keyboard shortcuts for queue review	2026-04-24 12:40:22 +02:00
vikingowl	ef6e1def3d	feat(discovery): keyboard shortcuts for queue review Ship 2 MR 8. Operator-productivity layer on top of the detail drawer: j/k to walk rows, Enter to open, a/r to accept-reject the selection, e/s to jump into the drawer with AI enrich / Similar already visible, ? for a help modal listing everything. Escape closes the drawer (or the help modal if it's open). Implementation - selectedId $state drives a subtle indigo ring on the highlighted row. Follows drawerId when the drawer opens so Esc → j leaves you on the same row. Auto-resets to queue[0] if the selected row scrolls off the page (pagination / refresh). - Global <svelte:window onkeydown> listener. isTypingTarget() bails out when focus is inside an input/textarea/select/contenteditable so typing in the drawer's edit form doesn't trigger shortcuts. Cmd/Ctrl/Alt combos also skipped so browser shortcuts stay intact. - selectRelative() updates selectedId + scrolls the row into view (block: 'nearest') so keyboard-driven scanning through a long queue keeps the highlight visible. - submitRowAction() builds + submits a hidden <form> for a/r so the SvelteKit action pipeline (invalidations, form result propagation) runs the same way a button click would. Decisions baked in - 'e' (AI enrich) and 's' (Similar) open the drawer rather than firing the LLM call directly. LLM calls cost money; keeping the UI explicit avoids hidden side effects from a misclick. - Persistent '?' button bottom-right for discoverability — operators shouldn't have to read docs to find the help. - Modal uses click-outside-to-dismiss + Esc + ✕ button, all three. No backend changes. Frontend-only.	2026-04-24 12:40:12 +02:00
vikingowl	3516999345	Merge branch 'feat/discovery-detail-drawer' — MR 6 detail drawer replaces inline panels	2026-04-24 12:37:47 +02:00
vikingowl	5476578373	feat(discovery): per-row detail drawer replaces inline panels Ship 2 MR 6. Consolidates every market-specific action that used to expand into the queue table into a single side drawer. Queue rows keep Accept/Reject for fast-path review; clicking anywhere else on a row opens the drawer with the full context. State via URL param ?drawer=<id>. F5 preserves the open row; links like /admin/discovery?drawer=<uuid>&sort=konfidenz are shareable and compose with existing pagination/sort state. DetailDrawer.svelte (new) sections: - Header: name, konfidenz, source count, Accept/Reject, close (✕) - Identity: editable form (name, stadt, bundesland, start/end, website) - Enrichment: full payload with per-field provenance tags + AI enrich button; "Noch keine Enrichment-Daten" empty state - Quellen: URL list (link-out) - Quellen-Vergleich: per-source contribution diff (reuses ContributionsPanel) — only rendered when >=2 sources - Similar: candidates loaded lazily on drawer open; AI? tiebreak button per candidate shows ✓ same / ✗ diff chips with LLM reason - Audit: discovered_at, agent_status, hinweis +page.svelte: removed the three inline <tr> panels (Similar, Quellen-Vergleich, expanded) and their associated state (expandedId, similarOpenId, quellenVergleichOpenId, similarLoading, similarEntries, similarVerdicts, similarClassifying, toggleSimilar, classifySimilar, toggleQuellenVergleich). Row actions collapsed from 5 buttons (Accept/Reject/Similar/AI/Quellen-Vergleich) to 2 (Accept/Reject). The chevron glyph stays as a visual affordance but is inert — the whole row is clickable. Buttons/forms/links inside the row stop propagation via a closest()-based guard so fast-path Accept/Reject don't accidentally open the drawer. No backend changes; the drawer consumes existing queue data + existing endpoints (similar, similar/classify, enrich). Follow-ups: MR 8 adds keyboard shortcuts that naturally compose with the drawer (j/k navigation, Enter opens, Esc closes).	2026-04-24 12:37:38 +02:00
vikingowl	6218710453	Merge branch 'feat/discovery-eval-harness' — MR 5 eval harness for AI similarity classifier	2026-04-24 12:31:11 +02:00
vikingowl	cf5408ab66	feat(discovery): eval harness for the AI similarity classifier Ship 2 MR 5. Adds a CLI that measures MistralSimilarityClassifier against a labelled fixture: precision, recall, F1, accuracy, plus a confidence calibration table so we can tell whether "90% confident" verdicts are actually right 90% of the time. Usage: go run ./backend/cmd/discovery-eval -fixture ... -cache ... -threshold 0.8 -report eval-report.json. Structure - main.go: arg parsing + wiring (ai.Client, classifier, cache, metrics). The work happens in realMain() which returns an exit code — keeps defers running on error paths. - fixture.go: parses labelled pairs JSON. Fixture authors only need to fill in name/stadt/year; name_normalized falls back to name when omitted. - cache.go: file-backed map keyed by SimilarityPairKey + model string. Symmetric (a,b) == (b,a). Atomic writes (temp file + rename) so a crashed run cannot corrupt the cache. Corrupt-file load returns an empty usable cache and reports the parse error. - run.go: executes each pair through the classifier, populating the cache. Individual classify errors are downgraded to "not correct" and logged — the run always finishes so the operator sees whatever data is available. - metrics.go: confusion matrix, P/R/F1/accuracy, per-confidence- bucket calibration ([0-0.5), [0.5-0.75), [0.75-0.9), [0.9-1.0]). Prints human summary + surfaces highest-confidence mismatches first (most actionable for prompt iteration). Optional JSON report. - Threshold gate: -threshold N exits non-zero when F1<N. Default 0 (gating disabled until we have a baseline F1). Fixture: seeds 15 hand-crafted DACH-market pairs covering the edge cases we actually care about — umlaut drift (Straßburg/Strassburg), year difference on a recurring series, word-reordering, distinct events at the same venue, historical proper names (Striezelmarkt), same city with multiple distinct Christmas markets. Operator extends over time; each pair carries a `note` explaining the case it locks. .gitignore adds .eval-cache.json and eval-report.json — neither should land in the repo. Tests cover metrics edge cases (all correct, imbalanced, no-positive-predictions-no-NaN, calibration bucket assignment, cache accounting, empty input) and cache behaviour (round-trip, symmetric lookup, model-scoped invalidation, missing/corrupt file handling, atomic-write leaves no temp files). Out of scope for MR 5: enrichment field accuracy (fuzzy text scoring is its own problem — tracked for a follow-up), CI wiring (needs a baseline F1 first).	2026-04-24 12:26:18 +02:00
vikingowl	525a20b79c	Merge branch 'feat/discovery-auto-merge-crawl' — Ship 2 MRs 2–7 (enrichment foundation, crawl-enrich, LLM enrich, AI similarity, auto-merge) Brings the full Ship 2 feature stack (except the eval harness and detail drawer) into main. Conflicts resolved: - repository.go: kept MR 1's sort params + queueOrderByClause builder on ListQueue, AND MR 7's FindPendingMatch + MergePendingSources (MR 7 removed the old QueueHasPending). ListQueue SELECT keeps the enrichment columns MR 2 added. - mock_repo_test.go: kept both MR 1's listQueueCalls capture and the MR 2-4 enrichment/similarity hooks. - service_test.go: ListPendingQueuePaged uses MR 1's sort-param signature; NewService uses the MR 2-7 seven-arg form. - handler_test.go: TestListQueueSortParamWhitelist's NewService call bumped from 4 args to 7 (nil geocoder, nil llm enricher, nil sim classifier). Features landing on main: - MR 2: enrichment schema (migration 000019), jsonb payload, enrich package with Merge/CacheKey/NoopLLMEnricher. - MR 3: manual crawl-enrich-all button + async 202 status endpoint. - MR 3b: per-row LLM enrich via scrape-then-prompt (pkg/scrape + MistralLLMEnricher). - MR 4: AI similarity tiebreak (migration 000020), MistralSimilarityClassifier, per-candidate AI? button in the Similar panel. - MR 7: cross-crawl auto-merge for new sources on pending queue rows (FindPendingMatch + MergePendingSources, AutoMerged counter).	2026-04-24 12:13:30 +02:00
vikingowl	28202c71df	Merge branch 'feat/discovery-queue-sort' — MR 1 sortable queue, default konfidenz desc	2026-04-24 12:05:16 +02:00
vikingowl	c06788a63d	feat(discovery): auto-merge queue rows across crawl runs Ship 2 MR 7. Replaces the "drop on duplicate" branch of the crawl loop with a cross-run auto-merge: when a new crawl brings a source that a pending queue row doesn't yet carry, the new source's data merges into the existing row instead of spawning a second entry. Operator review burden stays bounded to one row per market even as coverage grows across sources. Konfidenz upgrades come for free: a row that starts with one source at konfidenz=mittel flips to hoch the moment a second independent source confirms the same (name, city, start_date) triple. Repo changes - QueueHasPending (bool) replaced by FindPendingMatch returning *DiscoveredMarket. Same exact-tuple lookup; now callers see the full match so they can merge. - MergePendingSources appends new sources/quellen/contributions onto a pending row using set-union semantics. source_contributions dedupe by SourceName so repeat crawls don't stack duplicate entries. Konfidenz and hinweis are overwritten with caller-computed values. - Idempotent: send the same delta twice, nothing changes the second time. Service.Crawl flow - On match + incoming source already on the row -> DedupedQueue. Same semantic as before, just more tightly scoped (same source re-emits an event; previously any match counted as dedup). - On match + incoming source not yet on the row -> auto-merge path: compute the source/quellen/contribution delta, call MergePendingSources, count in summary.AutoMerged. - The crawlerKonfidenz helper is now a thin wrapper over a shared konfidenzForSources(sources []string), reused by the merge path. Source-name constants extracted to un-hardcode the switch cases and the test references. Summary + UI - CrawlSummary gains AutoMerged int. Logged alongside the other counters. - +page.svelte crawl-result grid gets an "Auto-merged" tile. Tests - Same-source redundant pickup -> DedupedQueue=1, no MergePendingSources call, no insert. - New-source auto-merge -> AutoMerged=1, MergePendingSources called with exact delta (addSources=[new only], addQuellen=[new only], addContribs labelled with new source_name), konfidenz upgraded to hoch. - Existing TestServiceCrawlDedupQueue renamed to TestServiceCrawlDedupQueue_SameSourceRedundant reflecting the tightened semantic. No migration — existing text[] and jsonb columns support the union operations via SQL.	2026-04-24 12:01:01 +02:00
vikingowl	e0b73acfd6	feat(discovery): AI tiebreak for ambiguous similarity matches Ship 2 MR 4. Adds per-pair AI-backed classification for operator use inside the existing Similar panel: an "AI?" button next to each candidate asks Mistral whether the two queue rows refer to the same underlying market. Result shown inline as a green "✓ same N%" or grey "✗ diff N%" chip with the LLM's reason on hover. No scraping — the classifier works from (name, city, year) alone, which is enough for the common cases (same venue on two calendars, typos, cross-year recurrence). Call is short (usually <3s) so the handler is synchronous, 15s deadline. Caching - Migration 000020 adds similarity_ai_cache keyed on a content hash over (normalized_name\|stadt\|year) for both rows, sorted for symmetry. Survives queue row accept/reject because the hash is about markt-content, not queue-row lifecycle. - enrich.SimilarityPairKey computes the key. Classify(a,b) and Classify(b,a) hit the same entry. Stadt casing drift doesn't invalidate. - Repo methods GetSimilarityCache / SetSimilarityCache + corresponding mock hooks. DefaultSimilarityCacheTTL=30d. Mistral integration - enrich.MistralSimilarityClassifier reuses the same aiPass2 interface as the enricher. English system prompt asks for JSON-only output with {same_market, confidence 0..1, reason}. Confidence clamped to [0,1] because models occasionally return 1.2 or -0.1. Reason is short German justification. - NoopSimilarityClassifier returns an error — callers must check ai.Enabled() before deciding which binding to pass. Service.ClassifySimilarPair loads both rows, computes pair key, cache-first, calls classifier on miss, writes cache, returns verdict. Rejects self-comparison (pair-key collapses). Handler POST /admin/discovery/queue/:aid/similar/:bid/classify. UI: new AI? column inside the Similar panel. Per-candidate pending state via Set<string>, disabled button while in-flight, inline verdict chip after response. Tooltip shows the LLM's reason. Tests: pair-key symmetry + differentiation + casing tolerance; Mistral classifier happy path, clamping edge cases, error propagation, bad-JSON handling, Noop rejection. Service tests: happy path writes cache, cache-hit skips LLM, self-comparison rejected, classifier errors don't poison the cache. NewService signature grows by one param (sim enrich. SimilarityClassifier). All 14 existing callers (routes.go + tests) updated; tests pass nil.	2026-04-24 11:04:15 +02:00
vikingowl	ce32f76731	feat(discovery): per-row LLM enrichment via scrape-then-prompt Completes the manual two-pass enrichment flow: the crawl-enrich-all button (MR 3) fills deterministic fields across the queue; this MR adds a per-row "AI" button that scrapes the row's quellen URLs and asks Mistral to fill category, opening_hours, description. Flow per click: 1. Load row, compute CacheKey(name_normalized, stadt, year). 2. Cache hit -> skip LLM, merge cached payload onto current crawl-enrich base, persist, return. 3. Miss -> scrape up to 5 quellen URLs via pkg/scrape (goquery text extraction, 4000-char truncation), concatenate into labeled blocks, call ai.Client.Pass2 with JSON response format. 4. Parse response into Enrichment{category, opening_hours, description}, stamp provenance=llm + model + token counts. 5. Cache the raw LLM payload (not the merged one) under the tuple key with DefaultCacheTTL=30d, so later re-crawls can layer new crawl-enrich bases on the same cached answer. 6. Merge(crawl, llm) -- crawl fields survive. Persist via SetEnrichment(status=done). Return merged to the operator. ErrNoScrapedContent fails fast when zero URLs return usable text; LLMs without grounding hallucinate, and a 400-style operator error is better than inventing details. Individual scrape failures don't halt the flow as long as at least one source succeeds. pkg/scrape (new, reusable) - Client.Fetch: HTTP GET, strip script/style/nav/footer/aside via goquery, gather body text, collapse whitespace, truncate. DefaultTimeout=10s, DefaultMaxChars=4000. User-Agent configurable. - Tests cover noise stripping, whitespace collapsing, truncation, body-less fragments. enrich.MistralLLMEnricher - Takes ai.Client + Scraper (both injectable; tests use stubs). - Prompt: English system instructions asking for JSON-only output with category/opening_hours/description in German. User prompt includes markt identifiers, already-filled fields (so the LLM doesn't waste tokens re-deriving them), and scraped blocks. - Tests: happy path, all-scrapes-fail (-> ErrNoScrapedContent), partial-scrape-success, empty LLM fields yield no provenance, URL cap at 5. Service.RunLLMEnrichOne + handler POST /admin/discovery/queue/:id/ enrich (sync, 30s timeout). NewService gains llm enrich.LLMEnricher param; routes.go constructs a MistralLLMEnricher when ai.Client is enabled, falls back to NoopLLMEnricher otherwise. UI: per-row AI button next to Similar, tracks per-row pending state via a Set<string>, disables the button while the request is in flight and shows "AI..." label. Success invalidates the page, the row's expanded view picks up the new category/opening_hours/ description fields with llm provenance tags. Inline error message on the row if the enrich action fails.	2026-04-24 10:46:28 +02:00

1 2 3 4 5 ...

282 Commits