Commit Graph

233 Commits

Author SHA1 Message Date
e0b73acfd6 feat(discovery): AI tiebreak for ambiguous similarity matches
Ship 2 MR 4. Adds per-pair AI-backed classification for operator use
inside the existing Similar panel: an "AI?" button next to each
candidate asks Mistral whether the two queue rows refer to the same
underlying market. Result shown inline as a green "✓ same N%" or
grey "✗ diff N%" chip with the LLM's reason on hover.

No scraping — the classifier works from (name, city, year) alone,
which is enough for the common cases (same venue on two calendars,
typos, cross-year recurrence). Call is short (usually <3s) so the
handler is synchronous, 15s deadline.

Caching
- Migration 000020 adds similarity_ai_cache keyed on a content hash
  over (normalized_name|stadt|year) for both rows, sorted for
  symmetry. Survives queue row accept/reject because the hash is
  about markt-content, not queue-row lifecycle.
- enrich.SimilarityPairKey computes the key. Classify(a,b) and
  Classify(b,a) hit the same entry. Stadt casing drift doesn't
  invalidate.
- Repo methods GetSimilarityCache / SetSimilarityCache + corresponding
  mock hooks. DefaultSimilarityCacheTTL=30d.

Mistral integration
- enrich.MistralSimilarityClassifier reuses the same aiPass2
  interface as the enricher. English system prompt asks for
  JSON-only output with {same_market, confidence 0..1, reason}.
  Confidence clamped to [0,1] because models occasionally return
  1.2 or -0.1. Reason is short German justification.
- NoopSimilarityClassifier returns an error — callers must check
  ai.Enabled() before deciding which binding to pass.

Service.ClassifySimilarPair loads both rows, computes pair key,
cache-first, calls classifier on miss, writes cache, returns
verdict. Rejects self-comparison (pair-key collapses). Handler
POST /admin/discovery/queue/:aid/similar/:bid/classify.

UI: new AI? column inside the Similar panel. Per-candidate pending
state via Set<string>, disabled button while in-flight, inline
verdict chip after response. Tooltip shows the LLM's reason.

Tests: pair-key symmetry + differentiation + casing tolerance;
Mistral classifier happy path, clamping edge cases, error
propagation, bad-JSON handling, Noop rejection. Service tests:
happy path writes cache, cache-hit skips LLM, self-comparison
rejected, classifier errors don't poison the cache.

NewService signature grows by one param (sim enrich.
SimilarityClassifier). All 14 existing callers (routes.go + tests)
updated; tests pass nil.
2026-04-24 11:04:15 +02:00
ce32f76731 feat(discovery): per-row LLM enrichment via scrape-then-prompt
Completes the manual two-pass enrichment flow: the crawl-enrich-all
button (MR 3) fills deterministic fields across the queue; this MR
adds a per-row "AI" button that scrapes the row's quellen URLs and
asks Mistral to fill category, opening_hours, description.

Flow per click:
  1. Load row, compute CacheKey(name_normalized, stadt, year).
  2. Cache hit -> skip LLM, merge cached payload onto current
     crawl-enrich base, persist, return.
  3. Miss -> scrape up to 5 quellen URLs via pkg/scrape (goquery
     text extraction, 4000-char truncation), concatenate into labeled
     blocks, call ai.Client.Pass2 with JSON response format.
  4. Parse response into Enrichment{category, opening_hours,
     description}, stamp provenance=llm + model + token counts.
  5. Cache the raw LLM payload (not the merged one) under the tuple
     key with DefaultCacheTTL=30d, so later re-crawls can layer new
     crawl-enrich bases on the same cached answer.
  6. Merge(crawl, llm) -- crawl fields survive. Persist via
     SetEnrichment(status=done). Return merged to the operator.

ErrNoScrapedContent fails fast when zero URLs return usable text;
LLMs without grounding hallucinate, and a 400-style operator error is
better than inventing details. Individual scrape failures don't halt
the flow as long as at least one source succeeds.

pkg/scrape (new, reusable)
- Client.Fetch: HTTP GET, strip script/style/nav/footer/aside via
  goquery, gather body text, collapse whitespace, truncate.
  DefaultTimeout=10s, DefaultMaxChars=4000. User-Agent configurable.
- Tests cover noise stripping, whitespace collapsing, truncation,
  body-less fragments.

enrich.MistralLLMEnricher
- Takes ai.Client + Scraper (both injectable; tests use stubs).
- Prompt: English system instructions asking for JSON-only output
  with category/opening_hours/description in German. User prompt
  includes markt identifiers, already-filled fields (so the LLM
  doesn't waste tokens re-deriving them), and scraped blocks.
- Tests: happy path, all-scrapes-fail (-> ErrNoScrapedContent),
  partial-scrape-success, empty LLM fields yield no provenance,
  URL cap at 5.

Service.RunLLMEnrichOne + handler POST /admin/discovery/queue/:id/
enrich (sync, 30s timeout). NewService gains llm enrich.LLMEnricher
param; routes.go constructs a MistralLLMEnricher when ai.Client is
enabled, falls back to NoopLLMEnricher otherwise.

UI: per-row AI button next to Similar, tracks per-row pending state
via a Set<string>, disables the button while the request is in
flight and shows "AI..." label. Success invalidates the page, the
row's expanded view picks up the new category/opening_hours/
description fields with llm provenance tags. Inline error message on
the row if the enrich action fails.
2026-04-24 10:46:28 +02:00
afe9d916d6 feat(discovery): manual crawl-enrich-all button + payload display
Replaces the originally-planned async-worker design with operator-
triggered bulk runs (see memory/project_ship2_enrichment.md). Crawl-
enrichment is cheap enough to always run against the whole list but
runs only when the admin clicks — the flow stays predictable and the
crawl itself stays fast.

Endpoints
- POST /admin/discovery/enrichment/crawl-all — 202 + goroutine, mirrors
  the crawl pattern. Per-process CAS gate prevents concurrent runs.
- GET  /admin/discovery/enrichment/crawl-all-status — polled shape
  identical to /crawl-status for UI reuse.

Service RunCrawlEnrichAll iterates enrichment_status='pending' rows,
builds an enrich.Input from each, runs CrawlEnrich (consolidation +
Nominatim geocoding via the shared geocoder), and persists via
SetEnrichment(status=done). Per-row errors count toward Failed and
append to a bounded Errors slice; the pass never halts.

Enrich package refactor
- Enrichment, Sources, Provenance constants moved from discovery ->
  enrich (they are the enrich package's own types; discovery previously
  held them for historical reasons).
- CrawlEnrich now takes a narrow enrich.Input / enrich.Contribution so
  the enrich package no longer imports the parent discovery package.
  This breaks the import cycle that appeared once discovery needed to
  call enrich (the MR 2 structure only worked because no caller went
  in that direction yet).
- LLMEnricher takes an LLMRequest (primitives) instead of a
  DiscoveredMarket. NoopLLMEnricher updated; real Mistral impl lands
  in MR 3b.
- CacheKey signature switched from (DiscoveredMarket) to primitive
  (nameNormalized, stadt, year).

Service geocoder wiring: discovery.NewService gains a Geocoder param
(routes.go passes the shared Nominatim client; the interface lives in
discovery to avoid another circular edge with enrich).

UI: "Run crawl-enrich" button next to "Run crawl"; identical poll +
summary card pattern. Queue row expand shows enrichment status badge
plus the PLZ/Venue/Organizer/Lat-Lng fields inline with per-field
provenance tag.

Tests: three new service tests (happy path, per-row SetEnrichment
failure, empty-queue no-op). Existing enrich package tests updated
for the primitive input signature. All 13 test NewService call-sites
updated for the new geocoder param.
2026-04-24 10:29:58 +02:00
dcbf38f6e9 feat(discovery): enrichment foundation — schema, types, crawl-enrich, cache
Lays infrastructure for Ship 2 crawl-time enrichment. Design principles
(see memory/project_ship2_enrichment.md):
- async worker (not inline in crawl) — MR 3 wires it up
- single enrichment jsonb column, not typed columns — shape still in flux
- per-row LLM budget, global soft cap logged
- crawl-enrich runs first; LLM only fills gaps it cannot reach

Migration 000019: adds discovered_markets.enrichment{,_status,_attempts}
and enriched_at; partial index on enrichment_status for the worker's
claim query; enrichment_cache table keyed by sha256(name|city|year).

enrich package:
- crawl.go — pure consolidator over SourceContributions (PLZ, venue,
  organizer), first non-empty wins. Optional Geocoder pulls lat/lng via
  Nominatim; failures are non-fatal. Everything marked provenance=crawl.
- llm.go — LLMEnricher interface + NoopLLMEnricher. Real Mistral-backed
  impl lands in MR 3 along with the worker.
- enrich.go — Merge(base, overlay) with base-wins semantics, enforcing
  the crawl-over-llm invariant at the type level: even a confident LLM
  pass can't overwrite a crawl-populated field.
- cache.go — CacheKey() stable across re-crawls; DefaultCacheTTL=30d.

Repository: scan/persist the new columns, GetEnrichmentCache /
SetEnrichmentCache / SetEnrichment. The SetEnrichment UPDATE increments
attempts server-side and stamps enriched_at only for terminal states
(done|failed) — 'skipped' keeps the previous timestamp.

No UI changes and no worker binary yet. Noop LLM enricher in place so
MR 3 can wire the worker without refactoring shape.
2026-04-24 09:55:38 +02:00
52f3e4c009 chore: replace personal emails with contact@marktvogt.de 2026-04-21 10:56:07 +02:00
d6b65501ec security: redact agent ID from helm values; gitignore superpowers docs
Remove Mistral agent ID from agentDiscovery comment in helm values.yaml.
Add docs/superpowers/ to .gitignore to prevent re-tracking internal AI plans.
2026-04-21 09:48:32 +02:00
9232203dd3 Merge branch 'chore/access-ttl-and-ship2-handoff' — Ship 2 handoff + TTL bump 2026-04-19 01:06:14 +02:00
b52ac7d861 docs(ship-2): handoff note + chore(helm): bump JWT access TTL 15m to 2h
Handoff captures end-of-Ship-1 state and Ship 2 scope (§4.10 expanded
product additions: crawl-time enrichment, AI-augmented similarity,
inline enrich-before-accept, detail drawer, eval harness, enrichment
cache, auto-merge during crawl, keyboard shortcuts). §4.12 tracks the
admin auth refresh-on-401 fix; pending that work JWT_ACCESS_TTL bumped
from 15m to 2h as interim relief.
2026-04-19 01:05:52 +02:00
95a3dfdef8 Merge branch 'fix/queue-pagination-envelope' — queue UI renders rows again
MR 6's backend + MR 7's UI had mismatched envelope assumptions. Backend
returned pagination as sibling fields to data; UI's ApiResponse<T> wrapper
only typed data, so 'body.data' (the queue) became undefined at runtime.
2026-04-19 00:46:49 +02:00
bddab60686 fix(admin): queue response uses meta envelope; UI reads total from meta
MR 6 backend returned {data, total, limit, offset} as siblings but the
shared ApiResponse<T> envelope only types the data field. The UI's load
function treated queueRes.data as a wrapper and read body.data (undefined)
as the row list. Result: empty queue in UI despite 1384 pending rows
in the DB.

Fix: backend moves total/limit/offset into meta (matches PaginationMeta
convention from web/src/lib/api/types.ts). UI casts to read the meta
slot alongside typed data.
2026-04-19 00:46:05 +02:00
b42a35c049 Merge branch 'feat/merge-conflict-display' — MR 7 per-source contributions visible
Migration 000018 adds sources text[] + source_contributions jsonb to
discovered_markets. Crawler preserves raw per-source RawEvents through
Merge() and service persists them alongside the merged row. Admin UI
gains a merged-sources chip + Datumskonflikt badge and an expandable
Quellen-Vergleich panel showing per-field comparison across sources
with conflicting values highlighted.
2026-04-19 00:28:24 +02:00
cc6c4f2efb feat(discovery): persist and display per-source contributions for merged queue rows
Migration 000018 adds sources text[] + source_contributions jsonb
columns to discovered_markets. Crawler's merger now preserves the raw
per-source RawEvents through Merge() so they can be stored alongside
the merged row. Admin UI gains two surfaces: (a) compact "merged from
source1 + source2" chip + amber Datumskonflikt badge when hinweis
flags it, (b) expandable Quellen-Vergleich panel showing a per-field
comparison table with diverging fields highlighted. Forensic visibility
into what each source said vs what the merger picked.
2026-04-19 00:27:34 +02:00
f22a141615 Merge branch 'feat/admin-queue-pagination-and-similar' — MR 6 queue UX
Queue endpoint returns {data, total, limit, offset}; admin UI exposes
prev/next + page-size + Showing X-Y of Z. Per-row Similar button
fetches MR 5's /queue/:id/similar via a SvelteKit proxy and renders
matches inline. Essential for reviewing the 1000+ row queue post-fix.
2026-04-19 00:14:52 +02:00
2acd0cdc06 feat(admin): queue pagination + per-row Show similar button
Queue endpoint now returns {data, total, limit, offset}. Admin UI
reads ?page + ?limit from URL, renders prev/next + page-size selector
+ "Showing X-Y of Z" label. Per-row Similar button fetches the MR 5
/queue/:id/similar endpoint via a new SvelteKit proxy route and
renders matches inline with score/name/city/date. Essential for
navigating the 1000+ row queue after MR 5's crawl fixes.
2026-04-18 23:59:18 +02:00
5c363944b2 Merge branch 'feat/crawl-similarity-and-fixes' — MR 5 crawler cleanup + similarity
Drops link-check from crawl path (was timing-bound, misleading counter).
Fixes suendenfrei pagination footer-link infinite loop. Adds similarity
helper with Levenshtein-based fuzzy name match + city match + date
proximity, exposed as GET /queue/:id/similar for admin duplicate review.
2026-04-18 20:05:44 +02:00
073e55c7fc feat(discovery): drop link-check from crawl path, fix suendenfrei pagination, add similarity helper
- Service.Crawl no longer link-verifies Quellen/Website for crawler
  events. Those URLs come from real HTML of trusted sources and have
  been implicitly verified at parse time. Removing this makes the
  insert phase complete in well under a minute even for 1500+ events
  and stops attributing timing-limited processing as link failures.
  LinkCheckFailed counter retained for JSON shape stability.

- Suendenfrei pagination now stops on len(events) == 0. Previously the
  site's footer <h3><a> links kept anchors.Length() > 0 indefinitely,
  sending the crawler to page-90 before the outer ctx timeout.

- New similarity helper (SimilarityScore, FindSimilar) and endpoint
  GET /api/v1/admin/discovery/queue/:id/similar. Multiplicative score
  of normalized-name Levenshtein ratio gating city-match and date-
  proximity bonuses. Prevents coincident-city/date events from being
  incorrectly flagged as near-duplicates when their names differ.
  Lets admin review flag near-duplicates that slip past exact-match
  dedup (date typos, city variants, trailing-word swaps).
2026-04-18 20:05:07 +02:00
cdd43cc45a Merge branch 'feat/crawl-async' — async crawl handler, UI polls status
Gateway (NGF) ignored our HTTPRoute timeouts field (UnsupportedField).
Flipping to fire-and-forget: handler returns 202 immediately, goroutine
runs crawl with detached 5-min context, GET /admin/discovery/crawl-status
returns state, admin UI polls every 3s until running=false.

HTTP requests are now all sub-second; gateway timeout is no longer in
the crawl critical path. Concurrent-run protection via atomic.Bool
(replaces TryLock), rate limit semantics unchanged.
2026-04-18 19:25:37 +02:00
9f286b8029 feat(discovery): async crawl — 202 Accepted, status endpoint, UI polls
Handler.Crawl now spawns a goroutine with a 5-minute detached context
and returns 202 immediately. Admin UI polls the new
GET /admin/discovery/crawl-status every 3s until running=false, then
renders CrawlSummary. Bypasses the 60s nginx-gateway proxy_read_timeout
entirely — HTTP requests are all sub-second.

Concurrency: atomic.Bool guard (CompareAndSwap) replaces TryLock,
resultMu RWMutex protects the summary/error state, rateMu protects
the rate-limit check. Rate limit semantics unchanged (still applies
to admin-session path, bearer-token bypass via context flag).
2026-04-18 19:24:48 +02:00
2ea8a9a6f3 Merge branch 'fix/discovery-crawl-timeout' — crawl survives gateway timeout
Gateway cut the HTTP request at 60s, which cancelled the request ctx
and cascaded into the link-verifier in Service.Crawl's insert pipeline.
Every merged event was then dropped as LinkCheckFailed, resulting in
zero new queue rows despite the crawler parsing ~1500 events.

Fix is three parts: HTTPRoute timeout 300s for /crawl*, insert-phase
context detached from the HTTP request ctx, and a CrawlSummary INFO
log line for diagnosability.
2026-04-18 18:40:30 +02:00
f6e4e5c29f fix(discovery): crawl survives gateway timeout and long-running runs
- HTTPRoute: add 300s request+backendRequest timeout rule for
  /api/v1/admin/discovery/crawl; default rule unchanged. nginx-gateway's
  60s default was cutting the connection mid-crawl.
- Service.Crawl: detach insert pipeline from HTTP request context with
  a 3-minute internal timeout. Previously a canceled request ctx
  cascaded into the link-verifier, failing every URL check and
  counting every merged event as LinkCheckFailed. Inserts now complete
  even if the gateway cut the connection.
- Log CrawlSummary at INFO on completion so outcomes are visible in
  backend logs without needing the HTTP response body.
- New test: TestServiceCrawlDetachesInsertContextFromRequestCtx.
2026-04-18 18:39:21 +02:00
2bb5156c0b Merge branch 'feat/discovery-crawler-mr2' — Ship 1 MR 2 cutover
Deletes the Mistral Pass 0 code path from discovery, flips the k8s
CronJob to the crawler endpoint on a daily schedule, and adds a
Run crawl button to the admin UI that renders CrawlSummary.

Net change: ~-900 lines / +150 lines. Mistral remains wired for Pass 1
and Pass 2 research — only Pass 0 discovery is replaced by the deterministic
5-source Go crawler.
2026-04-18 17:49:08 +02:00
ba453a910f chore(helm): daily discovery cron hits /crawl endpoint 2026-04-18 17:46:39 +02:00
3add4fb7ad refactor(discovery): remove Mistral Pass 0 path; /crawl is canonical
Deletes agent_client.go, agent_client_test.go, and the discovery-compare
diagnostic CLI. Removes Tick/PickBuckets/processOneBucket/processBucketResponse
from Service; renames NewServiceWithCrawler to NewService. Drops BatchSize,
ForwardMonths, AgentDiscovery config fields and their env reads. PickStaleBuckets
and UpdateBucketQueried removed from Repository interface (no callers). Stats
hardcodes forwardMonths=12. /tick route removed; /crawl is now the only machine
path, still protected by requireTickToken middleware.
2026-04-18 17:42:30 +02:00
a729412478 feat(admin): add Run crawl button and CrawlSummary rendering to discovery page 2026-04-18 17:29:05 +02:00
4c7c3dcb37 Merge branch 'feat/discovery-crawler' — DACH discovery crawler MR 1
Replaces Mistral Pass 0 with a deterministic 5-source Go crawler
(marktkalendarium.de, mittelalterkalender.info, festival-alarm.com,
mittelaltermarkt.online Tribe REST, suendenfrei.tv). Pass 1/2 enrichment
paths unchanged. Existing Mistral Tick path preserved alongside; cutover
gated on coverage verification via cmd/discovery-compare.

Spec: docs/superpowers/specs/2026-04-18-dach-discovery-crawler-design.md
Plan: docs/superpowers/plans/2026-04-18-dach-discovery-crawler.md
2026-04-18 17:03:27 +02:00
7c8a8c6419 fix(discovery): review follow-ups — konfidenz signal, end-date default, determinism, rate-limit=0
- Service.Crawl derives Konfidenz from merged source count + rank instead of
  hardcoded "mittel". Two+ sources -> "hoch"; single curated source ->
  "mittel"; single suendenfrei (prose regex) -> "niedrig".
- New AgentStatus constant "crawler" replaces "bestaetigt" for crawler rows
  so the validator's agent-specific rules don't fire on them and operators
  can filter the queue by origin. Added Konfidenz* and AgentStatus*
  constants to model.go.
- Default EndDatum to StartDatum when a source reports a single date
  (festival_alarm one-day events, suendenfrei lines without a "bis" range).
  Avoids Service.Accept rejecting nil-EndDatum rows.
- Sort PerSource names before assembling raw events for merge — makes
  merged output order deterministic across runs.
- NewHandler: manualRateLimitPerHour <= 0 now explicitly disables the
  rate limit (previously silently floored to 1/hour). Documented behavior
  for all three cases in a constructor comment.
- Added four new tests for Service.Crawl failure/quality paths:
  LinkCheckFailed, DedupedQueue, EndDatum default, multi-source Konfidenz.
- Documented the substring-match approximation in
  cmd/discovery-compare/main.go's groupCrawlerByBucket — diagnostic-only,
  not safe for production routing.
2026-04-18 16:35:26 +02:00
c5a4bc441c feat(cmd): discovery-compare CLI for pre-cutover coverage verification 2026-04-18 16:08:48 +02:00
0bed4401fe feat(config): crawler user-agent and manual rate-limit knobs 2026-04-18 15:50:21 +02:00
91cd4d89b3 feat(discovery): POST /admin/discovery/crawl with mutex and rate limit
Exposes Service.Crawl via two HTTP routes: a bearer-token path that
bypasses the manual rate limit, and an admin-session path subject to a
configurable per-hour cap. A sync.Mutex blocks concurrent runs.
Includes handler tests for mutex reentry and rate limit enforcement.
2026-04-18 15:22:24 +02:00
b3289bc6e6 feat(discovery): Service.Crawl — orchestrate crawler through existing pipeline
Extract normalize helpers into discovery/normalize subpackage to break
the otherwise circular import (discovery/crawler → discovery → crawler).
NormalizeName/NormalizeCity in discovery become thin wrappers; merger.go
switches to discovery/normalize directly.

Adds crawlerRunner interface, NewServiceWithCrawler constructor, CrawlSummary/
SourceSummary types, and Service.Crawl which wires the crawler output through
link-verify, dedup, validation, and insert — same pipeline as processBucketResponse
but without a bucket context (BucketID is nil on crawler-produced rows).
2026-04-18 15:03:02 +02:00
20176dd51f refactor(discovery): validator accepts *Bucket, skips bucket checks when nil 2026-04-18 14:43:07 +02:00
310673940e feat(discovery): migration 000017 — nullable bucket_id; model uses *uuid.UUID 2026-04-18 14:30:54 +02:00
507052e375 feat(discovery/crawler): source config and RunAll orchestrator 2026-04-18 14:09:22 +02:00
c013f6bc54 feat(discovery/crawler): cross-source merger with source-rank tiebreaks 2026-04-18 13:40:21 +02:00
3aed982e1c feat(discovery/crawler): log unparseable suendenfrei entries at INFO 2026-04-18 13:33:51 +02:00
2163621415 feat(discovery/crawler): suendenfrei.tv parser 2026-04-18 13:09:47 +02:00
94aa261c90 refactor(discovery/crawler): hoist land constants; document Tribe date format assumption 2026-04-18 13:04:28 +02:00
1cc7de0bb6 feat(discovery/crawler): mittelaltermarkt.online Tribe REST client 2026-04-18 12:53:32 +02:00
a55bb7e15b docs(discovery/crawler): clarify unused year param in parseDateAttr 2026-04-18 12:48:38 +02:00
93efb90967 feat(discovery/crawler): festival-alarm.com parser 2026-04-18 12:36:01 +02:00
91c058105e feat(discovery/crawler): mittelalterkalender.info parser 2026-04-18 12:24:49 +02:00
e6ec97c09d feat(discovery/crawler): marktkalendarium.de parser 2026-04-18 12:12:13 +02:00
57120beac0 feat(discovery/crawler): polite HTTP fetcher with retry and 429 backoff 2026-04-18 12:02:47 +02:00
31fea6fa3c test(discovery/crawler): add PLZ boundary + range coverage cases 2026-04-18 12:00:42 +02:00
eed76f1e76 docs(discovery/crawler): align PLZ helper comment with implementation 2026-04-18 11:57:34 +02:00
4694804331 feat(discovery/crawler): PLZ-to-land inference helper 2026-04-18 11:54:17 +02:00
e359d06d13 test(discovery/crawler): capture golden fixtures from five sources 2026-04-18 11:45:53 +02:00
5135f0a3be feat(discovery/crawler): scaffold subpackage with Source interface and RawEvent types 2026-04-18 11:36:07 +02:00
adf417b731 fix(research): 429 aware error handling for Pass 1/2
Pass 1 and Pass 2 now detect Mistral web_search rate limits (shared with
the Pass 0 CronJob) and return a proper HTTP 429 with Retry-After: 60
instead of a generic 500 "AI research failed". Pass 2 is enrichment-only,
so rate-limits there fall through with pass1 results intact.

- pkg/ai: new shared IsRateLimit helper + DefaultRetryAfterSeconds=60.
  discovery/service.go drops its local copy and imports the shared one.
- apierror.TooManyRequests now accepts an optional custom message so the
  response body can include "try again in ~60s".
- market/research.go: respondRateLimited helper sets Retry-After,
  downgrades the log line from ERROR to WARN (rate-limits are expected
  state, not a fault), and returns 429 with a structured rate_limited
  code the admin UI can key off of.
2026-04-18 10:33:13 +02:00
8e8bb8d4c3 Merge branch 'feature/discovery-validator' into 'main'
feat(discovery): validator — catches agent self-contradictions before insert

See merge request vikingowl/marktvogt.de!14
2026-04-18 08:05:20 +00:00