fix: bundle PDF.js worker locally to fix CDN loading issues

- Add postinstall script to copy worker to static/ - Update Dockerfile to copy worker during build - Update file-processor to try local worker first, fallback to CDN - Bump version to 0.4.11
chore: bump version to 0.4.10
2026-01-03 22:16:19 +01:00 · 2026-01-03 21:52:09 +01:00 · 2026-01-03 21:35:55 +01:00 · 2026-01-03 21:19:32 +01:00 · 2026-01-03 21:12:49 +01:00 · 2026-01-03 18:26:40 +01:00
51 changed files with 3769 additions and 745 deletions
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -17,11 +17,30 @@ jobs:
        with:
          fetch-depth: 0

+      - name: Wait for Gitea release
+        run: sleep 60
+
+      - name: Fetch release notes from Gitea
+        id: gitea_notes
+        env:
+          TAG_NAME: ${{ github.ref_name }}
+        run: |
+          NOTES=$(curl -s "https://somegit.dev/api/v1/repos/vikingowl/vessel/releases/tags/${TAG_NAME}" | jq -r '.body // empty')
+          if [ -n "$NOTES" ]; then
+            echo "found=true" >> $GITHUB_OUTPUT
+            {
+              echo "notes<<EOF"
+              echo "$NOTES"
+              echo "EOF"
+            } >> $GITHUB_OUTPUT
+          else
+            echo "found=false" >> $GITHUB_OUTPUT
+            echo "notes=See the [full release notes on Gitea](https://somegit.dev/vikingowl/vessel/releases/tag/${TAG_NAME}) for detailed information." >> $GITHUB_OUTPUT
+          fi
+
      - name: Create GitHub Release
        uses: softprops/action-gh-release@v2
        with:
-          generate_release_notes: true
-          body: |
-            See the [full release notes on Gitea](https://somegit.dev/vikingowl/vessel/releases) for detailed information.
+          body: ${{ steps.gitea_notes.outputs.notes }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
--- a/.gitignore
+++ b/.gitignore
@@ -36,3 +36,12 @@ docker-compose.override.yml

 # Claude Code project instructions (local only)
 CLAUDE.md
+
+# Dev artifacts
+dev.env
+backend/vessel-backend
+data/
+backend/data-dev/
+
+# Generated files
+frontend/static/pdf.worker.min.mjs
--- a/README.md
+++ b/README.md
@@ -9,34 +9,23 @@
 </p>

 <p align="center">
-  <a href="#why-vessel">Why Vessel</a> •
  <a href="#features">Features</a> •
-  <a href="#screenshots">Screenshots</a> •
  <a href="#quick-start">Quick Start</a> •
-  <a href="#installation">Installation</a> •
-  <a href="#roadmap">Roadmap</a>
+  <a href="https://github.com/VikingOwl91/vessel/wiki">Documentation</a> •
+  <a href="#contributing">Contributing</a>
 </p>

 <p align="center">
  <img src="https://img.shields.io/badge/SvelteKit-5.0-FF3E00?style=flat-square&logo=svelte&logoColor=white" alt="SvelteKit 5">
-  <img src="https://img.shields.io/badge/Svelte-5.16-FF3E00?style=flat-square&logo=svelte&logoColor=white" alt="Svelte 5">
  <img src="https://img.shields.io/badge/Go-1.24-00ADD8?style=flat-square&logo=go&logoColor=white" alt="Go 1.24">
-  <img src="https://img.shields.io/badge/TypeScript-5.7-3178C6?style=flat-square&logo=typescript&logoColor=white" alt="TypeScript">
-  <img src="https://img.shields.io/badge/Tailwind-3.4-06B6D4?style=flat-square&logo=tailwindcss&logoColor=white" alt="Tailwind CSS">
  <img src="https://img.shields.io/badge/Docker-Ready-2496ED?style=flat-square&logo=docker&logoColor=white" alt="Docker">
-</p>
-
-<p align="center">
  <img src="https://img.shields.io/badge/license-GPL--3.0-blue?style=flat-square" alt="License GPL-3.0">
-  <img src="https://img.shields.io/badge/PRs-welcome-brightgreen?style=flat-square" alt="PRs Welcome">
 </p>

 ---

 ## Why Vessel

-Vessel and [open-webui](https://github.com/open-webui/open-webui) solve different problems.
-
 **Vessel** is intentionally focused on:

 - A clean, local-first UI for **Ollama**
@@ -44,52 +33,35 @@ Vessel and [open-webui](https://github.com/open-webui/open-webui) solve differen
 - Low visual and cognitive overhead
 - Doing a small set of things well

-It exists for users who want a UI that is fast and uncluttered, makes browsing and managing Ollama models simple, and stays out of the way once set up.
-
-**open-webui** aims to be a feature-rich, extensible frontend supporting many runtimes, integrations, and workflows. That flexibility is powerful — but it comes with more complexity in setup, UI, and maintenance.
-
-### In short
-
- If you want a **universal, highly configurable platform** → open-webui is a great choice
- If you want a **small, focused UI for local Ollama usage** → Vessel is built for that
-
-Vessel deliberately avoids becoming a platform. Its scope is narrow by design.
+If you want a **universal, highly configurable platform** → [open-webui](https://github.com/open-webui/open-webui) is a great choice.
+If you want a **small, focused UI for local Ollama usage** → Vessel is built for that.

 ---

 ## Features

-### Core Chat Experience
- **Real-time streaming** — Watch responses appear token by token
- **Conversation history** — All chats stored locally in IndexedDB
- **Message editing** — Edit any message and regenerate responses with branching
- **Branch navigation** — Explore different response paths from edited messages
- **Markdown rendering** — Full GFM support with tables, lists, and formatting
- **Syntax highlighting** — Beautiful code blocks powered by Shiki with 100+ languages
- **Dark/Light mode** — Seamless theme switching with system preference detection
+### Chat
+- Real-time streaming responses
+- Message editing with branch navigation
+- Markdown rendering with syntax highlighting
+- Dark/Light themes

-### Built-in Tools (Function Calling)
-Vessel includes five powerful tools that models can invoke automatically:
+### Tools
+- **5 built-in tools**: web search, URL fetching, calculator, location, time
+- **Custom tools**: Create your own in JavaScript, Python, or HTTP
+- Test tools before saving with the built-in testing panel

-| Tool | Description |
-|------|-------------|
-| **Web Search** | Search the internet for current information, news, weather, prices |
-| **Fetch URL** | Read and extract content from any webpage |
-| **Calculator** | Safe math expression parser with functions (sqrt, sin, cos, log, etc.) |
-| **Get Location** | Detect user location via GPS or IP for local queries |
-| **Get Time** | Current date/time with timezone support |
+### Models
+- Browse and pull models from ollama.com
+- Create custom models with embedded system prompts
+- Track model updates

-### Model Management
- **Model browser** — Browse, search, and pull models from Ollama registry
- **Live status** — See which models are currently loaded in memory
- **Quick switch** — Change models mid-conversation
- **Model metadata** — View parameters, quantization, and capabilities
+### Prompts
+- Save and organize system prompts
+- Assign default prompts to specific models
+- Capability-based auto-selection (vision, code, tools, thinking)

-### Developer Experience
- **Beautiful code generation** — Syntax-highlighted output for any language
- **Copy code blocks** — One-click copy with visual feedback
- **Scroll to bottom** — Smart auto-scroll with manual override
- **Keyboard shortcuts** — Navigate efficiently with hotkeys
+📖 **[Full documentation on the Wiki →](https://github.com/VikingOwl91/vessel/wiki)**

 ---

@@ -98,33 +70,22 @@ Vessel includes five powerful tools that models can invoke automatically:
 <table>
  <tr>
    <td align="center" width="50%">
-      <img src="screenshots/hero-dark.png" alt="Chat Interface - Dark Mode">
-      <br>
-      <em>Clean, modern chat interface</em>
+      <img src="screenshots/hero-dark.png" alt="Chat Interface">
+      <br><em>Clean chat interface</em>
    </td>
    <td align="center" width="50%">
      <img src="screenshots/code-generation.png" alt="Code Generation">
-      <br>
-      <em>Syntax-highlighted code output</em>
+      <br><em>Syntax-highlighted code</em>
    </td>
  </tr>
  <tr>
    <td align="center" width="50%">
-      <img src="screenshots/web-search.png" alt="Web Search Results">
-      <br>
-      <em>Integrated web search with styled results</em>
+      <img src="screenshots/web-search.png" alt="Web Search">
+      <br><em>Integrated web search</em>
    </td>
    <td align="center" width="50%">
-      <img src="screenshots/light-mode.png" alt="Light Mode">
-      <br>
-      <em>Light theme for daytime use</em>
-    </td>
-  </tr>
-  <tr>
-    <td align="center" colspan="2">
-      <img src="screenshots/model-browser.png" alt="Model Browser" width="50%">
-      <br>
-      <em>Browse and manage Ollama models</em>
+      <img src="screenshots/model-browser.png" alt="Model Browser">
+      <br><em>Model browser</em>
    </td>
  </tr>
 </table>
@@ -136,330 +97,107 @@ Vessel includes five powerful tools that models can invoke automatically:
 ### Prerequisites

 - [Docker](https://docs.docker.com/get-docker/) and Docker Compose
- [Ollama](https://ollama.com/download) installed and running locally
+- [Ollama](https://ollama.com/download) running locally

-#### Ollama Configuration
+### Configure Ollama

-Ollama must listen on all interfaces for Docker containers to connect. Configure it by setting `OLLAMA_HOST=0.0.0.0`:
+Ollama must listen on all interfaces for Docker to connect:

-**Option A: Using systemd (Linux, recommended)**
 ```bash
+# Option A: systemd (Linux)
 sudo systemctl edit ollama
-```
-
-Add these lines:
-```ini
-[Service]
-Environment="OLLAMA_HOST=0.0.0.0"
-```
-
-Then restart:
-```bash
-sudo systemctl daemon-reload
+# Add: Environment="OLLAMA_HOST=0.0.0.0"
 sudo systemctl restart ollama
-```

-**Option B: Manual start**
-```bash
+# Option B: Manual
 OLLAMA_HOST=0.0.0.0 ollama serve
 ```

-### One-Line Install
+### Install

 ```bash
+# One-line install
 curl -fsSL https://somegit.dev/vikingowl/vessel/raw/main/install.sh | bash
-```

-### Or Clone and Run
-
-```bash
-git clone https://somegit.dev/vikingowl/vessel.git
+# Or clone and run
+git clone https://github.com/VikingOwl91/vessel.git
 cd vessel
 ./install.sh
 ```

-The installer will:
- Check for Docker, Docker Compose, and Ollama
- Start the frontend and backend services
- Optionally pull a starter model (llama3.2)
+Open **http://localhost:7842** in your browser.

-Once running, open **http://localhost:7842** in your browser.
+### Update / Uninstall
+
+```bash
+./install.sh --update     # Update to latest
+./install.sh --uninstall  # Remove
+```
+
+📖 **[Detailed installation guide →](https://github.com/VikingOwl91/vessel/wiki/Getting-Started)**

 ---

-## Installation
+## Documentation

-### Option 1: Install Script (Recommended)
+Full documentation is available on the **[GitHub Wiki](https://github.com/VikingOwl91/vessel/wiki)**:

-The install script handles everything automatically:
-
-```bash
-./install.sh              # Install and start
-./install.sh --update     # Update to latest version
-./install.sh --uninstall  # Remove installation
-```
-
-**Requirements:**
- Ollama must be installed and running locally
- Docker and Docker Compose
- Linux or macOS
-
-### Option 2: Docker Compose (Manual)
-
-```bash
-# Make sure Ollama is running first
-ollama serve
-
-# Start Vessel
-docker compose up -d
-```
-
-### Option 3: Manual Setup (Development)
-
-#### Prerequisites
- [Node.js](https://nodejs.org/) 20+
- [Go](https://go.dev/) 1.24+
- [Ollama](https://ollama.com/) running locally
-
-#### Frontend
-
-```bash
-cd frontend
-npm install
-npm run dev
-```
-
-Frontend runs on `http://localhost:5173`
-
-#### Backend
-
-```bash
-cd backend
-go mod tidy
-go run cmd/server/main.go -port 9090
-```
-
-Backend API runs on `http://localhost:9090`
-
---
-
-## Configuration
-
-### Environment Variables
-
-#### Frontend
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `OLLAMA_API_URL` | `http://localhost:11434` | Ollama API endpoint |
-| `BACKEND_URL` | `http://localhost:9090` | Vessel backend API |
-
-#### Backend
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `OLLAMA_URL` | `http://localhost:11434` | Ollama API endpoint |
-| `PORT` | `8080` | Backend server port |
-| `GIN_MODE` | `debug` | Gin mode (`debug`, `release`) |
-
-### Docker Compose Override
-
-Create `docker-compose.override.yml` for local customizations:
-
-```yaml
-services:
-  frontend:
-    environment:
-      - CUSTOM_VAR=value
-    ports:
-      - "3000:3000"  # Different port
-```
-
---
-
-## Architecture
-
-```
-vessel/
-├── frontend/               # SvelteKit 5 application
-│   ├── src/
-│   │   ├── lib/
-│   │   │   ├── components/ # UI components
-│   │   │   ├── stores/     # Svelte 5 runes state
-│   │   │   ├── tools/      # Built-in tool definitions
-│   │   │   ├── storage/    # IndexedDB (Dexie)
-│   │   │   └── api/        # API clients
-│   │   └── routes/         # SvelteKit routes
-│   └── Dockerfile
-│
-├── backend/                # Go API server
-│   ├── cmd/server/         # Entry point
-│   └── internal/
-│       ├── api/            # HTTP handlers
-│       │   ├── fetcher.go  # URL fetching with wget/curl/chromedp
-│       │   ├── search.go   # Web search via DuckDuckGo
-│       │   └── routes.go   # Route definitions
-│       ├── database/       # SQLite storage
-│       └── models/         # Data models
-│
-├── docker-compose.yml      # Production setup
-└── docker-compose.dev.yml  # Development with hot reload
-```
-
---
-
-## Tech Stack
-
-### Frontend
- **[SvelteKit 5](https://kit.svelte.dev/)** — Full-stack framework
- **[Svelte 5](https://svelte.dev/)** — Runes-based reactivity
- **[TypeScript](https://www.typescriptlang.org/)** — Type safety
- **[Tailwind CSS](https://tailwindcss.com/)** — Utility-first styling
- **[Skeleton UI](https://skeleton.dev/)** — Component library
- **[Shiki](https://shiki.matsu.io/)** — Syntax highlighting
- **[Dexie](https://dexie.org/)** — IndexedDB wrapper
- **[Marked](https://marked.js.org/)** — Markdown parser
- **[DOMPurify](https://github.com/cure53/DOMPurify)** — XSS sanitization
-
-### Backend
- **[Go 1.24](https://go.dev/)** — Fast, compiled backend
- **[Gin](https://gin-gonic.com/)** — HTTP framework
- **[SQLite](https://sqlite.org/)** — Embedded database
- **[chromedp](https://github.com/chromedp/chromedp)** — Headless browser
-
---
-
-## Development
-
-### Running Tests
-
-```bash
-# Frontend unit tests
-cd frontend
-npm run test
-
-# With coverage
-npm run test:coverage
-
-# Watch mode
-npm run test:watch
-```
-
-### Type Checking
-
-```bash
-cd frontend
-npm run check
-```
-
-### Development Mode
-
-Use the dev compose file for hot reloading:
-
-```bash
-docker compose -f docker-compose.dev.yml up
-```
-
---
-
-## API Reference
-
-### Backend Endpoints
-
-| Method | Endpoint | Description |
-|--------|----------|-------------|
-| `POST` | `/api/v1/proxy/search` | Web search via DuckDuckGo |
-| `POST` | `/api/v1/proxy/fetch` | Fetch URL content |
-| `GET` | `/api/v1/location` | Get user location from IP |
-| `GET` | `/api/v1/models/registry` | Browse Ollama model registry |
-| `GET` | `/api/v1/models/search` | Search models |
-| `POST` | `/api/v1/chats/sync` | Sync conversations |
-
-### Ollama Proxy
-
-All requests to `/ollama/*` are proxied to the Ollama API, enabling CORS.
+| Guide | Description |
+|-------|-------------|
+| [Getting Started](https://github.com/VikingOwl91/vessel/wiki/Getting-Started) | Installation and configuration |
+| [Custom Tools](https://github.com/VikingOwl91/vessel/wiki/Custom-Tools) | Create JavaScript, Python, or HTTP tools |
+| [System Prompts](https://github.com/VikingOwl91/vessel/wiki/System-Prompts) | Manage prompts with model defaults |
+| [Custom Models](https://github.com/VikingOwl91/vessel/wiki/Custom-Models) | Create models with embedded prompts |
+| [Built-in Tools](https://github.com/VikingOwl91/vessel/wiki/Built-in-Tools) | Reference for web search, calculator, etc. |
+| [API Reference](https://github.com/VikingOwl91/vessel/wiki/API-Reference) | Backend endpoints |
+| [Development](https://github.com/VikingOwl91/vessel/wiki/Development) | Contributing and architecture |
+| [Troubleshooting](https://github.com/VikingOwl91/vessel/wiki/Troubleshooting) | Common issues and solutions |

 ---

 ## Roadmap

-Vessel is intentionally focused on being a **clean, local-first UI for Ollama**.
-The roadmap prioritizes **usability, clarity, and low friction** over feature breadth.
+Vessel prioritizes **usability and simplicity** over feature breadth.

-### Core UX Improvements (Near-term)
+**Completed:**
+- [x] Model browser with filtering and update detection
+- [x] Custom tools (JavaScript, Python, HTTP)
+- [x] System prompt library with model-specific defaults
+- [x] Custom model creation with embedded prompts

-These improve the existing experience without expanding scope.
-
- [ ] Improve model browser & search
-  - better filtering (size, tags, quantization)
-  - clearer metadata presentation
+**Planned:**
 - [ ] Keyboard-first workflows
-  - model switching
-  - prompt navigation
- [ ] UX polish & stability
-  - error handling
-  - loading / offline states
-  - small performance improvements
-
-### Local Ecosystem Quality-of-Life (Opt-in)
-
-Still local-first, still focused — but easing onboarding and workflows.
-
- [ ] Docker-based Ollama support
-  *(for systems without native Ollama installs)*
+- [ ] UX polish and stability improvements
 - [ ] Optional voice input/output
-  *(accessibility & convenience, not a core requirement)*
- [ ] Presets for common workflows
-  *(model + tool combinations, kept simple)*

-### Experimental / Explicitly Optional
+**Non-Goals:**
+- Multi-user systems
+- Cloud sync
+- Plugin ecosystems
+- Support for every LLM runtime

-These are **explorations**, not promises. They are intentionally separated to avoid scope creep.
-
- [ ] Image generation support
-  *(only if it can be cleanly isolated from the core UI)*
- [ ] Hugging Face integration
-  *(evaluated carefully to avoid bloating the local-first experience)*
-
-### Non-Goals (By Design)
-
-Vessel intentionally avoids becoming a platform.
-
- Multi-user / account-based systems
- Cloud sync or hosted services
- Large plugin ecosystems
- "Universal" support for every LLM runtime
-
-If a feature meaningfully compromises simplicity, it likely doesn't belong in core Vessel.
-
-### Philosophy
-
-> Do one thing well.
-> Keep the UI out of the way.
-> Prefer clarity over configurability.
+> *Do one thing well. Keep the UI out of the way.*

 ---

 ## Contributing

-Contributions are welcome! Please feel free to submit a Pull Request.
-
-> Issues and feature requests are tracked on GitHub:
-> https://github.com/VikingOwl91/vessel/issues
+Contributions are welcome!

 1. Fork the repository
 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
-3. Commit your changes (`git commit -m 'Add some amazing feature'`)
-4. Push to the branch (`git push origin feature/amazing-feature`)
-5. Open a Pull Request
+3. Commit your changes
+4. Push and open a Pull Request
+
+📖 **[Development guide →](https://github.com/VikingOwl91/vessel/wiki/Development)**
+
+**Issues:** [github.com/VikingOwl91/vessel/issues](https://github.com/VikingOwl91/vessel/issues)

 ---

 ## License

-Copyright (C) 2026 VikingOwl
-
-This project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details.
-
---
+GPL-3.0 — See [LICENSE](LICENSE) for details.

 <p align="center">
  Made with <a href="https://ollama.com">Ollama</a> and <a href="https://svelte.dev">Svelte</a>
--- a/backend/cmd/server/main.go
+++ b/backend/cmd/server/main.go
@@ -18,7 +18,7 @@ import (
 )

 // Version is set at build time via -ldflags, or defaults to dev
-var Version = "0.4.2"
+var Version = "0.4.11"

 func getEnvOrDefault(key, defaultValue string) string {
 	if value := os.Getenv(key); value != "" {
--- a/backend/internal/api/model_registry.go
+++ b/backend/internal/api/model_registry.go
@@ -70,6 +70,7 @@ type ScrapedModel struct {
 	PullCount    int64
 	Tags         []string
 	Capabilities []string
+	UpdatedAt    string // Relative time like "2 weeks ago" converted to RFC3339
 }

 // scrapeOllamaLibrary fetches the model list from ollama.com/library
@@ -168,6 +169,14 @@ func parseLibraryHTML(html string) ([]ScrapedModel, error) {
 			capabilities = append(capabilities, "cloud")
 		}

+		// Extract updated time from <span x-test-updated>2 weeks ago</span>
+		updatedPattern := regexp.MustCompile(`<span[^>]*x-test-updated[^>]*>([^<]+)</span>`)
+		updatedAt := ""
+		if um := updatedPattern.FindStringSubmatch(cardContent); len(um) > 1 {
+			relativeTime := strings.TrimSpace(um[1])
+			updatedAt = parseRelativeTime(relativeTime)
+		}
+
 		models[slug] = &ScrapedModel{
 			Slug:         slug,
 			Name:         slug,
@@ -176,6 +185,7 @@ func parseLibraryHTML(html string) ([]ScrapedModel, error) {
 			PullCount:    pullCount,
 			Tags:         tags,
 			Capabilities: capabilities,
+			UpdatedAt:    updatedAt,
 		}
 	}

@@ -211,6 +221,52 @@ func decodeHTMLEntities(s string) string {
 	return s
 }

+// parseRelativeTime converts relative time strings like "2 weeks ago" to RFC3339 timestamps
+func parseRelativeTime(s string) string {
+	s = strings.ToLower(strings.TrimSpace(s))
+	if s == "" {
+		return ""
+	}
+
+	now := time.Now()
+
+	// Parse patterns like "2 weeks ago", "1 month ago", "3 days ago"
+	pattern := regexp.MustCompile(`(\d+)\s*(second|minute|hour|day|week|month|year)s?\s*ago`)
+	matches := pattern.FindStringSubmatch(s)
+	if len(matches) < 3 {
+		return ""
+	}
+
+	num, err := strconv.Atoi(matches[1])
+	if err != nil {
+		return ""
+	}
+
+	unit := matches[2]
+	var duration time.Duration
+
+	switch unit {
+	case "second":
+		duration = time.Duration(num) * time.Second
+	case "minute":
+		duration = time.Duration(num) * time.Minute
+	case "hour":
+		duration = time.Duration(num) * time.Hour
+	case "day":
+		duration = time.Duration(num) * 24 * time.Hour
+	case "week":
+		duration = time.Duration(num) * 7 * 24 * time.Hour
+	case "month":
+		duration = time.Duration(num) * 30 * 24 * time.Hour
+	case "year":
+		duration = time.Duration(num) * 365 * 24 * time.Hour
+	default:
+		return ""
+	}
+
+	return now.Add(-duration).Format(time.RFC3339)
+}
+
 // extractDescription tries to find the description for a model
 func extractDescription(html, slug string) string {
 	// Look for text after the model link that looks like a description
@@ -417,15 +473,16 @@ func (s *ModelRegistryService) SyncModels(ctx context.Context, fetchDetails bool
 		modelType := inferModelType(model.Slug)

 		_, err := s.db.ExecContext(ctx, `
-			INSERT INTO remote_models (slug, name, description, model_type, url, pull_count, tags, capabilities, scraped_at)
-			VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+			INSERT INTO remote_models (slug, name, description, model_type, url, pull_count, tags, capabilities, ollama_updated_at, scraped_at)
+			VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
 			ON CONFLICT(slug) DO UPDATE SET
 				description = COALESCE(NULLIF(excluded.description, ''), remote_models.description),
 				model_type = excluded.model_type,
 				pull_count = excluded.pull_count,
 				capabilities = excluded.capabilities,
+				ollama_updated_at = COALESCE(excluded.ollama_updated_at, remote_models.ollama_updated_at),
 				scraped_at = excluded.scraped_at
-		`, model.Slug, model.Name, model.Description, modelType, model.URL, model.PullCount, string(tagsJSON), string(capsJSON), now)
+		`, model.Slug, model.Name, model.Description, modelType, model.URL, model.PullCount, string(tagsJSON), string(capsJSON), model.UpdatedAt, now)

 		if err != nil {
 			log.Printf("Failed to upsert model %s: %v", model.Slug, err)
@@ -434,6 +491,55 @@ func (s *ModelRegistryService) SyncModels(ctx context.Context, fetchDetails bool
 		count++
 	}

+	// If fetchDetails is true and we have an Ollama client, update capabilities
+	// for installed models using the actual /api/show response (more accurate than scraped data)
+	if fetchDetails && s.ollamaClient != nil {
+		installedModels, err := s.ollamaClient.List(ctx)
+		if err != nil {
+			log.Printf("Warning: failed to list installed models for capability sync: %v", err)
+		} else {
+			log.Printf("Syncing capabilities for %d installed models", len(installedModels.Models))
+
+			for _, installed := range installedModels.Models {
+				select {
+				case <-ctx.Done():
+					return count, ctx.Err()
+				default:
+				}
+
+				// Extract base model name (e.g., "deepseek-r1" from "deepseek-r1:14b")
+				modelName := installed.Model
+				baseName := strings.Split(modelName, ":")[0]
+
+				// Fetch real capabilities from Ollama
+				details, err := s.fetchModelDetails(ctx, modelName)
+				if err != nil {
+					log.Printf("Warning: failed to fetch details for %s: %v", modelName, err)
+					continue
+				}
+
+				// Extract capabilities from the actual Ollama response
+				capabilities := []string{}
+				if details.Capabilities != nil {
+					for _, cap := range details.Capabilities {
+						capabilities = append(capabilities, string(cap))
+					}
+				}
+				capsJSON, _ := json.Marshal(capabilities)
+
+				// Update capabilities for the base model name
+				_, err = s.db.ExecContext(ctx, `
+					UPDATE remote_models SET capabilities = ? WHERE slug = ?
+				`, string(capsJSON), baseName)
+				if err != nil {
+					log.Printf("Warning: failed to update capabilities for %s: %v", baseName, err)
+				} else {
+					log.Printf("Updated capabilities for %s: %v", baseName, capabilities)
+				}
+			}
+		}
+	}
+
 	return count, nil
 }

@@ -572,6 +678,106 @@ func formatParamCount(n int64) string {
 	return fmt.Sprintf("%d", n)
 }

+// parseParamSizeToFloat extracts numeric value from parameter size strings like "8b", "70b", "1.5b"
+// Returns value in billions (e.g., "8b" -> 8.0, "70b" -> 70.0, "500m" -> 0.5)
+func parseParamSizeToFloat(s string) float64 {
+	s = strings.ToLower(strings.TrimSpace(s))
+	if s == "" {
+		return 0
+	}
+
+	// Handle suffix
+	multiplier := 1.0
+	if strings.HasSuffix(s, "b") {
+		s = strings.TrimSuffix(s, "b")
+	} else if strings.HasSuffix(s, "m") {
+		s = strings.TrimSuffix(s, "m")
+		multiplier = 0.001 // Convert millions to billions
+	}
+
+	if f, err := strconv.ParseFloat(s, 64); err == nil {
+		return f * multiplier
+	}
+	return 0
+}
+
+// getSizeRange returns the size range category for a given parameter size
+// small: ≤3B, medium: 4-13B, large: 14-70B, xlarge: >70B
+func getSizeRange(paramSize string) string {
+	size := parseParamSizeToFloat(paramSize)
+	if size <= 0 {
+		return ""
+	}
+	if size <= 3 {
+		return "small"
+	}
+	if size <= 13 {
+		return "medium"
+	}
+	if size <= 70 {
+		return "large"
+	}
+	return "xlarge"
+}
+
+// modelMatchesSizeRanges checks if any of the model's tags fall within the requested size ranges
+// A model matches if at least one of its tags is in any of the requested ranges
+func modelMatchesSizeRanges(tags []string, sizeRanges []string) bool {
+	if len(tags) == 0 || len(sizeRanges) == 0 {
+		return false
+	}
+	for _, tag := range tags {
+		tagRange := getSizeRange(tag)
+		if tagRange == "" {
+			continue
+		}
+		for _, sr := range sizeRanges {
+			if sr == tagRange {
+				return true
+			}
+		}
+	}
+	return false
+}
+
+// getContextRange returns the context range category for a given context length
+// standard: ≤8K, extended: 8K-32K, large: 32K-128K, unlimited: >128K
+func getContextRange(ctxLen int64) string {
+	if ctxLen <= 0 {
+		return ""
+	}
+	if ctxLen <= 8192 {
+		return "standard"
+	}
+	if ctxLen <= 32768 {
+		return "extended"
+	}
+	if ctxLen <= 131072 {
+		return "large"
+	}
+	return "unlimited"
+}
+
+// extractFamily extracts the model family from slug (e.g., "llama3.2" -> "llama", "qwen2.5" -> "qwen")
+func extractFamily(slug string) string {
+	// Remove namespace prefix for community models
+	if idx := strings.LastIndex(slug, "/"); idx != -1 {
+		slug = slug[idx+1:]
+	}
+	// Extract letters before any digits
+	family := ""
+	for _, r := range slug {
+		if r >= '0' && r <= '9' {
+			break
+		}
+		if r == '-' || r == '_' || r == '.' {
+			break
+		}
+		family += string(r)
+	}
+	return strings.ToLower(family)
+}
+
 // GetModel retrieves a single model from the database
 func (s *ModelRegistryService) GetModel(ctx context.Context, slug string) (*RemoteModel, error) {
 	row := s.db.QueryRowContext(ctx, `
@@ -584,40 +790,65 @@ func (s *ModelRegistryService) GetModel(ctx context.Context, slug string) (*Remo
 	return scanRemoteModel(row)
 }

+// ModelSearchParams holds all search/filter parameters
+type ModelSearchParams struct {
+	Query         string
+	ModelType     string
+	Capabilities  []string
+	SizeRanges    []string // small, medium, large, xlarge
+	ContextRanges []string // standard, extended, large, unlimited
+	Family        string
+	SortBy        string
+	Limit         int
+	Offset        int
+}
+
 // SearchModels searches for models in the database
 func (s *ModelRegistryService) SearchModels(ctx context.Context, query string, modelType string, capabilities []string, sortBy string, limit, offset int) ([]RemoteModel, int, error) {
+	return s.SearchModelsAdvanced(ctx, ModelSearchParams{
+		Query:        query,
+		ModelType:    modelType,
+		Capabilities: capabilities,
+		SortBy:       sortBy,
+		Limit:        limit,
+		Offset:       offset,
+	})
+}
+
+// SearchModelsAdvanced searches for models with all filter options
+func (s *ModelRegistryService) SearchModelsAdvanced(ctx context.Context, params ModelSearchParams) ([]RemoteModel, int, error) {
 	// Build query
 	baseQuery := `FROM remote_models WHERE 1=1`
 	args := []any{}

-	if query != "" {
+	if params.Query != "" {
 		baseQuery += ` AND (slug LIKE ? OR name LIKE ? OR description LIKE ?)`
-		q := "%" + query + "%"
+		q := "%" + params.Query + "%"
 		args = append(args, q, q, q)
 	}

-	if modelType != "" {
+	if params.ModelType != "" {
 		baseQuery += ` AND model_type = ?`
-		args = append(args, modelType)
+		args = append(args, params.ModelType)
 	}

 	// Filter by capabilities (JSON array contains)
-	for _, cap := range capabilities {
+	for _, cap := range params.Capabilities {
 		// Use JSON contains for SQLite - capabilities column stores JSON array like ["vision","code"]
 		baseQuery += ` AND capabilities LIKE ?`
 		args = append(args, `%"`+cap+`"%`)
 	}

-	// Get total count
-	var total int
-	countQuery := "SELECT COUNT(*) " + baseQuery
-	if err := s.db.QueryRowContext(ctx, countQuery, args...).Scan(&total); err != nil {
-		return nil, 0, err
+	// Filter by family (extracted from slug)
+	if params.Family != "" {
+		// Match slugs that start with the family name
+		baseQuery += ` AND (slug LIKE ? OR slug LIKE ?)`
+		args = append(args, params.Family+"%", "%/"+params.Family+"%")
 	}

 	// Build ORDER BY clause based on sort parameter
 	orderBy := "pull_count DESC" // default: most popular
-	switch sortBy {
+	switch params.SortBy {
 	case "name_asc":
 		orderBy = "name ASC"
 	case "name_desc":
@@ -630,12 +861,25 @@ func (s *ModelRegistryService) SearchModels(ctx context.Context, query string, m
 		orderBy = "ollama_updated_at DESC NULLS LAST, scraped_at DESC"
 	}

-	// Get models
-	selectQuery := `SELECT slug, name, description, model_type, architecture, parameter_size,
-		context_length, embedding_length, quantization, capabilities, default_params,
-		license, pull_count, tags, tag_sizes, ollama_updated_at, details_fetched_at, scraped_at, url ` +
-		baseQuery + ` ORDER BY ` + orderBy + ` LIMIT ? OFFSET ?`
-	args = append(args, limit, offset)
+	// For size/context filtering, we need to fetch all matching models first
+	// then filter and paginate in memory (these filters require computed values)
+	needsPostFilter := len(params.SizeRanges) > 0 || len(params.ContextRanges) > 0
+
+	var selectQuery string
+	if needsPostFilter {
+		// Fetch all (no limit/offset) for post-filtering
+		selectQuery = `SELECT slug, name, description, model_type, architecture, parameter_size,
+			context_length, embedding_length, quantization, capabilities, default_params,
+			license, pull_count, tags, tag_sizes, ollama_updated_at, details_fetched_at, scraped_at, url ` +
+			baseQuery + ` ORDER BY ` + orderBy
+	} else {
+		// Direct pagination
+		selectQuery = `SELECT slug, name, description, model_type, architecture, parameter_size,
+			context_length, embedding_length, quantization, capabilities, default_params,
+			license, pull_count, tags, tag_sizes, ollama_updated_at, details_fetched_at, scraped_at, url ` +
+			baseQuery + ` ORDER BY ` + orderBy + ` LIMIT ? OFFSET ?`
+		args = append(args, params.Limit, params.Offset)
+	}

 	rows, err := s.db.QueryContext(ctx, selectQuery, args...)
 	if err != nil {
@@ -649,10 +893,64 @@ func (s *ModelRegistryService) SearchModels(ctx context.Context, query string, m
 		if err != nil {
 			return nil, 0, err
 		}
+
+		// Apply size range filter based on tags
+		if len(params.SizeRanges) > 0 {
+			if !modelMatchesSizeRanges(m.Tags, params.SizeRanges) {
+				continue // Skip models without matching size tags
+			}
+		}
+
+		// Apply context range filter
+		if len(params.ContextRanges) > 0 {
+			modelCtxRange := getContextRange(m.ContextLength)
+			if modelCtxRange == "" {
+				continue // Skip models without context info
+			}
+			found := false
+			for _, cr := range params.ContextRanges {
+				if cr == modelCtxRange {
+					found = true
+					break
+				}
+			}
+			if !found {
+				continue
+			}
+		}
+
 		models = append(models, *m)
 	}

-	return models, total, rows.Err()
+	if err := rows.Err(); err != nil {
+		return nil, 0, err
+	}
+
+	// Get total after filtering
+	total := len(models)
+
+	// Apply pagination for post-filtered results
+	if needsPostFilter {
+		if params.Offset >= len(models) {
+			models = []RemoteModel{}
+		} else {
+			end := params.Offset + params.Limit
+			if end > len(models) {
+				end = len(models)
+			}
+			models = models[params.Offset:end]
+		}
+	} else {
+		// Get total count from DB for non-post-filtered queries
+		countQuery := "SELECT COUNT(*) " + baseQuery
+		// Remove the limit/offset args we added
+		countArgs := args[:len(args)-2]
+		if err := s.db.QueryRowContext(ctx, countQuery, countArgs...).Scan(&total); err != nil {
+			return nil, 0, err
+		}
+	}
+
+	return models, total, nil
 }

 // GetSyncStatus returns info about when models were last synced
@@ -764,31 +1062,53 @@ func scanRemoteModelRows(rows *sql.Rows) (*RemoteModel, error) {
 // ListRemoteModelsHandler returns a handler for listing/searching remote models
 func (s *ModelRegistryService) ListRemoteModelsHandler() gin.HandlerFunc {
 	return func(c *gin.Context) {
-		query := c.Query("search")
-		modelType := c.Query("type")
-		sortBy := c.Query("sort") // name_asc, name_desc, pulls_asc, pulls_desc, updated_desc
-		limit := 50
-		offset := 0
+		params := ModelSearchParams{
+			Query:     c.Query("search"),
+			ModelType: c.Query("type"),
+			SortBy:    c.Query("sort"), // name_asc, name_desc, pulls_asc, pulls_desc, updated_desc
+			Family:    c.Query("family"),
+			Limit:     50,
+			Offset:    0,
+		}

 		if l, err := strconv.Atoi(c.Query("limit")); err == nil && l > 0 && l <= 200 {
-			limit = l
+			params.Limit = l
 		}
 		if o, err := strconv.Atoi(c.Query("offset")); err == nil && o >= 0 {
-			offset = o
+			params.Offset = o
 		}

 		// Parse capabilities filter (comma-separated)
-		var capabilities []string
 		if caps := c.Query("capabilities"); caps != "" {
 			for _, cap := range strings.Split(caps, ",") {
 				cap = strings.TrimSpace(cap)
 				if cap != "" {
-					capabilities = append(capabilities, cap)
+					params.Capabilities = append(params.Capabilities, cap)
 				}
 			}
 		}

-		models, total, err := s.SearchModels(c.Request.Context(), query, modelType, capabilities, sortBy, limit, offset)
+		// Parse size range filter (comma-separated: small,medium,large,xlarge)
+		if sizes := c.Query("sizeRange"); sizes != "" {
+			for _, sz := range strings.Split(sizes, ",") {
+				sz = strings.TrimSpace(strings.ToLower(sz))
+				if sz == "small" || sz == "medium" || sz == "large" || sz == "xlarge" {
+					params.SizeRanges = append(params.SizeRanges, sz)
+				}
+			}
+		}
+
+		// Parse context range filter (comma-separated: standard,extended,large,unlimited)
+		if ctx := c.Query("contextRange"); ctx != "" {
+			for _, cr := range strings.Split(ctx, ",") {
+				cr = strings.TrimSpace(strings.ToLower(cr))
+				if cr == "standard" || cr == "extended" || cr == "large" || cr == "unlimited" {
+					params.ContextRanges = append(params.ContextRanges, cr)
+				}
+			}
+		}
+
+		models, total, err := s.SearchModelsAdvanced(c.Request.Context(), params)
 		if err != nil {
 			c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
 			return
@@ -797,8 +1117,8 @@ func (s *ModelRegistryService) ListRemoteModelsHandler() gin.HandlerFunc {
 		c.JSON(http.StatusOK, gin.H{
 			"models": models,
 			"total":  total,
-			"limit":  limit,
-			"offset": offset,
+			"limit":  params.Limit,
+			"offset": params.Offset,
 		})
 	}
 }
@@ -1138,3 +1458,36 @@ func (s *ModelRegistryService) GetLocalFamiliesHandler() gin.HandlerFunc {
 		c.JSON(http.StatusOK, gin.H{"families": families})
 	}
 }
+
+// GetRemoteFamiliesHandler returns unique model families from remote models
+// Useful for populating filter dropdowns
+func (s *ModelRegistryService) GetRemoteFamiliesHandler() gin.HandlerFunc {
+	return func(c *gin.Context) {
+		rows, err := s.db.QueryContext(c.Request.Context(), `SELECT DISTINCT slug FROM remote_models`)
+		if err != nil {
+			c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
+			return
+		}
+		defer rows.Close()
+
+		familySet := make(map[string]bool)
+		for rows.Next() {
+			var slug string
+			if err := rows.Scan(&slug); err != nil {
+				continue
+			}
+			family := extractFamily(slug)
+			if family != "" {
+				familySet[family] = true
+			}
+		}
+
+		families := make([]string, 0, len(familySet))
+		for f := range familySet {
+			families = append(families, f)
+		}
+		sort.Strings(families)
+
+		c.JSON(http.StatusOK, gin.H{"families": families})
+	}
+}
--- a/backend/internal/api/ollama_client.go
+++ b/backend/internal/api/ollama_client.go
@@ -336,6 +336,56 @@ func (s *OllamaService) CopyModelHandler() gin.HandlerFunc {
 	}
 }

+// CreateModelHandler handles custom model creation with progress streaming
+// Creates a new model derived from an existing one with a custom system prompt
+func (s *OllamaService) CreateModelHandler() gin.HandlerFunc {
+	return func(c *gin.Context) {
+		var req api.CreateRequest
+		if err := c.ShouldBindJSON(&req); err != nil {
+			c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request: " + err.Error()})
+			return
+		}
+
+		c.Header("Content-Type", "application/x-ndjson")
+		c.Header("Cache-Control", "no-cache")
+		c.Header("Connection", "keep-alive")
+
+		ctx := c.Request.Context()
+		flusher, ok := c.Writer.(http.Flusher)
+		if !ok {
+			c.JSON(http.StatusInternalServerError, gin.H{"error": "streaming not supported"})
+			return
+		}
+
+		err := s.client.Create(ctx, &req, func(resp api.ProgressResponse) error {
+			select {
+			case <-ctx.Done():
+				return ctx.Err()
+			default:
+			}
+
+			data, err := json.Marshal(resp)
+			if err != nil {
+				return err
+			}
+
+			_, err = c.Writer.Write(append(data, '\n'))
+			if err != nil {
+				return err
+			}
+			flusher.Flush()
+			return nil
+		})
+
+		if err != nil && err != context.Canceled {
+			errResp := gin.H{"error": err.Error()}
+			data, _ := json.Marshal(errResp)
+			c.Writer.Write(append(data, '\n'))
+			flusher.Flush()
+		}
+	}
+}
+
 // VersionHandler returns Ollama version
 func (s *OllamaService) VersionHandler() gin.HandlerFunc {
 	return func(c *gin.Context) {
--- a/backend/internal/api/routes.go
+++ b/backend/internal/api/routes.go
@@ -83,6 +83,8 @@ func SetupRoutes(r *gin.Engine, db *sql.DB, ollamaURL string, appVersion string)
 			// === Remote Models (from ollama.com cache) ===
 			// List/search remote models (from cache)
 			models.GET("/remote", modelRegistry.ListRemoteModelsHandler())
+			// Get unique model families for filter dropdowns
+			models.GET("/remote/families", modelRegistry.GetRemoteFamiliesHandler())
 			// Get single model details
 			models.GET("/remote/:slug", modelRegistry.GetRemoteModelHandler())
 			// Fetch detailed info from Ollama (requires model to be pulled)
@@ -103,6 +105,7 @@ func SetupRoutes(r *gin.Engine, db *sql.DB, ollamaURL string, appVersion string)
 				ollama.GET("/api/tags", ollamaService.ListModelsHandler())
 				ollama.POST("/api/show", ollamaService.ShowModelHandler())
 				ollama.POST("/api/pull", ollamaService.PullModelHandler())
+				ollama.POST("/api/create", ollamaService.CreateModelHandler())
 				ollama.DELETE("/api/delete", ollamaService.DeleteModelHandler())
 				ollama.POST("/api/copy", ollamaService.CopyModelHandler())

--- a/backend/internal/database/migrations.go
+++ b/backend/internal/database/migrations.go
@@ -123,5 +123,17 @@ func RunMigrations(db *sql.DB) error {
 		}
 	}

+	// Add system_prompt_id column to chats table if it doesn't exist
+	err = db.QueryRow(`SELECT COUNT(*) FROM pragma_table_info('chats') WHERE name='system_prompt_id'`).Scan(&count)
+	if err != nil {
+		return fmt.Errorf("failed to check system_prompt_id column: %w", err)
+	}
+	if count == 0 {
+		_, err = db.Exec(`ALTER TABLE chats ADD COLUMN system_prompt_id TEXT`)
+		if err != nil {
+			return fmt.Errorf("failed to add system_prompt_id column: %w", err)
+		}
+	}
+
 	return nil
 }
--- a/backend/internal/models/chat.go
+++ b/backend/internal/models/chat.go
@@ -10,15 +10,16 @@ import (

 // Chat represents a chat conversation
 type Chat struct {
-	ID          string    `json:"id"`
-	Title       string    `json:"title"`
-	Model       string    `json:"model"`
-	Pinned      bool      `json:"pinned"`
-	Archived    bool      `json:"archived"`
-	CreatedAt   time.Time `json:"created_at"`
-	UpdatedAt   time.Time `json:"updated_at"`
-	SyncVersion int64     `json:"sync_version"`
-	Messages    []Message `json:"messages,omitempty"`
+	ID             string    `json:"id"`
+	Title          string    `json:"title"`
+	Model          string    `json:"model"`
+	Pinned         bool      `json:"pinned"`
+	Archived       bool      `json:"archived"`
+	SystemPromptID *string   `json:"system_prompt_id,omitempty"`
+	CreatedAt      time.Time `json:"created_at"`
+	UpdatedAt      time.Time `json:"updated_at"`
+	SyncVersion    int64     `json:"sync_version"`
+	Messages       []Message `json:"messages,omitempty"`
 }

 // Message represents a chat message
@@ -54,9 +55,9 @@ func CreateChat(db *sql.DB, chat *Chat) error {
 	chat.SyncVersion = 1

 	_, err := db.Exec(`
-		INSERT INTO chats (id, title, model, pinned, archived, created_at, updated_at, sync_version)
-		VALUES (?, ?, ?, ?, ?, ?, ?, ?)`,
-		chat.ID, chat.Title, chat.Model, chat.Pinned, chat.Archived,
+		INSERT INTO chats (id, title, model, pinned, archived, system_prompt_id, created_at, updated_at, sync_version)
+		VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)`,
+		chat.ID, chat.Title, chat.Model, chat.Pinned, chat.Archived, chat.SystemPromptID,
 		chat.CreatedAt.Format(time.RFC3339), chat.UpdatedAt.Format(time.RFC3339), chat.SyncVersion,
 	)
 	if err != nil {
@@ -70,11 +71,12 @@ func GetChat(db *sql.DB, id string) (*Chat, error) {
 	chat := &Chat{}
 	var createdAt, updatedAt string
 	var pinned, archived int
+	var systemPromptID sql.NullString

 	err := db.QueryRow(`
-		SELECT id, title, model, pinned, archived, created_at, updated_at, sync_version
+		SELECT id, title, model, pinned, archived, system_prompt_id, created_at, updated_at, sync_version
 		FROM chats WHERE id = ?`, id).Scan(
-		&chat.ID, &chat.Title, &chat.Model, &pinned, &archived,
+		&chat.ID, &chat.Title, &chat.Model, &pinned, &archived, &systemPromptID,
 		&createdAt, &updatedAt, &chat.SyncVersion,
 	)
 	if err == sql.ErrNoRows {
@@ -86,6 +88,9 @@ func GetChat(db *sql.DB, id string) (*Chat, error) {

 	chat.Pinned = pinned == 1
 	chat.Archived = archived == 1
+	if systemPromptID.Valid {
+		chat.SystemPromptID = &systemPromptID.String
+	}
 	chat.CreatedAt, _ = time.Parse(time.RFC3339, createdAt)
 	chat.UpdatedAt, _ = time.Parse(time.RFC3339, updatedAt)

@@ -102,7 +107,7 @@ func GetChat(db *sql.DB, id string) (*Chat, error) {
 // ListChats retrieves all chats ordered by updated_at
 func ListChats(db *sql.DB, includeArchived bool) ([]Chat, error) {
 	query := `
-		SELECT id, title, model, pinned, archived, created_at, updated_at, sync_version
+		SELECT id, title, model, pinned, archived, system_prompt_id, created_at, updated_at, sync_version
 		FROM chats`
 	if !includeArchived {
 		query += " WHERE archived = 0"
@@ -120,14 +125,18 @@ func ListChats(db *sql.DB, includeArchived bool) ([]Chat, error) {
 		var chat Chat
 		var createdAt, updatedAt string
 		var pinned, archived int
+		var systemPromptID sql.NullString

-		if err := rows.Scan(&chat.ID, &chat.Title, &chat.Model, &pinned, &archived,
+		if err := rows.Scan(&chat.ID, &chat.Title, &chat.Model, &pinned, &archived, &systemPromptID,
 			&createdAt, &updatedAt, &chat.SyncVersion); err != nil {
 			return nil, fmt.Errorf("failed to scan chat: %w", err)
 		}

 		chat.Pinned = pinned == 1
 		chat.Archived = archived == 1
+		if systemPromptID.Valid {
+			chat.SystemPromptID = &systemPromptID.String
+		}
 		chat.CreatedAt, _ = time.Parse(time.RFC3339, createdAt)
 		chat.UpdatedAt, _ = time.Parse(time.RFC3339, updatedAt)
 		chats = append(chats, chat)
@@ -142,10 +151,10 @@ func UpdateChat(db *sql.DB, chat *Chat) error {
 	chat.SyncVersion++

 	result, err := db.Exec(`
-		UPDATE chats SET title = ?, model = ?, pinned = ?, archived = ?,
+		UPDATE chats SET title = ?, model = ?, pinned = ?, archived = ?, system_prompt_id = ?,
 		updated_at = ?, sync_version = ?
 		WHERE id = ?`,
-		chat.Title, chat.Model, chat.Pinned, chat.Archived,
+		chat.Title, chat.Model, chat.Pinned, chat.Archived, chat.SystemPromptID,
 		chat.UpdatedAt.Format(time.RFC3339), chat.SyncVersion, chat.ID,
 	)
 	if err != nil {
@@ -234,7 +243,7 @@ func GetMessagesByChatID(db *sql.DB, chatID string) ([]Message, error) {
 // GetChangedChats retrieves chats changed since a given sync version
 func GetChangedChats(db *sql.DB, sinceVersion int64) ([]Chat, error) {
 	rows, err := db.Query(`
-		SELECT id, title, model, pinned, archived, created_at, updated_at, sync_version
+		SELECT id, title, model, pinned, archived, system_prompt_id, created_at, updated_at, sync_version
 		FROM chats WHERE sync_version > ? ORDER BY sync_version ASC`, sinceVersion)
 	if err != nil {
 		return nil, fmt.Errorf("failed to get changed chats: %w", err)
@@ -246,14 +255,18 @@ func GetChangedChats(db *sql.DB, sinceVersion int64) ([]Chat, error) {
 		var chat Chat
 		var createdAt, updatedAt string
 		var pinned, archived int
+		var systemPromptID sql.NullString

-		if err := rows.Scan(&chat.ID, &chat.Title, &chat.Model, &pinned, &archived,
+		if err := rows.Scan(&chat.ID, &chat.Title, &chat.Model, &pinned, &archived, &systemPromptID,
 			&createdAt, &updatedAt, &chat.SyncVersion); err != nil {
 			return nil, fmt.Errorf("failed to scan chat: %w", err)
 		}

 		chat.Pinned = pinned == 1
 		chat.Archived = archived == 1
+		if systemPromptID.Valid {
+			chat.SystemPromptID = &systemPromptID.String
+		}
 		chat.CreatedAt, _ = time.Parse(time.RFC3339, createdAt)
 		chat.UpdatedAt, _ = time.Parse(time.RFC3339, updatedAt)

@@ -285,13 +298,14 @@ const (

 // GroupedChat represents a chat in a grouped list (without messages for efficiency)
 type GroupedChat struct {
-	ID        string    `json:"id"`
-	Title     string    `json:"title"`
-	Model     string    `json:"model"`
-	Pinned    bool      `json:"pinned"`
-	Archived  bool      `json:"archived"`
-	CreatedAt time.Time `json:"created_at"`
-	UpdatedAt time.Time `json:"updated_at"`
+	ID             string    `json:"id"`
+	Title          string    `json:"title"`
+	Model          string    `json:"model"`
+	Pinned         bool      `json:"pinned"`
+	Archived       bool      `json:"archived"`
+	SystemPromptID *string   `json:"system_prompt_id,omitempty"`
+	CreatedAt      time.Time `json:"created_at"`
+	UpdatedAt      time.Time `json:"updated_at"`
 }

 // ChatGroup represents a group of chats with a date label
@@ -349,7 +363,7 @@ func getDateGroup(t time.Time, now time.Time) DateGroup {
 func ListChatsGrouped(db *sql.DB, search string, includeArchived bool, limit, offset int) (*GroupedChatsResponse, error) {
 	// Build query with optional search filter
 	query := `
-		SELECT id, title, model, pinned, archived, created_at, updated_at
+		SELECT id, title, model, pinned, archived, system_prompt_id, created_at, updated_at
 		FROM chats
 		WHERE 1=1`
 	args := []interface{}{}
@@ -390,14 +404,18 @@ func ListChatsGrouped(db *sql.DB, search string, includeArchived bool, limit, of
 		var chat GroupedChat
 		var createdAt, updatedAt string
 		var pinned, archived int
+		var systemPromptID sql.NullString

-		if err := rows.Scan(&chat.ID, &chat.Title, &chat.Model, &pinned, &archived,
+		if err := rows.Scan(&chat.ID, &chat.Title, &chat.Model, &pinned, &archived, &systemPromptID,
 			&createdAt, &updatedAt); err != nil {
 			return nil, fmt.Errorf("failed to scan chat: %w", err)
 		}

 		chat.Pinned = pinned == 1
 		chat.Archived = archived == 1
+		if systemPromptID.Valid {
+			chat.SystemPromptID = &systemPromptID.String
+		}
 		chat.CreatedAt, _ = time.Parse(time.RFC3339, createdAt)
 		chat.UpdatedAt, _ = time.Parse(time.RFC3339, updatedAt)
 		chats = append(chats, chat)
--- a/frontend/Dockerfile
+++ b/frontend/Dockerfile
@@ -12,6 +12,10 @@ RUN npm ci
 # Copy source code
 COPY . .

+# Copy PDF.js worker to static directory for local serving
+# This avoids CDN dependency and CORS issues with ESM modules
+RUN cp node_modules/pdfjs-dist/build/pdf.worker.min.mjs static/
+
 # Build the application
 RUN npm run build

--- a/frontend/package-lock.json
+++ b/frontend/package-lock.json
@@ -1,12 +1,12 @@
 {
  "name": "vessel",
-  "version": "0.3.0",
+  "version": "0.4.8",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "vessel",
-      "version": "0.3.0",
+      "version": "0.4.8",
      "dependencies": {
        "@codemirror/lang-javascript": "^6.2.3",
        "@codemirror/lang-json": "^6.0.1",
@@ -15,6 +15,8 @@
        "@skeletonlabs/skeleton": "^2.10.0",
        "@skeletonlabs/tw-plugin": "^0.4.0",
        "@sveltejs/adapter-node": "^5.4.0",
+        "@tanstack/svelte-virtual": "^3.13.15",
+        "@tanstack/virtual-core": "^3.13.15",
        "@types/dompurify": "^3.0.5",
        "codemirror": "^6.0.1",
        "dexie": "^4.0.10",
@@ -1739,6 +1741,32 @@
        "tailwindcss": ">=3.0.0 || insiders || >=4.0.0-alpha.20 || >=4.0.0-beta.1"
      }
    },
+    "node_modules/@tanstack/svelte-virtual": {
+      "version": "3.13.15",
+      "resolved": "https://registry.npmjs.org/@tanstack/svelte-virtual/-/svelte-virtual-3.13.15.tgz",
+      "integrity": "sha512-3PPLI3hsyT70zSZhBkSIZXIarlN+GjFNKeKr2Wk1UR7EuEVtXgNlB/Zk0sYtaeJ4CvGvldQNakOvbdETnWAgeA==",
+      "license": "MIT",
+      "dependencies": {
+        "@tanstack/virtual-core": "3.13.15"
+      },
+      "funding": {
+        "type": "github",
+        "url": "https://github.com/sponsors/tannerlinsley"
+      },
+      "peerDependencies": {
+        "svelte": "^3.48.0 || ^4.0.0 || ^5.0.0"
+      }
+    },
+    "node_modules/@tanstack/virtual-core": {
+      "version": "3.13.15",
+      "resolved": "https://registry.npmjs.org/@tanstack/virtual-core/-/virtual-core-3.13.15.tgz",
+      "integrity": "sha512-8cG3acM2cSIm3h8WxboHARAhQAJbYUhvmadvnN8uz8aziDwrbYb9KiARni+uY2qrLh49ycn+poGoxvtIAKhjog==",
+      "license": "MIT",
+      "funding": {
+        "type": "github",
+        "url": "https://github.com/sponsors/tannerlinsley"
+      }
+    },
    "node_modules/@testing-library/dom": {
      "version": "10.4.1",
      "dev": true,
--- a/frontend/package.json
+++ b/frontend/package.json
@@ -1,6 +1,6 @@
 {
  "name": "vessel",
-  "version": "0.4.2",
+  "version": "0.4.11",
  "private": true,
  "type": "module",
  "scripts": {
@@ -11,7 +11,8 @@
    "check:watch": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json --watch",
    "test": "vitest run",
    "test:watch": "vitest",
-    "test:coverage": "vitest run --coverage"
+    "test:coverage": "vitest run --coverage",
+    "postinstall": "cp node_modules/pdfjs-dist/build/pdf.worker.min.mjs static/ 2>/dev/null || true"
  },
  "devDependencies": {
    "@sveltejs/adapter-auto": "^4.0.0",
@@ -37,10 +38,12 @@
    "@codemirror/lang-python": "^6.1.7",
    "@codemirror/theme-one-dark": "^6.1.2",
    "@skeletonlabs/skeleton": "^2.10.0",
-    "codemirror": "^6.0.1",
    "@skeletonlabs/tw-plugin": "^0.4.0",
    "@sveltejs/adapter-node": "^5.4.0",
+    "@tanstack/svelte-virtual": "^3.13.15",
+    "@tanstack/virtual-core": "^3.13.15",
    "@types/dompurify": "^3.0.5",
+    "codemirror": "^6.0.1",
    "dexie": "^4.0.10",
    "dompurify": "^3.2.0",
    "marked": "^15.0.0",
--- a/frontend/src/lib/api/model-registry.ts
+++ b/frontend/src/lib/api/model-registry.ts
@@ -49,11 +49,20 @@ export interface SyncStatus {
 /** Sort options for model list */
 export type ModelSortOption = 'name_asc' | 'name_desc' | 'pulls_asc' | 'pulls_desc' | 'updated_desc';

+/** Size range filter options */
+export type SizeRange = 'small' | 'medium' | 'large' | 'xlarge';
+
+/** Context length range filter options */
+export type ContextRange = 'standard' | 'extended' | 'large' | 'unlimited';
+
 /** Search/filter options */
 export interface ModelSearchOptions {
 	search?: string;
 	type?: 'official' | 'community';
 	capabilities?: string[];
+	sizeRanges?: SizeRange[];
+	contextRanges?: ContextRange[];
+	family?: string;
 	sort?: ModelSortOption;
 	limit?: number;
 	offset?: number;
@@ -73,6 +82,13 @@ export async function fetchRemoteModels(options: ModelSearchOptions = {}): Promi
 	if (options.capabilities && options.capabilities.length > 0) {
 		params.set('capabilities', options.capabilities.join(','));
 	}
+	if (options.sizeRanges && options.sizeRanges.length > 0) {
+		params.set('sizeRange', options.sizeRanges.join(','));
+	}
+	if (options.contextRanges && options.contextRanges.length > 0) {
+		params.set('contextRange', options.contextRanges.join(','));
+	}
+	if (options.family) params.set('family', options.family);
 	if (options.sort) params.set('sort', options.sort);
 	if (options.limit) params.set('limit', String(options.limit));
 	if (options.offset) params.set('offset', String(options.offset));
@@ -87,6 +103,20 @@ export async function fetchRemoteModels(options: ModelSearchOptions = {}): Promi
 	return response.json();
 }

+/**
+ * Get unique model families for filter dropdowns (remote models)
+ */
+export async function fetchRemoteFamilies(): Promise<string[]> {
+	const response = await fetch(`${API_BASE}/remote/families`);
+
+	if (!response.ok) {
+		throw new Error(`Failed to fetch families: ${response.statusText}`);
+	}
+
+	const data = await response.json();
+	return data.families;
+}
+
 /**
 * Get a single remote model by slug
 */
@@ -135,9 +165,11 @@ export async function fetchTagSizes(slug: string): Promise<RemoteModel> {

 /**
 * Sync models from ollama.com
+ * @param fetchDetails - If true, also fetches real capabilities from Ollama for installed models
 */
-export async function syncModels(): Promise<SyncResponse> {
-	const response = await fetch(`${API_BASE}/remote/sync`, {
+export async function syncModels(fetchDetails: boolean = true): Promise<SyncResponse> {
+	const url = fetchDetails ? `${API_BASE}/remote/sync?details=true` : `${API_BASE}/remote/sync`;
+	const response = await fetch(url, {
 		method: 'POST'
 	});

--- a/frontend/src/lib/backend/sync-manager.svelte.ts
+++ b/frontend/src/lib/backend/sync-manager.svelte.ts
@@ -330,6 +330,7 @@ class SyncManager {
 			updatedAt: new Date(backendChat.updated_at).getTime(),
 			isPinned: backendChat.pinned,
 			isArchived: backendChat.archived,
+			systemPromptId: backendChat.system_prompt_id ?? null,
 			messageCount: backendChat.messages?.length ?? existing?.messageCount ?? 0,
 			syncVersion: backendChat.sync_version
 		};
@@ -378,6 +379,7 @@ class SyncManager {
 			model: conv.model,
 			pinned: conv.isPinned,
 			archived: conv.isArchived,
+			system_prompt_id: conv.systemPromptId ?? undefined,
 			created_at: new Date(conv.createdAt).toISOString(),
 			updated_at: new Date(conv.updatedAt).toISOString(),
 			sync_version: conv.syncVersion ?? 1
--- a/frontend/src/lib/backend/types.ts
+++ b/frontend/src/lib/backend/types.ts
@@ -9,6 +9,7 @@ export interface BackendChat {
 	model: string;
 	pinned: boolean;
 	archived: boolean;
+	system_prompt_id?: string | null;
 	created_at: string;
 	updated_at: string;
 	sync_version: number;
--- a/frontend/src/lib/components/chat/ChatWindow.svelte
+++ b/frontend/src/lib/components/chat/ChatWindow.svelte
@@ -5,6 +5,7 @@
 	 */

 	import { chatState, modelsState, conversationsState, toolsState, promptsState, toastState } from '$lib/stores';
+	import { resolveSystemPrompt } from '$lib/services/prompt-resolution.js';
 	import { serverConversationsState } from '$lib/stores/server-conversations.svelte';
 	import { streamingMetricsState } from '$lib/stores/streaming-metrics.svelte';
 	import { ollamaClient } from '$lib/ollama';
@@ -22,7 +23,7 @@
 	import { runToolCalls, formatToolResultsForChat, getFunctionModel, USE_FUNCTION_MODEL } from '$lib/tools';
 	import type { OllamaMessage, OllamaToolCall, OllamaToolDefinition } from '$lib/ollama';
 	import type { Conversation } from '$lib/types/conversation';
-	import MessageList from './MessageList.svelte';
+	import VirtualMessageList from './VirtualMessageList.svelte';
 	import ChatInput from './ChatInput.svelte';
 	import EmptyState from './EmptyState.svelte';
 	import ContextUsageBar from './ContextUsageBar.svelte';
@@ -182,6 +183,15 @@
 		}
 	});

+	// Sync custom context limit with settings
+	$effect(() => {
+		if (settingsState.useCustomParameters) {
+			contextManager.setCustomContextLimit(settingsState.num_ctx);
+		} else {
+			contextManager.setCustomContextLimit(null);
+		}
+	});
+
 	// Update context manager when messages change
 	$effect(() => {
 		contextManager.updateMessages(chatState.visibleMessages);
@@ -262,6 +272,49 @@
 		}
 	}

+	/**
+	 * Handle automatic compaction of older messages
+	 * Called after assistant response completes when auto-compact is enabled
+	 */
+	async function handleAutoCompact(): Promise<void> {
+		// Check if auto-compact should be triggered
+		if (!contextManager.shouldAutoCompact()) return;
+
+		const selectedModel = modelsState.selectedId;
+		if (!selectedModel || isSummarizing) return;
+
+		const messages = chatState.visibleMessages;
+		const preserveCount = contextManager.getAutoCompactPreserveCount();
+		const { toSummarize } = selectMessagesForSummarization(messages, 0, preserveCount);
+
+		if (toSummarize.length < 2) return;
+
+		isSummarizing = true;
+
+		try {
+			// Generate summary using the LLM
+			const summary = await generateSummary(toSummarize, selectedModel);
+
+			// Mark original messages as summarized
+			const messageIdsToSummarize = toSummarize.map((node) => node.id);
+			chatState.markAsSummarized(messageIdsToSummarize);
+
+			// Insert the summary message (inline indicator will be shown by MessageList)
+			chatState.insertSummaryMessage(summary);
+
+			// Force context recalculation
+			contextManager.updateMessages(chatState.visibleMessages, true);
+
+			// Subtle notification for auto-compact (inline indicator is the primary feedback)
+			console.log(`[Auto-compact] Summarized ${toSummarize.length} messages`);
+		} catch (error) {
+			console.error('[Auto-compact] Failed:', error);
+			// Silent failure for auto-compact - don't interrupt user flow
+		} finally {
+			isSummarizing = false;
+		}
+	}
+
 	// =========================================================================
 	// Context Full Modal Handlers
 	// =========================================================================
@@ -410,33 +463,25 @@
 			let messages = getMessagesForApi();
 			const tools = getToolsForApi();

-			// Build system prompt from active prompt + thinking + RAG context
+			// Build system prompt from resolution service + RAG context
 			const systemParts: string[] = [];

-			// Wait for prompts to be loaded
-			await promptsState.ready();
+			// Resolve system prompt using priority chain:
+			// 1. Per-conversation prompt
+			// 2. New chat selection
+			// 3. Model-prompt mapping
+			// 4. Model-embedded prompt (from Modelfile)
+			// 5. Capability-matched prompt
+			// 6. Global active prompt
+			// 7. None
+			const resolvedPrompt = await resolveSystemPrompt(
+				model,
+				conversation?.systemPromptId,
+				newChatPromptId
+			);

-			// Priority: per-conversation prompt > new chat prompt > global active prompt > none
-			let promptContent: string | null = null;
-			if (conversation?.systemPromptId) {
-				// Use per-conversation prompt
-				const conversationPrompt = promptsState.get(conversation.systemPromptId);
-				if (conversationPrompt) {
-					promptContent = conversationPrompt.content;
-				}
-			} else if (newChatPromptId) {
-				// Use new chat selected prompt (before conversation is created)
-				const newChatPrompt = promptsState.get(newChatPromptId);
-				if (newChatPrompt) {
-					promptContent = newChatPrompt.content;
-				}
-			} else if (promptsState.activePrompt) {
-				// Fall back to global active prompt
-				promptContent = promptsState.activePrompt.content;
-			}
-
-			if (promptContent) {
-				systemParts.push(promptContent);
+			if (resolvedPrompt.content) {
+				systemParts.push(resolvedPrompt.content);
 			}

 			// RAG: Retrieve relevant context for the last user message
@@ -540,6 +585,9 @@
 								conversationsState.update(conversationId, {});
 							}
 						}
+
+						// Check for auto-compact after response completes
+						await handleAutoCompact();
 					},
 					onError: (error) => {
 						console.error('Streaming error:', error);
@@ -817,7 +865,7 @@
 <div class="flex h-full flex-col bg-theme-primary">
 	{#if hasMessages}
 		<div class="flex-1 overflow-hidden">
-			<MessageList
+			<VirtualMessageList
 				onRegenerate={handleRegenerate}
 				onEditMessage={handleEditMessage}
 				showThinking={thinkingEnabled}
@@ -877,10 +925,12 @@
 						<SystemPromptSelector
 							conversationId={conversation.id}
 							currentPromptId={conversation.systemPromptId}
+							modelName={modelsState.selectedId ?? undefined}
 						/>
 					{:else if mode === 'new'}
 						<SystemPromptSelector
 							currentPromptId={newChatPromptId}
+							modelName={modelsState.selectedId ?? undefined}
 							onSelect={(promptId) => (newChatPromptId = promptId)}
 						/>
 					{/if}
--- a/frontend/src/lib/components/chat/MessageList.svelte
+++ b/frontend/src/lib/components/chat/MessageList.svelte
@@ -7,6 +7,7 @@
 	import { chatState } from '$lib/stores';
 	import type { MessageNode, BranchInfo } from '$lib/types';
 	import MessageItem from './MessageItem.svelte';
+	import SummarizationIndicator from './SummarizationIndicator.svelte';

 	interface Props {
 		onRegenerate?: () => void;
@@ -208,6 +209,10 @@
 	>
 		<div class="mx-auto max-w-4xl px-4 py-6">
 			{#each chatState.visibleMessages as node, index (node.id)}
+				<!-- Show summarization indicator before summary messages -->
+				{#if node.message.isSummary}
+					<SummarizationIndicator />
+				{/if}
 				<MessageItem
 					{node}
 					branchInfo={getBranchInfo(node)}
--- a/frontend/src/lib/components/chat/SummarizationIndicator.svelte
+++ b/frontend/src/lib/components/chat/SummarizationIndicator.svelte
@@ -0,0 +1,17 @@
+<script lang="ts">
+	/**
+	 * SummarizationIndicator - Visual marker showing where conversation was summarized
+	 * Displayed in the message list to indicate context compaction occurred
+	 */
+</script>
+
+<div class="flex items-center gap-3 py-4" role="separator" aria-label="Conversation summarized">
+	<div class="flex-1 border-t border-dashed border-emerald-500/30"></div>
+	<div class="flex items-center gap-2 text-xs text-emerald-500">
+		<svg xmlns="http://www.w3.org/2000/svg" class="h-3.5 w-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+			<path stroke-linecap="round" stroke-linejoin="round" d="M4 7v10c0 2.21 3.582 4 8 4s8-1.79 8-4V7M4 7c0 2.21 3.582 4 8 4s8-1.79 8-4M4 7c0-2.21 3.582-4 8-4s8 1.79 8 4" />
+		</svg>
+		<span>Earlier messages summarized</span>
+	</div>
+	<div class="flex-1 border-t border-dashed border-emerald-500/30"></div>
+</div>
--- a/frontend/src/lib/components/chat/SystemPromptSelector.svelte
+++ b/frontend/src/lib/components/chat/SystemPromptSelector.svelte
@@ -1,59 +1,131 @@
 <script lang="ts">
 	/**
 	 * SystemPromptSelector - Dropdown to select a system prompt for the current conversation
-	 * Allows per-conversation prompt assignment with quick preview
-	 * In 'new' mode (no conversationId), uses onSelect callback for local state management
+	 * Now model-aware: shows embedded prompts and resolved source indicators
 	 */
 	import { promptsState, conversationsState, toastState } from '$lib/stores';
 	import { updateSystemPrompt } from '$lib/storage';
+	import { modelInfoService } from '$lib/services/model-info-service.js';
+	import { modelPromptMappingsState } from '$lib/stores/model-prompt-mappings.svelte.js';
+	import {
+		resolveSystemPrompt,
+		getPromptSourceLabel,
+		type PromptSource
+	} from '$lib/services/prompt-resolution.js';

 	interface Props {
 		conversationId?: string | null;
 		currentPromptId?: string | null;
+		/** Model name for model-aware prompt resolution */
+		modelName?: string;
 		/** Callback for 'new' mode - called when prompt is selected without a conversation */
 		onSelect?: (promptId: string | null) => void;
 	}

-	let { conversationId = null, currentPromptId = null, onSelect }: Props = $props();
+	let { conversationId = null, currentPromptId = null, modelName = '', onSelect }: Props = $props();

 	// UI state
 	let isOpen = $state(false);
 	let dropdownElement: HTMLDivElement | null = $state(null);

+	// Model info state
+	let hasEmbeddedPrompt = $state(false);
+	let modelCapabilities = $state<string[]>([]);
+	let resolvedSource = $state<PromptSource>('none');
+	let resolvedPromptName = $state<string | undefined>(undefined);
+
 	// Available prompts from store
 	const prompts = $derived(promptsState.prompts);

-	// Current prompt for this conversation
+	// Current prompt for this conversation (explicit override)
 	const currentPrompt = $derived(
 		currentPromptId ? prompts.find((p) => p.id === currentPromptId) : null
 	);

-	// Display text for the button
-	const buttonText = $derived(currentPrompt?.name ?? 'No system prompt');
+	// Check if there's a model-prompt mapping
+	const hasModelMapping = $derived(modelName ? modelPromptMappingsState.hasMapping(modelName) : false);
+
+	// Display text for the button
+	const buttonText = $derived.by(() => {
+		if (currentPrompt) return currentPrompt.name;
+		if (resolvedPromptName && resolvedSource !== 'none') return resolvedPromptName;
+		return 'No system prompt';
+	});
+
+	// Source badge color
+	const sourceBadgeClass = $derived.by(() => {
+		switch (resolvedSource) {
+			case 'per-conversation':
+			case 'new-chat-selection':
+				return 'bg-violet-500/20 text-violet-300';
+			case 'model-mapping':
+				return 'bg-blue-500/20 text-blue-300';
+			case 'model-embedded':
+				return 'bg-amber-500/20 text-amber-300';
+			case 'capability-match':
+				return 'bg-emerald-500/20 text-emerald-300';
+			case 'global-active':
+				return 'bg-slate-500/20 text-slate-300';
+			default:
+				return 'bg-slate-500/20 text-slate-400';
+		}
+	});
+
+	// Load model info when modelName changes
+	$effect(() => {
+		if (modelName) {
+			loadModelInfo();
+		}
+	});
+
+	// Resolve prompt when relevant state changes
+	$effect(() => {
+		// Depend on these values to trigger re-resolution
+		const _promptId = currentPromptId;
+		const _model = modelName;
+		if (modelName) {
+			resolveCurrentPrompt();
+		}
+	});
+
+	async function loadModelInfo(): Promise<void> {
+		if (!modelName) return;
+		try {
+			const info = await modelInfoService.getModelInfo(modelName);
+			hasEmbeddedPrompt = info.systemPrompt !== null;
+			modelCapabilities = info.capabilities;
+		} catch {
+			hasEmbeddedPrompt = false;
+			modelCapabilities = [];
+		}
+	}
+
+	async function resolveCurrentPrompt(): Promise<void> {
+		if (!modelName) return;
+		try {
+			const resolved = await resolveSystemPrompt(modelName, currentPromptId, null);
+			resolvedSource = resolved.source;
+			resolvedPromptName = resolved.promptName;
+		} catch {
+			resolvedSource = 'none';
+			resolvedPromptName = undefined;
+		}
+	}

-	/**
-	 * Toggle dropdown
-	 */
 	function toggleDropdown(): void {
 		isOpen = !isOpen;
 	}

-	/**
-	 * Close dropdown
-	 */
 	function closeDropdown(): void {
 		isOpen = false;
 	}

-	/**
-	 * Handle prompt selection
-	 */
 	async function handleSelect(promptId: string | null): Promise<void> {
 		// In 'new' mode (no conversation), use the callback
 		if (!conversationId) {
 			onSelect?.(promptId);
 			const promptName = promptId ? prompts.find((p) => p.id === promptId)?.name : null;
-			toastState.success(promptName ? `Using "${promptName}"` : 'System prompt cleared');
+			toastState.success(promptName ? `Using "${promptName}"` : 'Using model default');
 			closeDropdown();
 			return;
 		}
@@ -61,10 +133,9 @@
 		// Update in storage for existing conversation
 		const result = await updateSystemPrompt(conversationId, promptId);
 		if (result.success) {
-			// Update in memory
 			conversationsState.setSystemPrompt(conversationId, promptId);
 			const promptName = promptId ? prompts.find((p) => p.id === promptId)?.name : null;
-			toastState.success(promptName ? `Using "${promptName}"` : 'System prompt cleared');
+			toastState.success(promptName ? `Using "${promptName}"` : 'Using model default');
 		} else {
 			toastState.error('Failed to update system prompt');
 		}
@@ -72,18 +143,12 @@
 		closeDropdown();
 	}

-	/**
-	 * Handle click outside to close
-	 */
 	function handleClickOutside(event: MouseEvent): void {
 		if (dropdownElement && !dropdownElement.contains(event.target as Node)) {
 			closeDropdown();
 		}
 	}

-	/**
-	 * Handle escape key
-	 */
 	function handleKeydown(event: KeyboardEvent): void {
 		if (event.key === 'Escape' && isOpen) {
 			closeDropdown();
@@ -98,24 +163,40 @@
 	<button
 		type="button"
 		onclick={toggleDropdown}
-		class="flex items-center gap-1.5 rounded-lg px-2.5 py-1.5 text-xs font-medium transition-colors {currentPrompt
-			? 'bg-violet-500/20 text-violet-300 hover:bg-violet-500/30'
+		class="flex items-center gap-1.5 rounded-lg px-2.5 py-1.5 text-xs font-medium transition-colors {resolvedSource !== 'none'
+			? sourceBadgeClass
 			: 'text-theme-muted hover:bg-theme-secondary hover:text-theme-secondary'}"
-		title={currentPrompt ? `System prompt: ${currentPrompt.name}` : 'Set system prompt'}
+		title={resolvedPromptName ? `System prompt: ${resolvedPromptName}` : 'Set system prompt'}
 	>
-		<svg
-			xmlns="http://www.w3.org/2000/svg"
-			viewBox="0 0 20 20"
-			fill="currentColor"
-			class="h-3.5 w-3.5"
-		>
-			<path
-				fill-rule="evenodd"
-				d="M18 10a8 8 0 1 1-16 0 8 8 0 0 1 16 0Zm-8-5a.75.75 0 0 1 .75.75v4.5a.75.75 0 0 1-1.5 0v-4.5A.75.75 0 0 1 10 5Zm0 10a1 1 0 1 0 0-2 1 1 0 0 0 0 2Z"
-				clip-rule="evenodd"
-			/>
-		</svg>
+		<!-- Icon based on source -->
+		{#if resolvedSource === 'model-embedded'}
+			<!-- Chip/CPU icon for embedded -->
+			<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" class="h-3.5 w-3.5">
+				<path d="M14 6H6v8h8V6Z" />
+				<path fill-rule="evenodd" d="M9.25 3V1.75a.75.75 0 0 1 1.5 0V3h1.5V1.75a.75.75 0 0 1 1.5 0V3h.5A2.75 2.75 0 0 1 17 5.75v.5h1.25a.75.75 0 0 1 0 1.5H17v1.5h1.25a.75.75 0 0 1 0 1.5H17v1.5h1.25a.75.75 0 0 1 0 1.5H17v.5A2.75 2.75 0 0 1 14.25 17h-.5v1.25a.75.75 0 0 1-1.5 0V17h-1.5v1.25a.75.75 0 0 1-1.5 0V17h-1.5v1.25a.75.75 0 0 1-1.5 0V17h-.5A2.75 2.75 0 0 1 3 14.25v-.5H1.75a.75.75 0 0 1 0-1.5H3v-1.5H1.75a.75.75 0 0 1 0-1.5H3v-1.5H1.75a.75.75 0 0 1 0-1.5H3v-.5A2.75 2.75 0 0 1 5.75 3h.5V1.75a.75.75 0 0 1 1.5 0V3h1.5ZM4.5 5.75c0-.69.56-1.25 1.25-1.25h8.5c.69 0 1.25.56 1.25 1.25v8.5c0 .69-.56 1.25-1.25 1.25h-8.5c-.69 0-1.25-.56-1.25-1.25v-8.5Z" clip-rule="evenodd" />
+			</svg>
+		{:else}
+			<!-- Default info icon -->
+			<svg
+				xmlns="http://www.w3.org/2000/svg"
+				viewBox="0 0 20 20"
+				fill="currentColor"
+				class="h-3.5 w-3.5"
+			>
+				<path
+					fill-rule="evenodd"
+					d="M18 10a8 8 0 1 1-16 0 8 8 0 0 1 16 0Zm-8-5a.75.75 0 0 1 .75.75v4.5a.75.75 0 0 1-1.5 0v-4.5A.75.75 0 0 1 10 5Zm0 10a1 1 0 1 0 0-2 1 1 0 0 0 0 2Z"
+					clip-rule="evenodd"
+				/>
+			</svg>
+		{/if}
 		<span class="max-w-[120px] truncate">{buttonText}</span>
+		<!-- Source indicator badge -->
+		{#if resolvedSource !== 'none' && resolvedSource !== 'per-conversation' && resolvedSource !== 'new-chat-selection'}
+			<span class="rounded px-1 py-0.5 text-[10px] opacity-75">
+				{getPromptSourceLabel(resolvedSource)}
+			</span>
+		{/if}
 		<svg
 			xmlns="http://www.w3.org/2000/svg"
 			viewBox="0 0 20 20"
@@ -133,9 +214,12 @@
 	<!-- Dropdown menu -->
 	{#if isOpen}
 		<div
-			class="absolute left-0 top-full z-50 mt-1 w-64 rounded-lg border border-theme bg-theme-secondary py-1 shadow-xl"
+			class="absolute left-0 top-full z-50 mt-1 w-72 rounded-lg border border-theme bg-theme-secondary py-1 shadow-xl"
 		>
-			<!-- No prompt option -->
+			<!-- Model default section -->
+			<div class="px-3 py-1.5 text-xs font-medium text-theme-muted uppercase tracking-wide">
+				Model Default
+			</div>
 			<button
 				type="button"
 				onclick={() => handleSelect(null)}
@@ -143,7 +227,21 @@
 					? 'bg-theme-tertiary/50 text-theme-primary'
 					: 'text-theme-secondary'}"
 			>
-				<span class="flex-1">No system prompt</span>
+				<div class="flex-1">
+					<div class="flex items-center gap-2">
+						<span>Use model default</span>
+						{#if hasEmbeddedPrompt}
+							<span class="rounded bg-amber-500/20 px-1.5 py-0.5 text-[10px] text-amber-300">
+								Has embedded prompt
+							</span>
+						{/if}
+					</div>
+					{#if !currentPromptId && resolvedSource !== 'none'}
+						<div class="mt-0.5 text-xs text-theme-muted">
+							Currently: {resolvedPromptName ?? 'None'}
+						</div>
+					{/if}
+				</div>
 				{#if !currentPromptId}
 					<svg
 						xmlns="http://www.w3.org/2000/svg"
@@ -162,6 +260,9 @@

 			{#if prompts.length > 0}
 				<div class="my-1 border-t border-theme"></div>
+				<div class="px-3 py-1.5 text-xs font-medium text-theme-muted uppercase tracking-wide">
+					Your Prompts
+				</div>

 				<!-- Available prompts -->
 				{#each prompts as prompt}
@@ -205,12 +306,26 @@
 					</button>
 				{/each}
 			{:else}
+				<div class="my-1 border-t border-theme"></div>
 				<div class="px-3 py-2 text-xs text-theme-muted">
 					No prompts available. <a href="/prompts" class="text-violet-400 hover:underline"
 						>Create one</a
 					>
 				</div>
 			{/if}
+
+			<!-- Link to model defaults settings -->
+			<div class="mt-1 border-t border-theme"></div>
+			<a
+				href="/settings#model-prompts"
+				class="flex items-center gap-2 px-3 py-2 text-xs text-theme-muted hover:bg-theme-tertiary hover:text-theme-secondary"
+				onclick={closeDropdown}
+			>
+				<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" class="h-3.5 w-3.5">
+					<path fill-rule="evenodd" d="M8.34 1.804A1 1 0 0 1 9.32 1h1.36a1 1 0 0 1 .98.804l.295 1.473c.497.144.971.342 1.416.587l1.25-.834a1 1 0 0 1 1.262.125l.962.962a1 1 0 0 1 .125 1.262l-.834 1.25c.245.445.443.919.587 1.416l1.473.295a1 1 0 0 1 .804.98v1.36a1 1 0 0 1-.804.98l-1.473.295a6.95 6.95 0 0 1-.587 1.416l.834 1.25a1 1 0 0 1-.125 1.262l-.962.962a1 1 0 0 1-1.262.125l-1.25-.834a6.953 6.953 0 0 1-1.416.587l-.295 1.473a1 1 0 0 1-.98.804H9.32a1 1 0 0 1-.98-.804l-.295-1.473a6.957 6.957 0 0 1-1.416-.587l-1.25.834a1 1 0 0 1-1.262-.125l-.962-.962a1 1 0 0 1-.125-1.262l.834-1.25a6.957 6.957 0 0 1-.587-1.416l-1.473-.295A1 1 0 0 1 1 10.68V9.32a1 1 0 0 1 .804-.98l1.473-.295c.144-.497.342-.971.587-1.416l-.834-1.25a1 1 0 0 1 .125-1.262l.962-.962A1 1 0 0 1 5.38 3.03l1.25.834a6.957 6.957 0 0 1 1.416-.587l.294-1.473ZM13 10a3 3 0 1 1-6 0 3 3 0 0 1 6 0Z" clip-rule="evenodd" />
+				</svg>
+				Configure model defaults
+			</a>
 		</div>
 	{/if}
 </div>
--- a/frontend/src/lib/components/chat/VirtualMessageList.svelte
+++ b/frontend/src/lib/components/chat/VirtualMessageList.svelte
@@ -0,0 +1,314 @@
+<script lang="ts">
+	/**
+	 * VirtualMessageList - Virtualized message list for large conversations
+	 * Only renders visible messages for performance with long chats
+	 *
+	 * Uses @tanstack/svelte-virtual for virtualization.
+	 * Falls back to regular rendering if virtualization fails.
+	 */
+
+	import { createVirtualizer } from '@tanstack/svelte-virtual';
+	import { chatState } from '$lib/stores';
+	import type { MessageNode, BranchInfo } from '$lib/types';
+	import MessageItem from './MessageItem.svelte';
+	import SummarizationIndicator from './SummarizationIndicator.svelte';
+	import { onMount } from 'svelte';
+
+	interface Props {
+		onRegenerate?: () => void;
+		onEditMessage?: (messageId: string, newContent: string) => void;
+		showThinking?: boolean;
+	}
+
+	const { onRegenerate, onEditMessage, showThinking = true }: Props = $props();
+
+	// Container reference
+	let scrollContainer: HTMLDivElement | null = $state(null);
+
+	// Track if component is mounted (scroll container available)
+	let isMounted = $state(false);
+
+	// Track user scroll state
+	let userScrolledAway = $state(false);
+	let autoScrollEnabled = $state(true);
+	let wasStreaming = false;
+
+	// Height cache for measured items (message ID -> height)
+	const heightCache = new Map<string, number>();
+
+	// Default estimated height for messages
+	const DEFAULT_ITEM_HEIGHT = 150;
+
+	// Threshold for scroll detection
+	const SCROLL_THRESHOLD = 100;
+
+	// Get visible messages
+	const messages = $derived(chatState.visibleMessages);
+
+	// Set mounted after component mounts
+	onMount(() => {
+		isMounted = true;
+	});
+
+	// Create virtualizer - only functional after mount when scrollContainer exists
+	const virtualizer = createVirtualizer({
+		get count() {
+			return messages.length;
+		},
+		getScrollElement: () => scrollContainer,
+		estimateSize: (index: number) => {
+			const msg = messages[index];
+			if (!msg) return DEFAULT_ITEM_HEIGHT;
+			return heightCache.get(msg.id) ?? DEFAULT_ITEM_HEIGHT;
+		},
+		overscan: 5,
+	});
+
+	// Get virtual items with fallback
+	const virtualItems = $derived.by(() => {
+		if (!isMounted || !scrollContainer) {
+			return [];
+		}
+		return $virtualizer.getVirtualItems();
+	});
+
+	// Check if we should use fallback (non-virtual) rendering
+	const useFallback = $derived(
+		messages.length > 0 && virtualItems.length === 0 && isMounted
+	);
+
+	// Track conversation changes to clear cache
+	let lastConversationId: string | null = null;
+	$effect(() => {
+		const currentId = chatState.conversationId;
+		if (currentId !== lastConversationId) {
+			heightCache.clear();
+			lastConversationId = currentId;
+		}
+	});
+
+	// Force measure after mount and when scroll container becomes available
+	$effect(() => {
+		if (isMounted && scrollContainer && messages.length > 0) {
+			// Use setTimeout to ensure DOM is fully ready
+			setTimeout(() => {
+				$virtualizer.measure();
+			}, 0);
+		}
+	});
+
+	// Handle streaming scroll behavior
+	$effect(() => {
+		const isStreaming = chatState.isStreaming;
+
+		if (isStreaming && !wasStreaming) {
+			autoScrollEnabled = true;
+			if (!userScrolledAway && scrollContainer) {
+				requestAnimationFrame(() => {
+					if (useFallback) {
+						scrollContainer?.scrollTo({ top: scrollContainer.scrollHeight });
+					} else {
+						$virtualizer.scrollToIndex(messages.length - 1, { align: 'end' });
+					}
+				});
+			}
+		}
+
+		wasStreaming = isStreaming;
+	});
+
+	// Scroll to bottom during streaming
+	$effect(() => {
+		const buffer = chatState.streamBuffer;
+		const isStreaming = chatState.isStreaming;
+
+		if (isStreaming && buffer && autoScrollEnabled && scrollContainer) {
+			requestAnimationFrame(() => {
+				if (useFallback) {
+					scrollContainer?.scrollTo({ top: scrollContainer.scrollHeight });
+				} else {
+					$virtualizer.scrollToIndex(messages.length - 1, { align: 'end' });
+				}
+			});
+		}
+	});
+
+	// Scroll when new messages are added
+	let previousMessageCount = 0;
+	$effect(() => {
+		const currentCount = messages.length;
+
+		if (currentCount > previousMessageCount && currentCount > 0 && scrollContainer) {
+			autoScrollEnabled = true;
+			userScrolledAway = false;
+			requestAnimationFrame(() => {
+				if (useFallback) {
+					scrollContainer?.scrollTo({ top: scrollContainer.scrollHeight });
+				} else {
+					$virtualizer.scrollToIndex(currentCount - 1, { align: 'end' });
+				}
+			});
+		}
+
+		previousMessageCount = currentCount;
+	});
+
+	// Handle scroll events
+	function handleScroll(): void {
+		if (!scrollContainer) return;
+
+		const { scrollTop, scrollHeight, clientHeight } = scrollContainer;
+		userScrolledAway = scrollHeight - scrollTop - clientHeight > SCROLL_THRESHOLD;
+
+		if (userScrolledAway && chatState.isStreaming) {
+			autoScrollEnabled = false;
+		}
+	}
+
+	// Scroll to bottom button handler
+	function scrollToBottom(): void {
+		if (!scrollContainer) return;
+
+		if (useFallback) {
+			scrollContainer.scrollTo({ top: scrollContainer.scrollHeight, behavior: 'smooth' });
+		} else if (messages.length > 0) {
+			$virtualizer.scrollToIndex(messages.length - 1, { align: 'end', behavior: 'smooth' });
+		}
+	}
+
+	// Measure item height after render (for virtualized mode)
+	function measureItem(node: HTMLElement, index: number) {
+		const msg = messages[index];
+		if (!msg) return { destroy: () => {} };
+
+		const resizeObserver = new ResizeObserver((entries) => {
+			for (const entry of entries) {
+				const height = entry.contentRect.height;
+				if (height > 0 && heightCache.get(msg.id) !== height) {
+					heightCache.set(msg.id, height);
+					$virtualizer.measure();
+				}
+			}
+		});
+
+		resizeObserver.observe(node);
+
+		// Initial measurement
+		const height = node.getBoundingClientRect().height;
+		if (height > 0) {
+			heightCache.set(msg.id, height);
+		}
+
+		return {
+			destroy() {
+				resizeObserver.disconnect();
+			}
+		};
+	}
+
+	// Get branch info for a message
+	function getBranchInfo(node: MessageNode): BranchInfo | null {
+		const info = chatState.getBranchInfo(node.id);
+		if (info && info.totalCount > 1) {
+			return info;
+		}
+		return null;
+	}
+
+	// Handle branch switch
+	function handleBranchSwitch(messageId: string, direction: 'prev' | 'next'): void {
+		chatState.switchBranch(messageId, direction);
+	}
+
+	// Check if message is streaming
+	function isStreamingMessage(node: MessageNode): boolean {
+		return chatState.isStreaming && chatState.streamingMessageId === node.id;
+	}
+
+	// Check if message is last
+	function isLastMessage(index: number): boolean {
+		return index === messages.length - 1;
+	}
+
+	// Show scroll button
+	const showScrollButton = $derived(userScrolledAway && messages.length > 0);
+</script>
+
+<div class="relative h-full">
+	<div
+		bind:this={scrollContainer}
+		onscroll={handleScroll}
+		class="h-full overflow-y-auto"
+		role="log"
+		aria-live="polite"
+		aria-label="Chat messages"
+	>
+		<div class="mx-auto max-w-4xl px-4 py-6">
+			{#if useFallback}
+				<!-- Fallback: Regular rendering when virtualization isn't working -->
+				{#each messages as node, index (node.id)}
+					{#if node.message.isSummary}
+						<SummarizationIndicator />
+					{/if}
+					<MessageItem
+						{node}
+						branchInfo={getBranchInfo(node)}
+						isStreaming={isStreamingMessage(node)}
+						isLast={isLastMessage(index)}
+						{showThinking}
+						onBranchSwitch={(direction) => handleBranchSwitch(node.id, direction)}
+						onRegenerate={onRegenerate}
+						onEdit={(newContent) => onEditMessage?.(node.id, newContent)}
+					/>
+				{/each}
+			{:else}
+				<!-- Virtualized rendering -->
+				<div
+					style="height: {$virtualizer.getTotalSize()}px; width: 100%; position: relative;"
+				>
+					{#each virtualItems as virtualRow (virtualRow.key)}
+						{@const node = messages[virtualRow.index]}
+						{@const index = virtualRow.index}
+						{#if node}
+							<div
+								style="position: absolute; top: 0; left: 0; width: 100%; transform: translateY({virtualRow.start}px);"
+								use:measureItem={index}
+							>
+								{#if node.message.isSummary}
+									<SummarizationIndicator />
+								{/if}
+								<MessageItem
+									{node}
+									branchInfo={getBranchInfo(node)}
+									isStreaming={isStreamingMessage(node)}
+									isLast={isLastMessage(index)}
+									{showThinking}
+									onBranchSwitch={(direction) => handleBranchSwitch(node.id, direction)}
+									onRegenerate={onRegenerate}
+									onEdit={(newContent) => onEditMessage?.(node.id, newContent)}
+								/>
+							</div>
+						{/if}
+					{/each}
+				</div>
+			{/if}
+		</div>
+	</div>
+
+	<!-- Scroll to bottom button -->
+	{#if showScrollButton}
+		<button
+			type="button"
+			onclick={scrollToBottom}
+			class="absolute bottom-4 left-1/2 -translate-x-1/2 rounded-full bg-theme-tertiary px-4 py-2 text-sm text-theme-secondary shadow-lg transition-all hover:bg-theme-secondary"
+			aria-label="Scroll to latest message"
+		>
+			<span class="flex items-center gap-2">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-4 w-4" viewBox="0 0 20 20" fill="currentColor">
+					<path fill-rule="evenodd" d="M10 18a1 1 0 01-.707-.293l-5-5a1 1 0 011.414-1.414L10 15.586l4.293-4.293a1 1 0 011.414 1.414l-5 5A1 1 0 0110 18z" clip-rule="evenodd" />
+				</svg>
+				Jump to latest
+			</span>
+		</button>
+	{/if}
+</div>
--- a/frontend/src/lib/components/layout/Sidenav.svelte
+++ b/frontend/src/lib/components/layout/Sidenav.svelte
@@ -8,13 +8,9 @@
 	import SidenavHeader from './SidenavHeader.svelte';
 	import SidenavSearch from './SidenavSearch.svelte';
 	import ConversationList from './ConversationList.svelte';
-	import { SettingsModal } from '$lib/components/shared';

 	// Check if a path is active
 	const isActive = (path: string) => $page.url.pathname === path;
-
-	// Settings modal state
-	let settingsOpen = $state(false);
 </script>

 <!-- Overlay for mobile (closes sidenav when clicking outside) -->
@@ -137,11 +133,10 @@
 				<span>Prompts</span>
 			</a>

-			<!-- Settings button -->
-			<button
-				type="button"
-				onclick={() => (settingsOpen = true)}
-				class="flex w-full items-center gap-3 rounded-lg px-3 py-2 text-sm text-theme-muted transition-colors hover:bg-theme-hover hover:text-theme-primary"
+			<!-- Settings link -->
+			<a
+				href="/settings"
+				class="flex w-full items-center gap-3 rounded-lg px-3 py-2 text-sm transition-colors {isActive('/settings') ? 'bg-gray-500/20 text-gray-600 dark:bg-gray-700/30 dark:text-gray-300' : 'text-theme-muted hover:bg-theme-hover hover:text-theme-primary'}"
 			>
 				<svg
 					xmlns="http://www.w3.org/2000/svg"
@@ -159,10 +154,7 @@
 					<path stroke-linecap="round" stroke-linejoin="round" d="M15 12a3 3 0 1 1-6 0 3 3 0 0 1 6 0Z" />
 				</svg>
 				<span>Settings</span>
-			</button>
+			</a>
 		</div>
 	</div>
 </aside>
-
-<!-- Settings Modal -->
-<SettingsModal isOpen={settingsOpen} onClose={() => (settingsOpen = false)} />
--- a/frontend/src/lib/components/models/ModelCard.svelte
+++ b/frontend/src/lib/components/models/ModelCard.svelte
@@ -12,6 +12,28 @@

 	let { model, onSelect }: Props = $props();

+	/**
+	 * Format a date as relative time (e.g., "2d ago", "3w ago")
+	 */
+	function formatRelativeTime(date: string | Date | undefined): string {
+		if (!date) return '';
+		const now = Date.now();
+		const then = new Date(date).getTime();
+		const diff = now - then;
+
+		const minutes = Math.floor(diff / 60000);
+		const hours = Math.floor(diff / 3600000);
+		const days = Math.floor(diff / 86400000);
+		const weeks = Math.floor(days / 7);
+		const months = Math.floor(days / 30);
+
+		if (minutes < 60) return `${minutes}m ago`;
+		if (hours < 24) return `${hours}h ago`;
+		if (days < 7) return `${days}d ago`;
+		if (weeks < 4) return `${weeks}w ago`;
+		return `${months}mo ago`;
+	}
+
 	// Capability badges config (matches ollama.com capabilities)
 	const capabilityBadges: Record<string, { icon: string; color: string; label: string }> = {
 		vision: { icon: '👁', color: 'bg-purple-900/50 text-purple-300', label: 'Vision' },
@@ -92,6 +114,16 @@
 				<span>{formatContextLength(model.contextLength)}</span>
 			</div>
 		{/if}
+
+		<!-- Last Updated -->
+		{#if model.ollamaUpdatedAt}
+			<div class="flex items-center gap-1" title="Last updated on Ollama: {new Date(model.ollamaUpdatedAt).toLocaleDateString()}">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-3.5 w-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z" />
+				</svg>
+				<span>{formatRelativeTime(model.ollamaUpdatedAt)}</span>
+			</div>
+		{/if}
 	</div>

 	<!-- Size Tags -->
--- a/frontend/src/lib/components/models/ModelEditorDialog.svelte
+++ b/frontend/src/lib/components/models/ModelEditorDialog.svelte
@@ -0,0 +1,309 @@
+<script lang="ts">
+	/**
+	 * ModelEditorDialog - Dialog for creating/editing custom Ollama models
+	 * Supports two modes: create (new model) and edit (update system prompt)
+	 */
+
+	import { modelsState, promptsState } from '$lib/stores';
+	import { modelCreationState, type ModelEditorMode } from '$lib/stores/model-creation.svelte.js';
+	import { modelInfoService } from '$lib/services/model-info-service.js';
+
+	interface Props {
+		/** Whether the dialog is open */
+		isOpen: boolean;
+		/** Mode: create or edit */
+		mode: ModelEditorMode;
+		/** For edit mode: the model being edited */
+		editingModel?: string;
+		/** For edit mode: the current system prompt */
+		currentSystemPrompt?: string;
+		/** For edit mode: the base model (parent) */
+		baseModel?: string;
+		/** Callback when dialog is closed */
+		onClose: () => void;
+	}
+
+	let { isOpen, mode, editingModel, currentSystemPrompt, baseModel, onClose }: Props = $props();
+
+	// Form state
+	let modelName = $state('');
+	let selectedBaseModel = $state('');
+	let systemPrompt = $state('');
+	let usePromptLibrary = $state(false);
+	let selectedPromptId = $state<string | null>(null);
+
+	// Initialize form when opening
+	$effect(() => {
+		if (isOpen) {
+			if (mode === 'edit' && editingModel) {
+				modelName = editingModel;
+				selectedBaseModel = baseModel || '';
+				systemPrompt = currentSystemPrompt || '';
+			} else {
+				modelName = '';
+				selectedBaseModel = modelsState.chatModels[0]?.name || '';
+				systemPrompt = '';
+			}
+			usePromptLibrary = false;
+			selectedPromptId = null;
+			modelCreationState.reset();
+		}
+	});
+
+	// Get system prompt content (either from textarea or prompt library)
+	const effectiveSystemPrompt = $derived(
+		usePromptLibrary && selectedPromptId
+			? promptsState.get(selectedPromptId)?.content || ''
+			: systemPrompt
+	);
+
+	// Validation
+	const isValid = $derived(
+		modelName.trim().length > 0 &&
+		(mode === 'edit' || selectedBaseModel.length > 0) &&
+		effectiveSystemPrompt.trim().length > 0
+	);
+
+	async function handleSubmit(event: Event): Promise<void> {
+		event.preventDefault();
+		if (!isValid || modelCreationState.isCreating) return;
+
+		const base = mode === 'edit' ? (baseModel || editingModel || '') : selectedBaseModel;
+		const success = mode === 'edit'
+			? await modelCreationState.update(modelName, base, effectiveSystemPrompt)
+			: await modelCreationState.create(modelName, base, effectiveSystemPrompt);
+
+		if (success) {
+			// Close after short delay to show success status
+			setTimeout(() => {
+				onClose();
+			}, 500);
+		}
+	}
+
+	function handleBackdropClick(event: MouseEvent): void {
+		if (event.target === event.currentTarget && !modelCreationState.isCreating) {
+			onClose();
+		}
+	}
+
+	function handleKeydown(event: KeyboardEvent): void {
+		if (event.key === 'Escape' && !modelCreationState.isCreating) {
+			onClose();
+		}
+	}
+
+	function handleCancel(): void {
+		if (modelCreationState.isCreating) {
+			modelCreationState.cancel();
+		} else {
+			onClose();
+		}
+	}
+</script>
+
+<svelte:window onkeydown={handleKeydown} />
+
+{#if isOpen}
+	<!-- Backdrop -->
+	<div
+		class="fixed inset-0 z-50 flex items-center justify-center bg-black/60 backdrop-blur-sm p-4"
+		onclick={handleBackdropClick}
+		role="dialog"
+		aria-modal="true"
+		aria-labelledby="model-editor-title"
+	>
+		<!-- Dialog -->
+		<div class="w-full max-w-lg rounded-xl bg-theme-secondary shadow-xl">
+			<div class="border-b border-theme px-6 py-4">
+				<h2 id="model-editor-title" class="text-lg font-semibold text-theme-primary">
+					{mode === 'edit' ? 'Edit Model System Prompt' : 'Create Custom Model'}
+				</h2>
+				{#if mode === 'edit'}
+					<p class="mt-1 text-xs text-theme-muted">
+						This will re-create the model with the new system prompt
+					</p>
+				{/if}
+			</div>
+
+			{#if modelCreationState.isCreating}
+				<!-- Progress view -->
+				<div class="p-6">
+					<div class="flex flex-col items-center justify-center py-8">
+						<div class="h-10 w-10 animate-spin rounded-full border-3 border-theme-subtle border-t-violet-500 mb-4"></div>
+						<p class="text-sm text-theme-secondary mb-2">
+							{mode === 'edit' ? 'Updating model...' : 'Creating model...'}
+						</p>
+						<p class="text-xs text-theme-muted text-center max-w-xs">
+							{modelCreationState.status}
+						</p>
+					</div>
+
+					<div class="flex justify-center">
+						<button
+							type="button"
+							onclick={handleCancel}
+							class="rounded-lg px-4 py-2 text-sm text-red-400 hover:bg-red-900/20"
+						>
+							Cancel
+						</button>
+					</div>
+				</div>
+			{:else if modelCreationState.error}
+				<!-- Error view -->
+				<div class="p-6">
+					<div class="rounded-lg bg-red-900/20 border border-red-500/30 p-4 mb-4">
+						<p class="text-sm text-red-400">{modelCreationState.error}</p>
+					</div>
+					<div class="flex justify-end gap-3">
+						<button
+							type="button"
+							onclick={() => modelCreationState.reset()}
+							class="rounded-lg px-4 py-2 text-sm text-theme-secondary hover:bg-theme-tertiary"
+						>
+							Try Again
+						</button>
+						<button
+							type="button"
+							onclick={onClose}
+							class="rounded-lg bg-theme-tertiary px-4 py-2 text-sm text-theme-secondary hover:bg-theme-hover"
+						>
+							Close
+						</button>
+					</div>
+				</div>
+			{:else}
+				<!-- Form view -->
+				<form onsubmit={handleSubmit} class="p-6">
+					<div class="space-y-4">
+						{#if mode === 'create'}
+							<!-- Base model selection -->
+							<div>
+								<label for="base-model" class="mb-1 block text-sm font-medium text-theme-secondary">
+									Base Model <span class="text-red-400">*</span>
+								</label>
+								<select
+									id="base-model"
+									bind:value={selectedBaseModel}
+									class="w-full rounded-lg border border-theme-subtle bg-theme-tertiary px-3 py-2 text-theme-primary focus:border-violet-500 focus:outline-none focus:ring-1 focus:ring-violet-500"
+								>
+									{#each modelsState.chatModels as model (model.name)}
+										<option value={model.name}>{model.name}</option>
+									{/each}
+								</select>
+								<p class="mt-1 text-xs text-theme-muted">
+									The model to derive from
+								</p>
+							</div>
+						{/if}
+
+						<!-- Model name -->
+						<div>
+							<label for="model-name" class="mb-1 block text-sm font-medium text-theme-secondary">
+								Model Name <span class="text-red-400">*</span>
+							</label>
+							<input
+								id="model-name"
+								type="text"
+								bind:value={modelName}
+								placeholder="e.g., my-coding-assistant"
+								disabled={mode === 'edit'}
+								class="w-full rounded-lg border border-theme-subtle bg-theme-tertiary px-3 py-2 text-theme-primary placeholder-theme-muted focus:border-violet-500 focus:outline-none focus:ring-1 focus:ring-violet-500 disabled:opacity-60"
+								autocomplete="off"
+								autocorrect="off"
+								autocapitalize="off"
+								spellcheck="false"
+							/>
+							{#if mode === 'create'}
+								<p class="mt-1 text-xs text-theme-muted">
+									Use lowercase letters, numbers, and hyphens
+								</p>
+							{/if}
+						</div>
+
+						<!-- System prompt source toggle -->
+						<div class="flex items-center gap-4">
+							<button
+								type="button"
+								onclick={() => usePromptLibrary = false}
+								class="text-sm {!usePromptLibrary ? 'text-violet-400 font-medium' : 'text-theme-muted hover:text-theme-secondary'}"
+							>
+								Write prompt
+							</button>
+							<span class="text-theme-muted">|</span>
+							<button
+								type="button"
+								onclick={() => usePromptLibrary = true}
+								class="text-sm {usePromptLibrary ? 'text-violet-400 font-medium' : 'text-theme-muted hover:text-theme-secondary'}"
+							>
+								Use from library
+							</button>
+						</div>
+
+						{#if usePromptLibrary}
+							<!-- Prompt library selector -->
+							<div>
+								<label for="prompt-library" class="mb-1 block text-sm font-medium text-theme-secondary">
+									Select Prompt <span class="text-red-400">*</span>
+								</label>
+								<select
+									id="prompt-library"
+									bind:value={selectedPromptId}
+									class="w-full rounded-lg border border-theme-subtle bg-theme-tertiary px-3 py-2 text-theme-primary focus:border-violet-500 focus:outline-none focus:ring-1 focus:ring-violet-500"
+								>
+									<option value={null}>-- Select a prompt --</option>
+									{#each promptsState.prompts as prompt (prompt.id)}
+										<option value={prompt.id}>{prompt.name}</option>
+									{/each}
+								</select>
+								{#if selectedPromptId}
+									{@const selectedPrompt = promptsState.get(selectedPromptId)}
+									{#if selectedPrompt}
+										<div class="mt-2 rounded-lg bg-theme-tertiary p-3 text-xs text-theme-muted max-h-32 overflow-y-auto">
+											{selectedPrompt.content}
+										</div>
+									{/if}
+								{/if}
+							</div>
+						{:else}
+							<!-- System prompt textarea -->
+							<div>
+								<label for="system-prompt" class="mb-1 block text-sm font-medium text-theme-secondary">
+									System Prompt <span class="text-red-400">*</span>
+								</label>
+								<textarea
+									id="system-prompt"
+									bind:value={systemPrompt}
+									placeholder="You are a helpful assistant that..."
+									rows="6"
+									class="w-full resize-none rounded-lg border border-theme-subtle bg-theme-tertiary px-3 py-2 font-mono text-sm text-theme-primary placeholder-theme-muted focus:border-violet-500 focus:outline-none focus:ring-1 focus:ring-violet-500"
+								></textarea>
+								<p class="mt-1 text-xs text-theme-muted">
+									{systemPrompt.length} characters
+								</p>
+							</div>
+						{/if}
+					</div>
+
+					<!-- Actions -->
+					<div class="mt-6 flex justify-end gap-3">
+						<button
+							type="button"
+							onclick={handleCancel}
+							class="rounded-lg px-4 py-2 text-sm text-theme-secondary hover:bg-theme-tertiary"
+						>
+							Cancel
+						</button>
+						<button
+							type="submit"
+							disabled={!isValid}
+							class="rounded-lg bg-violet-600 px-4 py-2 text-sm font-medium text-white hover:bg-violet-500 disabled:cursor-not-allowed disabled:opacity-50"
+						>
+							{mode === 'edit' ? 'Update Model' : 'Create Model'}
+						</button>
+					</div>
+				</form>
+			{/if}
+		</div>
+	</div>
+{/if}
--- a/frontend/src/lib/components/shared/SettingsModal.svelte
+++ b/frontend/src/lib/components/shared/SettingsModal.svelte
@@ -1,203 +0,0 @@
-<script lang="ts">
-	/**
-	 * SettingsModal - Application settings dialog
-	 * Handles theme, model defaults, and other preferences
-	 */
-
-	import { modelsState, uiState } from '$lib/stores';
-	import { getPrimaryModifierDisplay } from '$lib/utils';
-
-	interface Props {
-		isOpen: boolean;
-		onClose: () => void;
-	}
-
-	const { isOpen, onClose }: Props = $props();
-
-	// Settings state (mirrors global state for editing)
-	let defaultModel = $state<string | null>(null);
-
-	// Sync with global state when modal opens
-	$effect(() => {
-		if (isOpen) {
-			defaultModel = modelsState.selectedId;
-		}
-	});
-
-	/**
-	 * Save settings and close modal
-	 */
-	function handleSave(): void {
-		if (defaultModel) {
-			modelsState.select(defaultModel);
-		}
-		onClose();
-	}
-
-	/**
-	 * Handle backdrop click
-	 */
-	function handleBackdropClick(event: MouseEvent): void {
-		if (event.target === event.currentTarget) {
-			onClose();
-		}
-	}
-
-	/**
-	 * Handle escape key
-	 */
-	function handleKeydown(event: KeyboardEvent): void {
-		if (event.key === 'Escape') {
-			onClose();
-		}
-	}
-
-	const modifierKey = getPrimaryModifierDisplay();
-</script>
-
-{#if isOpen}
-	<!-- Backdrop -->
-	<!-- svelte-ignore a11y_no_static_element_interactions -->
-	<div
-		class="fixed inset-0 z-50 flex items-center justify-center bg-black/60 backdrop-blur-sm"
-		onclick={handleBackdropClick}
-		onkeydown={handleKeydown}
-	>
-		<!-- Modal -->
-		<div
-			class="w-full max-w-lg rounded-xl bg-theme-secondary shadow-2xl"
-			role="dialog"
-			aria-modal="true"
-			aria-labelledby="settings-title"
-		>
-			<!-- Header -->
-			<div class="flex items-center justify-between border-b border-theme px-6 py-4">
-				<h2 id="settings-title" class="text-lg font-semibold text-theme-primary">Settings</h2>
-				<button
-					type="button"
-					onclick={onClose}
-					class="rounded-lg p-1.5 text-theme-muted hover:bg-theme-tertiary hover:text-theme-secondary"
-					aria-label="Close settings"
-				>
-					<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5" viewBox="0 0 20 20" fill="currentColor">
-						<path fill-rule="evenodd" d="M4.293 4.293a1 1 0 011.414 0L10 8.586l4.293-4.293a1 1 0 111.414 1.414L11.414 10l4.293 4.293a1 1 0 01-1.414 1.414L10 11.414l-4.293 4.293a1 1 0 01-1.414-1.414L8.586 10 4.293 5.707a1 1 0 010-1.414z" clip-rule="evenodd" />
-					</svg>
-				</button>
-			</div>
-
-			<!-- Content -->
-			<div class="space-y-6 p-6">
-				<!-- Appearance Section -->
-				<section>
-					<h3 class="mb-3 text-sm font-medium uppercase tracking-wide text-theme-muted">Appearance</h3>
-					<div class="space-y-4">
-						<div class="flex items-center justify-between">
-							<div>
-								<p class="text-sm font-medium text-theme-secondary">Dark Mode</p>
-								<p class="text-xs text-theme-muted">Toggle between light and dark theme</p>
-							</div>
-							<button
-								type="button"
-								onclick={() => uiState.toggleDarkMode()}
-								class="relative inline-flex h-6 w-11 flex-shrink-0 cursor-pointer rounded-full border-2 border-transparent transition-colors duration-200 ease-in-out focus:outline-none focus:ring-2 focus:ring-emerald-500 focus:ring-offset-2 focus:ring-offset-theme {uiState.darkMode ? 'bg-emerald-600' : 'bg-theme-tertiary'}"
-								role="switch"
-								aria-checked={uiState.darkMode}
-							>
-								<span
-									class="pointer-events-none inline-block h-5 w-5 transform rounded-full bg-white shadow ring-0 transition duration-200 ease-in-out {uiState.darkMode ? 'translate-x-5' : 'translate-x-0'}"
-								></span>
-							</button>
-						</div>
-						<div class="flex items-center justify-between">
-							<div>
-								<p class="text-sm font-medium text-theme-secondary">Use System Theme</p>
-								<p class="text-xs text-theme-muted">Match your OS light/dark preference</p>
-							</div>
-							<button
-								type="button"
-								onclick={() => uiState.useSystemTheme()}
-								class="rounded-lg bg-theme-tertiary px-3 py-1.5 text-xs font-medium text-theme-secondary transition-colors hover:bg-theme-tertiary"
-							>
-								Sync with System
-							</button>
-						</div>
-					</div>
-				</section>
-
-				<!-- Model Section -->
-				<section>
-					<h3 class="mb-3 text-sm font-medium uppercase tracking-wide text-theme-muted">Default Model</h3>
-					<div class="space-y-4">
-						<div>
-							<select
-								bind:value={defaultModel}
-								class="w-full rounded-lg border border-theme-subtle bg-theme-tertiary px-3 py-2 text-theme-secondary focus:border-emerald-500 focus:outline-none focus:ring-1 focus:ring-emerald-500"
-							>
-								{#each modelsState.chatModels as model}
-									<option value={model.name}>{model.name}</option>
-								{/each}
-							</select>
-							<p class="mt-1 text-sm text-theme-muted">Model used for new conversations</p>
-						</div>
-					</div>
-				</section>
-
-				<!-- Keyboard Shortcuts Section -->
-				<section>
-					<h3 class="mb-3 text-sm font-medium uppercase tracking-wide text-theme-muted">Keyboard Shortcuts</h3>
-					<div class="space-y-2 text-sm">
-						<div class="flex justify-between text-theme-secondary">
-							<span>New Chat</span>
-							<kbd class="rounded bg-theme-tertiary px-2 py-0.5 font-mono text-theme-muted">{modifierKey}+N</kbd>
-						</div>
-						<div class="flex justify-between text-theme-secondary">
-							<span>Search</span>
-							<kbd class="rounded bg-theme-tertiary px-2 py-0.5 font-mono text-theme-muted">{modifierKey}+K</kbd>
-						</div>
-						<div class="flex justify-between text-theme-secondary">
-							<span>Toggle Sidebar</span>
-							<kbd class="rounded bg-theme-tertiary px-2 py-0.5 font-mono text-theme-muted">{modifierKey}+B</kbd>
-						</div>
-						<div class="flex justify-between text-theme-secondary">
-							<span>Send Message</span>
-							<kbd class="rounded bg-theme-tertiary px-2 py-0.5 font-mono text-theme-muted">Enter</kbd>
-						</div>
-						<div class="flex justify-between text-theme-secondary">
-							<span>New Line</span>
-							<kbd class="rounded bg-theme-tertiary px-2 py-0.5 font-mono text-theme-muted">Shift+Enter</kbd>
-						</div>
-					</div>
-				</section>
-
-				<!-- About Section -->
-				<section>
-					<h3 class="mb-3 text-sm font-medium uppercase tracking-wide text-theme-muted">About</h3>
-					<div class="rounded-lg bg-theme-tertiary/50 p-4">
-						<p class="font-medium text-theme-secondary">Vessel</p>
-						<p class="mt-1 text-sm text-theme-muted">
-							A modern interface for local AI with chat, tools, and memory management.
-						</p>
-					</div>
-				</section>
-			</div>
-
-			<!-- Footer -->
-			<div class="flex justify-end gap-3 border-t border-theme px-6 py-4">
-				<button
-					type="button"
-					onclick={onClose}
-					class="rounded-lg px-4 py-2 text-sm font-medium text-theme-secondary hover:bg-theme-tertiary"
-				>
-					Cancel
-				</button>
-				<button
-					type="button"
-					onclick={handleSave}
-					class="rounded-lg bg-emerald-600 px-4 py-2 text-sm font-medium text-white hover:bg-emerald-500"
-				>
-					Save Changes
-				</button>
-			</div>
-		</div>
-	</div>
-{/if}
--- a/frontend/src/lib/components/shared/index.ts
+++ b/frontend/src/lib/components/shared/index.ts
@@ -10,6 +10,5 @@ export { default as ToastContainer } from './ToastContainer.svelte';
 export { default as Skeleton } from './Skeleton.svelte';
 export { default as MessageSkeleton } from './MessageSkeleton.svelte';
 export { default as ErrorBoundary } from './ErrorBoundary.svelte';
-export { default as SettingsModal } from './SettingsModal.svelte';
 export { default as ShortcutsModal } from './ShortcutsModal.svelte';
 export { default as SearchModal } from './SearchModal.svelte';
--- a/frontend/src/lib/memory/context-manager.svelte.ts
+++ b/frontend/src/lib/memory/context-manager.svelte.ts
@@ -9,6 +9,7 @@ import type { MessageNode } from '$lib/types/chat.js';
 import type { ContextUsage, TokenEstimate, MessageWithTokens } from './types.js';
 import { estimateMessageTokens, estimateFormatOverhead, formatTokenCount } from './tokenizer.js';
 import { getModelContextLimit, formatContextSize } from './model-limits.js';
+import { settingsState } from '$lib/stores/settings.svelte.js';

 /** Warning threshold as percentage of context (0.85 = 85%) */
 const WARNING_THRESHOLD = 0.85;
@@ -24,8 +25,14 @@ class ContextManager {
 	/** Current model name */
 	currentModel = $state<string>('');

-	/** Maximum context length for current model */
-	maxTokens = $state<number>(4096);
+	/** Maximum context length for current model (from model lookup) */
+	modelMaxTokens = $state<number>(4096);
+
+	/** Custom context limit override (from user settings) */
+	customMaxTokens = $state<number | null>(null);
+
+	/** Effective max tokens (custom override or model default) */
+	maxTokens = $derived(this.customMaxTokens ?? this.modelMaxTokens);

 	/**
 	 * Cached token estimates for messages (id -> estimate)
@@ -94,7 +101,15 @@ class ContextManager {
 	 */
 	setModel(modelName: string): void {
 		this.currentModel = modelName;
-		this.maxTokens = getModelContextLimit(modelName);
+		this.modelMaxTokens = getModelContextLimit(modelName);
+	}
+
+	/**
+	 * Set custom context limit override
+	 * Pass null to clear and use model default
+	 */
+	setCustomContextLimit(tokens: number | null): void {
+		this.customMaxTokens = tokens;
 	}

 	/**
@@ -238,6 +253,43 @@ class ContextManager {
 		this.tokenCache.clear();
 		this.messagesWithTokens = [];
 	}
+
+	/**
+	 * Check if auto-compact should be triggered
+	 * Returns true if:
+	 * - Auto-compact is enabled in settings
+	 * - Context usage exceeds the configured threshold
+	 * - There are enough messages to summarize
+	 */
+	shouldAutoCompact(): boolean {
+		// Check if auto-compact is enabled
+		if (!settingsState.autoCompactEnabled) {
+			return false;
+		}
+
+		// Check context usage against threshold
+		const threshold = settingsState.autoCompactThreshold;
+		if (this.contextUsage.percentage < threshold) {
+			return false;
+		}
+
+		// Check if there are enough messages to summarize
+		// Need at least preserveCount + 2 messages to have anything to summarize
+		const preserveCount = settingsState.autoCompactPreserveCount;
+		const minMessages = preserveCount + 2;
+		if (this.messagesWithTokens.length < minMessages) {
+			return false;
+		}
+
+		return true;
+	}
+
+	/**
+	 * Get the number of recent messages to preserve during auto-compact
+	 */
+	getAutoCompactPreserveCount(): number {
+		return settingsState.autoCompactPreserveCount;
+	}
 }

 /** Singleton context manager instance */
--- a/frontend/src/lib/memory/summarizer.ts
+++ b/frontend/src/lib/memory/summarizer.ts
@@ -79,18 +79,22 @@ export async function generateSummary(
 /**
 * Determine which messages should be summarized
 * Returns indices of messages to summarize (older messages) and messages to keep
+ * @param messages - All messages in the conversation
+ * @param targetFreeTokens - Not currently used (preserved for API compatibility)
+ * @param preserveCount - Number of recent messages to keep (defaults to PRESERVE_RECENT_MESSAGES)
 */
 export function selectMessagesForSummarization(
 	messages: MessageNode[],
-	targetFreeTokens: number
+	targetFreeTokens: number,
+	preserveCount: number = PRESERVE_RECENT_MESSAGES
 ): { toSummarize: MessageNode[]; toKeep: MessageNode[] } {
-	if (messages.length <= PRESERVE_RECENT_MESSAGES) {
+	if (messages.length <= preserveCount) {
 		return { toSummarize: [], toKeep: messages };
 	}

 	// Calculate how many messages to summarize
 	// Keep the recent ones, summarize the rest
-	const cutoffIndex = Math.max(0, messages.length - PRESERVE_RECENT_MESSAGES);
+	const cutoffIndex = Math.max(0, messages.length - preserveCount);

 	// Filter out system messages from summarization (they should stay)
 	const toSummarize: MessageNode[] = [];
--- a/frontend/src/lib/ollama/client.ts
+++ b/frontend/src/lib/ollama/client.ts
@@ -20,6 +20,8 @@ import type {
 	OllamaGenerateRequest,
 	OllamaGenerateResponse,
 	OllamaPullProgress,
+	OllamaCreateRequest,
+	OllamaCreateProgress,
 	JsonSchema
 } from './types.js';
 import {
@@ -214,6 +216,75 @@ export class OllamaClient {
 		}
 	}

+	/**
+	 * Creates a custom model with an embedded system prompt
+	 * POST /api/create with streaming progress
+	 * @param request Create request with model name, base model, and system prompt
+	 * @param onProgress Callback for progress updates
+	 * @param signal Optional abort signal
+	 */
+	async createModel(
+		request: OllamaCreateRequest,
+		onProgress: (progress: OllamaCreateProgress) => void,
+		signal?: AbortSignal
+	): Promise<void> {
+		const url = `${this.config.baseUrl}/api/create`;
+
+		const response = await this.fetchFn(url, {
+			method: 'POST',
+			headers: { 'Content-Type': 'application/json' },
+			body: JSON.stringify({ ...request, stream: true }),
+			signal
+		});
+
+		if (!response.ok) {
+			throw await createErrorFromResponse(response, '/api/create');
+		}
+
+		if (!response.body) {
+			throw new Error('No response body for create stream');
+		}
+
+		const reader = response.body.getReader();
+		const decoder = new TextDecoder();
+		let buffer = '';
+
+		try {
+			while (true) {
+				const { done, value } = await reader.read();
+
+				if (done) break;
+
+				buffer += decoder.decode(value, { stream: true });
+
+				// Process complete lines
+				let newlineIndex: number;
+				while ((newlineIndex = buffer.indexOf('\n')) !== -1) {
+					const line = buffer.slice(0, newlineIndex).trim();
+					buffer = buffer.slice(newlineIndex + 1);
+
+					if (!line) continue;
+
+					try {
+						const progress = JSON.parse(line) as OllamaCreateProgress;
+						// Check for error in response
+						if ('error' in progress) {
+							throw new Error((progress as { error: string }).error);
+						}
+						onProgress(progress);
+					} catch (e) {
+						if (e instanceof Error && e.message !== line) {
+							throw e;
+						}
+						console.warn('[Ollama] Failed to parse create progress:', line);
+					}
+				}
+			}
+		} finally {
+			reader.releaseLock();
+		}
+	}
+
 	// ==========================================================================
 	// Chat Completion
 	// ==========================================================================
--- a/frontend/src/lib/ollama/modelfile-parser.ts
+++ b/frontend/src/lib/ollama/modelfile-parser.ts
@@ -0,0 +1,124 @@
+/**
+ * Parser for Ollama Modelfile format
+ * Extracts system prompts and other directives from modelfile strings
+ *
+ * Modelfile format reference: https://github.com/ollama/ollama/blob/main/docs/modelfile.md
+ */
+
+/**
+ * Parse the SYSTEM directive from an Ollama modelfile string.
+ *
+ * Handles multiple formats:
+ * - Multi-line with triple quotes: SYSTEM """...""" or SYSTEM '''...'''
+ * - Single-line with quotes: SYSTEM "..." or SYSTEM '...'
+ * - Unquoted single-line: SYSTEM Your prompt here
+ *
+ * @param modelfile - Raw modelfile string from Ollama /api/show
+ * @returns Extracted system prompt or null if none found
+ */
+export function parseSystemPromptFromModelfile(modelfile: string): string | null {
+	if (!modelfile) {
+		return null;
+	}
+
+	// Pattern 1: Multi-line with triple double quotes
+	// SYSTEM """
+	// Your multi-line prompt
+	// """
+	const tripleDoubleQuoteMatch = modelfile.match(/SYSTEM\s+"""([\s\S]*?)"""/i);
+	if (tripleDoubleQuoteMatch) {
+		return tripleDoubleQuoteMatch[1].trim();
+	}
+
+	// Pattern 2: Multi-line with triple single quotes
+	// SYSTEM '''
+	// Your multi-line prompt
+	// '''
+	const tripleSingleQuoteMatch = modelfile.match(/SYSTEM\s+'''([\s\S]*?)'''/i);
+	if (tripleSingleQuoteMatch) {
+		return tripleSingleQuoteMatch[1].trim();
+	}
+
+	// Pattern 3: Single-line with double quotes
+	// SYSTEM "Your prompt here"
+	const doubleQuoteMatch = modelfile.match(/SYSTEM\s+"([^"]+)"/i);
+	if (doubleQuoteMatch) {
+		return doubleQuoteMatch[1].trim();
+	}
+
+	// Pattern 4: Single-line with single quotes
+	// SYSTEM 'Your prompt here'
+	const singleQuoteMatch = modelfile.match(/SYSTEM\s+'([^']+)'/i);
+	if (singleQuoteMatch) {
+		return singleQuoteMatch[1].trim();
+	}
+
+	// Pattern 5: Unquoted single-line (less common, stops at newline)
+	// SYSTEM Your prompt here
+	const unquotedMatch = modelfile.match(/^SYSTEM\s+([^\n"']+)$/im);
+	if (unquotedMatch) {
+		return unquotedMatch[1].trim();
+	}
+
+	return null;
+}
+
+/**
+ * Parse the TEMPLATE directive from a modelfile.
+ * Templates define how messages are formatted for the model.
+ *
+ * @param modelfile - Raw modelfile string
+ * @returns Template string or null if none found
+ */
+export function parseTemplateFromModelfile(modelfile: string): string | null {
+	if (!modelfile) {
+		return null;
+	}
+
+	// Multi-line template with triple quotes
+	const tripleQuoteMatch = modelfile.match(/TEMPLATE\s+"""([\s\S]*?)"""/i);
+	if (tripleQuoteMatch) {
+		return tripleQuoteMatch[1];
+	}
+
+	// Single-line template
+	const singleLineMatch = modelfile.match(/TEMPLATE\s+"([^"]+)"/i);
+	if (singleLineMatch) {
+		return singleLineMatch[1];
+	}
+
+	return null;
+}
+
+/**
+ * Parse PARAMETER directives from a modelfile.
+ * Returns a map of parameter names to values.
+ *
+ * @param modelfile - Raw modelfile string
+ * @returns Object with parameter name-value pairs
+ */
+export function parseParametersFromModelfile(modelfile: string): Record<string, string> {
+	if (!modelfile) {
+		return {};
+	}
+
+	const params: Record<string, string> = {};
+
+	// Use matchAll to find all PARAMETER lines
+	const matches = modelfile.matchAll(/^PARAMETER\s+(\w+)\s+(.+)$/gim);
+	for (const match of matches) {
+		params[match[1].toLowerCase()] = match[2].trim();
+	}
+
+	return params;
+}
+
+/**
+ * Check if a modelfile has a SYSTEM directive defined.
+ *
+ * @param modelfile - Raw modelfile string
+ * @returns true if SYSTEM directive exists
+ */
+export function hasSystemPrompt(modelfile: string): boolean {
+	return parseSystemPromptFromModelfile(modelfile) !== null;
+}
--- a/frontend/src/lib/ollama/types.ts
+++ b/frontend/src/lib/ollama/types.ts
@@ -80,6 +80,28 @@ export interface OllamaDeleteRequest {
 	name: string;
 }

+// ============================================================================
+// Model Create Types
+// ============================================================================
+
+/** Request body for POST /api/create */
+export interface OllamaCreateRequest {
+	/** Name for the new model */
+	model: string;
+	/** Base model to derive from (e.g., "llama3.2:8b") */
+	from: string;
+	/** System prompt to embed in the model */
+	system?: string;
+	/** Whether to stream progress (default: true) */
+	stream?: boolean;
+}
+
+/** Progress chunk from POST /api/create streaming response */
+export interface OllamaCreateProgress {
+	/** Status message (e.g., "creating new layer", "writing manifest", "success") */
+	status: string;
+}
+
 // ============================================================================
 // Message Types
 // ============================================================================
--- a/frontend/src/lib/services/model-info-service.ts
+++ b/frontend/src/lib/services/model-info-service.ts
@@ -0,0 +1,204 @@
+/**
+ * Model Info Service
+ *
+ * Fetches and caches model information from Ollama, including:
+ * - Embedded system prompts (from Modelfile SYSTEM directive)
+ * - Model capabilities (vision, code, thinking, tools, etc.)
+ *
+ * Uses IndexedDB for persistent caching with configurable TTL.
+ */
+
+import { ollamaClient } from '$lib/ollama/client.js';
+import { parseSystemPromptFromModelfile } from '$lib/ollama/modelfile-parser.js';
+import type { OllamaCapability } from '$lib/ollama/types.js';
+import { db, type StoredModelSystemPrompt } from '$lib/storage/db.js';
+
+/** Cache TTL in milliseconds (1 hour) */
+const CACHE_TTL_MS = 60 * 60 * 1000;
+
+/** Model info returned by the service */
+export interface ModelInfo {
+	modelName: string;
+	systemPrompt: string | null;
+	capabilities: OllamaCapability[];
+	extractedAt: number;
+}
+
+/**
+ * Service for fetching and caching model information.
+ * Singleton pattern with in-flight request deduplication.
+ */
+class ModelInfoService {
+	/** Track in-flight fetches to prevent duplicate requests */
+	private fetchingModels = new Map<string, Promise<ModelInfo>>();
+
+	/**
+	 * Get model info, fetching from Ollama if not cached or expired.
+	 *
+	 * @param modelName - Ollama model name (e.g., "llama3.2:8b")
+	 * @param forceRefresh - Skip cache and fetch fresh data
+	 * @returns Model info including embedded system prompt and capabilities
+	 */
+	async getModelInfo(modelName: string, forceRefresh = false): Promise<ModelInfo> {
+		// Check cache first (unless force refresh)
+		if (!forceRefresh) {
+			const cached = await this.getCached(modelName);
+			if (cached && Date.now() - cached.extractedAt < CACHE_TTL_MS) {
+				return {
+					modelName: cached.modelName,
+					systemPrompt: cached.systemPrompt,
+					capabilities: cached.capabilities as OllamaCapability[],
+					extractedAt: cached.extractedAt
+				};
+			}
+		}
+
+		// Check if already fetching this model (deduplication)
+		const existingFetch = this.fetchingModels.get(modelName);
+		if (existingFetch) {
+			return existingFetch;
+		}
+
+		// Create new fetch promise
+		const fetchPromise = this.fetchAndCache(modelName);
+		this.fetchingModels.set(modelName, fetchPromise);
+
+		try {
+			return await fetchPromise;
+		} finally {
+			this.fetchingModels.delete(modelName);
+		}
+	}
+
+	/**
+	 * Fetch model info from Ollama and cache it.
+	 */
+	private async fetchAndCache(modelName: string): Promise<ModelInfo> {
+		try {
+			const response = await ollamaClient.showModel(modelName);
+			const systemPrompt = parseSystemPromptFromModelfile(response.modelfile);
+			const capabilities = (response.capabilities ?? []) as OllamaCapability[];
+			const extractedAt = Date.now();
+
+			const record: StoredModelSystemPrompt = {
+				modelName,
+				systemPrompt,
+				capabilities,
+				extractedAt
+			};
+
+			// Cache in IndexedDB
+			await db.modelSystemPrompts.put(record);
+
+			return {
+				modelName,
+				systemPrompt,
+				capabilities,
+				extractedAt
+			};
+		} catch (error) {
+			console.error(`[ModelInfoService] Failed to fetch info for ${modelName}:`, error);
+
+			// Return cached data if available (even if expired)
+			const cached = await this.getCached(modelName);
+			if (cached) {
+				return {
+					modelName: cached.modelName,
+					systemPrompt: cached.systemPrompt,
+					capabilities: cached.capabilities as OllamaCapability[],
+					extractedAt: cached.extractedAt
+				};
+			}
+
+			// Return empty info if no cache
+			return {
+				modelName,
+				systemPrompt: null,
+				capabilities: [],
+				extractedAt: 0
+			};
+		}
+	}
+
+	/**
+	 * Get cached model info from IndexedDB.
+	 */
+	private async getCached(modelName: string): Promise<StoredModelSystemPrompt | undefined> {
+		try {
+			return await db.modelSystemPrompts.get(modelName);
+		} catch (error) {
+			console.error(`[ModelInfoService] Cache read error for ${modelName}:`, error);
+			return undefined;
+		}
+	}
+
+	/**
+	 * Check if a model has an embedded system prompt.
+	 *
+	 * @param modelName - Ollama model name
+	 * @returns true if model has embedded system prompt
+	 */
+	async hasEmbeddedPrompt(modelName: string): Promise<boolean> {
+		const info = await this.getModelInfo(modelName);
+		return info.systemPrompt !== null;
+	}
+
+	/**
+	 * Get the embedded system prompt for a model.
+	 *
+	 * @param modelName - Ollama model name
+	 * @returns Embedded system prompt or null
+	 */
+	async getEmbeddedPrompt(modelName: string): Promise<string | null> {
+		const info = await this.getModelInfo(modelName);
+		return info.systemPrompt;
+	}
+
+	/**
+	 * Get capabilities for a model.
+	 *
+	 * @param modelName - Ollama model name
+	 * @returns Array of capability strings
+	 */
+	async getCapabilities(modelName: string): Promise<OllamaCapability[]> {
+		const info = await this.getModelInfo(modelName);
+		return info.capabilities;
+	}
+
+	/**
+	 * Pre-fetch info for multiple models in parallel.
+	 * Useful for warming the cache on app startup.
+	 *
+	 * @param modelNames - Array of model names to fetch
+	 */
+	async prefetchModels(modelNames: string[]): Promise<void> {
+		await Promise.allSettled(modelNames.map((name) => this.getModelInfo(name)));
+	}
+
+	/**
+	 * Clear cached info for a model.
+	 *
+	 * @param modelName - Ollama model name
+	 */
+	async clearCache(modelName: string): Promise<void> {
+		try {
+			await db.modelSystemPrompts.delete(modelName);
+		} catch (error) {
+			console.error(`[ModelInfoService] Failed to clear cache for ${modelName}:`, error);
+		}
+	}
+
+	/**
+	 * Clear all cached model info.
+	 */
+	async clearAllCache(): Promise<void> {
+		try {
+			await db.modelSystemPrompts.clear();
+		} catch (error) {
+			console.error('[ModelInfoService] Failed to clear all cache:', error);
+		}
+	}
+}
+
+/** Singleton instance */
+export const modelInfoService = new ModelInfoService();
--- a/frontend/src/lib/services/prompt-resolution.ts
+++ b/frontend/src/lib/services/prompt-resolution.ts
@@ -0,0 +1,195 @@
+/**
+ * Prompt Resolution Service
+ *
+ * Determines which system prompt to use for a chat based on priority:
+ * 1. Per-conversation prompt (explicit user override)
+ * 2. New chat prompt selection (before conversation exists)
+ * 3. Model-prompt mapping (user configured default for model)
+ * 4. Model-embedded prompt (from Ollama Modelfile SYSTEM directive)
+ * 5. Capability-matched prompt (user prompt targeting model capabilities)
+ * 6. Global active prompt
+ * 7. No prompt
+ */
+
+import { promptsState, type Prompt } from '$lib/stores/prompts.svelte.js';
+import { modelPromptMappingsState } from '$lib/stores/model-prompt-mappings.svelte.js';
+import { modelInfoService } from '$lib/services/model-info-service.js';
+import type { OllamaCapability } from '$lib/ollama/types.js';
+
+/** Source of the resolved prompt */
+export type PromptSource =
+	| 'per-conversation'
+	| 'new-chat-selection'
+	| 'model-mapping'
+	| 'model-embedded'
+	| 'capability-match'
+	| 'global-active'
+	| 'none';
+
+/** Result of prompt resolution */
+export interface ResolvedPrompt {
+	/** The system prompt content to use */
+	content: string;
+	/** Where this prompt came from */
+	source: PromptSource;
+	/** Name of the prompt (for display) */
+	promptName?: string;
+	/** Matched capability (if source is capability-match) */
+	matchedCapability?: OllamaCapability;
+}
+
+/** Priority order for capability matching */
+const CAPABILITY_PRIORITY: OllamaCapability[] = ['code', 'vision', 'thinking', 'tools'];
+
+/**
+ * Find a user prompt that targets specific capabilities.
+ *
+ * @param capabilities - Model capabilities to match against
+ * @param prompts - Available user prompts
+ * @returns Matched prompt and capability, or null
+ */
+function findCapabilityMatchedPrompt(
+	capabilities: OllamaCapability[],
+	prompts: Prompt[]
+): { prompt: Prompt; capability: OllamaCapability } | null {
+	for (const capability of CAPABILITY_PRIORITY) {
+		if (!capabilities.includes(capability)) continue;
+
+		// Find a prompt targeting this capability
+		const match = prompts.find(
+			(p) => (p as Prompt & { targetCapabilities?: string[] }).targetCapabilities?.includes(capability)
+		);
+		if (match) {
+			return { prompt: match, capability };
+		}
+	}
+	return null;
+}
+
+/**
+ * Resolve which system prompt to use for a chat.
+ *
+ * Priority order:
+ * 1. Per-conversation prompt (explicit user override)
+ * 2. New chat prompt selection (before conversation exists)
+ * 3. Model-prompt mapping (user configured default for model)
+ * 4. Model-embedded prompt (from Ollama Modelfile)
+ * 5. Capability-matched prompt
+ * 6. Global active prompt
+ * 7. No prompt
+ *
+ * @param modelName - Ollama model name (e.g., "llama3.2:8b")
+ * @param conversationPromptId - Per-conversation prompt ID (if set)
+ * @param newChatPromptId - New chat selection (before conversation created)
+ * @returns Resolved prompt with content and source
+ */
+export async function resolveSystemPrompt(
+	modelName: string,
+	conversationPromptId?: string | null,
+	newChatPromptId?: string | null
+): Promise<ResolvedPrompt> {
+	// Ensure stores are loaded
+	await promptsState.ready();
+	await modelPromptMappingsState.ready();
+
+	// 1. Per-conversation prompt (highest priority)
+	if (conversationPromptId) {
+		const prompt = promptsState.get(conversationPromptId);
+		if (prompt) {
+			return {
+				content: prompt.content,
+				source: 'per-conversation',
+				promptName: prompt.name
+			};
+		}
+	}
+
+	// 2. New chat prompt selection (before conversation exists)
+	if (newChatPromptId) {
+		const prompt = promptsState.get(newChatPromptId);
+		if (prompt) {
+			return {
+				content: prompt.content,
+				source: 'new-chat-selection',
+				promptName: prompt.name
+			};
+		}
+	}
+
+	// 3. User-configured model-prompt mapping
+	const mappedPromptId = modelPromptMappingsState.getMapping(modelName);
+	if (mappedPromptId) {
+		const prompt = promptsState.get(mappedPromptId);
+		if (prompt) {
+			return {
+				content: prompt.content,
+				source: 'model-mapping',
+				promptName: prompt.name
+			};
+		}
+	}
+
+	// 4. Model-embedded prompt (from Ollama Modelfile SYSTEM directive)
+	const modelInfo = await modelInfoService.getModelInfo(modelName);
+	if (modelInfo.systemPrompt) {
+		return {
+			content: modelInfo.systemPrompt,
+			source: 'model-embedded',
+			promptName: `${modelName} (embedded)`
+		};
+	}
+
+	// 5. Capability-matched prompt
+	if (modelInfo.capabilities.length > 0) {
+		const capabilityMatch = findCapabilityMatchedPrompt(modelInfo.capabilities, promptsState.prompts);
+		if (capabilityMatch) {
+			return {
+				content: capabilityMatch.prompt.content,
+				source: 'capability-match',
+				promptName: capabilityMatch.prompt.name,
+				matchedCapability: capabilityMatch.capability
+			};
+		}
+	}
+
+	// 6. Global active prompt
+	const activePrompt = promptsState.activePrompt;
+	if (activePrompt) {
+		return {
+			content: activePrompt.content,
+			source: 'global-active',
+			promptName: activePrompt.name
+		};
+	}
+
+	// 7. No prompt
+	return {
+		content: '',
+		source: 'none'
+	};
+}
+
+/**
+ * Get a human-readable description of a prompt source.
+ *
+ * @param source - Prompt source type
+ * @returns Display string for the source
+ */
+export function getPromptSourceLabel(source: PromptSource): string {
+	switch (source) {
+		case 'per-conversation':
+			return 'Custom (this chat)';
+		case 'new-chat-selection':
+			return 'Selected prompt';
+		case 'model-mapping':
+			return 'Model default';
+		case 'model-embedded':
+			return 'Model built-in';
+		case 'capability-match':
+			return 'Auto-matched';
+		case 'global-active':
+			return 'Global default';
+		case 'none':
+			return 'None';
+	}
+}
--- a/frontend/src/lib/storage/db.ts
+++ b/frontend/src/lib/storage/db.ts
@@ -137,6 +137,36 @@ export interface StoredPrompt {
 	isDefault: boolean;
 	createdAt: number;
 	updatedAt: number;
+	/** Capabilities this prompt is optimized for (for auto-matching) */
+	targetCapabilities?: string[];
+}
+
+/**
+ * Cached model info including embedded system prompt (from Ollama /api/show)
+ */
+export interface StoredModelSystemPrompt {
+	/** Model name (e.g., "llama3.2:8b") - Primary key */
+	modelName: string;
+	/** System prompt extracted from modelfile, null if none */
+	systemPrompt: string | null;
+	/** Model capabilities (vision, code, thinking, tools, etc.) */
+	capabilities: string[];
+	/** Timestamp when this info was fetched */
+	extractedAt: number;
+}
+
+/**
+ * User-configured model-to-prompt mapping
+ * Allows users to set default prompts for specific models
+ */
+export interface StoredModelPromptMapping {
+	id: string;
+	/** Ollama model name (e.g., "llama3.2:8b") */
+	modelName: string;
+	/** Reference to StoredPrompt.id */
+	promptId: string;
+	createdAt: number;
+	updatedAt: number;
 }

 /**
@@ -151,6 +181,8 @@ class OllamaDatabase extends Dexie {
 	documents!: Table<StoredDocument>;
 	chunks!: Table<StoredChunk>;
 	prompts!: Table<StoredPrompt>;
+	modelSystemPrompts!: Table<StoredModelSystemPrompt>;
+	modelPromptMappings!: Table<StoredModelPromptMapping>;

 	constructor() {
 		super('vessel');
@@ -203,6 +235,22 @@ class OllamaDatabase extends Dexie {
 			chunks: 'id, documentId',
 			prompts: 'id, name, isDefault, updatedAt'
 		});
+
+		// Version 5: Model-specific system prompts
+		// Adds: cached model info (with embedded prompts) and user model-prompt mappings
+		this.version(5).stores({
+			conversations: 'id, updatedAt, isPinned, isArchived, systemPromptId',
+			messages: 'id, conversationId, parentId, createdAt',
+			attachments: 'id, messageId',
+			syncQueue: 'id, entityType, createdAt',
+			documents: 'id, name, createdAt, updatedAt',
+			chunks: 'id, documentId',
+			prompts: 'id, name, isDefault, updatedAt',
+			// Cached model info from Ollama /api/show (includes embedded system prompts)
+			modelSystemPrompts: 'modelName',
+			// User-configured model-to-prompt mappings
+			modelPromptMappings: 'id, modelName, promptId'
+		});
 	}
 }

--- a/frontend/src/lib/storage/model-prompt-mappings.ts
+++ b/frontend/src/lib/storage/model-prompt-mappings.ts
@@ -0,0 +1,125 @@
+/**
+ * Storage operations for model-prompt mappings.
+ *
+ * Allows users to configure default system prompts for specific models.
+ * When a model is used, its mapped prompt takes priority over the global default.
+ */
+
+import {
+	db,
+	generateId,
+	withErrorHandling,
+	type StorageResult,
+	type StoredModelPromptMapping
+} from './db.js';
+
+// Re-export the type for consumers
+export type { StoredModelPromptMapping };
+
+/**
+ * Get the prompt mapping for a specific model.
+ *
+ * @param modelName - Ollama model name (e.g., "llama3.2:8b")
+ * @returns The mapping if found, null otherwise
+ */
+export async function getModelPromptMapping(
+	modelName: string
+): Promise<StorageResult<StoredModelPromptMapping | null>> {
+	return withErrorHandling(async () => {
+		const mapping = await db.modelPromptMappings.where('modelName').equals(modelName).first();
+		return mapping ?? null;
+	});
+}
+
+/**
+ * Get all model-prompt mappings.
+ *
+ * @returns Array of all mappings
+ */
+export async function getAllModelPromptMappings(): Promise<
+	StorageResult<StoredModelPromptMapping[]>
+> {
+	return withErrorHandling(async () => {
+		return db.modelPromptMappings.toArray();
+	});
+}
+
+/**
+ * Set or update the prompt mapping for a model.
+ * Pass null for promptId to remove the mapping.
+ *
+ * @param modelName - Ollama model name
+ * @param promptId - Prompt ID to map to, or null to remove mapping
+ */
+export async function setModelPromptMapping(
+	modelName: string,
+	promptId: string | null
+): Promise<StorageResult<void>> {
+	return withErrorHandling(async () => {
+		if (promptId === null) {
+			// Remove mapping
+			await db.modelPromptMappings.where('modelName').equals(modelName).delete();
+		} else {
+			// Upsert mapping
+			const existing = await db.modelPromptMappings.where('modelName').equals(modelName).first();
+
+			const now = Date.now();
+			if (existing) {
+				await db.modelPromptMappings.update(existing.id, {
+					promptId,
+					updatedAt: now
+				});
+			} else {
+				await db.modelPromptMappings.add({
+					id: generateId(),
+					modelName,
+					promptId,
+					createdAt: now,
+					updatedAt: now
+				});
+			}
+		}
+	});
+}
+
+/**
+ * Remove the prompt mapping for a model.
+ *
+ * @param modelName - Ollama model name
+ */
+export async function removeModelPromptMapping(modelName: string): Promise<StorageResult<void>> {
+	return setModelPromptMapping(modelName, null);
+}
+
+/**
+ * Get mappings for multiple models at once.
+ * Useful for batch operations.
+ *
+ * @param modelNames - Array of model names
+ * @returns Map of model name to prompt ID
+ */
+export async function getModelPromptMappingsBatch(
+	modelNames: string[]
+): Promise<StorageResult<Map<string, string>>> {
+	return withErrorHandling(async () => {
+		const mappings = await db.modelPromptMappings
+			.where('modelName')
+			.anyOf(modelNames)
+			.toArray();
+
+		const result = new Map<string, string>();
+		for (const mapping of mappings) {
+			result.set(mapping.modelName, mapping.promptId);
+		}
+		return result;
+	});
+}
+
+/**
+ * Clear all model-prompt mappings.
+ */
+export async function clearAllModelPromptMappings(): Promise<StorageResult<void>> {
+	return withErrorHandling(async () => {
+		await db.modelPromptMappings.clear();
+	});
+}
--- a/frontend/src/lib/storage/prompts.ts
+++ b/frontend/src/lib/storage/prompts.ts
@@ -42,6 +42,7 @@ export async function createPrompt(data: {
 	content: string;
 	description?: string;
 	isDefault?: boolean;
+	targetCapabilities?: string[];
 }): Promise<StorageResult<StoredPrompt>> {
 	return withErrorHandling(async () => {
 		const now = Date.now();
@@ -51,6 +52,7 @@ export async function createPrompt(data: {
 			content: data.content,
 			description: data.description ?? '',
 			isDefault: data.isDefault ?? false,
+			targetCapabilities: data.targetCapabilities,
 			createdAt: now,
 			updatedAt: now
 		};
--- a/frontend/src/lib/stores/index.ts
+++ b/frontend/src/lib/stores/index.ts
@@ -9,6 +9,7 @@ export { UIState, uiState } from './ui.svelte.js';
 export { ToastState, toastState } from './toast.svelte.js';
 export { toolsState } from './tools.svelte.js';
 export { promptsState } from './prompts.svelte.js';
+export { SettingsState, settingsState } from './settings.svelte.js';
 export type { Prompt } from './prompts.svelte.js';
 export { VersionState, versionState } from './version.svelte.js';

--- a/frontend/src/lib/stores/local-models.svelte.ts
+++ b/frontend/src/lib/stores/local-models.svelte.ts
@@ -146,7 +146,8 @@ class LocalModelsState {
 			const response = await checkForUpdates();

 			this.updatesAvailable = response.updatesAvailable;
-			this.modelsWithUpdates = new Set(response.updates.map(m => m.name));
+			// Handle null/undefined updates array from API
+			this.modelsWithUpdates = new Set((response.updates ?? []).map(m => m.name));

 			return response;
 		} catch (err) {
--- a/frontend/src/lib/stores/model-creation.svelte.ts
+++ b/frontend/src/lib/stores/model-creation.svelte.ts
@@ -0,0 +1,120 @@
+/**
+ * Model creation/editing state management using Svelte 5 runes
+ * Handles creating custom Ollama models with embedded system prompts
+ */
+
+import { ollamaClient } from '$lib/ollama';
+import type { OllamaCreateProgress } from '$lib/ollama/types.js';
+import { modelsState } from './models.svelte.js';
+import { modelInfoService } from '$lib/services/model-info-service.js';
+
+/** Mode of the model editor */
+export type ModelEditorMode = 'create' | 'edit';
+
+/** Model creation state class with reactive properties */
+class ModelCreationState {
+	/** Whether a creation/update operation is in progress */
+	isCreating = $state(false);
+
+	/** Current status message from Ollama */
+	status = $state('');
+
+	/** Error message if creation failed */
+	error = $state<string | null>(null);
+
+	/** Abort controller for cancelling operations */
+	private abortController: AbortController | null = null;
+
+	/**
+	 * Create a new custom model with an embedded system prompt
+	 * @param modelName Name for the new model
+	 * @param baseModel Base model to derive from (e.g., "llama3.2:8b")
+	 * @param systemPrompt System prompt to embed
+	 * @returns true if successful, false otherwise
+	 */
+	async create(
+		modelName: string,
+		baseModel: string,
+		systemPrompt: string
+	): Promise<boolean> {
+		if (this.isCreating) return false;
+
+		this.isCreating = true;
+		this.status = 'Initializing...';
+		this.error = null;
+		this.abortController = new AbortController();
+
+		try {
+			await ollamaClient.createModel(
+				{
+					model: modelName,
+					from: baseModel,
+					system: systemPrompt
+				},
+				(progress: OllamaCreateProgress) => {
+					this.status = progress.status;
+				},
+				this.abortController.signal
+			);
+
+			// Refresh models list to show the new model
+			await modelsState.refresh();
+
+			// Clear the model info cache for this model so it gets fresh info
+			modelInfoService.clearCache(modelName);
+
+			this.status = 'Success!';
+			return true;
+		} catch (err) {
+			if (err instanceof Error && err.name === 'AbortError') {
+				this.error = 'Operation cancelled';
+			} else {
+				this.error = err instanceof Error ? err.message : 'Failed to create model';
+			}
+			return false;
+		} finally {
+			this.isCreating = false;
+			this.abortController = null;
+		}
+	}
+
+	/**
+	 * Update an existing model's system prompt
+	 * Note: This re-creates the model with the new prompt (Ollama limitation)
+	 * @param modelName Name of the existing model
+	 * @param baseModel Base model (usually the model's parent or itself)
+	 * @param systemPrompt New system prompt
+	 * @returns true if successful, false otherwise
+	 */
+	async update(
+		modelName: string,
+		baseModel: string,
+		systemPrompt: string
+	): Promise<boolean> {
+		// Updating is the same as creating with the same name
+		// Ollama will overwrite the existing model
+		return this.create(modelName, baseModel, systemPrompt);
+	}
+
+	/**
+	 * Cancel the current operation
+	 */
+	cancel(): void {
+		if (this.abortController) {
+			this.abortController.abort();
+		}
+	}
+
+	/**
+	 * Reset the state
+	 */
+	reset(): void {
+		this.isCreating = false;
+		this.status = '';
+		this.error = null;
+		this.abortController = null;
+	}
+}
+
+/** Singleton model creation state instance */
+export const modelCreationState = new ModelCreationState();
--- a/frontend/src/lib/stores/model-prompt-mappings.svelte.ts
+++ b/frontend/src/lib/stores/model-prompt-mappings.svelte.ts
@@ -0,0 +1,150 @@
+/**
+ * Model-prompt mappings state management using Svelte 5 runes.
+ *
+ * Manages user-configured default prompts for specific models.
+ * When a model is used, its mapped prompt takes priority over the global default.
+ */
+
+import {
+	getAllModelPromptMappings,
+	setModelPromptMapping,
+	removeModelPromptMapping,
+	type StoredModelPromptMapping
+} from '$lib/storage/model-prompt-mappings.js';
+
+/**
+ * Model-prompt mappings state class with reactive properties.
+ */
+class ModelPromptMappingsState {
+	/** Map of model name to prompt ID */
+	mappings = $state<Map<string, string>>(new Map());
+
+	/** Loading state */
+	isLoading = $state(false);
+
+	/** Error state */
+	error = $state<string | null>(null);
+
+	/** Promise that resolves when initial load is complete */
+	private _readyPromise: Promise<void> | null = null;
+	private _readyResolve: (() => void) | null = null;
+
+	constructor() {
+		// Create ready promise
+		this._readyPromise = new Promise((resolve) => {
+			this._readyResolve = resolve;
+		});
+
+		// Load mappings on initialization (client-side only)
+		if (typeof window !== 'undefined') {
+			this.load();
+		}
+	}
+
+	/**
+	 * Wait for initial load to complete.
+	 */
+	async ready(): Promise<void> {
+		return this._readyPromise ?? Promise.resolve();
+	}
+
+	/**
+	 * Load all mappings from storage.
+	 */
+	async load(): Promise<void> {
+		this.isLoading = true;
+		this.error = null;
+
+		try {
+			const result = await getAllModelPromptMappings();
+			if (result.success) {
+				this.mappings = new Map(result.data.map((m) => [m.modelName, m.promptId]));
+			} else {
+				this.error = result.error;
+			}
+		} catch (err) {
+			this.error = err instanceof Error ? err.message : 'Failed to load model-prompt mappings';
+		} finally {
+			this.isLoading = false;
+			this._readyResolve?.();
+		}
+	}
+
+	/**
+	 * Get the prompt ID mapped to a model.
+	 *
+	 * @param modelName - Ollama model name
+	 * @returns Prompt ID or undefined if not mapped
+	 */
+	getMapping(modelName: string): string | undefined {
+		return this.mappings.get(modelName);
+	}
+
+	/**
+	 * Check if a model has a prompt mapping.
+	 *
+	 * @param modelName - Ollama model name
+	 * @returns true if model has a mapping
+	 */
+	hasMapping(modelName: string): boolean {
+		return this.mappings.has(modelName);
+	}
+
+	/**
+	 * Set or update the prompt mapping for a model.
+	 *
+	 * @param modelName - Ollama model name
+	 * @param promptId - Prompt ID to map to
+	 * @returns true if successful
+	 */
+	async setMapping(modelName: string, promptId: string): Promise<boolean> {
+		const result = await setModelPromptMapping(modelName, promptId);
+		if (result.success) {
+			// Update local state
+			const newMap = new Map(this.mappings);
+			newMap.set(modelName, promptId);
+			this.mappings = newMap;
+			return true;
+		}
+		this.error = result.error;
+		return false;
+	}
+
+	/**
+	 * Remove the prompt mapping for a model.
+	 *
+	 * @param modelName - Ollama model name
+	 * @returns true if successful
+	 */
+	async removeMapping(modelName: string): Promise<boolean> {
+		const result = await removeModelPromptMapping(modelName);
+		if (result.success) {
+			// Update local state
+			const newMap = new Map(this.mappings);
+			newMap.delete(modelName);
+			this.mappings = newMap;
+			return true;
+		}
+		this.error = result.error;
+		return false;
+	}
+
+	/**
+	 * Get all mappings as an array.
+	 *
+	 * @returns Array of [modelName, promptId] pairs
+	 */
+	getAllMappings(): Array<[string, string]> {
+		return Array.from(this.mappings.entries());
+	}
+
+	/**
+	 * Get number of configured mappings.
+	 */
+	get count(): number {
+		return this.mappings.size;
+	}
+}
+
+/** Singleton instance */
+export const modelPromptMappingsState = new ModelPromptMappingsState();
--- a/frontend/src/lib/stores/model-registry.svelte.ts
+++ b/frontend/src/lib/stores/model-registry.svelte.ts
@@ -5,12 +5,15 @@

 import {
 	fetchRemoteModels,
+	fetchRemoteFamilies,
 	getSyncStatus,
 	syncModels,
 	type RemoteModel,
 	type SyncStatus,
 	type ModelSearchOptions,
-	type ModelSortOption
+	type ModelSortOption,
+	type SizeRange,
+	type ContextRange
 } from '$lib/api/model-registry';

 /** Store state */
@@ -25,6 +28,10 @@ class ModelRegistryState {
 	searchQuery = $state('');
 	modelType = $state<'official' | 'community' | ''>('');
 	selectedCapabilities = $state<string[]>([]);
+	selectedSizeRanges = $state<SizeRange[]>([]);
+	selectedContextRanges = $state<ContextRange[]>([]);
+	selectedFamily = $state<string>('');
+	availableFamilies = $state<string[]>([]);
 	sortBy = $state<ModelSortOption>('pulls_desc');
 	currentPage = $state(0);
 	pageSize = $state(24);
@@ -69,6 +76,18 @@ class ModelRegistryState {
 				options.capabilities = this.selectedCapabilities;
 			}

+			if (this.selectedSizeRanges.length > 0) {
+				options.sizeRanges = this.selectedSizeRanges;
+			}
+
+			if (this.selectedContextRanges.length > 0) {
+				options.contextRanges = this.selectedContextRanges;
+			}
+
+			if (this.selectedFamily) {
+				options.family = this.selectedFamily;
+			}
+
 			const response = await fetchRemoteModels(options);
 			this.models = response.models;
 			this.total = response.total;
@@ -119,6 +138,68 @@ class ModelRegistryState {
 		return this.selectedCapabilities.includes(capability);
 	}

+	/**
+	 * Toggle a size range filter
+	 */
+	async toggleSizeRange(size: SizeRange): Promise<void> {
+		const index = this.selectedSizeRanges.indexOf(size);
+		if (index === -1) {
+			this.selectedSizeRanges = [...this.selectedSizeRanges, size];
+		} else {
+			this.selectedSizeRanges = this.selectedSizeRanges.filter((s) => s !== size);
+		}
+		this.currentPage = 0;
+		await this.loadModels();
+	}
+
+	/**
+	 * Check if a size range is selected
+	 */
+	hasSizeRange(size: SizeRange): boolean {
+		return this.selectedSizeRanges.includes(size);
+	}
+
+	/**
+	 * Toggle a context range filter
+	 */
+	async toggleContextRange(range: ContextRange): Promise<void> {
+		const index = this.selectedContextRanges.indexOf(range);
+		if (index === -1) {
+			this.selectedContextRanges = [...this.selectedContextRanges, range];
+		} else {
+			this.selectedContextRanges = this.selectedContextRanges.filter((r) => r !== range);
+		}
+		this.currentPage = 0;
+		await this.loadModels();
+	}
+
+	/**
+	 * Check if a context range is selected
+	 */
+	hasContextRange(range: ContextRange): boolean {
+		return this.selectedContextRanges.includes(range);
+	}
+
+	/**
+	 * Set family filter
+	 */
+	async setFamily(family: string): Promise<void> {
+		this.selectedFamily = family;
+		this.currentPage = 0;
+		await this.loadModels();
+	}
+
+	/**
+	 * Load available families for filter dropdown
+	 */
+	async loadFamilies(): Promise<void> {
+		try {
+			this.availableFamilies = await fetchRemoteFamilies();
+		} catch (err) {
+			console.error('Failed to load families:', err);
+		}
+	}
+
 	/**
 	 * Set sort order
 	 */
@@ -200,6 +281,9 @@ class ModelRegistryState {
 		this.searchQuery = '';
 		this.modelType = '';
 		this.selectedCapabilities = [];
+		this.selectedSizeRanges = [];
+		this.selectedContextRanges = [];
+		this.selectedFamily = '';
 		this.sortBy = 'pulls_desc';
 		this.currentPage = 0;
 		await this.loadModels();
@@ -209,7 +293,7 @@ class ModelRegistryState {
 	 * Initialize the store
 	 */
 	async init(): Promise<void> {
-		await Promise.all([this.loadSyncStatus(), this.loadModels()]);
+		await Promise.all([this.loadSyncStatus(), this.loadModels(), this.loadFamilies()]);
 	}
 }

--- a/frontend/src/lib/stores/prompts.svelte.ts
+++ b/frontend/src/lib/stores/prompts.svelte.ts
@@ -21,6 +21,7 @@ export interface Prompt {
 	content: string;
 	description: string;
 	isDefault: boolean;
+	targetCapabilities?: string[];
 	createdAt: Date;
 	updatedAt: Date;
 }
@@ -127,6 +128,7 @@ class PromptsState {
 		content: string;
 		description?: string;
 		isDefault?: boolean;
+		targetCapabilities?: string[];
 	}): Promise<Prompt | null> {
 		try {
 			const result = await createPrompt(data);
@@ -158,7 +160,7 @@ class PromptsState {
 	 */
 	async update(
 		id: string,
-		updates: Partial<{ name: string; content: string; description: string; isDefault: boolean }>
+		updates: Partial<{ name: string; content: string; description: string; isDefault: boolean; targetCapabilities: string[] }>
 	): Promise<boolean> {
 		try {
 			const result = await updatePrompt(id, updates);
--- a/frontend/src/lib/stores/settings.svelte.ts
+++ b/frontend/src/lib/stores/settings.svelte.ts
@@ -6,9 +6,12 @@
 import {
 	type ModelParameters,
 	type ChatSettings,
+	type AutoCompactSettings,
 	DEFAULT_MODEL_PARAMETERS,
 	DEFAULT_CHAT_SETTINGS,
-	PARAMETER_RANGES
+	DEFAULT_AUTO_COMPACT_SETTINGS,
+	PARAMETER_RANGES,
+	AUTO_COMPACT_RANGES
 } from '$lib/types/settings';
 import type { ModelDefaults } from './models.svelte';

@@ -30,6 +33,11 @@ export class SettingsState {
 	// Panel visibility
 	isPanelOpen = $state(false);

+	// Auto-compact settings
+	autoCompactEnabled = $state(DEFAULT_AUTO_COMPACT_SETTINGS.enabled);
+	autoCompactThreshold = $state(DEFAULT_AUTO_COMPACT_SETTINGS.threshold);
+	autoCompactPreserveCount = $state(DEFAULT_AUTO_COMPACT_SETTINGS.preserveCount);
+
 	// Derived: Current model parameters object
 	modelParameters = $derived.by((): ModelParameters => ({
 		temperature: this.temperature,
@@ -141,6 +149,32 @@ export class SettingsState {
 		this.saveToStorage();
 	}

+	/**
+	 * Toggle auto-compact enabled state
+	 */
+	toggleAutoCompact(): void {
+		this.autoCompactEnabled = !this.autoCompactEnabled;
+		this.saveToStorage();
+	}
+
+	/**
+	 * Update auto-compact threshold
+	 */
+	updateAutoCompactThreshold(value: number): void {
+		const range = AUTO_COMPACT_RANGES.threshold;
+		this.autoCompactThreshold = Math.max(range.min, Math.min(range.max, value));
+		this.saveToStorage();
+	}
+
+	/**
+	 * Update auto-compact preserve count
+	 */
+	updateAutoCompactPreserveCount(value: number): void {
+		const range = AUTO_COMPACT_RANGES.preserveCount;
+		this.autoCompactPreserveCount = Math.max(range.min, Math.min(range.max, Math.round(value)));
+		this.saveToStorage();
+	}
+
 	/**
 	 * Load settings from localStorage
 	 */
@@ -151,11 +185,17 @@ export class SettingsState {

 			const settings: ChatSettings = JSON.parse(stored);

+			// Model parameters
 			this.useCustomParameters = settings.useCustomParameters ?? false;
 			this.temperature = settings.modelParameters?.temperature ?? DEFAULT_MODEL_PARAMETERS.temperature;
 			this.top_k = settings.modelParameters?.top_k ?? DEFAULT_MODEL_PARAMETERS.top_k;
 			this.top_p = settings.modelParameters?.top_p ?? DEFAULT_MODEL_PARAMETERS.top_p;
 			this.num_ctx = settings.modelParameters?.num_ctx ?? DEFAULT_MODEL_PARAMETERS.num_ctx;
+
+			// Auto-compact settings
+			this.autoCompactEnabled = settings.autoCompact?.enabled ?? DEFAULT_AUTO_COMPACT_SETTINGS.enabled;
+			this.autoCompactThreshold = settings.autoCompact?.threshold ?? DEFAULT_AUTO_COMPACT_SETTINGS.threshold;
+			this.autoCompactPreserveCount = settings.autoCompact?.preserveCount ?? DEFAULT_AUTO_COMPACT_SETTINGS.preserveCount;
 		} catch (error) {
 			console.warn('[Settings] Failed to load from localStorage:', error);
 		}
@@ -168,7 +208,12 @@ export class SettingsState {
 		try {
 			const settings: ChatSettings = {
 				useCustomParameters: this.useCustomParameters,
-				modelParameters: this.modelParameters
+				modelParameters: this.modelParameters,
+				autoCompact: {
+					enabled: this.autoCompactEnabled,
+					threshold: this.autoCompactThreshold,
+					preserveCount: this.autoCompactPreserveCount
+				}
 			};

 			localStorage.setItem(STORAGE_KEY, JSON.stringify(settings));
--- a/frontend/src/lib/stores/tools.svelte.ts
+++ b/frontend/src/lib/stores/tools.svelte.ts
@@ -110,8 +110,25 @@ class ToolsState {
 			return [];
 		}

-		const definitions = toolRegistry.getDefinitions();
-		return definitions.filter(def => this.isToolEnabled(def.function.name));
+		// Get enabled builtin tools
+		const builtinDefs = toolRegistry.getDefinitions();
+		const enabled = builtinDefs.filter(def => this.isToolEnabled(def.function.name));
+
+		// Add enabled custom tools
+		for (const custom of this.customTools) {
+			if (custom.enabled && this.isToolEnabled(custom.name)) {
+				enabled.push({
+					type: 'function',
+					function: {
+						name: custom.name,
+						description: custom.description,
+						parameters: custom.parameters
+					}
+				});
+			}
+		}
+
+		return enabled;
 	}

 	/**
--- a/frontend/src/lib/tools/builtin.ts
+++ b/frontend/src/lib/tools/builtin.ts
@@ -292,7 +292,9 @@ class MathParser {
 const mathParser = new MathParser();

 const calculateHandler: BuiltinToolHandler<CalculateArgs> = (args) => {
-	const { expression, precision = 10 } = args;
+	const { expression } = args;
+	// Coerce to number - Ollama models sometimes output numbers as strings
+	const precision = Number(args.precision) || 10;

 	try {
 		const result = mathParser.parse(expression);
@@ -423,7 +425,10 @@ async function fetchViaProxy(url: string, maxLength: number, timeout: number): P
 }

 const fetchUrlHandler: BuiltinToolHandler<FetchUrlArgs> = async (args) => {
-	const { url, extract = 'text', maxLength = 50000, timeout = 30 } = args;
+	const { url, extract = 'text' } = args;
+	// Coerce to numbers - Ollama models sometimes output numbers as strings
+	const maxLength = Number(args.maxLength) || 50000;
+	const timeout = Number(args.timeout) || 30;

 	try {
 		const parsedUrl = new URL(url);
@@ -683,7 +688,10 @@ const webSearchDefinition: ToolDefinition = {
 };

 const webSearchHandler: BuiltinToolHandler<WebSearchArgs> = async (args) => {
-	const { query, maxResults = 5, site, freshness, region, timeout } = args;
+	const { query, site, freshness, region } = args;
+	// Coerce to numbers - Ollama models sometimes output numbers as strings
+	const maxResults = Number(args.maxResults) || 5;
+	const timeout = Number(args.timeout) || undefined;

 	if (!query || query.trim() === '') {
 		return { error: 'Search query is required' };
--- a/frontend/src/lib/types/settings.ts
+++ b/frontend/src/lib/types/settings.ts
@@ -77,6 +77,37 @@ export const PARAMETER_DESCRIPTIONS: Record<keyof ModelParameters, string> = {
 	num_ctx: 'Context window size in tokens. Larger uses more memory.'
 };

+/**
+ * Auto-compact settings for automatic context management
+ */
+export interface AutoCompactSettings {
+	/** Whether auto-compact is enabled */
+	enabled: boolean;
+
+	/** Context usage threshold (percentage) to trigger auto-compact */
+	threshold: number;
+
+	/** Number of recent messages to preserve when compacting */
+	preserveCount: number;
+}
+
+/**
+ * Default auto-compact settings
+ */
+export const DEFAULT_AUTO_COMPACT_SETTINGS: AutoCompactSettings = {
+	enabled: false,
+	threshold: 70,
+	preserveCount: 6
+};
+
+/**
+ * Auto-compact parameter ranges for UI
+ */
+export const AUTO_COMPACT_RANGES = {
+	threshold: { min: 50, max: 90, step: 5 },
+	preserveCount: { min: 2, max: 20, step: 1 }
+} as const;
+
 /**
 * Chat settings including model parameters
 */
@@ -86,6 +117,9 @@ export interface ChatSettings {

 	/** Custom model parameters (used when useCustomParameters is true) */
 	modelParameters: ModelParameters;
+
+	/** Auto-compact settings for context management */
+	autoCompact?: AutoCompactSettings;
 }

 /**
@@ -93,5 +127,6 @@ export interface ChatSettings {
 */
 export const DEFAULT_CHAT_SETTINGS: ChatSettings = {
 	useCustomParameters: false,
-	modelParameters: { ...DEFAULT_MODEL_PARAMETERS }
+	modelParameters: { ...DEFAULT_MODEL_PARAMETERS },
+	autoCompact: { ...DEFAULT_AUTO_COMPACT_SETTINGS }
 };
--- a/frontend/src/lib/utils/file-processor.ts
+++ b/frontend/src/lib/utils/file-processor.ts
@@ -150,8 +150,18 @@ async function loadPdfJs(): Promise<typeof import('pdfjs-dist')> {
 	try {
 		pdfjsLib = await import('pdfjs-dist');

-		// Set worker source using CDN for reliability
-		pdfjsLib.GlobalWorkerOptions.workerSrc = `https://cdnjs.cloudflare.com/ajax/libs/pdf.js/${pdfjsLib.version}/pdf.worker.min.mjs`;
+		// Use locally bundled worker (copied to static/ during build)
+		// Falls back to CDN if local worker isn't available
+		const localWorkerPath = '/pdf.worker.min.mjs';
+		const cdnWorkerPath = `https://cdnjs.cloudflare.com/ajax/libs/pdf.js/${pdfjsLib.version}/pdf.worker.min.mjs`;
+
+		// Try local first, with CDN fallback
+		try {
+			const response = await fetch(localWorkerPath, { method: 'HEAD' });
+			pdfjsLib.GlobalWorkerOptions.workerSrc = response.ok ? localWorkerPath : cdnWorkerPath;
+		} catch {
+			pdfjsLib.GlobalWorkerOptions.workerSrc = cdnWorkerPath;
+		}

 		return pdfjsLib;
 	} catch (error) {
--- a/frontend/src/routes/+page.svelte
+++ b/frontend/src/routes/+page.svelte
@@ -7,6 +7,7 @@

 	import { onMount } from 'svelte';
 	import { chatState, conversationsState, modelsState, toolsState, promptsState } from '$lib/stores';
+	import { resolveSystemPrompt } from '$lib/services/prompt-resolution.js';
 	import { streamingMetricsState } from '$lib/stores/streaming-metrics.svelte';
 	import { settingsState } from '$lib/stores/settings.svelte';
 	import { createConversation as createStoredConversation, addMessage as addStoredMessage, updateConversation } from '$lib/storage';
@@ -132,14 +133,13 @@
 				images
 			}];

-			// Build system prompt from active prompt + RAG context
+			// Build system prompt from resolution service + RAG context
 			const systemParts: string[] = [];

-			// Wait for prompts to be loaded, then add system prompt if active
-			await promptsState.ready();
-			const activePrompt = promptsState.activePrompt;
-			if (activePrompt) {
-				systemParts.push(activePrompt.content);
+			// Resolve system prompt using priority chain (model-aware)
+			const resolvedPrompt = await resolveSystemPrompt(model, null, null);
+			if (resolvedPrompt.content) {
+				systemParts.push(resolvedPrompt.content);
 			}

 			// Add RAG context if available
--- a/frontend/src/routes/models/+page.svelte
+++ b/frontend/src/routes/models/+page.svelte
@@ -11,7 +11,10 @@
 	import { modelOperationsState } from '$lib/stores/model-operations.svelte';
 	import { ModelCard } from '$lib/components/models';
 	import PullModelDialog from '$lib/components/models/PullModelDialog.svelte';
+	import ModelEditorDialog from '$lib/components/models/ModelEditorDialog.svelte';
 	import { fetchTagSizes, type RemoteModel } from '$lib/api/model-registry';
+	import { modelInfoService, type ModelInfo } from '$lib/services/model-info-service';
+	import type { ModelEditorMode } from '$lib/stores/model-creation.svelte';

 	// Search debounce
 	let searchInput = $state('');
@@ -40,12 +43,14 @@
 	let pullProgress = $state<{ status: string; completed?: number; total?: number } | null>(null);
 	let pullError = $state<string | null>(null);
 	let loadingSizes = $state(false);
+	let capabilitiesVerified = $state(false); // True if capabilities come from Ollama (installed model)

 	async function handleSelectModel(model: RemoteModel): Promise<void> {
 		selectedModel = model;
 		selectedTag = model.tags[0] || '';
 		pullProgress = null;
 		pullError = null;
+		capabilitiesVerified = false;

 		// Fetch tag sizes if not already loaded
 		if (!model.tagSizes || Object.keys(model.tagSizes).length === 0) {
@@ -60,6 +65,21 @@
 				loadingSizes = false;
 			}
 		}
+
+		// Try to fetch real capabilities from Ollama if model is installed locally
+		// This overrides scraped capabilities from ollama.com with accurate runtime data
+		try {
+			const realCapabilities = await modelsState.fetchCapabilities(model.slug);
+			// fetchCapabilities returns empty array on error, but we check hasCapability to confirm model exists
+			if (modelsState.hasCapability(model.slug, 'completion') || realCapabilities.length > 0) {
+				// Model is installed - use real capabilities from Ollama
+				selectedModel = { ...selectedModel!, capabilities: realCapabilities };
+				capabilitiesVerified = true;
+			}
+		} catch {
+			// Model not installed locally - keep scraped capabilities
+			capabilitiesVerified = false;
+		}
 	}

 	function closeDetails(): void {
@@ -167,6 +187,70 @@
 	let deleting = $state(false);
 	let deleteError = $state<string | null>(null);

+	// Model editor dialog state
+	let modelEditorOpen = $state(false);
+	let modelEditorMode = $state<ModelEditorMode>('create');
+	let editingModelName = $state<string | undefined>(undefined);
+	let editingSystemPrompt = $state<string | undefined>(undefined);
+	let editingBaseModel = $state<string | undefined>(undefined);
+
+	// Cache for model info (to know which models have embedded prompts)
+	let modelInfoCache = $state<Map<string, ModelInfo>>(new Map());
+
+	function openCreateDialog(): void {
+		modelEditorMode = 'create';
+		editingModelName = undefined;
+		editingSystemPrompt = undefined;
+		editingBaseModel = undefined;
+		modelEditorOpen = true;
+	}
+
+	async function openEditDialog(modelName: string): Promise<void> {
+		// Fetch model info to get the current system prompt and base model
+		const info = await modelInfoService.getModelInfo(modelName);
+		if (!info.systemPrompt) {
+			// No embedded prompt - shouldn't happen if we only show edit for models with prompts
+			return;
+		}
+
+		// Get base model from family - if a model has an embedded prompt, its parent is typically
+		// the family base model (e.g., llama3.2). For now, use the model name itself as fallback.
+		const localModel = localModelsState.models.find((m) => m.name === modelName);
+		const baseModel = localModel?.family || modelName;
+
+		modelEditorMode = 'edit';
+		editingModelName = modelName;
+		editingSystemPrompt = info.systemPrompt;
+		editingBaseModel = baseModel;
+		modelEditorOpen = true;
+	}
+
+	function closeModelEditor(): void {
+		modelEditorOpen = false;
+		// Refresh models list after closing (in case a model was created/updated)
+		localModelsState.refresh();
+	}
+
+	// Fetch model info for all local models to determine which have embedded prompts
+	async function fetchModelInfoForLocalModels(): Promise<void> {
+		const newCache = new Map<string, ModelInfo>();
+		for (const model of localModelsState.models) {
+			try {
+				const info = await modelInfoService.getModelInfo(model.name);
+				newCache.set(model.name, info);
+			} catch {
+				// Ignore errors - model might not be accessible
+			}
+		}
+		modelInfoCache = newCache;
+	}
+
+	// Check if a model has an embedded prompt (and thus can be edited)
+	function hasEmbeddedPrompt(modelName: string): boolean {
+		const info = modelInfoCache.get(modelName);
+		return info?.systemPrompt !== null && info?.systemPrompt !== undefined && info.systemPrompt.length > 0;
+	}
+
 	// Delete a local model
 	async function deleteModel(modelName: string): Promise<void> {
 		if (deleting) return;
@@ -214,12 +298,22 @@
 		}, 300);
 	}

+	// Fetch model info when local models change
+	$effect(() => {
+		if (localModelsState.models.length > 0) {
+			fetchModelInfoForLocalModels();
+		}
+	});
+
 	// Initialize on mount
 	onMount(() => {
 		// Initialize stores (backend handles heavy operations)
 		localModelsState.init();
 		modelRegistry.init();
-		modelsState.refresh();
+		modelsState.refresh().then(() => {
+			// Fetch capabilities for all installed models
+			modelsState.fetchAllCapabilities();
+		});
 	});
 </script>

@@ -265,6 +359,17 @@
 							{/if}
 						</button>
 					{:else}
+						<!-- Create Custom Model Button -->
+						<button
+							type="button"
+							onclick={openCreateDialog}
+							class="flex items-center gap-2 rounded-lg bg-violet-600 px-4 py-2 text-sm font-medium text-theme-primary transition-colors hover:bg-violet-500"
+						>
+							<svg xmlns="http://www.w3.org/2000/svg" class="h-4 w-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+								<path stroke-linecap="round" stroke-linejoin="round" d="M12 4v16m8-8H4" />
+							</svg>
+							<span>Create Custom</span>
+						</button>
 						<!-- Pull Model Button -->
 						<button
 							type="button"
@@ -476,6 +581,7 @@
 				{:else}
 					<div class="space-y-2">
 						{#each localModelsState.models as model (model.name)}
+							{@const caps = modelsState.getCapabilities(model.name) ?? []}
 							<div class="group rounded-lg border border-theme bg-theme-secondary p-4 transition-colors hover:border-theme-subtle">
 								<div class="flex items-center justify-between">
 									<div class="flex-1">
@@ -489,6 +595,11 @@
 													Update
 												</span>
 											{/if}
+											{#if hasEmbeddedPrompt(model.name)}
+												<span class="rounded bg-violet-900/50 px-2 py-0.5 text-xs text-violet-300" title="Custom model with embedded system prompt">
+													Custom
+												</span>
+											{/if}
 										</div>
 										<div class="mt-1 flex items-center gap-4 text-xs text-theme-muted">
 											<span>{formatBytes(model.size)}</span>
@@ -496,6 +607,36 @@
 											<span>Parameters: {model.parameterSize}</span>
 											<span>Quantization: {model.quantizationLevel}</span>
 										</div>
+										<!-- Capabilities (from Ollama runtime - verified) -->
+										{#if caps.length > 0}
+											<div class="mt-2 flex flex-wrap gap-1.5">
+												{#if caps.includes('vision')}
+													<span class="inline-flex items-center gap-1 rounded px-1.5 py-0.5 text-xs bg-purple-900/50 text-purple-300">
+														<span>👁</span><span>Vision</span>
+													</span>
+												{/if}
+												{#if caps.includes('tools')}
+													<span class="inline-flex items-center gap-1 rounded px-1.5 py-0.5 text-xs bg-blue-900/50 text-blue-300">
+														<span>🔧</span><span>Tools</span>
+													</span>
+												{/if}
+												{#if caps.includes('thinking')}
+													<span class="inline-flex items-center gap-1 rounded px-1.5 py-0.5 text-xs bg-pink-900/50 text-pink-300">
+														<span>🧠</span><span>Thinking</span>
+													</span>
+												{/if}
+												{#if caps.includes('embedding')}
+													<span class="inline-flex items-center gap-1 rounded px-1.5 py-0.5 text-xs bg-amber-900/50 text-amber-300">
+														<span>📊</span><span>Embedding</span>
+													</span>
+												{/if}
+												{#if caps.includes('code')}
+													<span class="inline-flex items-center gap-1 rounded px-1.5 py-0.5 text-xs bg-emerald-900/50 text-emerald-300">
+														<span>💻</span><span>Code</span>
+													</span>
+												{/if}
+											</div>
+										{/if}
 									</div>
 									<div class="flex items-center gap-2">
 										{#if deleteConfirm === model.name}
@@ -517,6 +658,18 @@
 												No
 											</button>
 										{:else}
+											{#if hasEmbeddedPrompt(model.name)}
+												<button
+													type="button"
+													onclick={() => openEditDialog(model.name)}
+													class="rounded p-2 text-theme-muted opacity-0 transition-opacity hover:bg-theme-tertiary hover:text-violet-400 group-hover:opacity-100"
+													title="Edit system prompt"
+												>
+													<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+														<path stroke-linecap="round" stroke-linejoin="round" d="M11 5H6a2 2 0 00-2 2v11a2 2 0 002 2h11a2 2 0 002-2v-5m-1.414-9.414a2 2 0 112.828 2.828L11.828 15H9v-2.828l8.586-8.586z" />
+													</svg>
+												</button>
+											{/if}
 											<button
 												type="button"
 												onclick={() => deleteConfirm = model.name}
@@ -641,7 +794,7 @@
 			</div>

 			<!-- Capability Filters (matches ollama.com capabilities) -->
-			<div class="mb-6 flex flex-wrap items-center gap-2">
+			<div class="mb-4 flex flex-wrap items-center gap-2">
 				<span class="text-sm text-theme-muted">Capabilities:</span>
 				<button
 					type="button"
@@ -694,13 +847,81 @@
 					<span>Cloud</span>
 				</button>

-				{#if modelRegistry.selectedCapabilities.length > 0 || modelRegistry.modelType || modelRegistry.searchQuery || modelRegistry.sortBy !== 'pulls_desc'}
+				<!-- Capability info notice -->
+				<span class="ml-2 text-xs text-theme-muted" title="Capability data is sourced from ollama.com and may not be accurate. Actual capabilities are verified once a model is installed locally.">
+					<svg xmlns="http://www.w3.org/2000/svg" class="inline h-3.5 w-3.5 opacity-60" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+						<path stroke-linecap="round" stroke-linejoin="round" d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
+					</svg>
+					<span class="opacity-60">from ollama.com</span>
+				</span>
+			</div>
+
+			<!-- Size Range Filters -->
+			<div class="mb-4 flex flex-wrap items-center gap-2">
+				<span class="text-sm text-theme-muted">Size:</span>
+				<button
+					type="button"
+					onclick={() => modelRegistry.toggleSizeRange('small')}
+					class="rounded-full px-3 py-1 text-sm transition-colors {modelRegistry.hasSizeRange('small')
+						? 'bg-emerald-600 text-theme-primary'
+						: 'bg-theme-secondary text-theme-muted hover:bg-theme-tertiary hover:text-theme-primary'}"
+				>
+					≤3B
+				</button>
+				<button
+					type="button"
+					onclick={() => modelRegistry.toggleSizeRange('medium')}
+					class="rounded-full px-3 py-1 text-sm transition-colors {modelRegistry.hasSizeRange('medium')
+						? 'bg-emerald-600 text-theme-primary'
+						: 'bg-theme-secondary text-theme-muted hover:bg-theme-tertiary hover:text-theme-primary'}"
+				>
+					4-13B
+				</button>
+				<button
+					type="button"
+					onclick={() => modelRegistry.toggleSizeRange('large')}
+					class="rounded-full px-3 py-1 text-sm transition-colors {modelRegistry.hasSizeRange('large')
+						? 'bg-emerald-600 text-theme-primary'
+						: 'bg-theme-secondary text-theme-muted hover:bg-theme-tertiary hover:text-theme-primary'}"
+				>
+					14-70B
+				</button>
+				<button
+					type="button"
+					onclick={() => modelRegistry.toggleSizeRange('xlarge')}
+					class="rounded-full px-3 py-1 text-sm transition-colors {modelRegistry.hasSizeRange('xlarge')
+						? 'bg-emerald-600 text-theme-primary'
+						: 'bg-theme-secondary text-theme-muted hover:bg-theme-tertiary hover:text-theme-primary'}"
+				>
+					>70B
+				</button>
+			</div>
+
+			<!-- Family Filter + Clear All -->
+			<div class="mb-6 flex flex-wrap items-center gap-4">
+				{#if modelRegistry.availableFamilies.length > 0}
+					<div class="flex items-center gap-2">
+						<span class="text-sm text-theme-muted">Family:</span>
+						<select
+							value={modelRegistry.selectedFamily}
+							onchange={(e) => modelRegistry.setFamily((e.target as HTMLSelectElement).value)}
+							class="rounded-lg border border-theme bg-theme-secondary px-3 py-1.5 text-sm text-theme-primary focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500"
+						>
+							<option value="">All Families</option>
+							{#each modelRegistry.availableFamilies as family}
+								<option value={family}>{family}</option>
+							{/each}
+						</select>
+					</div>
+				{/if}
+
+				{#if modelRegistry.selectedCapabilities.length > 0 || modelRegistry.selectedSizeRanges.length > 0 || modelRegistry.selectedFamily || modelRegistry.modelType || modelRegistry.searchQuery || modelRegistry.sortBy !== 'pulls_desc'}
 					<button
 						type="button"
 						onclick={() => { modelRegistry.clearFilters(); searchInput = ''; }}
-						class="ml-2 text-sm text-theme-muted hover:text-theme-primary"
+						class="text-sm text-theme-muted hover:text-theme-primary"
 					>
-						Clear filters
+						Clear all filters
 					</button>
 				{/if}
 			</div>
@@ -826,14 +1047,40 @@
 			{/if}

 			<!-- Capabilities -->
-			{#if selectedModel.capabilities.length > 0}
+			{#if selectedModel.capabilities.length > 0 || !capabilitiesVerified}
 				<div class="mb-6">
-					<h3 class="mb-2 text-sm font-medium text-theme-secondary">Capabilities</h3>
-					<div class="flex flex-wrap gap-2">
-						{#each selectedModel.capabilities as cap}
-							<span class="rounded bg-theme-tertiary px-2 py-1 text-xs text-theme-secondary">{cap}</span>
-						{/each}
-					</div>
+					<h3 class="mb-2 flex items-center gap-2 text-sm font-medium text-theme-secondary">
+						<span>Capabilities</span>
+						{#if capabilitiesVerified}
+							<span class="inline-flex items-center gap-1 rounded bg-green-900/30 px-1.5 py-0.5 text-xs text-green-400" title="Capabilities verified from installed model">
+								<svg xmlns="http://www.w3.org/2000/svg" class="h-3 w-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+									<path stroke-linecap="round" stroke-linejoin="round" d="M5 13l4 4L19 7" />
+								</svg>
+								verified
+							</span>
+						{:else}
+							<span class="inline-flex items-center gap-1 rounded bg-amber-900/30 px-1.5 py-0.5 text-xs text-amber-400" title="Capabilities sourced from ollama.com - install model for verified data">
+								<svg xmlns="http://www.w3.org/2000/svg" class="h-3 w-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+									<path stroke-linecap="round" stroke-linejoin="round" d="M12 9v2m0 4h.01m-6.938 4h13.856c1.54 0 2.502-1.667 1.732-3L13.732 4c-.77-1.333-2.694-1.333-3.464 0L3.34 16c-.77 1.333.192 3 1.732 3z" />
+								</svg>
+								unverified
+							</span>
+						{/if}
+					</h3>
+					{#if selectedModel.capabilities.length > 0}
+						<div class="flex flex-wrap gap-2">
+							{#each selectedModel.capabilities as cap}
+								<span class="rounded bg-theme-tertiary px-2 py-1 text-xs text-theme-secondary">{cap}</span>
+							{/each}
+						</div>
+					{:else}
+						<p class="text-xs text-theme-muted">No capabilities reported</p>
+					{/if}
+					{#if !capabilitiesVerified}
+						<p class="mt-2 text-xs text-theme-muted">
+							Install model to verify actual capabilities
+						</p>
+					{/if}
 				</div>
 			{/if}

@@ -1031,6 +1278,16 @@
 <!-- Pull Model Dialog -->
 <PullModelDialog />

+<!-- Model Editor Dialog (Create/Edit) -->
+<ModelEditorDialog
+	isOpen={modelEditorOpen}
+	mode={modelEditorMode}
+	editingModel={editingModelName}
+	currentSystemPrompt={editingSystemPrompt}
+	baseModel={editingBaseModel}
+	onClose={closeModelEditor}
+/>
+
 <!-- Active Pulls Progress (fixed bottom bar) -->
 {#if modelOperationsState.activePulls.size > 0}
 	<div class="fixed bottom-0 left-0 right-0 z-40 border-t border-theme bg-theme-secondary/95 p-4 backdrop-blur-sm">
--- a/frontend/src/routes/prompts/+page.svelte
+++ b/frontend/src/routes/prompts/+page.svelte
@@ -15,14 +15,24 @@
 	let formDescription = $state('');
 	let formContent = $state('');
 	let formIsDefault = $state(false);
+	let formTargetCapabilities = $state<string[]>([]);
 	let isSaving = $state(false);

+	// Available capabilities for targeting
+	const CAPABILITIES = [
+		{ id: 'code', label: 'Code', description: 'Auto-use with coding models' },
+		{ id: 'vision', label: 'Vision', description: 'Auto-use with vision models' },
+		{ id: 'thinking', label: 'Thinking', description: 'Auto-use with reasoning models' },
+		{ id: 'tools', label: 'Tools', description: 'Auto-use with tool-capable models' }
+	] as const;
+
 	function openCreateEditor(): void {
 		editingPrompt = null;
 		formName = '';
 		formDescription = '';
 		formContent = '';
 		formIsDefault = false;
+		formTargetCapabilities = [];
 		showEditor = true;
 	}

@@ -32,6 +42,7 @@
 		formDescription = prompt.description;
 		formContent = prompt.content;
 		formIsDefault = prompt.isDefault;
+		formTargetCapabilities = prompt.targetCapabilities ?? [];
 		showEditor = true;
 	}

@@ -45,19 +56,22 @@

 		isSaving = true;
 		try {
+			const capabilities = formTargetCapabilities.length > 0 ? formTargetCapabilities : undefined;
 			if (editingPrompt) {
 				await promptsState.update(editingPrompt.id, {
 					name: formName.trim(),
 					description: formDescription.trim(),
 					content: formContent,
-					isDefault: formIsDefault
+					isDefault: formIsDefault,
+					targetCapabilities: capabilities ?? []
 				});
 			} else {
 				await promptsState.add({
 					name: formName.trim(),
 					description: formDescription.trim(),
 					content: formContent,
-					isDefault: formIsDefault
+					isDefault: formIsDefault,
+					targetCapabilities: capabilities
 				});
 			}
 			closeEditor();
@@ -66,6 +80,14 @@
 		}
 	}

+	function toggleCapability(capId: string): void {
+		if (formTargetCapabilities.includes(capId)) {
+			formTargetCapabilities = formTargetCapabilities.filter(c => c !== capId);
+		} else {
+			formTargetCapabilities = [...formTargetCapabilities, capId];
+		}
+	}
+
 	async function handleDelete(prompt: Prompt): Promise<void> {
 		if (confirm(`Delete "${prompt.name}"? This cannot be undone.`)) {
 			await promptsState.remove(prompt.id);
@@ -166,7 +188,7 @@
 					>
 						<div class="flex items-start justify-between gap-4">
 							<div class="min-w-0 flex-1">
-								<div class="flex items-center gap-2">
+								<div class="flex flex-wrap items-center gap-2">
 									<h3 class="font-medium text-theme-primary">{prompt.name}</h3>
 									{#if prompt.isDefault}
 										<span class="rounded bg-blue-900 px-2 py-0.5 text-xs text-blue-300">
@@ -178,6 +200,13 @@
 											active
 										</span>
 									{/if}
+									{#if prompt.targetCapabilities && prompt.targetCapabilities.length > 0}
+										{#each prompt.targetCapabilities as cap (cap)}
+											<span class="rounded bg-purple-900/50 px-2 py-0.5 text-xs text-purple-300">
+												{cap}
+											</span>
+										{/each}
+									{/if}
 								</div>
 								{#if prompt.description}
 									<p class="mt-1 text-sm text-theme-muted">{prompt.description}</p>
@@ -259,8 +288,9 @@
 				(e.g., code reviewer, writing helper) or to enforce specific response formats.
 			</p>
 			<p class="mt-2 text-sm text-theme-muted">
-				<strong class="text-theme-secondary">Default prompt:</strong> Automatically used for all new chats.
+				<strong class="text-theme-secondary">Default prompt:</strong> Used for all new chats unless overridden.
 				<strong class="text-theme-secondary">Active prompt:</strong> Currently selected for your session.
+				<strong class="text-theme-secondary">Capability targeting:</strong> Auto-matches prompts to models with specific capabilities (code, vision, thinking, tools).
 			</p>
 		</section>
 	</div>
@@ -353,6 +383,28 @@
 							Set as default for new chats
 						</label>
 					</div>
+
+					<!-- Capability targeting -->
+					<div>
+						<label class="mb-2 block text-sm font-medium text-theme-secondary">
+							Auto-use for model types
+						</label>
+						<p class="mb-3 text-xs text-theme-muted">
+							When a model has these capabilities and no other prompt is selected, this prompt will be used automatically.
+						</p>
+						<div class="flex flex-wrap gap-2">
+							{#each CAPABILITIES as cap (cap.id)}
+								<button
+									type="button"
+									onclick={() => toggleCapability(cap.id)}
+									class="rounded-lg border px-3 py-1.5 text-sm transition-colors {formTargetCapabilities.includes(cap.id) ? 'border-blue-500 bg-blue-500/20 text-blue-300' : 'border-theme-subtle bg-theme-tertiary text-theme-muted hover:border-theme hover:text-theme-secondary'}"
+									title={cap.description}
+								>
+									{cap.label}
+								</button>
+							{/each}
+						</div>
+					</div>
 				</div>

 				<!-- Actions -->
--- a/frontend/src/routes/settings/+page.svelte
+++ b/frontend/src/routes/settings/+page.svelte
@@ -0,0 +1,492 @@
+<script lang="ts">
+	/**
+	 * Settings page
+	 * Comprehensive settings for appearance, models, memory, and more
+	 */
+
+	import { onMount } from 'svelte';
+	import { modelsState, uiState, settingsState, promptsState } from '$lib/stores';
+	import { modelPromptMappingsState } from '$lib/stores/model-prompt-mappings.svelte.js';
+	import { modelInfoService, type ModelInfo } from '$lib/services/model-info-service.js';
+	import { getPrimaryModifierDisplay } from '$lib/utils';
+	import { PARAMETER_RANGES, PARAMETER_LABELS, PARAMETER_DESCRIPTIONS, AUTO_COMPACT_RANGES } from '$lib/types/settings';
+
+	const modifierKey = getPrimaryModifierDisplay();
+
+	// Model info cache for the settings page
+	let modelInfoCache = $state<Map<string, ModelInfo>>(new Map());
+	let isLoadingModelInfo = $state(false);
+
+	// Load model info for all available models
+	onMount(async () => {
+		isLoadingModelInfo = true;
+		try {
+			const models = modelsState.chatModels;
+			const infos = await Promise.all(
+				models.map(async (model) => {
+					const info = await modelInfoService.getModelInfo(model.name);
+					return [model.name, info] as [string, ModelInfo];
+				})
+			);
+			modelInfoCache = new Map(infos);
+		} finally {
+			isLoadingModelInfo = false;
+		}
+	});
+
+	// Handle prompt selection for a model
+	async function handleModelPromptChange(modelName: string, promptId: string | null): Promise<void> {
+		await modelPromptMappingsState.setMapping(modelName, promptId);
+	}
+
+	// Get the currently mapped prompt ID for a model
+	function getMappedPromptId(modelName: string): string | null {
+		return modelPromptMappingsState.getMapping(modelName);
+	}
+
+	// Local state for default model selection
+	let defaultModel = $state<string | null>(modelsState.selectedId);
+
+	// Save default model when it changes
+	function handleModelChange(): void {
+		if (defaultModel) {
+			modelsState.select(defaultModel);
+		}
+	}
+
+	// Get current model defaults for reset functionality
+	const currentModelDefaults = $derived(
+		modelsState.selectedId ? modelsState.getModelDefaults(modelsState.selectedId) : undefined
+	);
+</script>
+
+<div class="h-full overflow-y-auto bg-theme-primary p-6">
+	<div class="mx-auto max-w-4xl">
+		<!-- Header -->
+		<div class="mb-8">
+			<h1 class="text-2xl font-bold text-theme-primary">Settings</h1>
+			<p class="mt-1 text-sm text-theme-muted">
+				Configure appearance, model defaults, and behavior
+			</p>
+		</div>
+
+		<!-- Appearance Section -->
+		<section class="mb-8">
+			<h2 class="mb-4 flex items-center gap-2 text-lg font-semibold text-theme-primary">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-purple-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M7 21a4 4 0 01-4-4V5a2 2 0 012-2h4a2 2 0 012 2v12a4 4 0 01-4 4zm0 0h12a2 2 0 002-2v-4a2 2 0 00-2-2h-2.343M11 7.343l1.657-1.657a2 2 0 012.828 0l2.829 2.829a2 2 0 010 2.828l-8.486 8.485M7 17h.01" />
+				</svg>
+				Appearance
+			</h2>
+
+			<div class="rounded-lg border border-theme bg-theme-secondary p-4 space-y-4">
+				<!-- Dark Mode Toggle -->
+				<div class="flex items-center justify-between">
+					<div>
+						<p class="text-sm font-medium text-theme-secondary">Dark Mode</p>
+						<p class="text-xs text-theme-muted">Toggle between light and dark theme</p>
+					</div>
+					<button
+						type="button"
+						onclick={() => uiState.toggleDarkMode()}
+						class="relative inline-flex h-6 w-11 flex-shrink-0 cursor-pointer rounded-full border-2 border-transparent transition-colors duration-200 ease-in-out focus:outline-none focus:ring-2 focus:ring-purple-500 focus:ring-offset-2 focus:ring-offset-theme {uiState.darkMode ? 'bg-purple-600' : 'bg-theme-tertiary'}"
+						role="switch"
+						aria-checked={uiState.darkMode}
+					>
+						<span
+							class="pointer-events-none inline-block h-5 w-5 transform rounded-full bg-white shadow ring-0 transition duration-200 ease-in-out {uiState.darkMode ? 'translate-x-5' : 'translate-x-0'}"
+						></span>
+					</button>
+				</div>
+
+				<!-- System Theme Sync -->
+				<div class="flex items-center justify-between">
+					<div>
+						<p class="text-sm font-medium text-theme-secondary">Use System Theme</p>
+						<p class="text-xs text-theme-muted">Match your OS light/dark preference</p>
+					</div>
+					<button
+						type="button"
+						onclick={() => uiState.useSystemTheme()}
+						class="rounded-lg bg-theme-tertiary px-3 py-1.5 text-xs font-medium text-theme-secondary transition-colors hover:bg-theme-hover"
+					>
+						Sync with System
+					</button>
+				</div>
+			</div>
+		</section>
+
+		<!-- Chat Defaults Section -->
+		<section class="mb-8">
+			<h2 class="mb-4 flex items-center gap-2 text-lg font-semibold text-theme-primary">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-cyan-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M8 12h.01M12 12h.01M16 12h.01M21 12c0 4.418-4.03 8-9 8a9.863 9.863 0 01-4.255-.949L3 20l1.395-3.72C3.512 15.042 3 13.574 3 12c0-4.418 4.03-8 9-8s9 3.582 9 8z" />
+				</svg>
+				Chat Defaults
+			</h2>
+
+			<div class="rounded-lg border border-theme bg-theme-secondary p-4">
+				<div>
+					<label for="default-model" class="text-sm font-medium text-theme-secondary">Default Model</label>
+					<p class="text-xs text-theme-muted mb-2">Model used for new conversations</p>
+					<select
+						id="default-model"
+						bind:value={defaultModel}
+						onchange={handleModelChange}
+						class="w-full rounded-lg border border-theme-subtle bg-theme-tertiary px-3 py-2 text-theme-secondary focus:border-cyan-500 focus:outline-none focus:ring-1 focus:ring-cyan-500"
+					>
+						{#each modelsState.chatModels as model}
+							<option value={model.name}>{model.name}</option>
+						{/each}
+					</select>
+				</div>
+			</div>
+		</section>
+
+		<!-- Model-Prompt Defaults Section -->
+		<section class="mb-8">
+			<h2 class="mb-4 flex items-center gap-2 text-lg font-semibold text-theme-primary">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-violet-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M19.5 14.25v-2.625a3.375 3.375 0 00-3.375-3.375h-1.5A1.125 1.125 0 0113.5 7.125v-1.5a3.375 3.375 0 00-3.375-3.375H8.25m0 12.75h7.5m-7.5 3H12M10.5 2.25H5.625c-.621 0-1.125.504-1.125 1.125v17.25c0 .621.504 1.125 1.125 1.125h12.75c.621 0 1.125-.504 1.125-1.125V11.25a9 9 0 00-9-9z" />
+				</svg>
+				Model-Prompt Defaults
+			</h2>
+
+			<div class="rounded-lg border border-theme bg-theme-secondary p-4">
+				<p class="text-sm text-theme-muted mb-4">
+					Set default system prompts for specific models. When no other prompt is selected, the model's default will be used automatically.
+				</p>
+
+				{#if isLoadingModelInfo}
+					<div class="flex items-center justify-center py-8">
+						<div class="h-6 w-6 animate-spin rounded-full border-2 border-theme-subtle border-t-violet-500"></div>
+						<span class="ml-2 text-sm text-theme-muted">Loading model info...</span>
+					</div>
+				{:else if modelsState.chatModels.length === 0}
+					<p class="text-sm text-theme-muted py-4 text-center">
+						No models available. Make sure Ollama is running.
+					</p>
+				{:else}
+					<div class="space-y-3">
+						{#each modelsState.chatModels as model (model.name)}
+							{@const modelInfo = modelInfoCache.get(model.name)}
+							{@const mappedPromptId = getMappedPromptId(model.name)}
+							<div class="rounded-lg border border-theme-subtle bg-theme-tertiary p-3">
+								<div class="flex items-start justify-between gap-4">
+									<div class="min-w-0 flex-1">
+										<div class="flex flex-wrap items-center gap-2">
+											<span class="font-medium text-theme-primary text-sm">{model.name}</span>
+											{#if modelInfo?.capabilities && modelInfo.capabilities.length > 0}
+												{#each modelInfo.capabilities as cap (cap)}
+													<span class="rounded bg-violet-900/50 px-1.5 py-0.5 text-xs text-violet-300">
+														{cap}
+													</span>
+												{/each}
+											{/if}
+											{#if modelInfo?.systemPrompt}
+												<span class="rounded bg-amber-900/50 px-1.5 py-0.5 text-xs text-amber-300" title="This model has a built-in system prompt">
+													embedded
+												</span>
+											{/if}
+										</div>
+									</div>
+
+									<select
+										value={mappedPromptId ?? ''}
+										onchange={(e) => {
+											const value = e.currentTarget.value;
+											handleModelPromptChange(model.name, value === '' ? null : value);
+										}}
+										class="rounded-lg border border-theme-subtle bg-theme-secondary px-2 py-1 text-sm text-theme-secondary focus:border-violet-500 focus:outline-none focus:ring-1 focus:ring-violet-500"
+									>
+										<option value="">
+											{modelInfo?.systemPrompt ? 'Use embedded prompt' : 'No default'}
+										</option>
+										{#each promptsState.prompts as prompt (prompt.id)}
+											<option value={prompt.id}>{prompt.name}</option>
+										{/each}
+									</select>
+								</div>
+
+								{#if modelInfo?.systemPrompt}
+									<p class="mt-2 text-xs text-theme-muted line-clamp-2">
+										<span class="font-medium text-amber-400">Embedded:</span> {modelInfo.systemPrompt}
+									</p>
+								{/if}
+							</div>
+						{/each}
+					</div>
+				{/if}
+			</div>
+		</section>
+
+		<!-- Model Parameters Section -->
+		<section class="mb-8">
+			<h2 class="mb-4 flex items-center gap-2 text-lg font-semibold text-theme-primary">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-orange-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M12 6V4m0 2a2 2 0 100 4m0-4a2 2 0 110 4m-6 8a2 2 0 100-4m0 4a2 2 0 110-4m0 4v2m0-6V4m6 6v10m6-2a2 2 0 100-4m0 4a2 2 0 110-4m0 4v2m0-6V4" />
+				</svg>
+				Model Parameters
+			</h2>
+
+			<div class="rounded-lg border border-theme bg-theme-secondary p-4 space-y-4">
+				<!-- Use Custom Parameters Toggle -->
+				<div class="flex items-center justify-between pb-4 border-b border-theme">
+					<div>
+						<p class="text-sm font-medium text-theme-secondary">Use Custom Parameters</p>
+						<p class="text-xs text-theme-muted">Override model defaults with custom values</p>
+					</div>
+					<button
+						type="button"
+						onclick={() => settingsState.toggleCustomParameters(currentModelDefaults)}
+						class="relative inline-flex h-6 w-11 flex-shrink-0 cursor-pointer rounded-full border-2 border-transparent transition-colors duration-200 ease-in-out focus:outline-none focus:ring-2 focus:ring-orange-500 focus:ring-offset-2 focus:ring-offset-theme {settingsState.useCustomParameters ? 'bg-orange-600' : 'bg-theme-tertiary'}"
+						role="switch"
+						aria-checked={settingsState.useCustomParameters}
+					>
+						<span
+							class="pointer-events-none inline-block h-5 w-5 transform rounded-full bg-white shadow ring-0 transition duration-200 ease-in-out {settingsState.useCustomParameters ? 'translate-x-5' : 'translate-x-0'}"
+						></span>
+					</button>
+				</div>
+
+				{#if settingsState.useCustomParameters}
+					<!-- Temperature -->
+					<div>
+						<div class="flex items-center justify-between mb-1">
+							<label for="temperature" class="text-sm font-medium text-theme-secondary">{PARAMETER_LABELS.temperature}</label>
+							<span class="text-sm text-theme-muted">{settingsState.temperature.toFixed(2)}</span>
+						</div>
+						<p class="text-xs text-theme-muted mb-2">{PARAMETER_DESCRIPTIONS.temperature}</p>
+						<input
+							id="temperature"
+							type="range"
+							min={PARAMETER_RANGES.temperature.min}
+							max={PARAMETER_RANGES.temperature.max}
+							step={PARAMETER_RANGES.temperature.step}
+							value={settingsState.temperature}
+							oninput={(e) => settingsState.updateParameter('temperature', parseFloat(e.currentTarget.value))}
+							class="w-full accent-orange-500"
+						/>
+					</div>
+
+					<!-- Top K -->
+					<div>
+						<div class="flex items-center justify-between mb-1">
+							<label for="top_k" class="text-sm font-medium text-theme-secondary">{PARAMETER_LABELS.top_k}</label>
+							<span class="text-sm text-theme-muted">{settingsState.top_k}</span>
+						</div>
+						<p class="text-xs text-theme-muted mb-2">{PARAMETER_DESCRIPTIONS.top_k}</p>
+						<input
+							id="top_k"
+							type="range"
+							min={PARAMETER_RANGES.top_k.min}
+							max={PARAMETER_RANGES.top_k.max}
+							step={PARAMETER_RANGES.top_k.step}
+							value={settingsState.top_k}
+							oninput={(e) => settingsState.updateParameter('top_k', parseInt(e.currentTarget.value))}
+							class="w-full accent-orange-500"
+						/>
+					</div>
+
+					<!-- Top P -->
+					<div>
+						<div class="flex items-center justify-between mb-1">
+							<label for="top_p" class="text-sm font-medium text-theme-secondary">{PARAMETER_LABELS.top_p}</label>
+							<span class="text-sm text-theme-muted">{settingsState.top_p.toFixed(2)}</span>
+						</div>
+						<p class="text-xs text-theme-muted mb-2">{PARAMETER_DESCRIPTIONS.top_p}</p>
+						<input
+							id="top_p"
+							type="range"
+							min={PARAMETER_RANGES.top_p.min}
+							max={PARAMETER_RANGES.top_p.max}
+							step={PARAMETER_RANGES.top_p.step}
+							value={settingsState.top_p}
+							oninput={(e) => settingsState.updateParameter('top_p', parseFloat(e.currentTarget.value))}
+							class="w-full accent-orange-500"
+						/>
+					</div>
+
+					<!-- Context Length -->
+					<div>
+						<div class="flex items-center justify-between mb-1">
+							<label for="num_ctx" class="text-sm font-medium text-theme-secondary">{PARAMETER_LABELS.num_ctx}</label>
+							<span class="text-sm text-theme-muted">{settingsState.num_ctx.toLocaleString()}</span>
+						</div>
+						<p class="text-xs text-theme-muted mb-2">{PARAMETER_DESCRIPTIONS.num_ctx}</p>
+						<input
+							id="num_ctx"
+							type="range"
+							min={PARAMETER_RANGES.num_ctx.min}
+							max={PARAMETER_RANGES.num_ctx.max}
+							step={PARAMETER_RANGES.num_ctx.step}
+							value={settingsState.num_ctx}
+							oninput={(e) => settingsState.updateParameter('num_ctx', parseInt(e.currentTarget.value))}
+							class="w-full accent-orange-500"
+						/>
+					</div>
+
+					<!-- Reset Button -->
+					<div class="pt-2">
+						<button
+							type="button"
+							onclick={() => settingsState.resetToDefaults(currentModelDefaults)}
+							class="text-sm text-orange-400 hover:text-orange-300 transition-colors"
+						>
+							Reset to model defaults
+						</button>
+					</div>
+				{:else}
+					<p class="text-sm text-theme-muted py-2">
+						Using model defaults. Enable custom parameters to adjust temperature, sampling, and context length.
+					</p>
+				{/if}
+			</div>
+		</section>
+
+		<!-- Memory Management Section -->
+		<section class="mb-8">
+			<h2 class="mb-4 flex items-center gap-2 text-lg font-semibold text-theme-primary">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-emerald-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M4 7v10c0 2.21 3.582 4 8 4s8-1.79 8-4V7M4 7c0 2.21 3.582 4 8 4s8-1.79 8-4M4 7c0-2.21 3.582-4 8-4s8 1.79 8 4m0 5c0 2.21-3.582 4-8 4s-8-1.79-8-4" />
+				</svg>
+				Memory Management
+			</h2>
+
+			<div class="rounded-lg border border-theme bg-theme-secondary p-4 space-y-4">
+				<!-- Auto-Compact Toggle -->
+				<div class="flex items-center justify-between pb-4 border-b border-theme">
+					<div>
+						<p class="text-sm font-medium text-theme-secondary">Auto-Compact</p>
+						<p class="text-xs text-theme-muted">Automatically summarize older messages when context usage is high</p>
+					</div>
+					<button
+						type="button"
+						onclick={() => settingsState.toggleAutoCompact()}
+						class="relative inline-flex h-6 w-11 flex-shrink-0 cursor-pointer rounded-full border-2 border-transparent transition-colors duration-200 ease-in-out focus:outline-none focus:ring-2 focus:ring-emerald-500 focus:ring-offset-2 focus:ring-offset-theme {settingsState.autoCompactEnabled ? 'bg-emerald-600' : 'bg-theme-tertiary'}"
+						role="switch"
+						aria-checked={settingsState.autoCompactEnabled}
+					>
+						<span
+							class="pointer-events-none inline-block h-5 w-5 transform rounded-full bg-white shadow ring-0 transition duration-200 ease-in-out {settingsState.autoCompactEnabled ? 'translate-x-5' : 'translate-x-0'}"
+						></span>
+					</button>
+				</div>
+
+				{#if settingsState.autoCompactEnabled}
+					<!-- Threshold Slider -->
+					<div>
+						<div class="flex items-center justify-between mb-1">
+							<label for="compact-threshold" class="text-sm font-medium text-theme-secondary">Context Threshold</label>
+							<span class="text-sm text-theme-muted">{settingsState.autoCompactThreshold}%</span>
+						</div>
+						<p class="text-xs text-theme-muted mb-2">Trigger compaction when context usage exceeds this percentage</p>
+						<input
+							id="compact-threshold"
+							type="range"
+							min={AUTO_COMPACT_RANGES.threshold.min}
+							max={AUTO_COMPACT_RANGES.threshold.max}
+							step={AUTO_COMPACT_RANGES.threshold.step}
+							value={settingsState.autoCompactThreshold}
+							oninput={(e) => settingsState.updateAutoCompactThreshold(parseInt(e.currentTarget.value))}
+							class="w-full accent-emerald-500"
+						/>
+						<div class="flex justify-between text-xs text-theme-muted mt-1">
+							<span>{AUTO_COMPACT_RANGES.threshold.min}%</span>
+							<span>{AUTO_COMPACT_RANGES.threshold.max}%</span>
+						</div>
+					</div>
+
+					<!-- Preserve Count -->
+					<div>
+						<div class="flex items-center justify-between mb-1">
+							<label for="preserve-count" class="text-sm font-medium text-theme-secondary">Messages to Preserve</label>
+							<span class="text-sm text-theme-muted">{settingsState.autoCompactPreserveCount}</span>
+						</div>
+						<p class="text-xs text-theme-muted mb-2">Number of recent messages to keep intact (not summarized)</p>
+						<input
+							id="preserve-count"
+							type="range"
+							min={AUTO_COMPACT_RANGES.preserveCount.min}
+							max={AUTO_COMPACT_RANGES.preserveCount.max}
+							step={AUTO_COMPACT_RANGES.preserveCount.step}
+							value={settingsState.autoCompactPreserveCount}
+							oninput={(e) => settingsState.updateAutoCompactPreserveCount(parseInt(e.currentTarget.value))}
+							class="w-full accent-emerald-500"
+						/>
+						<div class="flex justify-between text-xs text-theme-muted mt-1">
+							<span>{AUTO_COMPACT_RANGES.preserveCount.min}</span>
+							<span>{AUTO_COMPACT_RANGES.preserveCount.max}</span>
+						</div>
+					</div>
+				{:else}
+					<p class="text-sm text-theme-muted py-2">
+						Enable auto-compact to automatically manage context usage. When enabled, older messages
+						will be summarized when context usage exceeds your threshold.
+					</p>
+				{/if}
+			</div>
+		</section>
+
+		<!-- Keyboard Shortcuts Section -->
+		<section class="mb-8">
+			<h2 class="mb-4 flex items-center gap-2 text-lg font-semibold text-theme-primary">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-blue-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M12 6.253v13m0-13C10.832 5.477 9.246 5 7.5 5S4.168 5.477 3 6.253v13C4.168 18.477 5.754 18 7.5 18s3.332.477 4.5 1.253m0-13C13.168 5.477 14.754 5 16.5 5c1.747 0 3.332.477 4.5 1.253v13C19.832 18.477 18.247 18 16.5 18c-1.746 0-3.332.477-4.5 1.253" />
+				</svg>
+				Keyboard Shortcuts
+			</h2>
+
+			<div class="rounded-lg border border-theme bg-theme-secondary p-4">
+				<div class="space-y-3">
+					<div class="flex justify-between items-center">
+						<span class="text-sm text-theme-secondary">New Chat</span>
+						<kbd class="rounded bg-theme-tertiary px-2 py-1 font-mono text-xs text-theme-muted">{modifierKey}+N</kbd>
+					</div>
+					<div class="flex justify-between items-center">
+						<span class="text-sm text-theme-secondary">Search</span>
+						<kbd class="rounded bg-theme-tertiary px-2 py-1 font-mono text-xs text-theme-muted">{modifierKey}+K</kbd>
+					</div>
+					<div class="flex justify-between items-center">
+						<span class="text-sm text-theme-secondary">Toggle Sidebar</span>
+						<kbd class="rounded bg-theme-tertiary px-2 py-1 font-mono text-xs text-theme-muted">{modifierKey}+B</kbd>
+					</div>
+					<div class="flex justify-between items-center">
+						<span class="text-sm text-theme-secondary">Send Message</span>
+						<kbd class="rounded bg-theme-tertiary px-2 py-1 font-mono text-xs text-theme-muted">Enter</kbd>
+					</div>
+					<div class="flex justify-between items-center">
+						<span class="text-sm text-theme-secondary">New Line</span>
+						<kbd class="rounded bg-theme-tertiary px-2 py-1 font-mono text-xs text-theme-muted">Shift+Enter</kbd>
+					</div>
+				</div>
+			</div>
+		</section>
+
+		<!-- About Section -->
+		<section>
+			<h2 class="mb-4 flex items-center gap-2 text-lg font-semibold text-theme-primary">
+				<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-gray-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+					<path stroke-linecap="round" stroke-linejoin="round" d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
+				</svg>
+				About
+			</h2>
+
+			<div class="rounded-lg border border-theme bg-theme-secondary p-4">
+				<div class="flex items-center gap-4">
+					<div class="rounded-lg bg-theme-tertiary p-3">
+						<svg xmlns="http://www.w3.org/2000/svg" class="h-8 w-8 text-emerald-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="1.5">
+							<path stroke-linecap="round" stroke-linejoin="round" d="M20.25 6.375c0 2.278-3.694 4.125-8.25 4.125S3.75 8.653 3.75 6.375m16.5 0c0-2.278-3.694-4.125-8.25-4.125S3.75 4.097 3.75 6.375m16.5 0v11.25c0 2.278-3.694 4.125-8.25 4.125s-8.25-1.847-8.25-4.125V6.375m16.5 0v3.75m-16.5-3.75v3.75m16.5 0v3.75C20.25 16.153 16.556 18 12 18s-8.25-1.847-8.25-4.125v-3.75m16.5 0c0 2.278-3.694 4.125-8.25 4.125s-8.25-1.847-8.25-4.125" />
+						</svg>
+					</div>
+					<div>
+						<h3 class="font-semibold text-theme-primary">Vessel</h3>
+						<p class="text-sm text-theme-muted">
+							A modern interface for local AI with chat, tools, and memory management.
+						</p>
+					</div>
+				</div>
+			</div>
+		</section>
+	</div>
+</div>
--- a/frontend/vite.config.ts
+++ b/frontend/vite.config.ts
@@ -4,11 +4,12 @@ import { defineConfig } from 'vite';
 // Use environment variable or default to localhost (works with host network mode)
 const ollamaUrl = process.env.OLLAMA_API_URL || 'http://localhost:11434';
 const backendUrl = process.env.BACKEND_URL || 'http://localhost:9090';
+const devPort = parseInt(process.env.DEV_PORT || '7842', 10);

 export default defineConfig({
 	plugins: [sveltekit()],
 	server: {
-		port: 7842,
+		port: devPort,
 		proxy: {
 			// Backend health check
 			'/health': {
Author	SHA1	Message	Date
vikingowl	3a4aabff1d	fix: bundle PDF.js worker locally to fix CDN loading issues Some checks failed Create Release / release (push) Has been cancelled Details - Add postinstall script to copy worker to static/ - Update Dockerfile to copy worker during build - Update file-processor to try local worker first, fallback to CDN - Bump version to 0.4.11	2026-01-03 22:16:19 +01:00
vikingowl	d94d5ba03a	chore: bump version to 0.4.10 Some checks failed Create Release / release (push) Has been cancelled Details	2026-01-03 21:52:09 +01:00
vikingowl	75770b1bd8	docs: simplify README, move detailed docs to wiki	2026-01-03 21:35:55 +01:00
vikingowl	edd7c94507	docs: comprehensive documentation update Updates README with: - System prompts feature (model-specific, capability-based defaults) - Custom model creation with embedded prompts - Comprehensive Custom Tools Guide with examples - Updated API reference with all endpoints - Updated roadmap with completed features Adds detailed documentation for custom tools: - JavaScript, Python, and HTTP tool types - Parameter definitions and templates - Testing workflow and security notes - Complete weather tool example - Programmatic tool creation guide	2026-01-03 21:19:32 +01:00
vikingowl	6868027a1c	feat: add model-specific prompts and custom model creation Adds two related features for enhanced model customization: Model-Specific System Prompts: - Assign prompts to models via Settings > Model Prompts - Capability-based default prompts (vision, tools, thinking, code) - Auto-select appropriate prompt when switching models in chat - Per-model prompt mappings stored in IndexedDB Custom Ollama Model Creation: - Create custom models with embedded system prompts via Models page - Edit system prompts of existing custom models - Streaming progress during model creation - Visual "Custom" badge for models with embedded prompts - Backend handler for Ollama /api/create endpoint New files: - ModelEditorDialog.svelte: Create/edit dialog for custom models - model-creation.svelte.ts: State management for model operations - model-prompt-mappings.svelte.ts: Model-to-prompt mapping store - model-info-service.ts: Fetches and caches model info from Ollama - modelfile-parser.ts: Parses system prompts from Modelfiles	2026-01-03 21:12:49 +01:00
vikingowl	1063bec248	chore: bump version to 0.4.9 Some checks failed Create Release / release (push) Has been cancelled Details	2026-01-03 18:26:40 +01:00
vikingowl	cf4981f3b2	feat: add auto-compact, settings page, and message virtualization - Add auto-compact feature with configurable threshold (50-90%) - Convert settings modal to full /settings page with organized sections - Add Memory Management settings (auto-compact toggle, threshold, preserve count) - Add inline SummarizationIndicator shown where compaction occurred - Add VirtualMessageList with fallback for long conversation performance - Trigger auto-compact after assistant responses when threshold reached	2026-01-03 18:26:11 +01:00
vikingowl	7cc0df2c78	ci: sync GitHub release notes from Gitea	2026-01-03 15:48:50 +01:00
vikingowl	e19b6330e9	chore: bump version to 0.4.8 Some checks failed Create Release / release (push) Has been cancelled Details	2026-01-03 15:32:04 +01:00
vikingowl	c194a4e0e9	fix: include custom tools in Ollama API requests Custom tools were displayed as enabled in the UI but never sent to Ollama because getEnabledToolDefinitions() only queried the builtin tool registry. Now iterates customTools and includes enabled ones. Fixes #4	2026-01-03 15:29:25 +01:00
vikingowl	04c3018360	chore: bump version to 0.4.7 Some checks failed Create Release / release (push) Has been cancelled Details	2026-01-02 22:42:35 +01:00
vikingowl	2699f1cd5c	fix: handle null updates array and show capabilities for local models - Fix TypeError when check updates returns null updates array - Display verified capabilities from Ollama runtime in Local Models tab - Fetch all model capabilities on page mount - Add data-dev to gitignore	2026-01-02 22:41:37 +01:00
vikingowl	9f313e6599	feat: verify model capabilities from Ollama runtime Some checks failed Create Release / release (push) Has been cancelled Details - Add capability verification for installed models using /api/show - SyncModels now updates real capabilities when fetchDetails=true - Model browser shows verified/unverified badges for capabilities - Add info notice that capabilities are sourced from ollama.com - Fix incorrect capability data (e.g., deepseek-r1 "tools" badge) Capabilities from ollama.com website may be inaccurate. Once a model is installed, Vessel fetches actual capabilities from Ollama runtime and displays a "verified" badge in the model details panel.	2026-01-02 22:35:03 +01:00
vikingowl	802db229a6	feat: add model filters and last updated display Some checks failed Create Release / release (push) Has been cancelled Details - Add size filter (≤3B, 4-13B, 14-70B, >70B) based on model tags - Add model family filter dropdown with dynamic family list - Display last updated date on model cards (scraped from ollama.com) - Add /api/v1/models/remote/families endpoint - Convert relative time strings ("2 weeks ago") to timestamps during sync	2026-01-02 21:54:50 +01:00
vikingowl	14b566ce2a	feat: add DEV_PORT env var for running dev alongside production	2026-01-02 21:17:32 +01:00
vikingowl	7ef29aba37	fix: coerce numeric tool args to handle string values from Ollama Ollama models sometimes output numbers as strings in tool call arguments. Go backend strictly rejects string→int coercion, causing errors like: "cannot unmarshal string into Go struct field URLFetchRequest.maxLength" - fetch_url: coerce maxLength, timeout - web_search: coerce maxResults, timeout - calculate: coerce precision	2026-01-02 21:08:52 +01:00
vikingowl	3c8d811cdc	chore: bump version to 0.4.3 Some checks failed Create Release / release (push) Has been cancelled Details	2026-01-02 21:05:03 +01:00
vikingowl	5cab71dd78	fix: sync context progress bar with custom context length setting - Add customMaxTokens override to ContextManager - maxTokens is now derived from custom override or model default - ChatWindow syncs settings.num_ctx to context manager - Progress bar now shows custom context length when enabled	2026-01-02 21:04:47 +01:00