WebLLM Cache & Logo Storage Fix

Shipping a feature is the start, not the finish. Two issues surfaced within 24 hours of the WebLLM launch that would have made the feature impractical in real use. Both are fixed.

No duplicate model downloads

The initial WebLLM implementation contained a critical omission: it re-downloaded the AI model on every page load, regardless of whether a cached copy existed. For a model that weighs 1–2GB, this made the offline feature essentially unusable — loading the app would trigger a multi-gigabyte download every single time.

The fix: before initiating any download, the OfflineService checks the browser's origin-private file system (OPFS) for a cached model. If one exists, it loads from cache instead of downloading. A user who loaded the model once never downloads it again unless they explicitly clear their browser data or request a different model version.

The cache check is fast — it's a filesystem stat operation against the OPFS, not a network request. From the user's perspective, the model either loads quickly from cache or triggers a one-time download with a progress indicator. The behavior is now what it always should have been.

Logo storage: database paths, not data URLs

Room logos were being stored as base64 data URLs directly in the room settings object. This had two problems: data URLs can easily exceed 100KB per image, bloating every socket event that included room metadata; and a data URL can't be referenced from a URL — it's always inline, always re-transmitted, never cacheable by the browser.

Logos are now uploaded via the existing file upload endpoint (POST /api/upload), and the resulting relative path (e.g. /uploads/room-logo-abc123.png) is stored in the database. The data URL is used only as a local client-side preview while the upload is in flight. Once the upload completes, the relative path replaces the data URL in the room record.

This change makes logos behave like any other static asset: served once, cached by the browser, and referenced by a stable URL regardless of the backend's current deployment address.

Why it matters

Re-downloading a 1–2GB model on every page load would have made WebLLM unusable in practice — not just slow, but actively hostile to users on metered connections. The cache detection fix is what makes offline AI a real feature rather than a demo. The logo storage change is correctness: data URLs don't belong in a database column, and keeping them there would have created scaling issues as more rooms added custom branding.