🤖 feat: add google:gemini-3-flash-preview model (#1216)

ThomasK33 · web-flow · commit cac3cb0abe5d · 2025-12-18T07:54:55.000Z
Add `google:gemini-3-flash-preview` as a known model with Flash-specific thinking levels. ### Changes - Add `GEMINI_3_FLASH` to `knownModels.ts` with alias `gemini-3-flash` - Update thinking policy: Flash supports `off/low/medium/high` (vs Pro's `low/high`) - Update `providerOptions.ts` to pass `medium` directly for Flash (Pro maps `medium` → `high`) - Refresh `models.json` from LiteLLM (includes gemini-3-flash-preview pricing) - Fix `modelStats.ts` type to allow `null` for `max_output_tokens` (LiteLLM compat) --- <details> <summary>📋 Implementation Plan</summary> # Plan: Add `google:gemini-3-flash-preview` to Known Models ## Summary Add a new Google Gemini 3 Flash Preview model entry and update thinking policy to reflect Flash-specific thinking levels. **Estimated LoC change:** +25 lines (production code only, excluding auto-fetched JSON) ## Changes ### 1. Add model definition to `src/common/constants/knownModels.ts` Insert a new `GEMINI_3_FLASH` entry after the existing `GEMINI_3_PRO` definition: ```typescript GEMINI_3_FLASH: { provider: "google", providerModelId: "gemini-3-flash-preview", aliases: ["gemini-3-flash"], tokenizerOverride: "google/gemini-2.5-pro", }, ``` <details> <summary>Why these values?</summary> - **provider:** `"google"` — matches the existing Gemini model pattern - **providerModelId:** `"gemini-3-flash-preview"` — the exact model ID requested - **aliases:** `["gemini-3-flash"]` — convenient shorthand without `-preview` suffix - **tokenizerOverride:** `"google/gemini-2.5-pro"` — reuses the same tokenizer as `GEMINI_3_PRO` since `ai-tokenizer` likely doesn't have gemini-3 specific tokenizers yet - **warm:** Not set (defaults to false) — flash models are typically secondary, not preloaded </details> ### 2. Update thinking policy in `src/browser/utils/thinking/policy.ts` Per [Google's Gemini 3 docs](https://ai.google.dev/gemini-api/docs/gemini-3#rest), thinking levels differ between Pro and Flash: | Model | Thinking Levels | |-------|-----------------| | Gemini 3 Pro | `low`, `high` | | Gemini 3 Flash | `minimal`, `low`, `medium`, `high` | **Note:** Google's `minimal` maps to mux's `off` (no/minimal thinking). Update the `gemini-3` check in `getThinkingPolicyForModel()` to differentiate Flash from Pro: ```typescript // Gemini 3 Flash supports 4 levels: off (minimal), low, medium, high if (withoutProviderNamespace.includes("gemini-3-flash")) { return ["off", "low", "medium", "high"]; } // Gemini 3 Pro only supports "low" and "high" reasoning levels if (withoutProviderNamespace.includes("gemini-3")) { return ["low", "high"]; } ``` The Flash check must come **before** the generic `gemini-3` check since Flash contains `gemini-3` as a substring. ### 3. Refresh `src/common/utils/tokens/models.json` from LiteLLM LiteLLM already includes `gemini-3-flash-preview`. Just run the existing update script: ```bash bun scripts/update_models.ts ``` This fetches the latest model data from [LiteLLM's model_prices_and_context_window.json](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json). ### 4. Update `src/common/utils/ai/providerOptions.ts` for Flash-specific thinking levels Current code treats all `gemini-3` models the same, mapping `medium` → `high`. But Flash actually supports `medium` natively: | Model | Google API levels | |-------|-------------------| | Gemini 3 Pro | `low`, `high` | | Gemini 3 Flash | `minimal`, `low`, `medium`, `high` | Update the Google provider options to differentiate Flash from Pro: ```typescript if (isGemini3) { const isFlash = modelString.includes("gemini-3-flash"); if (isFlash) { // Flash supports: minimal (maps to off), low, medium, high // When off, we don't set thinkingConfig at all (API defaults to minimal) thinkingConfig.thinkingLevel = effectiveThinking === "xhigh" ? "high" : effectiveThinking; } else { // Pro only supports: low, high - map medium/xhigh to high thinkingConfig.thinkingLevel = effectiveThinking === "medium" || effectiveThinking === "xhigh" ? "high" : effectiveThinking; } } ``` ### 5. Update tests in `src/browser/utils/thinking/policy.test.ts` Add test case for the new Flash model: ```typescript expect(getThinkingPolicyForModel("google:gemini-3-flash-preview")).toEqual([ "off", "low", "medium", "high" ]); ``` ## Systems That Work Automatically | System | Reason | |--------|--------| | **Model display** | Generic Gemini parsing in `modelDisplay.ts` will format as "Gemini 3 Flash Preview" | | **Tokenizer** | Override specified; falls back to gemini-2.5-pro tokenizer | ## Verification After implementation: 1. `make typecheck` — ensures type correctness 2. `bun test src/browser/utils/thinking/policy.test.ts` — verify thinking policy 3. Verify model appears in `KNOWN_MODEL_OPTIONS` export </details> --- _Generated with `mux` • Model: `anthropic:claude-opus-4-5` • Thinking: `high`_ Signed-off-by: Thomas Kosiewski <tk@coder.com>
diff --git a/docs/models.mdx b/docs/models.mdx
@@ -15,19 +15,20 @@ mux ships with a curated set of first-class models that we keep up to date with
 
 {/* BEGIN KNOWN_MODELS_TABLE */}
 
-| Model                | ID                          | Aliases                                  | Default |
-| -------------------- | --------------------------- | ---------------------------------------- | ------- |
-| Opus 4.5             | anthropic:claude-opus-4-5   | `opus`                                   | ✓       |
-| Sonnet 4.5           | anthropic:claude-sonnet-4-5 | `sonnet`                                 |         |
-| Haiku 4.5            | anthropic:claude-haiku-4-5  | `haiku`                                  |         |
-| GPT-5.2              | openai:gpt-5.2              | `gpt`                                    |         |
-| GPT-5.2 Pro          | openai:gpt-5.2-pro          | `gpt-pro`                                |         |
-| GPT-5.1 Codex        | openai:gpt-5.1-codex        | `codex`                                  |         |
-| GPT-5.1 Codex Mini   | openai:gpt-5.1-codex-mini   | `codex-mini`                             |         |
-| GPT-5.1 Codex Max    | openai:gpt-5.1-codex-max    | `codex-max`                              |         |
-| Gemini 3 Pro Preview | google:gemini-3-pro-preview | `gemini-3`, `gemini-3-pro`               |         |
-| Grok 4 1 Fast        | xai:grok-4-1-fast           | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` |         |
-| Grok Code Fast 1     | xai:grok-code-fast-1        | `grok-code`                              |         |
+| Model                  | ID                            | Aliases                                  | Default |
+| ---------------------- | ----------------------------- | ---------------------------------------- | ------- |
+| Opus 4.5               | anthropic:claude-opus-4-5     | `opus`                                   | ✓       |
+| Sonnet 4.5             | anthropic:claude-sonnet-4-5   | `sonnet`                                 |         |
+| Haiku 4.5              | anthropic:claude-haiku-4-5    | `haiku`                                  |         |
+| GPT-5.2                | openai:gpt-5.2                | `gpt`                                    |         |
+| GPT-5.2 Pro            | openai:gpt-5.2-pro            | `gpt-pro`                                |         |
+| GPT-5.1 Codex          | openai:gpt-5.1-codex          | `codex`                                  |         |
+| GPT-5.1 Codex Mini     | openai:gpt-5.1-codex-mini     | `codex-mini`                             |         |
+| GPT-5.1 Codex Max      | openai:gpt-5.1-codex-max      | `codex-max`                              |         |
+| Gemini 3 Pro Preview   | google:gemini-3-pro-preview   | `gemini-3`, `gemini-3-pro`               |         |
+| Gemini 3 Flash Preview | google:gemini-3-flash-preview | `gemini-3-flash`                         |         |
+| Grok 4 1 Fast          | xai:grok-4-1-fast             | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` |         |
+| Grok Code Fast 1       | xai:grok-code-fast-1          | `grok-code`                              |         |
 
 {/* END KNOWN_MODELS_TABLE */}
 
diff --git a/src/browser/utils/thinking/policy.test.ts b/src/browser/utils/thinking/policy.test.ts
@@ -159,10 +159,19 @@ describe("getThinkingPolicyForModel", () => {
     ]);
   });
 
-  test("returns low/high for Gemini 3", () => {
+  test("returns low/high for Gemini 3 Pro", () => {
     expect(getThinkingPolicyForModel("google:gemini-3-pro-preview")).toEqual(["low", "high"]);
   });
 
+  test("returns off/low/medium/high for Gemini 3 Flash", () => {
+    expect(getThinkingPolicyForModel("google:gemini-3-flash-preview")).toEqual([
+      "off",
+      "low",
+      "medium",
+      "high",
+    ]);
+  });
+
   test("returns all levels for other providers", () => {
     expect(getThinkingPolicyForModel("anthropic:claude-opus-4")).toEqual([
       "off",
diff --git a/src/browser/utils/thinking/policy.ts b/src/browser/utils/thinking/policy.ts
@@ -66,6 +66,11 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {
     return ["high"];
   }
 
+  // Gemini 3 Flash supports 4 levels: off (minimal), low, medium, high
+  if (withoutProviderNamespace.includes("gemini-3-flash")) {
+    return ["off", "low", "medium", "high"];
+  }
+
   // Gemini 3 Pro only supports "low" and "high" reasoning levels
   if (withoutProviderNamespace.includes("gemini-3")) {
     return ["low", "high"];
diff --git a/src/common/constants/knownModels.ts b/src/common/constants/knownModels.ts
@@ -83,6 +83,12 @@ const MODEL_DEFINITIONS = {
     aliases: ["gemini-3", "gemini-3-pro"],
     tokenizerOverride: "google/gemini-2.5-pro",
   },
+  GEMINI_3_FLASH: {
+    provider: "google",
+    providerModelId: "gemini-3-flash-preview",
+    aliases: ["gemini-3-flash"],
+    tokenizerOverride: "google/gemini-2.5-pro",
+  },
   GROK_4_1: {
     provider: "xai",
     providerModelId: "grok-4-1-fast",
diff --git a/src/common/utils/ai/providerOptions.ts b/src/common/utils/ai/providerOptions.ts
@@ -264,11 +264,18 @@ export function buildProviderOptions(
       };
 
       if (isGemini3) {
-        // Gemini 3 uses thinkingLevel (low/high) - map medium/xhigh to supported values
-        thinkingConfig.thinkingLevel =
-          effectiveThinking === "medium" || effectiveThinking === "xhigh"
-            ? "high"
-            : effectiveThinking;
+        const isFlash = modelString.includes("gemini-3-flash");
+        if (isFlash) {
+          // Flash supports: minimal (maps to off), low, medium, high
+          // When off, we don't set thinkingConfig at all (API defaults to minimal)
+          thinkingConfig.thinkingLevel = effectiveThinking === "xhigh" ? "high" : effectiveThinking;
+        } else {
+          // Pro only supports: low, high - map medium/xhigh to high
+          thinkingConfig.thinkingLevel =
+            effectiveThinking === "medium" || effectiveThinking === "xhigh"
+              ? "high"
+              : effectiveThinking;
+        }
       } else {
         // Gemini 2.5 uses thinkingBudget
         const budget = GEMINI_THINKING_BUDGETS[effectiveThinking];
diff --git a/src/common/utils/tokens/modelStats.ts b/src/common/utils/tokens/modelStats.ts
@@ -12,8 +12,8 @@ export interface ModelStats {
 }
 
 interface RawModelData {
-  max_input_tokens?: number | string;
-  max_output_tokens?: number | string;
+  max_input_tokens?: number | string | null;
+  max_output_tokens?: number | string | null;
   input_cost_per_token?: number;
   output_cost_per_token?: number;
   cache_creation_input_token_cost?: number;
diff --git a/src/common/utils/tokens/models.json b/src/common/utils/tokens/models.json