feat: add multimodal embedding support for gemini-embedding-2-preview by ajroetker · Pull Request #7 · antflydb/antfly

ajroetker · 2026-03-18T00:07:14Z

Summary

Add gemini-embedding-2-preview to the known model registry with multimodal capabilities (images, audio, video, PDFs, 3072 dims, fusion support)
Introduce multimodal: true config escape hatch so users can declare any future model as multimodal without waiting for a registry update
Update the Gemini provider (google_gemini.go) to route multimodal content through the genai SDK's NewPartFromBytes/NewContentFromParts path
Wire GetConfigCapabilities() through all providers (OpenAI, Bedrock, Ollama, Cohere, OpenRouter, Termite)
Guard the legacy Vertex provider's multimodal path to only allow multimodalembedding models, with a clear error directing users to use provider: "gemini" with GOOGLE_GENAI_USE_VERTEXAI=1 for other models
Fix copy-paste "ollama" error message in GenaiGoogleImpl.GetModels()
Simplify: merge config_capabilities.go into capabilities.go, reuse allText() in vertex.go, clean up NewTermiteClient signature

Test plan

All lib/embeddings/... tests pass locally
Live integration tests pass with GEMINI_API_KEY set (text, image, fused text+image)
Live tests skip cleanly without credentials (CI-safe)
Pure logic tests always pass (capabilities resolution, config override, multimodal config)
CI passes

Add gemini-embedding-2-preview to the known model registry with multimodal capabilities (images, audio, video, PDFs, 3072 dims, fusion support). Introduce `multimodal: true` config escape hatch so users can declare any future model as multimodal without waiting for a registry update. The GenaiGoogleImpl (gemini provider) now routes multimodal content through the genai SDK's NewPartFromBytes/NewContentFromParts path. Also: - Wire GetConfigCapabilities() through all providers (OpenAI, Bedrock, Ollama, Cohere, OpenRouter, Termite) - Use capabilities-based routing in vertex provider instead of hardcoded model name check - Guard vertex provider's multimodal path to only allow multimodalembedding models (legacy Prediction API), with clear error pointing to gemini provider for other models - Fix copy-paste "ollama" error message in GenaiGoogleImpl.GetModels() - Simplify: merge config_capabilities.go into capabilities.go, reuse allText() in vertex.go, clean up NewTermiteClient signature - Add integration tests (skip without credentials) and capability tests

Picks up fix for nil map panic in zapx invertedIndexCache.Clear() and synonymIndexCache.Clear() during concurrent shard split operations.

ajroetker added 2 commits March 17, 2026 17:05

fix: update bleve to v2.5.8-antfly002 and zapx to v17.0.2-antfly004

baa8563

Picks up fix for nil map panic in zapx invertedIndexCache.Clear() and synonymIndexCache.Clear() during concurrent shard split operations.

ajroetker merged commit 300abf0 into main Mar 18, 2026
5 of 6 checks passed

ajroetker deleted the feat/multimodal-capability-override branch March 18, 2026 03:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add multimodal embedding support for gemini-embedding-2-preview#7

feat: add multimodal embedding support for gemini-embedding-2-preview#7
ajroetker merged 2 commits intomainfrom
feat/multimodal-capability-override

ajroetker commented Mar 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ajroetker commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ajroetker commented Mar 18, 2026 •

edited

Loading