Feature Request / Improvement
Background
Modern data architectures increasingly rely on vector embeddings for AI/ML workflows, semantic search, and recommendation systems. Major databases now support vector types:
- PostgreSQL (pgvector extension)
- Oracle 23c (VECTOR type)
- Elasticsearch (dense_vector field type)
- MySQL 9.0 (VECTOR type)
Problem Statement
Trino currently cannot:
- Read vector columns from sources that support them (Elasticsearch
dense_vector, Oracle 23c VECTOR)
- Store or pass vector data between connectors
- Perform basic vector operations needed for ML pipelines
This forces users to maintain separate query paths for vector operations, breaking Trino's "query anything, anywhere" value proposition.
Proposed Solution
Add a native VECTOR or ARRAY<DOUBLE> data type with:
Phase 1: Basic Support
Phase 2: Basic Operations
Phase 3: Advanced (Future)
Use Cases
- Unified Analytics: Join vector search results from Elasticsearch with business data in PostgreSQL
- ML Feature Engineering: Transform and aggregate embedding vectors across multiple data sources
- Cross-Source Vector Search: Compare embeddings stored in different systems
- Migration Simplification: Query legacy vector columns during database migrations
Example Query (Desired)
-- Join vector search results with business data
SELECT
e.product_id,
e.embedding,
p.product_name,
vector_cosine_similarity(e.embedding, CAST([0.1, 0.2, ...] AS VECTOR)) AS similarity
FROM elasticsearch.products.embeddings e
JOIN postgresql.public.products p ON e.product_id = p.id
WHERE similarity > 0.8
ORDER BY similarity DESC
LIMIT 10;
Feature Request / Improvement
Background
Modern data architectures increasingly rely on vector embeddings for AI/ML workflows, semantic search, and recommendation systems. Major databases now support vector types:
Problem Statement
Trino currently cannot:
dense_vector, Oracle 23cVECTOR)This forces users to maintain separate query paths for vector operations, breaking Trino's "query anything, anywhere" value proposition.
Proposed Solution
Add a native
VECTORorARRAY<DOUBLE>data type with:Phase 1: Basic Support
Phase 2: Basic Operations
vector_cosine_similarity(vector1, vector2)vector_euclidean_distance(vector1, vector2)vector_dot_product(vector1, vector2)Phase 3: Advanced (Future)
Use Cases
Example Query (Desired)