Skip to content

feat: add opensearch and typesense support#11678

Open
jpaodev wants to merge 3 commits intodanny-avila:devfrom
jpaodev:feat-add-search-providers
Open

feat: add opensearch and typesense support#11678
jpaodev wants to merge 3 commits intodanny-avila:devfrom
jpaodev:feat-add-search-providers

Conversation

@jpaodev
Copy link

@jpaodev jpaodev commented Feb 7, 2026

Pull Request Template

⚠️ Before Submitting a PR, Please Review:

  • Please ensure that you have thoroughly read and understood the Contributing Docs before submitting your Pull Request.

⚠️ Documentation Updates Notice:

  • Kindly note that documentation updates are managed in this repository: librechat.ai

Summary

This PR introduces a modular search provider abstraction that enables LibreChat to use OpenSearch and Typesense as alternative search backends alongside the existing MeiliSearch integration.

Adds support for opensearch and typesense, both being alternatives to meilisearch.

Initial motivation: Meilisearch is not able to run in multi-replica mode, which is a problem due to these reasons: 1) high availability deployments not possible 2) running application during node maintenance not possible, e.g. because the pod disruption budget can't be set to 1 pod always (maintenance wouldn't be possible) 3) problems with volume attachments can fry the application completely (love it). It seems though that single-node it starts perfectly fine with the adapted LibreChat values.yaml and fully hardened security context + DHI image of opensearch (DHI = Docker Hardened Image -> I always try to use the DHI images wherever possible.)

So I tested everything and everything looks great, however it looks like I won't be able to use OpenSearch (yayyy 🙃), because my security settings don't allow escalated privileges and those are required for OpenSearch apparently.

Note that I did not want to include too many providers and thought starting out with those 2 alternatives to Meili might be a solid option. This PR therefore partly addresses #6712

  • OpenSearch is Apache-2.0 license and supports multi-replica deployments, making it ideal for production Kubernetes environments.
  • Typesense is a fast, typo-tolerant search engine with built-in Raft-based clustering — included as an optional, cleanly separated provider. GPL-3.0 license
  • The existing MeiliSearch integration remains the default and is fully backward compatible.

What's included

Search Provider Abstraction Layer

  • SearchProvider interface defining a standard contract for all search backends (health check, index management, document CRUD, search with filters/sort/pagination)
  • SearchProviderFactory with auto-detection from environment variables and explicit SEARCH_PROVIDER override
  • Singleton caching with resetSearchProvider() for testing

Provider Implementations

  • MeiliSearchProvider — wraps the existing MeiliSearch client behind the new interface
  • OpenSearchProvider — full implementation using OpenSearch REST API (opensearchproject/opensearch:3.4.0) with HTTP Basic auth, TLS support, MeiliSearch-style filter translation to OpenSearch DSL
  • TypesenseProvider — full implementation using Typesense REST API (typesense/typesense:30.1) with JSONL bulk import, collection schema management, and filter translation

Generic Mongoose Plugin

  • mongoSearch.ts — a generic Mongoose plugin that works with any SearchProvider, replacing the need for provider-specific plugins
  • convo.ts and message.ts updated to use the generic plugin for OpenSearch/Typesense while preserving the original mongoMeili plugin for MeiliSearch

Helm Chart Updates

  • values.yaml — OpenSearch subchart (3.4.0) with DHI image switching support; Typesense section for external instance connection; secretKeyRef documentation for all sensitive credentials - NOTE:
  • configmap-env.yaml — auto-wiring for all three providers with every env var individually overridable via librechat.configEnv (supports remote/custom-hosted backends); secrets (OPENSEARCH_PASSWORD, TYPESENSE_API_KEY) kept out of ConfigMap
  • checks.yaml — mutual exclusivity validation ensuring only one search backend is enabled (moved from _checks.yaml which was never rendered by Helm)
  • NOTES.txt — search backend status in Helm install notes
  • External/remote deployment documentation for both OpenSearch and Typesense via configEnv overrides

Docker Compose

  • OpenSearch and Typesense services added as commented-out blocks in docker-compose.override.yml.example with clear switching instructions

Environment Variables

  • .env.example updated with all new env vars: SEARCH_PROVIDER, OPENSEARCH_HOST, OPENSEARCH_USERNAME, OPENSEARCH_PASSWORD, OPENSEARCH_INSECURE, TYPESENSE_HOST, TYPESENSE_API_KEY

Unit Tests — 90 new tests

  • searchProviderFactory.spec.ts (28 tests) — detection, factory instantiation, caching, isSearchEnabled, case-insensitivity, explicit overrides
  • openSearchProvider.spec.ts (32 tests) — constructor, health check, index/document CRUD, search with filter/sort/pagination translation, error handling
  • typesenseProvider.spec.ts (30 tests) — constructor, health check, collection CRUD, JSONL import, search with Typesense filter translation, error handling

Helm Template Validation Script — 13 scenarios

  • helm/test-search-backends.sh — automated validation covering default MeiliSearch, OpenSearch subchart, Typesense, external connections, configEnv overrides, security plugin detection (http/https), and mutual exclusivity checks - NOTE: Not added, can be added upon request, but I thought maybe it's not desired to have this in the helm folder

Backward Compatibility

  • No breaking changes. Existing MeiliSearch deployments should continue working as-is
  • When no SEARCH_PROVIDER is set, the factory auto-detects from environment variables with MeiliSearch as the fallback default.

Change Type

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Testing

Unit Tests

# Run all new search provider tests
cd packages/data-schemas
npx jest --testPathPatterns="searchProviderFactory.spec" --no-coverage
npx jest --testPathPatterns="openSearchProvider.spec" --no-coverage
npx jest --testPathPatterns="typesenseProvider.spec" --no-coverage

All 90 tests pass.

Helm Template Validation

# Run the automated Helm template test suite (13 scenarios) - Note: Not pushed
./helm/test-search-backends.sh

Local Integration Testing (Docker Compose)

docker-compose.override.yml for testing opensearch:

services:

# # USE LIBRECHAT CONFIG FILE
  api:
    volumes:
    - type: bind
      source: ./librechat.yaml
      target: /app/librechat.yaml

# # LOCAL BUILD
#   api:
    image: librechat
    build:
      context: .
      target: node


# DISABLE MEILISEARCH
  meilisearch:
    profiles:
      - donotstart

# USE OPENSEARCH INSTEAD OF MEILISEARCH
# 1. Disable MeiliSearch above (add it to the "donotstart" profile)
# 2. Set in .env: SEARCH=true, SEARCH_PROVIDER=opensearch, OPENSEARCH_HOST=https://opensearch-node1:9200, OPENSEARCH_INSECURE=true
  opensearch-node1:
    image: opensearchproject/opensearch:3.4.0
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node1
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - DISABLE_SECURITY_PLUGIN=true
      - DISABLE_INSTALL_DEMO_CONFIG=true
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - opensearch-data:/usr/share/opensearch/data
    ports:
      - 9200:9200



volumes:
  opensearch-data:
  # typesense-data:

docker-compose.override.yml for testing typesense:

services:

# # USE LIBRECHAT CONFIG FILE
  api:
    volumes:
    - type: bind
      source: ./librechat.yaml
      target: /app/librechat.yaml

# # LOCAL BUILD
#   api:
    image: librechat
    build:
      context: .
      target: node


# DISABLE MEILISEARCH
  meilisearch:
    profiles:
      - donotstart


# # USE TYPESENSE INSTEAD OF MEILISEARCH
# # 1. Disable MeiliSearch above (add it to the "donotstart" profile)
# # 2. Set in .env: SEARCH=true, SEARCH_PROVIDER=typesense, TYPESENSE_HOST=http://typesense:8108, TYPESENSE_API_KEY=xyz
  typesense:
    image: typesense/typesense:30.1
    container_name: typesense
    environment:
      - TYPESENSE_DATA_DIR=/data
      - TYPESENSE_API_KEY=xyz
      - TYPESENSE_ENABLE_CORS=true
    volumes:
      - typesense-data:/data
    ports:
      - 8108:8108


volumes:
  # opensearch-data:
  typesense-data:

Set in .env:

SEARCH_PROVIDER=typesense # or opensearch
TYPESENSE_HOST=http://typesense:8108
TYPESENSE_API_KEY=xyz
OPENSEARCH_HOST=http://opensearch-node1:9200
OPENSEARCH_INSECURE=true

Run docker compose up

Manual Testing — OpenSearch

opensearch-testing-07-02-2025

Manual Testing — Typesense

Screenshot 2026-02-07 at 20 38 51 Screenshot 2026-02-07 at 20 38 58

Test Configuration:

  • Node.js v22.14.0 (local), v20.20.0 (Docker)
  • Jest (project default)
  • Helm CLI for template validation
  • macOS

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • I have made pertinent documentation changes
  • My changes do not introduce new warnings
  • I have written tests demonstrating that my changes are effective or that my feature works
  • Local unit tests pass with my changes
  • Any changes dependent on mine have been merged and published in downstream modules.
  • A pull request for updating the documentation has been submitted.

@jpaodev
Copy link
Author

jpaodev commented Feb 7, 2026

Screenshot 2026-02-07 at 20 28 56

Here a sample screenshot from testing

@jpaodev
Copy link
Author

jpaodev commented Feb 7, 2026

OpenSearch also tested (single-pod opensearch) in Kubernetes deployment (successfully) with these values:

opensearch:
  enabled: true
  # singleNode: false
  # replicas: 2
  singleNode: true
  replicas: 1
  # ---- Image Configuration (DHI) ----
  global:
    dockerRegistry: ""
  image:
    repository: "dhi.io/opensearch"
    tag: "3.4.0"
    pullPolicy: "IfNotPresent"

  imagePullSecrets:
    - name: registry-pull-secret

  # OpenSearch Java memory settings
  opensearchJavaOpts: "-Xmx512M -Xms512M"

  # ---- Persistence ----
  persistence:
    enabled: true
    size: 8Gi
    enableInitChown: false  # Disable chown init container (requires root)
    # storageClass: ""

  # ---- Security Plugin: DISABLED for initial testing ----
  # The OpenSearch security plugin requires TLS certificates.
  # For initial deployment, disable it. K8s-level securityContext and
  # readOnlyRootFilesystem remain fully enforced (Gatekeeper).
  # To enable later: remove DISABLE_SECURITY_PLUGIN, provide TLS certs via
  # secretMounts, set OPENSEARCH_INITIAL_ADMIN_PASSWORD, and update configEnv.
  extraEnvs:
    - name: DISABLE_SECURITY_PLUGIN
      value: "true"
    - name: DISABLE_INSTALL_DEMO_CONFIG
      value: "true"
  # # For production with security plugin enabled:
  # extraEnvs:
  #   - name: DISABLE_INSTALL_DEMO_CONFIG
  #     value: "true"
  #   - name: OPENSEARCH_INITIAL_ADMIN_PASSWORD
  #     valueFrom:
  #       secretKeyRef:
  #         name: opensearch-credentials
  #         key: admin-password
  # opensearchUsername: "admin"

  # ---- OpenSearch Config ----
  # Minimal config with security plugin disabled.
  # When enabling the security plugin later, add the plugins.security block
  # with TLS cert paths and admin DN configuration.
  config:
    opensearch.yml: |
      cluster.name: opensearch-cluster
      network.host: 0.0.0.0

  # ---- Resources ----
  resources:
    requests:
      cpu: "500m"
      memory: "512Mi"
    limits:
      cpu: "1000m"
      memory: "1Gi"

  # ---- Pod Security Context ----
  podSecurityContext:
    fsGroup: 1000
    fsGroupChangePolicy: OnRootMismatch
    supplementalGroups: [1000]

  # ---- Container Security Context ----
  # readOnlyRootFilesystem: true requires emptyDir mounts for writable paths.
  # /usr/share/opensearch/data is already mounted from PVC.
  # /usr/share/opensearch/config is mounted from config-emptydir by the chart.
  # We add emptyDir for /tmp and /usr/share/opensearch/logs below via extraVolumes.
  securityContext:
    runAsUser: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    privileged: false
    allowPrivilegeEscalation: false
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    seccompProfile:
      type: RuntimeDefault

  # ---- Extra Volumes for readOnlyRootFilesystem ----
  # OpenSearch writes opensearch.keystore.tmp to /usr/share/opensearch/config at startup.
  # We mount an emptyDir over the config dir and use an init container to seed it.
  extraVolumes:
    - name: tmp-dir
      emptyDir: {}
    - name: opensearch-logs
      emptyDir: {}
    - name: opensearch-config-writable
      emptyDir: {}

  extraVolumeMounts:
    - name: tmp-dir
      mountPath: /tmp
    - name: opensearch-logs
      mountPath: /usr/share/opensearch/logs
    - name: opensearch-config-writable
      mountPath: /usr/share/opensearch/config

  # Init container to seed the writable config dir from the read-only image
  extraInitContainers:
    - name: init-config-dir
      image: "dhi.io/opensearch:3.4.0"
      imagePullPolicy: IfNotPresent
      command:
        - sh
        - -c
        - |
          cp -a /usr/share/opensearch/config/* /tmp/opensearch-config/ 2>/dev/null || true
          cp -a /usr/share/opensearch/config/.[!.]* /tmp/opensearch-config/ 2>/dev/null || true
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        runAsNonRoot: true
        privileged: false
        allowPrivilegeEscalation: false
        readOnlyRootFilesystem: true
        capabilities:
          drop: ["ALL"]
        seccompProfile:
          type: RuntimeDefault
      resources:
        requests:
          cpu: "50m"
          memory: "64Mi"
        limits:
          cpu: "100m"
          memory: "128Mi"
      volumeMounts:
        - name: opensearch-config-writable
          mountPath: /tmp/opensearch-config

  # ---- RBAC ----
  rbac:
    create: false
    automountServiceAccountToken: false

  # ---- Network Policy ----
  networkPolicy:
    create: true
    http:
      enabled: true

  # ---- Probes ----
  startupProbe:
    tcpSocket:
      port: 9200
    initialDelaySeconds: 5
    periodSeconds: 10
    timeoutSeconds: 3
    failureThreshold: 30

  readinessProbe:
    tcpSocket:
      port: 9200
    periodSeconds: 5
    timeoutSeconds: 3
    failureThreshold: 3

  # ---- Sysctls ----
  # vm.max_map_count=262144 is required for multi-node OpenSearch.
  # GKE Autopilot nodes typically have this set by default since late 2023.
  # If not, sysctl.enabled adds it as a pod-level sysctl (no privileged container).
  # Try with sysctl.enabled: false first. If OpenSearch fails with
  # "max virtual memory areas vm.max_map_count too low", set sysctl.enabled: true.
  # sysctlInit (privileged init container) is NOT compatible with GKE Autopilot Warden.
  sysctl:
    enabled: false
  sysctlInit:
    enabled: false

@jpaodev
Copy link
Author

jpaodev commented Feb 8, 2026

Typesense tested on k8s in Clustermode:

LibreChat values.yaml:

global:
  librechat:

    env:
      - name: TYPESENSE_API_KEY
        valueFrom:
          secretKeyRef:
            name: typesense-credentials
            key: typesense-api-key

librechat:
  configEnv:
    SEARCH: "true"
    SEARCH_PROVIDER: typesense
    TYPESENSE_HOST: "http://librechat-test-cluster-svc.librechat.svc.cluster.local:8108"

typesense:
  enabled: true

Tested with: https://github.com/akyriako/typesense-operator (3 pods cluster mode)

Sync messages rename:

2026-02-08T04:57:09.759Z info: [indexSync] Starting index synchronization check...
2026-02-08T04:57:10.018Z debug: [SEARCH_SYNC:search-index-sync-typesense] Creating initial flow state
2026-02-08T04:57:10.132Z info: [indexSync] Requesting message sync progress...

What I noticed as well during testing was that while the results seem to be perfectly showing in the mid of the screen, on the left side the chats are not showing anymore: I remember this different, as far as I know the chats also showed "filtered" on the left side, but maybe I misremember

Edit:
Angrily asking Gemini about the opensearch vm.max_map_count topic helped it seems:

If you are on GKE 1.32 or later, you use a ComputeClass. This is the modern way to tell Autopilot "I need this specific node setting for these specific workloads."
Reference: https://docs.cloud.google.com/kubernetes-engine/docs/reference/crds/computeclass

With this other people and me should be able to test (and run) multi-node OpenSearch as well (safely)

@ablizorukov
Copy link
Contributor

ablizorukov commented Feb 16, 2026

Hi @jpaodev, great work here!

- Reset flags command

Just wanted to point out that there are few more places where MeiliSearch is still used as a singe search engine.
For example there is a command reset-meili-sync, which refers the file: config/reset-meili-sync.js. Probably makes sense to modify it as well.

- Separate Mongoose model from Search Index model (probably worth to consider in the next PRs)

Another thing is would be really good to decouple mongoose Mongo document model and search index.

There are few issues with the Plugin implementation:

  • Right now the mongoose model contains _meiliIndex flag and the implementation of the filling meili index relies on them.
  • The way to use search as a Plugin triggers search index initialization 2 times, per each model (message and convo).
  • With the separation it would be easier to fine-tune index settings, there are more properties to use and tweak.

p.s. I started implementing similar search provider logic myself, planned to add elastic search.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces a comprehensive search provider abstraction layer that enables LibreChat to use OpenSearch and Typesense as alternative search backends alongside the existing MeiliSearch integration. The implementation maintains full backward compatibility with existing MeiliSearch deployments while providing a clean, extensible architecture for adding new search providers.

Changes:

  • Added a provider-agnostic SearchProvider interface with factory pattern for instantiating search backends based on environment configuration
  • Implemented OpenSearchProvider and TypesenseProvider with full REST API integration, including filter/sort translation and bulk operations
  • Created a generic mongoSearch Mongoose plugin that works with any SearchProvider, replacing provider-specific plugins for non-MeiliSearch backends
  • Updated Helm charts with OpenSearch subchart (3.4.0), Typesense configuration, mutual exclusivity validation, and security-aware configuration
  • Enhanced Docker Compose and environment variable examples with clear switching instructions for all three providers
  • Added 90 comprehensive unit tests covering factory logic, provider instantiation, and CRUD operations

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
packages/data-schemas/src/models/plugins/search/searchProvider.ts Core SearchProvider interface defining the contract for all search backends
packages/data-schemas/src/models/plugins/search/searchProviderFactory.ts Factory with auto-detection from environment variables and singleton caching
packages/data-schemas/src/models/plugins/search/meiliSearchProvider.ts Wrapper for existing MeiliSearch client behind the new interface
packages/data-schemas/src/models/plugins/search/openSearchProvider.ts Full OpenSearch REST API implementation with HTTP Basic auth and TLS support
packages/data-schemas/src/models/plugins/search/typesenseProvider.ts Full Typesense REST API implementation with JSONL bulk import and collection schema management
packages/data-schemas/src/models/plugins/mongoSearch.ts Generic Mongoose plugin working with any SearchProvider
packages/data-schemas/src/models/message.ts Updated to use generic plugin for OpenSearch/Typesense while preserving mongoMeili for MeiliSearch
packages/data-schemas/src/models/convo.ts Updated to use generic plugin for OpenSearch/Typesense while preserving mongoMeili for MeiliSearch
helm/librechat/values.yaml OpenSearch subchart with DHI image support, Typesense external connection config
helm/librechat/templates/configmap-env.yaml Auto-wiring for all providers with individual env var overrides
helm/librechat/templates/checks.yaml Mutual exclusivity validation for search backends
docker-compose.yml OpenSearch and Typesense service definitions as commented-out alternatives
.env.example Comprehensive documentation of all search provider environment variables
api/server/routes/search.js Health check endpoint supporting all providers
api/db/indexSync.js Sync operations extended to support all providers with backward compatibility

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +50 to +97
# ---- OpenSearch (alternative to MeiliSearch) ----
# To use OpenSearch instead of MeiliSearch:
# 1. Comment out the 'meilisearch' service above
# 2. Uncomment the 'opensearch' service below
# 3. In .env, set:
# SEARCH_PROVIDER=opensearch
# OPENSEARCH_HOST=http://opensearch:9200
# OPENSEARCH_INSECURE=true
# 4. In the 'api' service environment, replace MEILI_HOST with:
# OPENSEARCH_HOST=http://opensearch:9200
# 5. Comment out MEILI_HOST in the 'api' service environment
# opensearch:
# container_name: chat-opensearch
# image: opensearchproject/opensearch:3.4.0
# restart: always
# environment:
# - discovery.type=single-node
# - DISABLE_SECURITY_PLUGIN=true
# - DISABLE_INSTALL_DEMO_CONFIG=true
# - OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m
# volumes:
# - opensearch_data:/usr/share/opensearch/data
# ports:
# - "9200:9200"

# ---- Typesense (alternative to MeiliSearch) ----
# To use Typesense instead of MeiliSearch:
# 1. Comment out the 'meilisearch' service above
# 2. Uncomment the 'typesense' service below
# 3. In .env, set:
# SEARCH_PROVIDER=typesense
# TYPESENSE_HOST=http://typesense:8108
# TYPESENSE_API_KEY=xyz (match the --api-key below)
# 4. In the 'api' service environment, replace MEILI_HOST with:
# TYPESENSE_HOST=http://typesense:8108
# 5. Comment out MEILI_HOST in the 'api' service environment
# typesense:
# container_name: chat-typesense
# image: typesense/typesense:30.1
# restart: always
# environment:
# - TYPESENSE_DATA_DIR=/data
# - TYPESENSE_API_KEY=${TYPESENSE_API_KEY:-xyz}
# volumes:
# - typesense_data:/data
# ports:
# - "8108:8108"
# command: "--data-dir /data --api-key=${TYPESENSE_API_KEY:-xyz} --enable-cors"
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments indicate that environment variables should be modified in the 'api' service, but those instructions appear inconsistent with the actual docker-compose structure. The docker-compose.yml file shows that MEILI_HOST is already in the api service environment (line 20), but the comments suggest adding OPENSEARCH_HOST or TYPESENSE_HOST there.

However, for docker-compose to work with the new providers, users would need to:

  1. Add the new env vars to the api service's environment section (not just .env file)
  2. Comment out or remove MEILI_HOST from the api service environment

Consider updating the comments to be clearer about these steps, or better yet, provide working examples in the docker-compose.override.yml.example file that users can uncomment.

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +22
const providerType = detectSearchProvider ? detectSearchProvider() : null;

if (providerType && providerType !== 'meilisearch') {
// Use generic search provider (OpenSearch, etc.)
const provider = getSearchProvider ? getSearchProvider() : null;
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defensive checks detectSearchProvider ? detectSearchProvider() : null and getSearchProvider ? getSearchProvider() : null suggest uncertainty about whether these functions are available. However, these are imported at the top of the file from @librechat/data-schemas.

If the imports could fail or be undefined, this should be handled at the module level with proper error handling. If the imports are guaranteed to work (which they should be), these ternary checks are unnecessary and add confusion. Consider either:

  1. Removing the ternary checks if the imports are reliable
  2. Adding proper error handling at the import level if there's a legitimate concern about module availability
  3. Adding a comment explaining why these checks are needed
Suggested change
const providerType = detectSearchProvider ? detectSearchProvider() : null;
if (providerType && providerType !== 'meilisearch') {
// Use generic search provider (OpenSearch, etc.)
const provider = getSearchProvider ? getSearchProvider() : null;
const providerType = detectSearchProvider();
if (providerType && providerType !== 'meilisearch') {
// Use generic search provider (OpenSearch, etc.)
const provider = getSearchProvider();

Copilot uses AI. Check for mistakes.
Comment on lines +506 to +507
const [field, order] = s.split(':');
return `${field}:${order || 'asc'}`;
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sort parameter parsing at line 506 using split(':') could fail if a field name contains a colon character. While this is unlikely in practice, it's a potential edge case. Additionally, if the sort string doesn't contain a colon, the split will return an array with one element, making order undefined, which then defaults to 'asc'. This behavior is intentional based on the code, but field names with colons would break.

Consider using a more robust parsing approach, such as:

  • Using a regex that matches the last colon: /^(.+):(.+)$/
  • Or documenting that field names cannot contain colons

This same issue exists in OpenSearchProvider at line 372.

Suggested change
const [field, order] = s.split(':');
return `${field}:${order || 'asc'}`;
const match = s.match(/^(.+?):(asc|desc)$/i);
const field = match ? match[1] : s;
const order = match ? match[2].toLowerCase() : 'asc';
return `${field}:${order}`;

Copilot uses AI. Check for mistakes.
break;
}

const idsToDelete = searchResult.hits.filter((hit) => !hit.user).map((hit) => hit.id);
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function deleteDocumentsWithoutUserFieldGeneric at line 277 assumes that documents have an id field (hit.id). However, this may not be consistent across all providers:

  • OpenSearch stores documents with _id which is mapped to the primary key field in the _source
  • Typesense uses id as the primary key
  • MeiliSearch uses the configured primary key (e.g., messageId, conversationId)

This could cause the deletion to fail or delete the wrong documents if hit.id is undefined or has a different meaning than expected. Consider using the primaryKey parameter to properly identify documents, similar to how it's done in the MeiliSearch version (deleteDocumentsWithoutUserField at line 44).

Copilot uses AI. Check for mistakes.
Comment on lines +81 to +83
} catch {
// undici not available; fetch will use default TLS settings
logger.debug('[OpenSearchProvider] undici not available for insecure TLS');
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The undici dependency is used for insecure TLS support but it's not declared in package.json. This creates a hidden runtime dependency that may fail silently if undici is not available. Consider either:

  1. Adding undici as a peer dependency or optional dependency in package.json
  2. Documenting in README/installation guide that undici is required for OPENSEARCH_INSECURE=true functionality
  3. Providing a clearer error message when insecure mode is requested but undici is unavailable

Currently, the catch block only logs a debug message, which means users may not realize why insecure mode isn't working.

Suggested change
} catch {
// undici not available; fetch will use default TLS settings
logger.debug('[OpenSearchProvider] undici not available for insecure TLS');
} catch (error) {
// undici not available; fetch will use default TLS settings
logger.warn(
'[OpenSearchProvider] Insecure TLS requested but "undici" is not available. ' +
'Falling back to default TLS verification. To enable insecure TLS, install the "undici" package.',
{ error },
);

Copilot uses AI. Check for mistakes.
Comment on lines +384 to +386
const hits: SearchHit[] = (hitsData?.hits || []).map((hit) => ({
...hit._source,
}));
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search() method returns hits without including the document ID from OpenSearch's _id field (line 384-386), while getDocuments() does include it as _opensearch_id (line 340). This inconsistency means that search results won't have an ID field that can be used for deletion operations.

This is problematic because:

  1. The deleteDocumentsWithoutUserFieldGeneric function in indexSync.js expects hit.id to be available
  2. Users of the search API may expect consistent behavior between search() and getDocuments()

Consider adding the primary key field (or _opensearch_id) to search results, similar to what's done in getDocuments(). The hits should include the document's primary key value so they can be used for deletion or updates.

Suggested change
const hits: SearchHit[] = (hitsData?.hits || []).map((hit) => ({
...hit._source,
}));
const hits: SearchHit[] = (hitsData?.hits || []).map((hit) => {
const source = hit._source || ({} as SearchHit);
const result: SearchHit = {
...source,
_opensearch_id: hit._id,
} as SearchHit;
if (!(source as any).id) {
(result as any).id = hit._id;
}
return result;
});

Copilot uses AI. Check for mistakes.
Comment on lines +544 to +546
const match = part.trim().match(/^(\w+)\s*=\s*["'](.+?)["']$/);
if (match) {
const [, field, value] = match;
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern /^(\w+)\s*=\s*["'](.+?)["']$/ uses a non-greedy match (.+?) which will stop at the first closing quote. This means filters with values containing quotes (e.g., user = "John's Team") will fail to parse correctly or may produce incorrect results.

Consider escaping quotes in values or using a more robust parsing approach that handles edge cases like:

  • Values with embedded quotes: user = "John's Team"
  • Values with special characters: name = "test&value"
  • Empty values: field = ""

This is a similar issue in both OpenSearch (line 461) and Typesense (line 544) providers.

Suggested change
const match = part.trim().match(/^(\w+)\s*=\s*["'](.+?)["']$/);
if (match) {
const [, field, value] = match;
const match = part.trim().match(/^(\w+)\s*=\s*(["'])(.*)\2$/);
if (match) {
const [, field, , value] = match;

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +34
{{- if not (dig "configEnv" "OPENSEARCH_INSECURE" "" .Values.librechat) }}
OPENSEARCH_INSECURE: "true"
{{- end }}
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When OPENSEARCH_INSECURE is set to "true", the protocol is correctly set to "https" in the OPENSEARCH_HOST but OPENSEARCH_INSECURE is also hardcoded to "true" (line 33). This seems inconsistent - if security is not disabled, shouldn't OPENSEARCH_INSECURE default to "false"?

The current logic:

  • If DISABLE_SECURITY_PLUGIN=true → use http protocol
  • If DISABLE_SECURITY_PLUGIN≠true → use https protocol
  • But OPENSEARCH_INSECURE is always set to "true" regardless

This means even with security enabled (https), the client is configured to skip certificate validation, which may not be the intended behavior for production. Consider making OPENSEARCH_INSECURE conditional based on the security plugin state.

Suggested change
{{- if not (dig "configEnv" "OPENSEARCH_INSECURE" "" .Values.librechat) }}
OPENSEARCH_INSECURE: "true"
{{- end }}
{{- if not (dig "configEnv" "OPENSEARCH_INSECURE" "" .Values.librechat) }}
{{- $securityDisabled := false }}
{{- range (.Values.opensearch.extraEnvs | default list) }}
{{- if and (eq .name "DISABLE_SECURITY_PLUGIN") (eq .value "true") }}
{{- $securityDisabled = true }}
{{- end }}
{{- end }}
{{- if $securityDisabled }}
OPENSEARCH_INSECURE: "true"
{{- else }}
OPENSEARCH_INSECURE: "false"
{{- end }}
{{- end }}

Copilot uses AI. Check for mistakes.
}
// Typesense supports batch delete via filter_by
// For ID-based deletion, we use individual deletes or filter
const filterBy = `id: [${documentIds.join(',')}]`;
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter syntax for batch deletion uses id: [${documentIds.join(',')}] but this doesn't properly escape or quote the individual IDs. According to Typesense documentation, the filter_by syntax for an array of values should be id: [value1, value2] where string values need to be properly escaped if they contain special characters.

If document IDs contain commas, spaces, or other special characters, this filter will fail or produce incorrect results. Consider properly escaping or quoting each ID in the array, e.g., id: [${documentIds.map(id => JSON.stringify(id)).join(',')}].

Suggested change
const filterBy = `id: [${documentIds.join(',')}]`;
const filterBy = `id: [${documentIds.map((id) => JSON.stringify(id)).join(',')}]`;

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +96
private async request(
method: string,
path: string,
body?: unknown,
): Promise<{ status: number; data: Record<string, unknown> }> {
const url = `${this.node}${path}`;
const headers: Record<string, string> = {
'Content-Type': 'application/json',
Authorization: this.authHeader,
};

const fetchOptions: RequestInit & { dispatcher?: unknown } = {
method,
headers,
body: body !== undefined ? JSON.stringify(body) : undefined,
};

// Support insecure TLS via undici dispatcher when available
if (this.insecure) {
try {
// Use require() so Rollup externalizes this instead of code-splitting
// eslint-disable-next-line @typescript-eslint/no-var-requires
const { Agent } = require('undici') as { Agent: new (opts: Record<string, unknown>) => unknown };
fetchOptions.dispatcher = new Agent({
connect: { rejectUnauthorized: false },
});
} catch {
// undici not available; fetch will use default TLS settings
logger.debug('[OpenSearchProvider] undici not available for insecure TLS');
}
}

const response = await fetch(url, fetchOptions);
const text = await response.text();
let data: Record<string, unknown> = {};
try {
data = JSON.parse(text) as Record<string, unknown>;
} catch {
data = { raw: text };
}
return { status: response.status, data };
}
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenSearchProvider doesn't implement connection timeouts for fetch requests, unlike TypesenseProvider which has configurable connectionTimeoutMs. This means OpenSearch requests could hang indefinitely if the server doesn't respond, potentially blocking operations.

Consider adding timeout support similar to TypesenseProvider:

  1. Add a connectionTimeoutMs option to OpenSearchProviderOptions
  2. Implement AbortController with setTimeout in the request() and bulkRequest() methods
  3. Default to a reasonable timeout (e.g., 30000ms for regular requests, 60000ms for bulk operations)

Copilot uses AI. Check for mistakes.
@Lazyshot
Copy link

I just wanted to add that I appreciate this work as we have been having issues maintaining our meilisearch instance at our scale, so having an alternative provider would be much appreciated.

@jpaodev
Copy link
Author

jpaodev commented Feb 18, 2026

I just wanted to add that I appreciate this work as we have been having issues maintaining our meilisearch instance at our scale, so having an alternative provider would be much appreciated.

Hey there,
I'm absolutely looking forward to finish the PR up based on the recommendations - in general everything was looking good already, but it seems like i missed a couple of spots there, have yet to evaluate the importance of those

Overall I can recommend looking into https://typesense.org/docs/guide/install-typesense.html#kubernetes - I think that's most likely the "easiest" option to get a multi-replica cluster without fuss - beware though that typesense seems to be a bit "special" as well, but it should be viable!

If anyone has further wishes or would like to narrow down changes, let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants