feat: add opensearch and typesense support#11678
feat: add opensearch and typesense support#11678jpaodev wants to merge 3 commits intodanny-avila:devfrom
Conversation
|
OpenSearch also tested (single-pod opensearch) in Kubernetes deployment (successfully) with these values: opensearch:
enabled: true
# singleNode: false
# replicas: 2
singleNode: true
replicas: 1
# ---- Image Configuration (DHI) ----
global:
dockerRegistry: ""
image:
repository: "dhi.io/opensearch"
tag: "3.4.0"
pullPolicy: "IfNotPresent"
imagePullSecrets:
- name: registry-pull-secret
# OpenSearch Java memory settings
opensearchJavaOpts: "-Xmx512M -Xms512M"
# ---- Persistence ----
persistence:
enabled: true
size: 8Gi
enableInitChown: false # Disable chown init container (requires root)
# storageClass: ""
# ---- Security Plugin: DISABLED for initial testing ----
# The OpenSearch security plugin requires TLS certificates.
# For initial deployment, disable it. K8s-level securityContext and
# readOnlyRootFilesystem remain fully enforced (Gatekeeper).
# To enable later: remove DISABLE_SECURITY_PLUGIN, provide TLS certs via
# secretMounts, set OPENSEARCH_INITIAL_ADMIN_PASSWORD, and update configEnv.
extraEnvs:
- name: DISABLE_SECURITY_PLUGIN
value: "true"
- name: DISABLE_INSTALL_DEMO_CONFIG
value: "true"
# # For production with security plugin enabled:
# extraEnvs:
# - name: DISABLE_INSTALL_DEMO_CONFIG
# value: "true"
# - name: OPENSEARCH_INITIAL_ADMIN_PASSWORD
# valueFrom:
# secretKeyRef:
# name: opensearch-credentials
# key: admin-password
# opensearchUsername: "admin"
# ---- OpenSearch Config ----
# Minimal config with security plugin disabled.
# When enabling the security plugin later, add the plugins.security block
# with TLS cert paths and admin DN configuration.
config:
opensearch.yml: |
cluster.name: opensearch-cluster
network.host: 0.0.0.0
# ---- Resources ----
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1Gi"
# ---- Pod Security Context ----
podSecurityContext:
fsGroup: 1000
fsGroupChangePolicy: OnRootMismatch
supplementalGroups: [1000]
# ---- Container Security Context ----
# readOnlyRootFilesystem: true requires emptyDir mounts for writable paths.
# /usr/share/opensearch/data is already mounted from PVC.
# /usr/share/opensearch/config is mounted from config-emptydir by the chart.
# We add emptyDir for /tmp and /usr/share/opensearch/logs below via extraVolumes.
securityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
privileged: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
seccompProfile:
type: RuntimeDefault
# ---- Extra Volumes for readOnlyRootFilesystem ----
# OpenSearch writes opensearch.keystore.tmp to /usr/share/opensearch/config at startup.
# We mount an emptyDir over the config dir and use an init container to seed it.
extraVolumes:
- name: tmp-dir
emptyDir: {}
- name: opensearch-logs
emptyDir: {}
- name: opensearch-config-writable
emptyDir: {}
extraVolumeMounts:
- name: tmp-dir
mountPath: /tmp
- name: opensearch-logs
mountPath: /usr/share/opensearch/logs
- name: opensearch-config-writable
mountPath: /usr/share/opensearch/config
# Init container to seed the writable config dir from the read-only image
extraInitContainers:
- name: init-config-dir
image: "dhi.io/opensearch:3.4.0"
imagePullPolicy: IfNotPresent
command:
- sh
- -c
- |
cp -a /usr/share/opensearch/config/* /tmp/opensearch-config/ 2>/dev/null || true
cp -a /usr/share/opensearch/config/.[!.]* /tmp/opensearch-config/ 2>/dev/null || true
securityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
privileged: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
seccompProfile:
type: RuntimeDefault
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "100m"
memory: "128Mi"
volumeMounts:
- name: opensearch-config-writable
mountPath: /tmp/opensearch-config
# ---- RBAC ----
rbac:
create: false
automountServiceAccountToken: false
# ---- Network Policy ----
networkPolicy:
create: true
http:
enabled: true
# ---- Probes ----
startupProbe:
tcpSocket:
port: 9200
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 30
readinessProbe:
tcpSocket:
port: 9200
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
# ---- Sysctls ----
# vm.max_map_count=262144 is required for multi-node OpenSearch.
# GKE Autopilot nodes typically have this set by default since late 2023.
# If not, sysctl.enabled adds it as a pod-level sysctl (no privileged container).
# Try with sysctl.enabled: false first. If OpenSearch fails with
# "max virtual memory areas vm.max_map_count too low", set sysctl.enabled: true.
# sysctlInit (privileged init container) is NOT compatible with GKE Autopilot Warden.
sysctl:
enabled: false
sysctlInit:
enabled: false |
|
Typesense tested on k8s in Clustermode: LibreChat global:
librechat:
env:
- name: TYPESENSE_API_KEY
valueFrom:
secretKeyRef:
name: typesense-credentials
key: typesense-api-key
librechat:
configEnv:
SEARCH: "true"
SEARCH_PROVIDER: typesense
TYPESENSE_HOST: "http://librechat-test-cluster-svc.librechat.svc.cluster.local:8108"
typesense:
enabled: true
Tested with: https://github.com/akyriako/typesense-operator (3 pods cluster mode) Sync messages rename: What I noticed as well during testing was that while the results seem to be perfectly showing in the mid of the screen, on the left side the chats are not showing anymore: I remember this different, as far as I know the chats also showed "filtered" on the left side, but maybe I misremember Edit:
With this other people and me should be able to test (and run) multi-node OpenSearch as well (safely) |
|
Hi @jpaodev, great work here! - Reset flags commandJust wanted to point out that there are few more places where MeiliSearch is still used as a singe search engine. - Separate Mongoose model from Search Index model (probably worth to consider in the next PRs)Another thing is would be really good to decouple mongoose Mongo document model and search index. There are few issues with the Plugin implementation:
p.s. I started implementing similar search provider logic myself, planned to add elastic search. |
There was a problem hiding this comment.
Pull request overview
This pull request introduces a comprehensive search provider abstraction layer that enables LibreChat to use OpenSearch and Typesense as alternative search backends alongside the existing MeiliSearch integration. The implementation maintains full backward compatibility with existing MeiliSearch deployments while providing a clean, extensible architecture for adding new search providers.
Changes:
- Added a provider-agnostic
SearchProviderinterface with factory pattern for instantiating search backends based on environment configuration - Implemented OpenSearchProvider and TypesenseProvider with full REST API integration, including filter/sort translation and bulk operations
- Created a generic
mongoSearchMongoose plugin that works with any SearchProvider, replacing provider-specific plugins for non-MeiliSearch backends - Updated Helm charts with OpenSearch subchart (3.4.0), Typesense configuration, mutual exclusivity validation, and security-aware configuration
- Enhanced Docker Compose and environment variable examples with clear switching instructions for all three providers
- Added 90 comprehensive unit tests covering factory logic, provider instantiation, and CRUD operations
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/data-schemas/src/models/plugins/search/searchProvider.ts | Core SearchProvider interface defining the contract for all search backends |
| packages/data-schemas/src/models/plugins/search/searchProviderFactory.ts | Factory with auto-detection from environment variables and singleton caching |
| packages/data-schemas/src/models/plugins/search/meiliSearchProvider.ts | Wrapper for existing MeiliSearch client behind the new interface |
| packages/data-schemas/src/models/plugins/search/openSearchProvider.ts | Full OpenSearch REST API implementation with HTTP Basic auth and TLS support |
| packages/data-schemas/src/models/plugins/search/typesenseProvider.ts | Full Typesense REST API implementation with JSONL bulk import and collection schema management |
| packages/data-schemas/src/models/plugins/mongoSearch.ts | Generic Mongoose plugin working with any SearchProvider |
| packages/data-schemas/src/models/message.ts | Updated to use generic plugin for OpenSearch/Typesense while preserving mongoMeili for MeiliSearch |
| packages/data-schemas/src/models/convo.ts | Updated to use generic plugin for OpenSearch/Typesense while preserving mongoMeili for MeiliSearch |
| helm/librechat/values.yaml | OpenSearch subchart with DHI image support, Typesense external connection config |
| helm/librechat/templates/configmap-env.yaml | Auto-wiring for all providers with individual env var overrides |
| helm/librechat/templates/checks.yaml | Mutual exclusivity validation for search backends |
| docker-compose.yml | OpenSearch and Typesense service definitions as commented-out alternatives |
| .env.example | Comprehensive documentation of all search provider environment variables |
| api/server/routes/search.js | Health check endpoint supporting all providers |
| api/db/indexSync.js | Sync operations extended to support all providers with backward compatibility |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # ---- OpenSearch (alternative to MeiliSearch) ---- | ||
| # To use OpenSearch instead of MeiliSearch: | ||
| # 1. Comment out the 'meilisearch' service above | ||
| # 2. Uncomment the 'opensearch' service below | ||
| # 3. In .env, set: | ||
| # SEARCH_PROVIDER=opensearch | ||
| # OPENSEARCH_HOST=http://opensearch:9200 | ||
| # OPENSEARCH_INSECURE=true | ||
| # 4. In the 'api' service environment, replace MEILI_HOST with: | ||
| # OPENSEARCH_HOST=http://opensearch:9200 | ||
| # 5. Comment out MEILI_HOST in the 'api' service environment | ||
| # opensearch: | ||
| # container_name: chat-opensearch | ||
| # image: opensearchproject/opensearch:3.4.0 | ||
| # restart: always | ||
| # environment: | ||
| # - discovery.type=single-node | ||
| # - DISABLE_SECURITY_PLUGIN=true | ||
| # - DISABLE_INSTALL_DEMO_CONFIG=true | ||
| # - OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m | ||
| # volumes: | ||
| # - opensearch_data:/usr/share/opensearch/data | ||
| # ports: | ||
| # - "9200:9200" | ||
|
|
||
| # ---- Typesense (alternative to MeiliSearch) ---- | ||
| # To use Typesense instead of MeiliSearch: | ||
| # 1. Comment out the 'meilisearch' service above | ||
| # 2. Uncomment the 'typesense' service below | ||
| # 3. In .env, set: | ||
| # SEARCH_PROVIDER=typesense | ||
| # TYPESENSE_HOST=http://typesense:8108 | ||
| # TYPESENSE_API_KEY=xyz (match the --api-key below) | ||
| # 4. In the 'api' service environment, replace MEILI_HOST with: | ||
| # TYPESENSE_HOST=http://typesense:8108 | ||
| # 5. Comment out MEILI_HOST in the 'api' service environment | ||
| # typesense: | ||
| # container_name: chat-typesense | ||
| # image: typesense/typesense:30.1 | ||
| # restart: always | ||
| # environment: | ||
| # - TYPESENSE_DATA_DIR=/data | ||
| # - TYPESENSE_API_KEY=${TYPESENSE_API_KEY:-xyz} | ||
| # volumes: | ||
| # - typesense_data:/data | ||
| # ports: | ||
| # - "8108:8108" | ||
| # command: "--data-dir /data --api-key=${TYPESENSE_API_KEY:-xyz} --enable-cors" |
There was a problem hiding this comment.
The comments indicate that environment variables should be modified in the 'api' service, but those instructions appear inconsistent with the actual docker-compose structure. The docker-compose.yml file shows that MEILI_HOST is already in the api service environment (line 20), but the comments suggest adding OPENSEARCH_HOST or TYPESENSE_HOST there.
However, for docker-compose to work with the new providers, users would need to:
- Add the new env vars to the api service's environment section (not just .env file)
- Comment out or remove MEILI_HOST from the api service environment
Consider updating the comments to be clearer about these steps, or better yet, provide working examples in the docker-compose.override.yml.example file that users can uncomment.
| const providerType = detectSearchProvider ? detectSearchProvider() : null; | ||
|
|
||
| if (providerType && providerType !== 'meilisearch') { | ||
| // Use generic search provider (OpenSearch, etc.) | ||
| const provider = getSearchProvider ? getSearchProvider() : null; |
There was a problem hiding this comment.
The defensive checks detectSearchProvider ? detectSearchProvider() : null and getSearchProvider ? getSearchProvider() : null suggest uncertainty about whether these functions are available. However, these are imported at the top of the file from @librechat/data-schemas.
If the imports could fail or be undefined, this should be handled at the module level with proper error handling. If the imports are guaranteed to work (which they should be), these ternary checks are unnecessary and add confusion. Consider either:
- Removing the ternary checks if the imports are reliable
- Adding proper error handling at the import level if there's a legitimate concern about module availability
- Adding a comment explaining why these checks are needed
| const providerType = detectSearchProvider ? detectSearchProvider() : null; | |
| if (providerType && providerType !== 'meilisearch') { | |
| // Use generic search provider (OpenSearch, etc.) | |
| const provider = getSearchProvider ? getSearchProvider() : null; | |
| const providerType = detectSearchProvider(); | |
| if (providerType && providerType !== 'meilisearch') { | |
| // Use generic search provider (OpenSearch, etc.) | |
| const provider = getSearchProvider(); |
| const [field, order] = s.split(':'); | ||
| return `${field}:${order || 'asc'}`; |
There was a problem hiding this comment.
The sort parameter parsing at line 506 using split(':') could fail if a field name contains a colon character. While this is unlikely in practice, it's a potential edge case. Additionally, if the sort string doesn't contain a colon, the split will return an array with one element, making order undefined, which then defaults to 'asc'. This behavior is intentional based on the code, but field names with colons would break.
Consider using a more robust parsing approach, such as:
- Using a regex that matches the last colon:
/^(.+):(.+)$/ - Or documenting that field names cannot contain colons
This same issue exists in OpenSearchProvider at line 372.
| const [field, order] = s.split(':'); | |
| return `${field}:${order || 'asc'}`; | |
| const match = s.match(/^(.+?):(asc|desc)$/i); | |
| const field = match ? match[1] : s; | |
| const order = match ? match[2].toLowerCase() : 'asc'; | |
| return `${field}:${order}`; |
| break; | ||
| } | ||
|
|
||
| const idsToDelete = searchResult.hits.filter((hit) => !hit.user).map((hit) => hit.id); |
There was a problem hiding this comment.
The function deleteDocumentsWithoutUserFieldGeneric at line 277 assumes that documents have an id field (hit.id). However, this may not be consistent across all providers:
- OpenSearch stores documents with
_idwhich is mapped to the primary key field in the_source - Typesense uses
idas the primary key - MeiliSearch uses the configured primary key (e.g.,
messageId,conversationId)
This could cause the deletion to fail or delete the wrong documents if hit.id is undefined or has a different meaning than expected. Consider using the primaryKey parameter to properly identify documents, similar to how it's done in the MeiliSearch version (deleteDocumentsWithoutUserField at line 44).
| } catch { | ||
| // undici not available; fetch will use default TLS settings | ||
| logger.debug('[OpenSearchProvider] undici not available for insecure TLS'); |
There was a problem hiding this comment.
The undici dependency is used for insecure TLS support but it's not declared in package.json. This creates a hidden runtime dependency that may fail silently if undici is not available. Consider either:
- Adding undici as a peer dependency or optional dependency in package.json
- Documenting in README/installation guide that undici is required for OPENSEARCH_INSECURE=true functionality
- Providing a clearer error message when insecure mode is requested but undici is unavailable
Currently, the catch block only logs a debug message, which means users may not realize why insecure mode isn't working.
| } catch { | |
| // undici not available; fetch will use default TLS settings | |
| logger.debug('[OpenSearchProvider] undici not available for insecure TLS'); | |
| } catch (error) { | |
| // undici not available; fetch will use default TLS settings | |
| logger.warn( | |
| '[OpenSearchProvider] Insecure TLS requested but "undici" is not available. ' + | |
| 'Falling back to default TLS verification. To enable insecure TLS, install the "undici" package.', | |
| { error }, | |
| ); |
| const hits: SearchHit[] = (hitsData?.hits || []).map((hit) => ({ | ||
| ...hit._source, | ||
| })); |
There was a problem hiding this comment.
The search() method returns hits without including the document ID from OpenSearch's _id field (line 384-386), while getDocuments() does include it as _opensearch_id (line 340). This inconsistency means that search results won't have an ID field that can be used for deletion operations.
This is problematic because:
- The deleteDocumentsWithoutUserFieldGeneric function in indexSync.js expects
hit.idto be available - Users of the search API may expect consistent behavior between search() and getDocuments()
Consider adding the primary key field (or _opensearch_id) to search results, similar to what's done in getDocuments(). The hits should include the document's primary key value so they can be used for deletion or updates.
| const hits: SearchHit[] = (hitsData?.hits || []).map((hit) => ({ | |
| ...hit._source, | |
| })); | |
| const hits: SearchHit[] = (hitsData?.hits || []).map((hit) => { | |
| const source = hit._source || ({} as SearchHit); | |
| const result: SearchHit = { | |
| ...source, | |
| _opensearch_id: hit._id, | |
| } as SearchHit; | |
| if (!(source as any).id) { | |
| (result as any).id = hit._id; | |
| } | |
| return result; | |
| }); |
| const match = part.trim().match(/^(\w+)\s*=\s*["'](.+?)["']$/); | ||
| if (match) { | ||
| const [, field, value] = match; |
There was a problem hiding this comment.
The regex pattern /^(\w+)\s*=\s*["'](.+?)["']$/ uses a non-greedy match (.+?) which will stop at the first closing quote. This means filters with values containing quotes (e.g., user = "John's Team") will fail to parse correctly or may produce incorrect results.
Consider escaping quotes in values or using a more robust parsing approach that handles edge cases like:
- Values with embedded quotes:
user = "John's Team" - Values with special characters:
name = "test&value" - Empty values:
field = ""
This is a similar issue in both OpenSearch (line 461) and Typesense (line 544) providers.
| const match = part.trim().match(/^(\w+)\s*=\s*["'](.+?)["']$/); | |
| if (match) { | |
| const [, field, value] = match; | |
| const match = part.trim().match(/^(\w+)\s*=\s*(["'])(.*)\2$/); | |
| if (match) { | |
| const [, field, , value] = match; |
| {{- if not (dig "configEnv" "OPENSEARCH_INSECURE" "" .Values.librechat) }} | ||
| OPENSEARCH_INSECURE: "true" | ||
| {{- end }} |
There was a problem hiding this comment.
When OPENSEARCH_INSECURE is set to "true", the protocol is correctly set to "https" in the OPENSEARCH_HOST but OPENSEARCH_INSECURE is also hardcoded to "true" (line 33). This seems inconsistent - if security is not disabled, shouldn't OPENSEARCH_INSECURE default to "false"?
The current logic:
- If DISABLE_SECURITY_PLUGIN=true → use http protocol
- If DISABLE_SECURITY_PLUGIN≠true → use https protocol
- But OPENSEARCH_INSECURE is always set to "true" regardless
This means even with security enabled (https), the client is configured to skip certificate validation, which may not be the intended behavior for production. Consider making OPENSEARCH_INSECURE conditional based on the security plugin state.
| {{- if not (dig "configEnv" "OPENSEARCH_INSECURE" "" .Values.librechat) }} | |
| OPENSEARCH_INSECURE: "true" | |
| {{- end }} | |
| {{- if not (dig "configEnv" "OPENSEARCH_INSECURE" "" .Values.librechat) }} | |
| {{- $securityDisabled := false }} | |
| {{- range (.Values.opensearch.extraEnvs | default list) }} | |
| {{- if and (eq .name "DISABLE_SECURITY_PLUGIN") (eq .value "true") }} | |
| {{- $securityDisabled = true }} | |
| {{- end }} | |
| {{- end }} | |
| {{- if $securityDisabled }} | |
| OPENSEARCH_INSECURE: "true" | |
| {{- else }} | |
| OPENSEARCH_INSECURE: "false" | |
| {{- end }} | |
| {{- end }} |
| } | ||
| // Typesense supports batch delete via filter_by | ||
| // For ID-based deletion, we use individual deletes or filter | ||
| const filterBy = `id: [${documentIds.join(',')}]`; |
There was a problem hiding this comment.
The filter syntax for batch deletion uses id: [${documentIds.join(',')}] but this doesn't properly escape or quote the individual IDs. According to Typesense documentation, the filter_by syntax for an array of values should be id: [value1, value2] where string values need to be properly escaped if they contain special characters.
If document IDs contain commas, spaces, or other special characters, this filter will fail or produce incorrect results. Consider properly escaping or quoting each ID in the array, e.g., id: [${documentIds.map(id => JSON.stringify(id)).join(',')}].
| const filterBy = `id: [${documentIds.join(',')}]`; | |
| const filterBy = `id: [${documentIds.map((id) => JSON.stringify(id)).join(',')}]`; |
| private async request( | ||
| method: string, | ||
| path: string, | ||
| body?: unknown, | ||
| ): Promise<{ status: number; data: Record<string, unknown> }> { | ||
| const url = `${this.node}${path}`; | ||
| const headers: Record<string, string> = { | ||
| 'Content-Type': 'application/json', | ||
| Authorization: this.authHeader, | ||
| }; | ||
|
|
||
| const fetchOptions: RequestInit & { dispatcher?: unknown } = { | ||
| method, | ||
| headers, | ||
| body: body !== undefined ? JSON.stringify(body) : undefined, | ||
| }; | ||
|
|
||
| // Support insecure TLS via undici dispatcher when available | ||
| if (this.insecure) { | ||
| try { | ||
| // Use require() so Rollup externalizes this instead of code-splitting | ||
| // eslint-disable-next-line @typescript-eslint/no-var-requires | ||
| const { Agent } = require('undici') as { Agent: new (opts: Record<string, unknown>) => unknown }; | ||
| fetchOptions.dispatcher = new Agent({ | ||
| connect: { rejectUnauthorized: false }, | ||
| }); | ||
| } catch { | ||
| // undici not available; fetch will use default TLS settings | ||
| logger.debug('[OpenSearchProvider] undici not available for insecure TLS'); | ||
| } | ||
| } | ||
|
|
||
| const response = await fetch(url, fetchOptions); | ||
| const text = await response.text(); | ||
| let data: Record<string, unknown> = {}; | ||
| try { | ||
| data = JSON.parse(text) as Record<string, unknown>; | ||
| } catch { | ||
| data = { raw: text }; | ||
| } | ||
| return { status: response.status, data }; | ||
| } |
There was a problem hiding this comment.
The OpenSearchProvider doesn't implement connection timeouts for fetch requests, unlike TypesenseProvider which has configurable connectionTimeoutMs. This means OpenSearch requests could hang indefinitely if the server doesn't respond, potentially blocking operations.
Consider adding timeout support similar to TypesenseProvider:
- Add a connectionTimeoutMs option to OpenSearchProviderOptions
- Implement AbortController with setTimeout in the request() and bulkRequest() methods
- Default to a reasonable timeout (e.g., 30000ms for regular requests, 60000ms for bulk operations)
|
I just wanted to add that I appreciate this work as we have been having issues maintaining our meilisearch instance at our scale, so having an alternative provider would be much appreciated. |
Hey there, Overall I can recommend looking into https://typesense.org/docs/guide/install-typesense.html#kubernetes - I think that's most likely the "easiest" option to get a multi-replica cluster without fuss - beware though that typesense seems to be a bit "special" as well, but it should be viable! If anyone has further wishes or would like to narrow down changes, let me know! |

Pull Request Template
Summary
This PR introduces a modular search provider abstraction that enables LibreChat to use OpenSearch and Typesense as alternative search backends alongside the existing MeiliSearch integration.
Adds support for opensearch and typesense, both being alternatives to meilisearch.
Initial motivation: Meilisearch is not able to run in multi-replica mode, which is a problem due to these reasons: 1) high availability deployments not possible 2) running application during node maintenance not possible, e.g. because the pod disruption budget can't be set to 1 pod always (maintenance wouldn't be possible) 3) problems with volume attachments can fry the application completely (love it). It seems though that single-node it starts perfectly fine with the adapted LibreChat
values.yamland fully hardened security context + DHI image of opensearch (DHI = Docker Hardened Image -> I always try to use the DHI images wherever possible.)So I tested everything and everything looks great, however it looks like I won't be able to use OpenSearch (yayyy 🙃), because my security settings don't allow escalated privileges and those are required for OpenSearch apparently.
Note that I did not want to include too many providers and thought starting out with those 2 alternatives to Meili might be a solid option. This PR therefore partly addresses #6712
What's included
Search Provider Abstraction Layer
SearchProviderinterface defining a standard contract for all search backends (health check, index management, document CRUD, search with filters/sort/pagination)SearchProviderFactorywith auto-detection from environment variables and explicitSEARCH_PROVIDERoverrideresetSearchProvider()for testingProvider Implementations
opensearchproject/opensearch:3.4.0) with HTTP Basic auth, TLS support, MeiliSearch-style filter translation to OpenSearch DSLtypesense/typesense:30.1) with JSONL bulk import, collection schema management, and filter translationGeneric Mongoose Plugin
mongoSearch.ts— a generic Mongoose plugin that works with anySearchProvider, replacing the need for provider-specific pluginsconvo.tsandmessage.tsupdated to use the generic plugin for OpenSearch/Typesense while preserving the originalmongoMeiliplugin for MeiliSearchHelm Chart Updates
values.yaml— OpenSearch subchart (3.4.0) with DHI image switching support; Typesense section for external instance connection;secretKeyRefdocumentation for all sensitive credentials - NOTE:configmap-env.yaml— auto-wiring for all three providers with every env var individually overridable vialibrechat.configEnv(supports remote/custom-hosted backends); secrets (OPENSEARCH_PASSWORD,TYPESENSE_API_KEY) kept out of ConfigMapchecks.yaml— mutual exclusivity validation ensuring only one search backend is enabled (moved from_checks.yamlwhich was never rendered by Helm)NOTES.txt— search backend status in Helm install notesconfigEnvoverridesDocker Compose
docker-compose.override.yml.examplewith clear switching instructionsEnvironment Variables
.env.exampleupdated with all new env vars:SEARCH_PROVIDER,OPENSEARCH_HOST,OPENSEARCH_USERNAME,OPENSEARCH_PASSWORD,OPENSEARCH_INSECURE,TYPESENSE_HOST,TYPESENSE_API_KEYUnit Tests — 90 new tests
searchProviderFactory.spec.ts(28 tests) — detection, factory instantiation, caching,isSearchEnabled, case-insensitivity, explicit overridesopenSearchProvider.spec.ts(32 tests) — constructor, health check, index/document CRUD, search with filter/sort/pagination translation, error handlingtypesenseProvider.spec.ts(30 tests) — constructor, health check, collection CRUD, JSONL import, search with Typesense filter translation, error handlingHelm Template Validation Script — 13 scenarios
helm/test-search-backends.sh— automated validation covering default MeiliSearch, OpenSearch subchart, Typesense, external connections, configEnv overrides, security plugin detection (http/https), and mutual exclusivity checks - NOTE: Not added, can be added upon request, but I thought maybe it's not desired to have this in thehelmfolderBackward Compatibility
SEARCH_PROVIDERis set, the factory auto-detects from environment variables with MeiliSearch as the fallback default.Change Type
Testing
Unit Tests
All 90 tests pass.
Helm Template Validation
# Run the automated Helm template test suite (13 scenarios) - Note: Not pushed ./helm/test-search-backends.shLocal Integration Testing (Docker Compose)
docker-compose.override.ymlfor testing opensearch:docker-compose.override.ymlfor testing typesense:Set in
.env:Run
docker compose upManual Testing — OpenSearch
Manual Testing — Typesense
Test Configuration:
Checklist