|
| 1 | +# Trace-Based Testing for Chatbot RAG Application |
| 2 | + |
| 3 | +*The instructions and tests below have been used with models hosted by OpenAI. |
| 4 | +We plan to add tests for additional model configurations in the future.* |
| 5 | + |
| 6 | +## Introduction to Trace Testing |
| 7 | + |
| 8 | +Trace testing is a modern approach to testing distributed systems by leveraging the distributed traces that flow through your applications. In a complex system like the Chatbot RAG application, traditional testing approaches may fall short because they can't effectively monitor the interactions between microservices, databases, and external APIs. |
| 9 | + |
| 10 | +Tracetest is an open-source tool that enables you to create, run, and maintain integration tests using distributed traces with support for OpenTelemetry and observability backends such as Elastic APM. It allows you to: |
| 11 | + |
| 12 | +- Validate the flow of requests through your entire system |
| 13 | +- Assert on specific spans within a trace |
| 14 | +- Test complex scenarios involving multiple services |
| 15 | + |
| 16 | +For more information about Tracetest, visit the [official documentation](https://docs.tracetest.io/). |
| 17 | + |
| 18 | +## Setup |
| 19 | + |
| 20 | +Chatbot RAG application setup uses Docker to create a testing environment that includes: |
| 21 | + |
| 22 | +1. A Tracetest server for executing and managing tests |
| 23 | +2. An Elasticsearch cluster for storing traces, logs, and application data |
| 24 | +3. An OpenTelemetry collector for processing and routing telemetry data |
| 25 | +4. The chatbot RAG application itself |
| 26 | + |
| 27 | +The setup leverages several Docker Compose files to combine the test environment with the local Elastic Stack (from [docker/docker-compose-elastic.yml](../../../../docker/docker-compose-elastic.yml)) and Chatbot RAG application (from [example-apps/chatbot-rag-app/docker-compose.yml](../../docker-compose.yml)). In order to spin up the up-to-date versions of all moving parts, we leverage overrides maintained within this directory. We use: |
| 28 | + |
| 29 | +- `docker-compose.test.yml` - for Tracetest configuration |
| 30 | +- `docker-compose.test.override.yml` - Test-specific Tracetest configuration |
| 31 | +- `elastic-stack.override.yml` - for test-specific configuration for Elasticsearch and OpenTelemetry Collector |
| 32 | +- `chatbot-rag.override.yml` - for configuration of the chatbot application in test mode. |
| 33 | + |
| 34 | +All services are connected through a shared Docker network to enable communication between components. |
| 35 | + |
| 36 | +## Environment Configuration |
| 37 | + |
| 38 | +Before running tests, you need to prepare a `.env.test` file with the necessary environment variables. This file configures the behavior of the chatbot application during testing (same configuration as described in [the applications's directory](../../README.md)). |
| 39 | + |
| 40 | +Create a `.env.test` file in the `test/tracetest` directory with the following content to reproduce the environment we're testing with: |
| 41 | + |
| 42 | +```bash |
| 43 | +# Location of the application routes |
| 44 | +FLASK_APP=api/app.py |
| 45 | +# Ensure print statements appear as they happen |
| 46 | +PYTHONUNBUFFERED=1 |
| 47 | + |
| 48 | +# How you connect to Elasticsearch: change details to your instance |
| 49 | +ELASTICSEARCH_URL=http://elasticsearch:9200 |
| 50 | +ELASTICSEARCH_USER=elastic |
| 51 | +ELASTICSEARCH_PASSWORD=elastic |
| 52 | + |
| 53 | +# The name of the Elasticsearch indexes |
| 54 | +ES_INDEX=workplace-app-docs |
| 55 | +ES_INDEX_CHAT_HISTORY=workplace-app-docs-chat-history |
| 56 | + |
| 57 | +# OpenAI Configuration |
| 58 | +LLM_TYPE=openai |
| 59 | +OPENAI_API_KEY= |
| 60 | +CHAT_MODEL=gpt-4o-mini |
| 61 | + |
| 62 | +# Set to false to record logs, traces and metrics |
| 63 | +OTEL_SDK_DISABLED=false |
| 64 | + |
| 65 | +# Assign the service name that shows up in Kibana |
| 66 | +OTEL_SERVICE_NAME=chatbot-rag-app |
| 67 | + |
| 68 | +# OpenTelemetry configuration |
| 69 | +OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318 |
| 70 | +OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf |
| 71 | +OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true |
| 72 | + |
| 73 | +# Performance tuning |
| 74 | +OTEL_METRIC_EXPORT_INTERVAL=3000 |
| 75 | +OTEL_BSP_SCHEDULE_DELAY=3000 |
| 76 | +OTEL_EXPERIMENTAL_RESOURCE_DETECTORS=process_runtime,os,otel,telemetry_distro |
| 77 | +``` |
| 78 | + |
| 79 | +> Note: Make sure to add your actual OpenAI API key. |
| 80 | +
|
| 81 | +## Running the Tests |
| 82 | + |
| 83 | +To run the trace-based tests for the chatbot RAG application, follow these steps: |
| 84 | + |
| 85 | +1. Navigate to the test directory: |
| 86 | + |
| 87 | + ```bash |
| 88 | + cd example-apps/chatbot-rag-app/test/tracetest |
| 89 | + ``` |
| 90 | + |
| 91 | +2. Execute the test script: |
| 92 | + |
| 93 | + ```bash |
| 94 | + ./run-tests.sh |
| 95 | + ``` |
| 96 | + |
| 97 | + To automatically clean up resources after the tests complete (or if they fail), you can use the --with-cleanup flag: |
| 98 | + |
| 99 | + ```bash |
| 100 | + ./run-tests.sh --with-cleanup |
| 101 | + ``` |
| 102 | + |
| 103 | +The script performs the following operations: |
| 104 | + |
| 105 | +- Creates a shared Docker network for all services |
| 106 | +- Sets up the Tracetest server |
| 107 | +- Starts the Elastic stack (Elasticsearch and OpenTelemetry Collector) |
| 108 | +- Builds and starts the chatbot RAG application |
| 109 | +- Executes the trace tests defined in `resources/openai-chatbot-test.yaml` |
| 110 | +- If `--with-cleanup` is provided, automatically cleans up all resources when the script exits (normally or due to an error) |
| 111 | + |
| 112 | +The example test sends a question about working from home policy to the LLM via API and validates several aspects of the application: |
| 113 | + |
| 114 | +- Successful interraction with the LLM (in the initial setup, a `gpt-4o-mini` via OpenAI API) |
| 115 | +- Proper search operations in Elasticsearch for RAG functionality |
| 116 | +- Correct updating of chat history |
0 commit comments