Snowflake

The Snowflake AI Data Cloud provides:

This guide walks you through how to use your Snowflake account for LLMs and/or for retrieval.

Connect to your Snowflake account

The sample application can use a programmatic access token

Login to your Snowflake account, e.g., https://<account_identifier>.snowflakecomputing.com/
Click on your user, then "Settings", then "Authentication"
Under "Programmatic access tokens" click "Generate new token"
Set SNOWFLAKE_ACCOUNT_URL and SNOWFLAKE_PAT in the .env file (as README.md suggests).
(Optionally): Set SNOWFLAKE_EMBEDDING_MODEL to an embedding model available in Snowflake
(Optionally): Set SNOWFLAKE_CORTEX_SEARCH_SERVICE to the fully qualified name of the Cortex Search Service to use for retrieval.

Run from 'python' directory:

python code/python/testing/check_connectivity.py

You'll see a three line report on configuration whether configuration has been set correctly for Snowflake services.

Edit config_llm.yaml and change preferred_endpoint at the top to preferred_endpoint: snowflake
(Optionally) adjust the models to use by setting snowflake.models.high or snowflake.models.low in config_llm.yaml to any of the models available to your Snowflake account

Edit config_retrieval.yaml and change write_endpoint at the top to write_endpoint: snowflake_cortex_search_1
(Optionally): To populate a Cortex Search Service with the SciFi Movies dataset included in this repository: a. Install the snowflake cli and configure your connection. Make sure to set role, database and schema in the connections.toml file. b. Run the snowflake.sql script to index the scifi movies data (Cortex Search will automatically vectorize and also build a keyword index) using the snow command, for example:
```
snow sql \
    -f ../code/python/retrieval_providers/utils/snowflake.sql \
    -D DATA_DIR=$(git rev-parse --show-toplevel)/data \
    -D WAREHOUSE=<name of the warehouse in your Snowflake account to use for compute> \
    -c <name of the configured connection>
```