|
| 1 | +# Pinecone Troubleshooting Guide |
| 2 | + |
| 3 | +> **Prerequisites**: See [PINECONE.md](./PINECONE.md) for universal concepts and setup. |
| 4 | +
|
| 5 | +This guide covers common troubleshooting issues that apply across all programming languages. For language-specific troubleshooting and code examples, see the appropriate language guide: |
| 6 | + |
| 7 | +- **Python**: [PINECONE-python.md](./PINECONE-python.md#troubleshooting) |
| 8 | +- **TypeScript/Node.js**: [PINECONE-typescript.md](./PINECONE-typescript.md#troubleshooting) |
| 9 | +- **Go**: [PINECONE-go.md](./PINECONE-go.md#troubleshooting) |
| 10 | +- **Java**: [PINECONE-java.md](./PINECONE-java.md#troubleshooting) |
| 11 | + |
| 12 | +## Common Problems & Solutions |
| 13 | + |
| 14 | +### Problem: `describeIndexStats()` returns 0 records after upsert |
| 15 | + |
| 16 | +**Cause**: Eventual consistency - records haven't indexed yet |
| 17 | + |
| 18 | +**Solution**: |
| 19 | + |
| 20 | +1. Wait 10+ seconds minimum after upserting records |
| 21 | +2. Check if records were actually upserted (no errors thrown) |
| 22 | +3. Use a polling pattern for production code (see language-specific guides for examples) |
| 23 | + |
| 24 | +**How to verify**: |
| 25 | + |
| 26 | +- Wait 10-20 seconds after upserting |
| 27 | +- Check stats again with `describeIndexStats()` |
| 28 | +- Records should appear in stats within 10-20 seconds |
| 29 | + |
| 30 | +### Problem: Search returns no results |
| 31 | + |
| 32 | +**Cause**: Usually one of these: |
| 33 | + |
| 34 | +1. Field name mismatch (using wrong field in `--field_map`) |
| 35 | +2. Records not indexed yet (eventual consistency) |
| 36 | +3. Empty namespace or wrong namespace name |
| 37 | +4. Filtering too aggressively |
| 38 | + |
| 39 | +**Solutions**: |
| 40 | + |
| 41 | +1. **Verify field_map matches your records**: |
| 42 | + |
| 43 | + - If you created index with `--field_map text=content` |
| 44 | + - Make sure records have `content` field, not `text` |
| 45 | + - The field name in records must match the right side of `field_map` |
| 46 | + |
| 47 | +2. **Check if records exist**: |
| 48 | + |
| 49 | + - List records using `listPaginated()` or equivalent |
| 50 | + - Verify records are in the correct namespace |
| 51 | + - See language-specific guides for code examples |
| 52 | + |
| 53 | +3. **Try simple search without filters**: |
| 54 | + |
| 55 | + - Start with a basic search query without any filters |
| 56 | + - Gradually add filters to isolate the issue |
| 57 | + - See language-specific guides for code examples |
| 58 | + |
| 59 | +4. **Wait for indexing**: |
| 60 | + - Records become searchable 5-10 seconds after upsert |
| 61 | + - Use polling pattern in production code |
| 62 | + - See [Indexing Delays & Eventual Consistency](#indexing-delays--eventual-consistency) below |
| 63 | + |
| 64 | +### Problem: Metadata too large error |
| 65 | + |
| 66 | +**Cause**: Metadata exceeds 40KB limit per record |
| 67 | + |
| 68 | +**Solution**: |
| 69 | + |
| 70 | +- Check metadata size (must be under 40KB) |
| 71 | +- Remove unnecessary metadata fields |
| 72 | +- Flatten nested objects (nested structures not allowed) |
| 73 | +- Store large data elsewhere and reference it in metadata |
| 74 | + |
| 75 | +**Best practices**: |
| 76 | + |
| 77 | +- Keep metadata minimal (IDs, categories, dates, etc.) |
| 78 | +- Store large content in the record fields, not metadata |
| 79 | +- Use flat structure only (no nested objects) |
| 80 | + |
| 81 | +### Problem: Batch too large error |
| 82 | + |
| 83 | +**Cause**: Exceeding batch size limits |
| 84 | + |
| 85 | +**Solution**: |
| 86 | + |
| 87 | +- **Text records**: MAX 96 records per batch, 2MB total |
| 88 | +- **Vector records**: MAX 1000 records per batch, 2MB total |
| 89 | +- Split large batches into smaller chunks |
| 90 | +- See language-specific guides for batch processing examples |
| 91 | + |
| 92 | +### Problem: Nested metadata error |
| 93 | + |
| 94 | +**Cause**: Metadata contains nested objects or arrays of objects |
| 95 | + |
| 96 | +**Solution**: |
| 97 | + |
| 98 | +- Flatten all metadata to a single level |
| 99 | +- Use string lists instead of object arrays |
| 100 | +- Convert nested structures to flat key-value pairs |
| 101 | + |
| 102 | +**Example**: |
| 103 | + |
| 104 | +```typescript |
| 105 | +// ❌ WRONG - nested objects not allowed |
| 106 | +{ |
| 107 | + user: { name: "John", id: 123 }, |
| 108 | + tags: [{ type: "urgent" }] |
| 109 | +} |
| 110 | + |
| 111 | +// ✅ CORRECT - flat structure only |
| 112 | +{ |
| 113 | + user_name: "John", |
| 114 | + user_id: 123, |
| 115 | + tags: ["urgent", "important"] // String lists OK |
| 116 | +} |
| 117 | +``` |
| 118 | + |
| 119 | +### Problem: Rate limit (429) errors |
| 120 | + |
| 121 | +**Cause**: Too many requests too quickly |
| 122 | + |
| 123 | +**Solution**: |
| 124 | + |
| 125 | +- Implement exponential backoff retry logic |
| 126 | +- Reduce request rate |
| 127 | +- Add delays between batch operations |
| 128 | +- See language-specific guides for retry pattern examples |
| 129 | + |
| 130 | +**Best practices**: |
| 131 | + |
| 132 | +- Only retry on transient errors (429, 5xx) |
| 133 | +- Don't retry on client errors (4xx except 429) |
| 134 | +- Use exponential backoff with max retries |
| 135 | +- Cap delay at reasonable maximum (e.g., 60 seconds) |
| 136 | + |
| 137 | +### Problem: Search results are not relevant |
| 138 | + |
| 139 | +**Cause**: Usually one of these: |
| 140 | + |
| 141 | +1. Not using reranking |
| 142 | +2. Query doesn't match field_map configuration |
| 143 | +3. Insufficient or poor quality data |
| 144 | + |
| 145 | +**Solution**: |
| 146 | + |
| 147 | +1. **Always use reranking in production**: |
| 148 | + |
| 149 | + - Use `bge-reranker-v2-m3` model |
| 150 | + - Rerank top 2x candidates for better results |
| 151 | + - See language-specific guides for reranking examples |
| 152 | + |
| 153 | +2. **Verify query format**: |
| 154 | + |
| 155 | + - Query text should match the field type in `field_map` |
| 156 | + - For `--field_map text=content`, use text queries |
| 157 | + - Ensure embedding model matches your use case |
| 158 | + |
| 159 | +3. **Improve data quality**: |
| 160 | + - Ensure records have sufficient context |
| 161 | + - Add more relevant records to the index |
| 162 | + - Consider data preprocessing and cleaning |
| 163 | + |
| 164 | +### Problem: Index not found (404 errors) |
| 165 | + |
| 166 | +**Cause**: Index doesn't exist or wrong index name |
| 167 | + |
| 168 | +**Solution**: |
| 169 | + |
| 170 | +1. Verify index exists: `pc index list` (CLI) |
| 171 | +2. Check index name spelling (case-sensitive) |
| 172 | +3. Verify you're in the correct project/organization |
| 173 | +4. Create index if it doesn't exist (use CLI, not SDK) |
| 174 | + |
| 175 | +### Problem: API key authentication failed |
| 176 | + |
| 177 | +**Cause**: Invalid or expired API key |
| 178 | + |
| 179 | +**Solution**: |
| 180 | + |
| 181 | +1. Verify API key is set correctly in environment variable |
| 182 | +2. Check API key hasn't expired or been revoked |
| 183 | +3. Ensure API key has proper permissions |
| 184 | +4. Generate new API key if needed: [Pinecone Console](https://app.pinecone.io/) |
| 185 | + |
| 186 | +**Best practices**: |
| 187 | + |
| 188 | +- Never hardcode API keys in source code |
| 189 | +- Use environment variables for API keys |
| 190 | +- Add `.env` files to `.gitignore` |
| 191 | +- Rotate keys if exposed |
| 192 | + |
| 193 | +### Problem: Namespace isolation issues |
| 194 | + |
| 195 | +**Cause**: Not using namespaces or wrong namespace name |
| 196 | + |
| 197 | +**Solution**: |
| 198 | + |
| 199 | +- Always use namespaces for data isolation |
| 200 | +- Verify namespace names match exactly (case-sensitive) |
| 201 | +- Use consistent namespace naming strategy |
| 202 | +- See [PINECONE.md](./PINECONE.md#data-operations) for namespace best practices |
| 203 | + |
| 204 | +## Indexing Delays & Eventual Consistency |
| 205 | + |
| 206 | +Pinecone uses **eventual consistency**. This means records don't immediately appear in searches or stats after upserting. |
| 207 | + |
| 208 | +### Realistic Timing Expectations |
| 209 | + |
| 210 | +| Operation | Time | Notes | |
| 211 | +| ------------------ | ------------- | ------------------------------------------- | |
| 212 | +| Record stored | 1-3 seconds | Data is persisted | |
| 213 | +| Records searchable | 5-10 seconds | Can find via `searchRecords()` | |
| 214 | +| Stats updated | 10-20 seconds | `describeIndexStats()` shows accurate count | |
| 215 | +| Indexes ready | 30-60 seconds | New indexes enter "Ready" state | |
| 216 | + |
| 217 | +### Correct Wait Pattern |
| 218 | + |
| 219 | +After upserting records: |
| 220 | + |
| 221 | +1. **Minimum wait**: 10 seconds for records to become searchable |
| 222 | +2. **Stats wait**: 10-20 seconds for stats to update |
| 223 | +3. **Production pattern**: Use polling with `describeIndexStats()` to verify readiness |
| 224 | + |
| 225 | +See language-specific guides for polling pattern code examples. |
| 226 | + |
| 227 | +### Production Pattern: Polling for Readiness |
| 228 | + |
| 229 | +For production code, implement a polling function that: |
| 230 | + |
| 231 | +1. Checks `describeIndexStats()` periodically (every 5 seconds) |
| 232 | +2. Compares record count with expected count |
| 233 | +3. Times out after reasonable maximum (e.g., 5 minutes) |
| 234 | +4. Returns when records are fully indexed |
| 235 | + |
| 236 | +See language-specific guides for complete polling implementation examples. |
| 237 | + |
| 238 | +## Quick Reference Table |
| 239 | + |
| 240 | +| Issue | Solution | Section | |
| 241 | +| -------------------------------- | ----------------------------------------------------- | ----------------------------------------------- | |
| 242 | +| `describeIndexStats()` returns 0 | Wait 10+ seconds, use polling pattern | [Above](#indexing-delays--eventual-consistency) | |
| 243 | +| Search returns no results | Check field_map, namespace, wait for indexing | [Above](#problem-search-returns-no-results) | |
| 244 | +| Metadata too large | Reduce to <40KB, flatten nested objects | [Above](#problem-metadata-too-large-error) | |
| 245 | +| Batch too large | Split to 96 (text) or 1000 (vector) per batch | [Above](#problem-batch-too-large-error) | |
| 246 | +| Rate limit (429) errors | Implement exponential backoff retry | [Above](#problem-rate-limit-429-errors) | |
| 247 | +| Nested metadata error | Flatten all metadata - no nested objects | [Above](#problem-nested-metadata-error) | |
| 248 | +| Index not found | Verify index exists with `pc index list` | [Above](#problem-index-not-found-404-errors) | |
| 249 | +| API key authentication failed | Verify key in environment variable, check permissions | [Above](#problem-api-key-authentication-failed) | |
| 250 | + |
| 251 | +## Getting More Help |
| 252 | + |
| 253 | +If you're still experiencing issues: |
| 254 | + |
| 255 | +1. **Check language-specific guides** for code examples and patterns |
| 256 | +2. **Review official documentation**: [https://docs.pinecone.io/](https://docs.pinecone.io/) |
| 257 | +3. **Search documentation**: Use the search feature on [docs.pinecone.io](https://docs.pinecone.io/) to find relevant guides |
| 258 | +4. **Error Handling**: See error handling patterns in language-specific guides and the [Production guides](https://docs.pinecone.io/guides/production/) section |
| 259 | + |
| 260 | +--- |
| 261 | + |
| 262 | +**Remember**: Always use namespaces, always rerank, always handle errors with retry logic, and account for eventual consistency delays. |
0 commit comments