An intelligent AI-powered application that combines image management with natural language interaction. Built with FastAPI, ChromaDB, and Google's Gemini model, this bot can understand, store, and retrieve images through conversation.
-
Smart Image Gallery
- Upload and manage images
- Automatic metadata generation
- Tag-based organization
- Newest-first display order
-
AI-Powered Search
- Natural language image search
- Image similarity search
- Combined text and image search
- Semantic understanding
-
Conversational Interface
- Natural language interaction
- Context-aware responses
- Image-based discussions
- Memory of past interactions
-
Advanced Image Processing
- Automatic description generation
- Smart tagging
- Vector embeddings
- Similarity matching
-
Backend Framework: FastAPI
-
Database: ChromaDB (Vector Database)
-
AI Models:
- Google Gemini 1.5
- CLIP (ViT-B-32)
-
Frontend:
- HTML5/CSS3
- JavaScript
-
Image Processing: PIL/Pillow
-
Vector Operations: NumPy, scikit-learn
-
Retrieval-Augmented Generation (RAG) for improved search results powered by LangChain.
- Python 3.9+
- Google API Key
- 4GB+ RAM
- Storage space for image database
-
Clone the Repository
git clone [your-repository-url] cd memory-bot-ai -
Create Virtual Environment
python -m venv venv source venv/bin/activate # Linux/Mac # or venv\Scripts\activate # Windows
-
Install Dependencies
pip install -r requirements.txt
-
Environment Setup Create a
.envfile in the root directory:GOOGLE_API_KEY=your_api_key_here ENVIRONMENT=development PORT=8000 HOST=0.0.0.0
-
Start the Server
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
-
Access the Application
- Main Interface:
http://localhost:8000 - API Documentation:
http://localhost:8000/docs
- Main Interface:
- Navigate to the Upload page
- Select image(s) to upload
- Wait for automatic processing
- Review generated descriptions and tags
- Browse images in chronological order
- Use filters and tags for organization
- Click images for detailed view
- Delete unwanted images
- Text Search: "Show me images of cats"
- Image Search: Upload a similar image
- Combined Search: Use both text and image
- Ask about specific images
- Query the image database
- Get image descriptions
- Natural conversation about images
POST /upload - Upload new images
GET /gallery - Retrieve image gallery
DELETE /image/{id} - Remove an image
POST /search/text - Text-based search
POST /search/image - Image similarity search
POST /search/multimodal - Combined search
POST /chat - Chat with the bot
GET /chat/history - Retrieve chat history
memory-bot-ai/
├── app/
│ ├── main.py # FastAPI application entry point
│ ├── config.py # Configuration settings
│ ├── constants.py # Global constants and messages
│ ├── services/
│ │ ├── chromadb.py # Vector database operations
│ │ ├── embeddings.py # Image/text embeddings
│ │ ├── gemini.py # AI model integration
│ │ └── llm_langchain.py # Language model chain
│ ├── routes/
│ │ └── app_route.py # API endpoints
│ ├── templates/
│ │ ├── chat.html # Chat page
│ │ ├── gallery.html # Image gallery page
│ │ ├── home.html # Home page
│ │ └── upload.html # Image upload page
│ └── static/
│ ├── css/
│ └── style.css # Main stylesheet
│ ├── chromadb/ # Vector database storage
│ ├── uploads/ # Uploaded image storage
│ └── images/ # Processed images
└── requirements.txt # Project dependencies
- Google Gemini AI for natural language processing
- OpenAI's CLIP model for image understanding
- ChromaDB for vector storage
- FastAPI community for the excellent framework