HA Live - Voice Assistant for Home Assistant

Bring natural, conversational AI to your Home Assistant setup with low-latency, streaming voice interactions.

HA Live is an open-source Android app that bridges Google's Gemini Live API with Home Assistant's Model Context Protocol (MCP) server, giving you a powerful voice assistant that can control your smart home through natural conversation. Think of it as having a deeply integrated AI assistant that actually understands and can control your entire Home Assistant ecosystem.

What Makes HA Live Special?

True Streaming Conversations: Uses Gemini Live's real-time, bidirectional streaming for natural, interruptible conversations—no more waiting for the AI to finish speaking
Direct Home Assistant Integration: Connects to Home Assistant's MCP server to access all your entities, services, and automations as native AI tools
Real-Time Video Streaming: Share your phone camera or Home Assistant cameras with Gemini for visual context during conversations
Multiple Personalities: Create unlimited conversation profiles with different prompts, voices, models, and tool access
Wake Word Detection: Built-in "Lizzy H" wake word support (foreground only, privacy-first)
Contextual Awareness: Inject live Home Assistant state and Jinja2 templates into every conversation

How It Works

HA Live acts as a bridge between two powerful systems:

You (voice + optional video) → HA Live (Android) → Gemini Live API
                                      ↓
                          Home Assistant MCP Server (/mcp_server/sse)
                                      ↓
                          Your Smart Home (lights, sensors, cameras, etc.)

When you interact with HA Live:

Your voice is streamed to Gemini Live for real-time processing
Optionally, video from your phone camera or HA cameras is streamed alongside
Gemini Live receives a list of available "tools" from your Home Assistant setup
When Gemini decides to control your home, it calls these tools
HA Live translates tool calls into JSON-RPC requests to your HA MCP server
Home Assistant executes the action and returns results
Gemini confirms the action back to you via voice

All of this happens in real-time with sub-second latency, making conversations feel natural and responsive.

Key Features

Conversation Profiles

Create multiple profiles for different use cases:

Custom System Prompts: Define how the AI should behave
Personality Traits: Formal assistant, casual friend, technical expert—you choose
Background Info: Use Jinja2 templates to inject dynamic context (e.g., {{ now() }}, {{ states('sensor.temperature') }})
Model Selection: Choose from available Gemini models (e.g., gemini-2.0-flash-exp)
Voice Options: Select from multiple voice styles (Aoede, Leda, etc.)
Tool Filtering: Grant access to ALL tools or create a whitelist for specific profiles
Auto-Start: Automatically begin conversations when opening the app
Initial Messages: Send a message to the agent as soon as the session starts
Model Camera Access: Configure which HA cameras the AI can request to view

Video Streaming

Share visual context with Gemini during conversations:

Phone Cameras: Stream from front or back camera
Home Assistant Cameras: Stream from any HA camera entity
Configurable Quality: Choose resolution (256×256, 512×512, 1024×1024) and frame rate (0.2–1 FPS)
AI Camera Control: Let the AI request access to specific HA cameras during conversation
Live Preview: See what you're sharing in a preview window

Advanced Configuration

Live Context Injection: Automatically fetch current Home Assistant state before each conversation
Template Rendering: Background info supports full Jinja2 syntax via Home Assistant's template API
Transcription Logging: See real-time speech-to-text for debugging and monitoring
Profile Import/Export: Share profiles with the community via JSON

Wake Word Detection

Powered by OpenWakeWord's ONNX models
"Lizzy H" wake phrase
Foreground-only (battery efficient, privacy-conscious)
Works while app is active, pauses during conversations

Session Management

Fresh MCP connection per conversation (always up-to-date tools)
Clean session lifecycle (no stale state between chats)
Graceful error handling (template errors fail fast, context errors degrade gracefully)

Requirements

Android Device: API 26+ (Android 8.0 Oreo or newer)
Home Assistant:
- Version with MCP server support
- MCP SSE endpoint enabled (/mcp_server/sse)
Gemini API Key: Get one from Google AI Studio (or use a shared key via the optional HACS integration)
Network: Both devices on the same network (or Home Assistant accessible remotely via HTTPS)

Installation

1. Install the App

Option A: Build from Source

git clone https://github.com/yourusername/ha-live.git
cd ha-live/app
./gradlew assembleRelease
# Install the APK from app/app/build/outputs/apk/release/

Option B: Download APK Download the latest APK from the Releases page.

2. Set Up Home Assistant

Ensure the MCP server is enabled in your Home Assistant (it's on by default in recent versions).

The app authenticates via OAuth—you'll log in with your Home Assistant credentials during setup.

3. Get a Gemini API Key

Go to Google AI Studio
Create a new API key (or use existing)
Copy the API key for app setup

4. First-Run Onboarding

When you first launch HA Live, you'll go through setup:

Step 1: Enter Home Assistant URL

Enter your Home Assistant URL (e.g., http://192.168.1.100:8123 or https://home.example.com)

Step 2: Log in to Home Assistant

A browser opens for you to log in with your Home Assistant credentials
Authorize HA Live to access your Home Assistant

Step 3: Configure Gemini API

If the HA Live Config integration is installed, you can use the shared household API key
Otherwise, paste your own Gemini API key

Step 4: Complete

Tap "Start Using HA Live"

Configuration Guide

Creating Your First Profile

After onboarding, you'll have a default profile. To customize or create new ones:

Tap the menu (three dots) → "Manage Profiles"
Tap "+" to create a new profile or tap an existing one to edit

System Prompt: Core instructions for the AI

You are a helpful home automation assistant. You can control lights,
check sensors, and manage automations. Be concise and action-oriented.

Personality: How the AI should communicate

Friendly but professional. Use casual language but stay focused on
getting tasks done efficiently.

Background Info: Dynamic context using Jinja2 templates

Current time: {{ now().strftime('%I:%M %p') }}
Outside temperature: {{ states('sensor.outdoor_temperature') }}°F
Living room occupied: {{ states('binary_sensor.living_room_motion') }}

Live Context: Enable to fetch a complete Home Assistant state overview at session start

Tool Filtering:

ALL: Grant access to all Home Assistant tools (recommended for general use)
SELECTED: Whitelist specific tools (useful for restricted profiles, e.g., "kids profile" with limited access)

Video Configuration

During Active Chat: Tap the video button to open the camera source menu. Options include:

Off: Disable video streaming
Phone Camera (Front/Back): Use device camera
Home Assistant Cameras: Any HA camera entity available in your setup

Camera Settings (Settings → Camera Settings):

Resolution: 256×256 (low bandwidth), 512×512 (balanced), 1024×1024 (high quality)
Frame Rate: 0.2 FPS (1 per 5s), 0.5 FPS (1 per 2s), 1 FPS (smooth)

AI Camera Access (Profile Editor → "Cameras AI Can Access"):

Configure which HA cameras the AI model can request to view
When enabled, AI can call StartWatchingCamera to view specific cameras
User is prompted if AI wants to switch from an active phone camera

Wake Word Configuration

Quick Toggle: Tap the "Wake Word" chip on the main screen to enable/disable detection.

Advanced Settings (Settings → Wake Word → "Configure"):

Threshold (0.3-0.8): Sensitivity control—lower = more sensitive, higher = fewer false positives
Thread Count: CPU threads for model execution (1, 2, 4, or 8)
Execution Mode: Sequential (lower latency) or Parallel (multi-core utilization)
Optimization Level: ONNX Runtime optimization (Basic recommended, Extended/All for maximum performance)
Test Mode: Live score display with visual threshold marker to tune sensitivity

Note: Models are bundled with the app (~10MB) and copied to device storage on first launch.

Profile Management Tips

Export Single Profile: In Manage Profiles, tap a profile's menu → "Export" → Save as .haprofile file
Export All Profiles: In Manage Profiles, menu → "Export All Profiles" → Save as .haprofile file
Import Profiles: In Manage Profiles, menu → "Import Profiles" → Select .haprofile file
Quick Switch: Tap the dropdown on the main screen to change active profile
Auto-Start: Enable in profile settings to start conversations immediately on app launch

Shared Configuration (Optional)

If multiple people in your household use HA Live, you can install the HA Live Config integration in Home Assistant to share configuration across all devices.

What It Provides

Shared Gemini API Key: Configure one API key in Home Assistant—all household members use it automatically
Shared Profiles: Create profiles in Home Assistant that sync to everyone's devices (great for household-wide "Kitchen Helper" or "Default Assistant" profiles)

When To Use It

Single user? You don't need this—local configuration works fine
Multiple household members? Install the integration so everyone gets the same profiles and you only manage one API key

Installation

Install via HACS (Home Assistant Community Store):

Open HACS in Home Assistant
Go to Integrations → Custom repositories
Add: https://github.com/mrsheepuk/ha-live-config
Search for "HA Live Config" and install
Restart Home Assistant
Go to Settings → Devices & Services → Add Integration → HA Live Config

How It Works

The app automatically detects if the integration is installed
During setup, you'll be offered the choice to use shared or local configuration
Shared profiles appear alongside local profiles in the app
Changes to shared profiles sync to all household devices
You can still create local-only profiles for personal use

Advanced Features

Jinja2 Template Support

Background info templates are rendered via Home Assistant's /api/template endpoint, giving you access to:

Time: {{ now() }}, {{ today_at('17:00') }}
States: {{ states('entity.id') }}, {{ states.sensor }}
Attributes: {{ state_attr('climate.bedroom', 'temperature') }}
Custom: Any Jinja2 function available in HA

Templates are re-rendered fresh at the start of each conversation.

Local Tools

In addition to Home Assistant tools, HA Live provides built-in tools:

EndConversation: Allows Gemini to gracefully end the session when appropriate
StartWatchingCamera: Lets Gemini request to view a Home Assistant camera (if configured in profile)
StopWatchingCamera: Stops viewing the current camera

Debug Logs (Tool Call Logging)

Access detailed tool execution logs via the menu → "Debug Logs":

[✓] 12:34:57 - SUCCESS
Tool: light.turn_on
Params: {"entity_id": "light.living_room"}
Result: {"success": true}

[✓] 12:35:02 - SUCCESS
Tool: GetLiveContext
Params: {}
Result: Live Context: Living room lights are on...

Shows all tool calls, system events, initialization steps, and errors.

Transcription Logs (Speech-to-Text)

When enabled in profile settings, a collapsible section on the main screen shows real-time speech-to-text:

User: "Turn on the living room lights"
Model: "Okay, turning on the living room lights"
Model (thought): "I should use the light.turn_on service"

Toggle the header to expand/collapse the transcription view.

Architecture

HA Live uses a modular architecture:

Core Components

ConversationService Interface: Abstracts the Gemini Live API connection
DirectConversationService: WebSocket-based Gemini Live API implementation
ConversationServiceFactory: Creates the conversation service

MCP Integration

McpClientManager: Server-Sent Events (SSE) client for Home Assistant MCP server
AppToolExecutor: Wraps MCP client, adds logging and local tool support
SessionPreparer: Handles tool fetching, filtering, and session initialization

Audio & Video Pipeline

MicrophoneHelper: Audio capture with echo cancellation and pre-buffering
CameraHelper: CameraX-based device camera capture with frame processing
VideoSource Interface: Abstracts video sources (device cameras, HA cameras)
CameraSourceManager: Manages available video sources and creates instances
HACameraSource: Fetches snapshots from Home Assistant camera entities

Session Lifecycle

App launch: Load config, initialize API client, start wake word (if enabled)
Start chat: Create MCP connection, fetch tools, fetch HA cameras, apply filtering, render templates
Active session: Stream audio (and optionally video), handle tool calls, provide transcription
End chat: Cleanup MCP connection, stop video capture, play end beep, resume wake word

File Organization

app/src/main/java/uk/co/mrsheep/halive/
├── core/                    # Configuration and data models
│   ├── HAConfig.kt         # Home Assistant credentials
│   ├── GeminiConfig.kt     # Gemini API key storage
│   ├── CameraConfig.kt     # Camera resolution/frame rate settings
│   ├── Profile.kt          # Profile data model
│   └── ProfileService.kt   # Profile management (local + remote)
├── services/
│   ├── audio/              # Audio capture
│   │   └── MicrophoneHelper.kt  # Microphone with echo cancellation
│   ├── camera/             # Video capture
│   │   ├── VideoSource.kt       # Video source interface
│   │   ├── CameraHelper.kt      # Device camera capture (CameraX)
│   │   ├── HACameraSource.kt    # HA camera snapshot fetching
│   │   └── CameraSourceManager.kt  # Source management
│   ├── conversation/       # Provider interface
│   ├── geminidirect/       # Gemini Live API implementation
│   ├── mcp/                # MCP client
│   ├── LiveSessionService.kt  # Foreground service for sessions
│   ├── WakeWordService.kt  # Wake word detection
│   └── SessionPreparer.kt  # Session initialization
└── ui/                     # Activities and ViewModels

Contributing

Contributions are welcome! Areas where help is especially appreciated:

Additional wake word models (beyond "Lizzy H")
UI/UX improvements for profile management
Background mode support (Android background restrictions are tricky)
Documentation and example profiles

Development Setup

Clone the repository
Open in Android Studio (Hedgehog or newer)
Sync Gradle dependencies
Set up a Home Assistant test instance
Create a Gemini API key
Build and run on device or emulator (API 26+)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Home Assistant: For the amazing MCP server implementation
Google Gemini Team: For the Gemini Live API
OpenWakeWord: For the ONNX wake word models
Claude (Sonnet 4.5): For the tireless implementation efforts, marshalling teams of Haiku subagents to build most of the features in this app
The Home Assistant Community: For inspiration and feedback

Community & Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Home Assistant Forum: [Coming Soon]

Features Implemented

✅ Gemini Live API via WebSocket
✅ Real-time audio streaming with echo cancellation
✅ Real-time video streaming (phone cameras + HA cameras)
✅ AI-controlled camera viewing
✅ Multiple conversation profiles
✅ Tool filtering (whitelist mode per profile)
✅ Wake word detection (foreground only)
✅ Jinja2 template rendering for background info
✅ Live context fetching on session start
✅ Real-time transcription logging
✅ Auto-start chat on app launch
✅ Initial message to agent
✅ Profile export/import
✅ Tool call logging with timestamps
✅ Local tools (EndConversation, StartWatchingCamera, StopWatchingCamera)
✅ Audio feedback (beeps for ready/end)
✅ OAuth authentication with Home Assistant
✅ Shared configuration via HACS integration (optional)

Roadmap

Background mode support (system-wide wake word)
Custom wake word training
Multi-language support
Response caching for common queries
Integration with Home Assistant conversation agents
Widget support for quick access
Wear OS companion app

Made with love for the Home Assistant community

If you find HA Live useful, consider starring the repo and sharing it with other HA enthusiasts!

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.github/workflows		.github/workflows
app		app
convert		convert
custom_components/ha_live_config		custom_components/ha_live_config
docs		docs
site		site
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
INTERFACE_REFACTORING.md		INTERFACE_REFACTORING.md
LICENSE		LICENSE
README.md		README.md
REFACTORING_SUMMARY.md		REFACTORING_SUMMARY.md
hacs.json		hacs.json

License

mrsheepuk/ha-live

Folders and files

Latest commit

History

Repository files navigation

HA Live - Voice Assistant for Home Assistant

What Makes HA Live Special?

How It Works

Key Features

Conversation Profiles

Video Streaming

Advanced Configuration

Wake Word Detection

Session Management

Requirements

Installation

1. Install the App

2. Set Up Home Assistant

3. Get a Gemini API Key

4. First-Run Onboarding

Configuration Guide

Creating Your First Profile

Video Configuration

Wake Word Configuration

Profile Management Tips

Shared Configuration (Optional)

What It Provides

When To Use It

Installation

How It Works

Advanced Features

Jinja2 Template Support

Local Tools

Debug Logs (Tool Call Logging)

Transcription Logs (Speech-to-Text)

Architecture

Core Components

MCP Integration

Audio & Video Pipeline

Session Lifecycle

File Organization

Contributing

Development Setup

License

Acknowledgments

Community & Support

Features Implemented

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages