Commit 97b82d7
authored
Add configurable model cache timeout for automatic memory management (#8693)
## Summary
Adds `model_cache_keep_alive_min` config option (minutes, default 5) to
automatically clear model cache after inactivity. Addresses memory
contention when running InvokeAI alongside other GPU applications like
Ollama.
**Implementation:**
- **Config**: New `model_cache_keep_alive_min` field in
`InvokeAIAppConfig` with 5-minute default
- **ModelCache**: Activity tracking on get/lock/unlock/put operations,
threading.Timer for scheduled clearing
- **Thread safety**: Double-check pattern handles race conditions,
daemon threads for clean shutdown
- **Integration**: ModelManagerService passes config to cache, calls
shutdown() on stop
- **Logging**: Smart timeout logging that only shows messages when
unlocked models are actually cleared
- **Tests**: Comprehensive unit tests with properly configured mock
logger
**Usage:**
```yaml
# invokeai.yaml
model_cache_keep_alive_min: 10 # Clear after 10 minutes idle
model_cache_keep_alive_min: 0 # Set to 0 for indefinite caching (old behavior)
```
**Key Behavior:**
- **Default timeout**: 5 minutes - models are automatically cleared
after 5 minutes of inactivity
- Clearing uses same logic as "Clear Model Cache" button (make_room with
1000GB)
- Only clears **unlocked** models (respects models actively in use
during generation)
- Timeout message only appears when models are actually cleared
- Debug logging available for timeout events when no action is taken
- Prevents misleading log entries during active generation
- Users can set to 0 to restore indefinite caching behavior
## Related Issues / Discussions
Addresses enhancement request for automatic model unloading from memory
after inactivity period.
## QA Instructions
1. **Test default behavior (5-minute timeout)**:
- Start InvokeAI without explicit config
- Run a generation
- Wait 6 minutes with no activity
- Check logs for "Clearing X unlocked model(s) from cache" message
- Verify cache is empty
2. **Test custom timeout**:
- Set `model_cache_keep_alive_min: 0.1` (6 seconds) in config
- Load a model (run generation)
- Wait 7+ seconds with no activity
- Check logs for "Clearing X unlocked model(s) from cache" message
- Verify cache is empty
3. **Test no timeout (old behavior)**:
- Set `model_cache_keep_alive_min: 0` in config
- Run generations and wait extended periods
- Verify models remain cached indefinitely
4. **Test during active use**:
- Run continuous generations with any timeout setting
- Verify no timeout messages appear during active use (models are
locked)
- After generation completes, wait for timeout and verify unlocked
models are cleared
## Merge Plan
N/A - Additive change with sensible defaults. The 5-minute default
enables automatic memory management while remaining practical for
typical workflows.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
<!-- START COPILOT ORIGINAL PROMPT -->
<details>
<summary>Original prompt</summary>
>
> ----
>
> *This section details on the original issue you should resolve*
>
> <issue_title>[enhancement]: option to unload from memory
</issue_title>
> <issue_description>### Is there an existing issue for this?
>
> - [X] I have searched the existing issues
>
> ### Contact Details
>
> ### What should this feature add?
>
> a command line option to unload model from RAM after a defined period
of time
>
> ### Alternatives
>
> running as a container and using Sablier to shutdown the container
after some time, this has the downside of if traffic isn't see through
the web interface it will be shut even if jobs are running.
>
> ### Additional Content
>
> _No response_</issue_description>
>
> ## Comments on the Issue (you are @copilot in this section)
>
> <comments>
> <comment_new><author>@lstein</author><body>
> I am reopening this issue. I'm running ollama and invoke on the same
server and I find their memory requirements are frequently clashing. It
would be helpful to offer users the option to have the model cache
automatically cleared after a fixed amount of inactivity. I would
suggest the following:
>
> 1. Introduce a new config file option `model_cache_keep_alive` which
specifies, in minutes, how long to keep a model in cache between
generations. The default is 0, which means to keep the model in cache
indefinitely, as is currently the case.
> 2. If no model generations occur within the timeout period, the model
cache is cleared using the same backend code as the "Clear Model Cache"
button in the queue tab.
>
> I'm going to assign this to GitHub copilot, partly to test how well it
can manage the Invoke code base. </body></comment_new>
> </comments>
>
</details>
<!-- START COPILOT CODING AGENT SUFFIX -->
- Fixes #6856
<!-- START COPILOT CODING AGENT TIPS -->
---
✨ Let Copilot coding agent [set things up for
you](https://github.com/invoke-ai/InvokeAI/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.5 files changed
Lines changed: 247 additions & 3 deletions
File tree
- invokeai
- app/services
- config
- model_manager
- backend/model_manager/load/model_cache
- frontend/web/src/services/api
- tests/backend/model_manager/load/model_cache
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| 88 | + | |
88 | 89 | | |
89 | 90 | | |
90 | 91 | | |
| |||
165 | 166 | | |
166 | 167 | | |
167 | 168 | | |
| 169 | + | |
168 | 170 | | |
169 | 171 | | |
170 | | - | |
| 172 | + | |
171 | 173 | | |
172 | 174 | | |
173 | 175 | | |
| |||
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
63 | 67 | | |
64 | 68 | | |
65 | 69 | | |
| |||
88 | 92 | | |
89 | 93 | | |
90 | 94 | | |
| 95 | + | |
| 96 | + | |
91 | 97 | | |
| 98 | + | |
92 | 99 | | |
93 | 100 | | |
94 | 101 | | |
| |||
Lines changed: 104 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
58 | 73 | | |
59 | 74 | | |
60 | 75 | | |
| |||
132 | 147 | | |
133 | 148 | | |
134 | 149 | | |
| 150 | + | |
135 | 151 | | |
136 | 152 | | |
137 | 153 | | |
| |||
151 | 167 | | |
152 | 168 | | |
153 | 169 | | |
| 170 | + | |
154 | 171 | | |
155 | 172 | | |
156 | 173 | | |
| |||
182 | 199 | | |
183 | 200 | | |
184 | 201 | | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
185 | 208 | | |
186 | 209 | | |
187 | 210 | | |
| |||
190 | 213 | | |
191 | 214 | | |
192 | 215 | | |
193 | | - | |
| 216 | + | |
194 | 217 | | |
195 | 218 | | |
196 | 219 | | |
| |||
218 | 241 | | |
219 | 242 | | |
220 | 243 | | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
221 | 266 | | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
222 | 316 | | |
223 | 317 | | |
224 | 318 | | |
| |||
228 | 322 | | |
229 | 323 | | |
230 | 324 | | |
231 | | - | |
| 325 | + | |
232 | 326 | | |
233 | 327 | | |
234 | 328 | | |
| |||
272 | 366 | | |
273 | 367 | | |
274 | 368 | | |
| 369 | + | |
275 | 370 | | |
276 | 371 | | |
277 | 372 | | |
| |||
309 | 404 | | |
310 | 405 | | |
311 | 406 | | |
| 407 | + | |
312 | 408 | | |
313 | 409 | | |
314 | 410 | | |
| 411 | + | |
315 | 412 | | |
316 | 413 | | |
317 | 414 | | |
| |||
348 | 445 | | |
349 | 446 | | |
350 | 447 | | |
| 448 | + | |
351 | 449 | | |
352 | 450 | | |
353 | 451 | | |
| |||
691 | 789 | | |
692 | 790 | | |
693 | 791 | | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
694 | 796 | | |
695 | 797 | | |
696 | 798 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13035 | 13035 | | |
13036 | 13036 | | |
13037 | 13037 | | |
| 13038 | + | |
13038 | 13039 | | |
13039 | 13040 | | |
13040 | 13041 | | |
| |||
13278 | 13279 | | |
13279 | 13280 | | |
13280 | 13281 | | |
| 13282 | + | |
| 13283 | + | |
| 13284 | + | |
| 13285 | + | |
| 13286 | + | |
| 13287 | + | |
13281 | 13288 | | |
13282 | 13289 | | |
13283 | 13290 | | |
| |||
0 commit comments