Skip to content

Switch to FLM 0.9.35#1233

Open
superm1 wants to merge 29 commits intomainfrom
npu-json
Open

Switch to FLM 0.9.35#1233
superm1 wants to merge 29 commits intomainfrom
npu-json

Conversation

@superm1
Copy link
Member

@superm1 superm1 commented Feb 28, 2026

A variety of things need to happen when FLM 0.9.35 releases. This branch tracks them all.

  1. Drop the Linux beta flag
  2. Use --json output from FLM
  3. Set new FLM minimum version to 0.9.35
  4. Update Lemonade release to 10.0.0.

Remaining things that we're waiting on to add before merging it.

  • flm validate --json needs to return values for Windows with NPU driver. The Lemonade code that looks for driver version needs to be dropped and use the json on Windows
  • Need to make sure that all labels are carried over properly
  • Need to test the upgrade paths (what happens with the branch on older FLM version installed)
  • Need to update Windows CI runners to newer FLM version (maybe leave one at old version and make sure we gracefully fail?)
  • Need to add FLM to Linux CI runners

Copy link
Member

@jeremyfowers jeremyfowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change current set the names in Lemonade to be the exact flm name? Like qwen3:8b?

Need to think about how users will know this is an NPU model. In the past we had -NPU, -Hybrid, and -FLM suffixes so they were aware.

This is important because:

  1. People assume ollama-style model names mean llamacpp GGUFs on GPUs, which is not the case here
  2. The NPU has unique restrictions (only 1 LLM on NPU at a time) and performance limitations (less memory bandwidth than GPU)

So people should be aware of what they're loading. lemoande-server run qwen3:8b doesn't carry that information.

@jeremyfowers
Copy link
Member

Also if we were to ever add a translation table for ollama model names for use with ollama API that would collide with the FLM model names.

@jeremyfowers
Copy link
Member

jeremyfowers commented Feb 28, 2026

Another musing... it might finally be time to drop "recipes" at the top level.

If we support Qwen3-8B on cpu, gpu, and npu, why not just show the model in the model list ONCE and just offer the user a choice of device?

edit: one limitation of this strategy is that GPU vs. NPU is different model files. We dont want to trigger a lot of random giant downloads just because someone played with the backend menu.

@superm1
Copy link
Member Author

superm1 commented Feb 28, 2026

The problem with offering a list of devices is third party clients. Like think running in openwebui.

So I think we really do want to keep suffixes.

@jeremyfowers
Copy link
Member

The problem with offering a list of devices is third party clients. Like think running in openwebui.

So I think we really do want to keep suffixes.

So convert their colons to dashes and append -flm?

qwen3:8b becomes qwen3-8b-flm

@superm1
Copy link
Member Author

superm1 commented Feb 28, 2026

The problem with offering a list of devices is third party clients. Like think running in openwebui.
So I think we really do want to keep suffixes.

So convert their colons to dashes and append -flm?

qwen3:8b becomes qwen3-8b-flm

ya

@jeremyfowers
Copy link
Member

Reminder to self: ensure this doesn’t accidentally break the gpt-oss-20b filtering on windows.

@superm1 superm1 force-pushed the npu-json branch 2 times, most recently from 508bf59 to 73ff1c1 Compare March 4, 2026 16:24
@superm1 superm1 changed the title Use --json output for flm Switch to FLM 0.9.35 Mar 4, 2026
@superm1
Copy link
Member Author

superm1 commented Mar 4, 2026

I've updated this branch OP to reflect an ongoing TODO. CI failures are expected until that list is all covered.

@superm1 superm1 force-pushed the npu-json branch 3 times, most recently from 60c63c7 to 1827a15 Compare March 5, 2026 17:29
jeremyfowers and others added 3 commits March 5, 2026 15:49
Deduplicate the backend row markup between BackendManager and
ModelManager into a single BackendRow component. Also adds FLM
discoverability: the model manager now shows recipe categories
with install/upgrade prompts when the backend is supported but
not yet installed.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Consolidate FLM detection into 5 ordered states (unsupported, installable,
update_required, action_required, installed) determined by system-info cache.
Remove 4 redundant NPU checks, 2 version checkers, 2 validate wrappers,
and 3 independent check-then-install dances. All code now reads FLM status
from SystemInfoCache::get_flm_status() instead of discovering it independently.

Key changes:
- Add FlmStatus struct and invalidate_recipes() to SystemInfoCache
- Split system-info cache into hardware (cached once) and recipes (invalidatable)
- Fix is_recipe_installed() for FLM with correct ordered checks
- Treat unknown FLM version as update_required (not action_required)
- Remove ModelInvalidatedException, FLMCheckException, model invalidation logic
- Simplify FastFlowLMServer: remove check(), validate(), check_npu_available()
- Add LEMONADE_MOCK_FLM_PATH env var for testability
- Add comprehensive FLM status tests (26 tests: 6 scenarios x 5 API actions)
- Add flm-status to CI test matrix

Co-Authored-By: Claude Opus 4.6 <[email protected]>
jeremyfowers and others added 12 commits March 6, 2026 13:29
- Make FlmStatus::is_ready a method instead of a redundant field
- Add FlmStatus::error_string() to eliminate 3x duplicated error formatting
- Remove double build_recipes_info() call on first /system-info request
- Consolidate triplicated #ifdef blocks in FLM branch of build_recipes_info()
- Add _mock_flm() context manager to tests, removing 16x temp dir boilerplate

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add allowlist-based path validation to prevent command injection via
environment variables (LEMONADE_MOCK_FLM_PATH, PATH). Addresses CodeQL
alert about user-derived input reaching popen().

Co-Authored-By: Claude Opus 4.6 <[email protected]>
# Conflicts:
#	src/app/src/renderer/BackendManager.tsx
Co-Authored-By: Claude Opus 4.6 <[email protected]>
…ogic

- FLM install button opens setup instructions iframe instead of showing error
- Server returns action URL from /install for backends requiring manual setup
- Consolidate backend install/uninstall into shared useBackendInstall hook
- Remove redundant recipes cache from BackendManager (SystemInfoCache is single source)
- Invalidate model cache after backend install/uninstall so FLM models appear
- Models list auto-refreshes on backendsUpdated event

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The local branch consolidated install/copy logic into ConnectedBackendRow
and useBackendInstall hook, making the inline handlers from origin redundant.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Export is_safe_executable_path() as a public API and call it at each
popen site in model_manager.cpp and system_info.cpp. CodeQL traces
data flow per compilation unit, so the sanitizer must be visible
where the dangerous sink (popen) is called.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Eliminates CodeQL taint-flow alerts by removing the test-specific
LEMONADE_FLM_DIR env var from find_flm_executable(). Tests now mock
FLM by placing scripts named `flm` in a temp dir prepended to PATH.
Mock-FLM scenarios (update_required, action_required, installed) are
skipped on Windows where SearchPathA can't be overridden via PATH.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…nditions

- Install tests: expect 200 with action URL on Linux (server returns
  setup page URL instead of auto-installing FLM)
- unknown_version_load: accept "requires" as alternative to "unknown"
  in FLM hint (guards against inter-test race conditions)
- Improve server lifecycle: wait for port release between tests to
  prevent stale server responses

Co-Authored-By: Claude Opus 4.6 <[email protected]>
superm1 added 3 commits March 6, 2026 18:37
This information first lands upstream in 7.1.  Effectively hide the
NPU sensor if it's missing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants