-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Name and Version
release-0.14
What is the problem this feature will solve?
Currently, our service proxies search requests directly to the Hugging Face (HF) registry, which strictly requires exact string matches for its filters. Users frequently do not know the exact nomenclature used by Hugging Face (e.g., typing apache instead of apache-2.0, or pyto instead of pytorch). Because of this strict requirement, partial inputs result in empty search returns. This creates a frustrating user experience, often forcing users to leave our platform to find the exact tag names on the HF website before they can successfully search through our proxy.
What is the feature you are proposing to solve the problem?
We propose implementing a client-side Autocomplete/Enum Search Component, powered by a comprehensive list of tags provided by the backend.
Because the total number of available Hugging Face tags across these categories is finite and manageable, the implementation would be split into two parts:
- Backend: Implement a scheduled background job that periodically fetches and caches the complete, finite list of valid, exact tags from Hugging Face. The backend will expose a single endpoint (e.g.,
/api/tags) that returns this entire dataset to the client at once. - Frontend: Implement an autocomplete dropdown component for the search bar. The UI will fetch the full list of tags from the backend (e.g., on application load) and handle the partial-matching logic entirely on the client side. As the user types, the UI will instantly filter the local list and present the exact HF tags. The user selects the correct exact tag, ensuring that only a single, perfectly formatted filter is ever sent through our proxy to the HF API.
What alternatives have you considered?
Backend-Only Hidden Expansion (Rejected): The original idea was to intercept the user's partial string on the backend, expand it into a list of all matching exact tags, and append them all to the downstream HF request. This will not work. The HF API evaluates multiple tags using a strict AND operator. Passing pytorch and pytorch-lightning simultaneously returns zero results because it looks for models that possess both tags.
Backend Scatter-Gather / Simulated OR (Rejected): To bypass the AND limitation, the backend could theoretically break the expanded tags into separate HTTP requests, fire them concurrently at HF, and merge the results. This was rejected because fanning out requests would immediately exhaust our Hugging Face API rate limits and severely degrade the performance of our proxy service.
Wait for Upstream API Updates (Rejected): Pausing this feature until Hugging Face updates their GET /api/models endpoint to natively support OR logic or partial matching. This was rejected as it indefinitely blocks our ability to improve the platform's user experience
Metadata
Metadata
Assignees
Labels
Type
Projects
Status