-
Notifications
You must be signed in to change notification settings - Fork 14.2k
server: (preset) add unsafe-allow-api-override
#18322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Interesting! If I understand correctly, you're implementing a filter that lets the router accept cold-start param overrides via the API body (like ctx, n-gpu-layers), with the whitelist in the INI controlling which params can be touched. This gives fine-grained control that can be refined later. Smart approach to transfer the risk to the admin's explicit whitelist! |
|
I was initially thinking more about hot-reloading the INI by the router to keep control while having the INI modifications handled by the admin. But you're right that since there's already an API, we're already working in two directions at once. We can improve the INI whitelist later, add threshold validation, constraints, etc In my case, if I expose my WebUI to the internet, I need to open the entire API. So anything admin-related (memory allocation etc) requires authentication. I had initially thought about a simple /admin endpoint prefix. With all-or-nothing access, I'll have to filter at the reverse proxy level, which is not admin-friendly Ultimately, all we need is an open "user" level as currently exists, and an "admin" level with a password for everything that changes depending on available resources such as memory, etc.. |
I think allow reloading INI can be useful, but will be quite complicated due to some edge cases as I specified in #18261 (comment)
IMO this will also be quite over-complicated because:
Re. the Instead, I would transfer the risk to end-user by allowing them to hot-reload the INI. Users can develop their own admin panel that allow modifying INI file, then trigger the hot-reload - they are now responsible for any security risks that may happen in their admin code. |
|
A bit more context on why we want to transfer the risk whenever possible: we do have a security advisory program, but many bug reports are AI-generated and they nitpick functionalities like this. In the context of this PR, we simply transfer the risk of allowing overriding a param to the user. So if they explicitly wrote something like this: And someone use it via API with Then now it's their responsibility as they explicitly written the word |
|
If we code our own backend/WebUI, we don't need the API overrides; we write the INI directly. This feature is more for exposing the llama.cpp API directly to external clients, right? Also, a separate endpoint/API for "unsafe" items would make it easier for the community to develop admin panels. Mixing everything together isn't ideal; unsafe items should remain localhost (--jinja /etc/passwd). |
|
The whole router feature is never meant to be used in untrusted network. A classical DDoS attack vector is just by repeatedly sending load and unload requests. Instead, this |
Yes it's a classic DoS, it doesn't even need to be distributed (DDoS), easy to filter / rate limit |
yeah I agree that this main issue is that INI hot-reload have some edge cases so we need to list that out before working on the feature. tbh I don't know if the current PR is as useful as it is, mainly to address the fact that a lot of people asked that |
|
The most important thing is to discuss and test in practice, and automatically we will find the best compromises! |
|
This looks like what I need! Quick question to confirm: if I use --models-dir with a minimal preset file that just has a global [*] section with unsafe-allow-api-override = c, would that allow context size overrides for all dynamically discovered models? My use case is loading models with different context sizes without disrupting others already warm in memory, hence why reloading the ini file each time doesn't suffice. |
Ref discussion: #18261 (comment)
@ServeurpersoCom I think we need to add a test with INI preset at some point
Example preset for this PR:
And API request:
Returns: