Skip to content

Break down main function in llama-server #13425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

ericcurtin
Copy link
Collaborator

llama-server main function is getting meaty, just breaking it down into smaller functions.

@ericcurtin ericcurtin requested a review from ngxson as a code owner May 10, 2025 12:08
@ericcurtin ericcurtin marked this pull request as draft May 10, 2025 12:08
@ericcurtin
Copy link
Collaborator Author

Incomplete

llama-server main function is getting meaty, just breaking it
down into smaller functions.

Signed-off-by: Eric Curtin <[email protected]>
@ericcurtin ericcurtin marked this pull request as ready for review May 10, 2025 12:32
@ngxson
Copy link
Collaborator

ngxson commented May 10, 2025

Before going further, I think it's better to discuss a plan rather than diving into the code.

While working on #13400 (comment) , I also thought about refactoring server.cpp into small components, this should be done in a way that is easy to enable routing requests to multiple models on the same server instance.

For now, the most simple task is of course to abstract out the creation of HTTP server. Second task could be to move all the HTTP handler to a completely separated file. The main component, server_context may also need to be moved to a dedicated file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants