Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is TorchServe No Longer Actively Maintained? #3396

Open
ily666666 opened this issue Mar 3, 2025 · 10 comments
Open

Why is TorchServe No Longer Actively Maintained? #3396

ily666666 opened this issue Mar 3, 2025 · 10 comments

Comments

@ily666666
Copy link

Hello, I noticed that the TorchServe GitHub page has been marked as 'Limited Maintenance,' indicating that the project is no longer actively maintained. Could you share the reasons behind this decision? Is it related to the development direction of the PyTorch ecosystem? Additionally, are there any recommended alternative tools or solutions for deploying PyTorch models? 
Thank you for your response!
@zagorulkinde
Copy link

Could you clarify your decision? What do you plan to use in the future?

@sapphire008
Copy link

If torchserve is no longer the way to serve PyTorch models, what else is out there?

@michaeltinsley
Copy link

The best like for like replacement is probably Nvidia Triton with the pytorch backend right now I think

@aniketmaurya
Copy link

Developer of LitServe here - LitServe has similar API interface and on-par performance so super easy to port your application.
https://lightning.ai/docs/litserve/home/benchmarks

@whplh
Copy link

whplh commented Mar 13, 2025

Just want to share a list of resources to go from here...

Ray Serve (https://docs.ray.io/en/latest/serve/index.html)
Triton Inference Server ( https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html )
MLflow (https://mlflow.org/)
BentoML (https://www.bentoml.com/)
KServe (https://kserve.github.io/website/latest/)
Seldon (https://www.seldon.io/solutions/seldon-mlserver/)
Cortex (https://www.cortex.dev/)
ForestFlow (https://github.com/ForestFlow/ForestFlow)
TensorFlow Serving (https://www.tensorflow.org/tfx/guide/serving)
DeepDetect (https://www.deepdetect.com/overview/introduction)
Multi Model Server (MMS) (https://github.com/awslabs/multi-model-server)

So far I looked at BentoML and RayServe.
Here are my thoughts:

  • BentoML seems to have a paywall BentoMLCloud which you cannot self host. But it has nice features like model management and a triton inference server interface
  • RayServe also has a triton inference runtime and model management

in LiteServe I cannot find any model managment (?)

If you are also comming from TorchServe and like OpenSource, the packaging and management api (like me =) feel free to share your experiences/research or correct or add anything, I'm still searching/researching.

Cheers.

@geodavic
Copy link

Does anyone have any recommendations for alternative frameworks that allow per-model user-provided code like torchserve's handler.py?

@drjasonharrison
Copy link

drjasonharrison commented Mar 19, 2025

The "no longer actively maintained" notice should include a date. Especially true for the documentation at pytorch.org/serve

Without having to dig I should be able to determine how recently the project has been abandoned. Thankfully google found this repo and the list of releases has the latest release in 2024-09

@bhimrazy
Copy link

bhimrazy commented Mar 24, 2025

Does anyone have any recommendations for alternative frameworks that allow per-model user-provided code like torchserve's handler.py?

Hi @geodavic, I’d recommend LitServe as a great alternative. As a contributor, I can say it offers a user-friendly interface for serving models with excellent performance. Feel free to try it out and let me know if you have any questions! 😊

@yuzhichang
Copy link

@bhimrazy What I miss TorchServe is it support seperate endpoints and different GPU configuration for multiple models. However LitServe doesn't according to https://lightning.ai/docs/litserve/features/multiple-endpoints#multiple-routes.

@bhimrazy
Copy link

Hi @yuzhichang,
By design, LitServe is kept simple yet performant.

Btw, you can easily configure devices, GPUs, and workers while setting up the LitServer (see: LitServer Devices).

For multiple endpoints, I’d suggest creating a Docker image for each endpoint and serving them that way. If you’d like to share any thoughts or use cases on multiple endpoints, feel free to add them to this issue: #271. 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants