-
-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance for large (~100MB) payloads #443
Comments
I set this up to run using hypercorn (with and without uvloop) by changing the else clause in the main() function to: Click here to expand # Uncomment as appropriate
# Hypercorn:
from hypercorn.asyncio import serve, Config
config = Config()
config.bind = ["localhost:8000"] # As an example configuration setting
asyncio.run(serve(app, config))
# Hypercorn + uvloop
# import uvloop
# from hypercorn.asyncio import serve, Config
# config = Config()
# config.bind = ["localhost:8000"] # As an example configuration setting
# asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
# loop = asyncio.new_event_loop()
# asyncio.set_event_loop(loop)
# loop.run_until_complete(serve(app, config))
# Uvicorn:
# import uvicorn
# uvicorn.run(app) Where uvicorn handles the request in ~16-17s, hypercorn without uvloop handles it in ~1m6s, and hypercorn with uvloop handles it even slower at ~1m13s 👀. So maybe this is a starlette issue after all... Either way, any insight or suggested lines of investigation would be appreciated! |
Hey @dmontagu, nice investigation! I'll try to have a deeper look a bit later this week. It's interesting as it's not the most common scenario, as some cloud providers and services will rather return 413 for that size of payload, but it's definitely an interesting one. On the other side, I'm a bit in doubt if the test client provides a "fair" time, I have a sneaky suspicion it may reuse already allocated memory for the payload avoiding expensive operation. And also just to clarify, are we talking only about the json payload? |
I am actually more interested in the multi-part speed since I already make use of the file submission capabilities for large data transfers in some of my projects, but I have only noticed the issue (and tested) for the json body. I haven't noticed any serious issues in my projects due to this issue (although perhaps some of my endpoints could be more responsive...). It's just that the ~40x slowdown I saw in fastapi/fastapi#360 scared me, and I was hoping to get in front of potentially serious performance issues. (More generally, from an advocacy perspective, I would rather just prevent situations where someone could start arguing that starlette is 50 times slower than flask...) I wouldn't mind if the end conclusion was that streaming / file submissions currently have no performance issues, and bodies should be limited in size for performance. (I would be especially fine with this if it was due to a tradeoff that made handling small bodies more performant). I just want to understand what is going on. (And fix it if something has a problem.) It wouldn't surprise me if the TestClient was "cheating" a little bit, but the flask server in the fastapi issue I linked above was able to handle the exact same request that was being sent to fastapi only about 10% slower than the asgi client. That example involved a ~175MB payload; on my computer it took ~1.3 seconds via asgi, ~1.4 seconds via flask (running in a separate process, listening to a local port), and ~52 seconds via fastapi (and fastapi was doing basically nothing on top of starlette). If necessary, I can port the code in my script above to work with flask for comparison, but I'd rather not if possible 😬 -- I have very little experience with flask and just relied on the implementation provided by @MarlieChiller. I agree that these payloads are unreasonably large, but even in the high single-digit megabyte range the slowdown is substantial (just not a factor of 30+ 😄), I think I was still seeing ~2x slowdown (it is easy to test by changing the number of |
@gvbgduh I got to the bottom of it -- it was in fact the This line was causing quadratic scaling for request building, which was becoming very significant with large request sizes: I did testing with smaller payloads, and even at the 5MB mark changing this caused the server to handle the request ~15-20% faster. I opened encode/starlette#653 to fix it. |
(Cross posted to encode/starlette#651; I'm not sure where the issue lies, but I currently think it is either starlette or uvicorn.)
I'm noticing especially slow handling of large request bodies when running in uvicorn, and I'm trying to get to the bottom of it.
If this kind of performance is expected for for payloads of this size for any reason, please let me know.
The script below posts a large payload (~100M in size) to a
starlette.applications.Starlette
endpoint, which just returns a success response. Running via the starletteTestClient
, I get a response in ~0.65 seconds; running via uvicorn it takes ~17.5 seconds (or ~27x slower).(I'll note that this discrepancy becomes much smaller as the size of the payload decreases -- I think it was about 3-5x for 10MB, and not really significant below 1MB.)
I was able to get speeds comparable to the
TestClient
runs speed using a flask implementation. I also get similar slowdowns when running via gunicorn with a uvicorn worker (I haven't tested other servers; not sure if there are recommended alternatives).Click here to expand script
This script can perform three actions:
--uvicorn-test
(requires the server to have been previously started)TestClient
if executed with the argument--asgi-test
The script performs only a single request, but the speed difference is very consistently this extreme.
I ran cProfile over the server while the slow response (to a single request) was being generated, and by far the line that stood out was:
where
body
here is a reference to thestarlette.requests.Request.body
method. Nothing else was remotely close in theOwn Time
column. (Onlyuvloop.loop.Loop.run_until_complete
was more than 1%, and I think that was just downtime while waiting for me to trigger the request.)This was originally an issue posted to fastapi fastapi/fastapi#360, but seems to be an issue with either uvicorn or starlette. (I am going to cross post this issue to uvicorn as well.) In that issue, a (more complex) script was posted comparing the performance to that of flask; flask was able to achieve similar performance to what I can get using the ASGI TestClient.
The text was updated successfully, but these errors were encountered: