Poor performance for large (~100MB) payloads #651

dmontagu · 2019-10-01T11:02:53Z

I'm noticing especially slow handling of large request bodies when running in uvicorn, and I'm trying to get to the bottom of it.

If this kind of performance is expected for for payloads of this size for any reason, please let me know.

The script below posts a large payload (~100M in size) to a starlette.applications.Starlette endpoint, which just returns a success response. Running via the starlette TestClient, I get a response in ~0.65 seconds; running via uvicorn it takes ~17.5 seconds (or ~27x slower).

(I'll note that this discrepancy becomes much smaller as the size of the payload decreases -- I think it was about 3-5x for 10MB, and not really significant below 1MB.)

I was able to get speeds comparable to the TestClient runs speed using a flask implementation. I also get similar slowdowns when running via gunicorn with a uvicorn worker (I haven't tested other servers; not sure if there are recommended alternatives).

Click here to expand script

import sys
from datetime import datetime

import requests
import uvicorn
from requests import Session
from starlette.applications import Starlette
from starlette.requests import Request
from starlette.responses import Response
from starlette.testclient import TestClient

app = Starlette()


@app.route("/", methods=["POST"])
async def endpoint(request: Request):
    payload = await request.json()
    assert isinstance(payload, dict)
    return Response("success")


def _speed_test(session: Session, url: str):
    payload = {"payload": "a" * 100_000_000}
    start = datetime.utcnow()
    response = session.post(url=url, json=payload)
    elapsed = datetime.utcnow() - start
    assert response.status_code == 200
    assert response.content == b"success"
    print(elapsed)


def asgi_test():
    client = TestClient(app)
    _speed_test(client, "/")


def uvicorn_test():
    session = requests.Session()
    _speed_test(session, f"http://127.0.0.1:8000/")


def main():
    if "--asgi-test" in sys.argv:
        asgi_test()
        # 0:00:00.650825
    elif "--uvicorn-test" in sys.argv:
        uvicorn_test()
        # 0:00:17.502396
        # cProfile:
        # Name   Call Count   Time (ms)  Own Time (ms)
        # body      391         16670        16649
    else:
        uvicorn.run(app)


if __name__ == "__main__":
    main()

This script can perform three actions:

Start the uvicorn server if run without arguments
Hit the uvicorn server if run with the argument --uvicorn-test (requires the server to have been previously started)
Use starlette's TestClient if executed with the argument --asgi-test

The script performs only a single request, but the speed difference is very consistently this extreme.

I ran cProfile over the server while the slow response (to a single request) was being generated, and by far the line that stood out was:

Name   Call Count   Time (ms)  Own Time (ms)
body      391         16670        16649

where body here is a reference to the starlette.requests.Request.body method. Nothing else was remotely close in the Own Time column. (Only uvloop.loop.Loop.run_until_complete was more than 1%, and I think that was just downtime while waiting for me to trigger the request.)

This was originally an issue posted to fastapi fastapi/fastapi#360, but seems to be an issue with either uvicorn or starlette. (I am going to cross post this issue to uvicorn as well.) In that issue, a (more complex) script was posted comparing the performance to that of flask; flask was able to achieve similar performance to what I can get using the ASGI TestClient.

The text was updated successfully, but these errors were encountered:

dmontagu · 2019-10-01T21:39:33Z

I set this up to run using hypercorn (with and without uvloop) by changing the else clause in the main() function to:

Click here to expand

        # Uncomment as appropriate

        # Hypercorn:
        from hypercorn.asyncio import serve, Config

        config = Config()
        config.bind = ["localhost:8000"]  # As an example configuration setting
        asyncio.run(serve(app, config))

        # Hypercorn + uvloop 
        # import uvloop
        # from hypercorn.asyncio import serve, Config
        # config = Config()
        # config.bind = ["localhost:8000"]  # As an example configuration setting
        # asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
        # loop = asyncio.new_event_loop()
        # asyncio.set_event_loop(loop)
        # loop.run_until_complete(serve(app, config))

        # Uvicorn:
        # import uvicorn
        # uvicorn.run(app)

Where uvicorn handles the request in ~16-17s, hypercorn without uvloop handles it in ~1m6s, and hypercorn with uvloop handles it even slower at ~1m13s 👀. So maybe this is an issue with the starlette body method?

Any insight or suggested lines of investigation would be appreciated!

dmontagu · 2019-10-02T06:37:52Z

I believe the issue is this line:

starlette/starlette/requests.py

Line 183 in 6a1c7d3

body += chunk

I think this is related to the quadratic-scaling problem when building strings via +=.

I'm going to investigate and if changing it to appending items to a list and calling b"".join() solves it, I'll open a PR.

Edit: this was indeed the issue. Even for payloads as small as 5MB, in my testing, the proposed change caused the server to handle the request ~15-20% faster.

tomchristie · 2019-10-02T12:06:47Z

Closed via #653

gvbgduh · 2019-10-02T12:28:39Z

awesome work @dmontagu!

dmontagu mentioned this issue Oct 1, 2019

Poor performance for large (~100MB) payloads encode/uvicorn#443

Closed

dmontagu mentioned this issue Oct 2, 2019

Performantly build large request bodies #653

Merged

tomchristie closed this as completed Oct 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor performance for large (~100MB) payloads #651

Poor performance for large (~100MB) payloads #651

dmontagu commented Oct 1, 2019 •

edited

Loading

dmontagu commented Oct 1, 2019

dmontagu commented Oct 2, 2019 •

edited

Loading

tomchristie commented Oct 2, 2019

gvbgduh commented Oct 2, 2019

Poor performance for large (~100MB) payloads #651

Poor performance for large (~100MB) payloads #651

Comments

dmontagu commented Oct 1, 2019 • edited Loading

dmontagu commented Oct 1, 2019

dmontagu commented Oct 2, 2019 • edited Loading

tomchristie commented Oct 2, 2019

gvbgduh commented Oct 2, 2019

dmontagu commented Oct 1, 2019 •

edited

Loading

dmontagu commented Oct 2, 2019 •

edited

Loading