admin管理员组

文章数量:1336293

I have a Flask service that receives GET requests, and I want to scale the QPS on that endpoint (on a single machine/container). Should I use a python ThreadPoolExecutor or ProcessPoolExecutor, or something else? The GET request just retrieves small pieces of data from a cache backed by a DB. Is there anything specific to Flask that should be taken into account?

I have a Flask service that receives GET requests, and I want to scale the QPS on that endpoint (on a single machine/container). Should I use a python ThreadPoolExecutor or ProcessPoolExecutor, or something else? The GET request just retrieves small pieces of data from a cache backed by a DB. Is there anything specific to Flask that should be taken into account?

Share asked Nov 19, 2024 at 18:19 FrankFrank 4,4178 gold badges43 silver badges62 bronze badges 4
  • You probably want to start by measuring the thing you want to improve - what's the current QPS, and have you identified any obvious bottlenecks in the current request pipeline? – Mathias R. Jessen Commented Nov 19, 2024 at 18:27
  • remember ProcessPoolExecutor requires non-trivial state synchronization, it really depends on how much state is being shared and how much work is being done in the workers. – Ahmed AEK Commented Nov 19, 2024 at 18:31
  • @MathiasR.Jessen Very good point, you are right. I'll get on to that. – Frank Commented Nov 19, 2024 at 18:34
  • @AhmedAEK The threads really just to each do a simply select ... from db where user_id = xyz each time the /GET endpoint is reached. – Frank Commented Nov 19, 2024 at 18:34
Add a comment  | 

1 Answer 1

Reset to default 2

Neither.

Flask will serve one request per worker (or more, but depending on the worker type) - the way you set-it up, either with gunicorn, wsgi or awsgi is what is responsible for the number of parallel requests your app can process.

Inside your app, you don't change anything - your views will be called as independent processes, independent threads or independent async tasks, depending on how you setup your server - that is where you have to tinker with the configurations.

Using another concurrency strategy would only make sense if each request would have calculations and data fetching which could themselves be parallelized, inside the same request.

Check how is your deployment configured, and pick the best option for you (all things being equal, pick the easiest one): https://flask.palletsprojects/en/stable/deploying/ (also, I would not recomend "mod_wsgi" among those options - it is super complex and old tech)

本文标签: pythonCorrect way to parallelize request processing in FlaskStack Overflow