apache - What could cause strange delays while sending delays from a python aiohttp server? - Stack Overflow

IT技术

更新时间：2025-04-154

admin管理员组
文章数量:1391934

We have a service with this architecture:

HTTPS requests come into an A10 load balancer that does L4 load balancing
Behind it are 2 backend servers with Apache running that terminate the TLS connection
In Apache there is a ProxyPass rule that talks to a http service on localhost
this local service uses gunicorn and is implemented in Python using aiohttp (Python 3.11.2 running on Debian Linux ("bookworm") running as a VMWare VM)

Now we have some cases where users reported timeouts. Both the python application and Apache write logs, and so we've traced some of the timeouts to a weird delay. The python application logs a message right before it passes it response to gunicorn (and thus to Apache), and sometimes the time between this log and the Apache access log is really long, like 300s.

This is a somewhat rare occurrence, in a day where we process about 880k requests there are just 17 requests where the delay is 30s or more, which makes this kinda hard to debug.

Capturing all the network traffic in a huge .pcap file and then sifting through it is kinda hopeless, far too much data, and the responses from Apache are encrypted, so it makes it really hard to trace request IDs in the pcap files.

Most of the requests with delays have a response size of a few kilobytes to a few megabytes, though very seldomly we also see slow requests with response body <1kb, which should fit into a single TCP package.

Does anybody have a good hypothesis where this delay could come from, and how I could debug it?

We have a service with this architecture:

HTTPS requests come into an A10 load balancer that does L4 load balancing
Behind it are 2 backend servers with Apache running that terminate the TLS connection
In Apache there is a ProxyPass rule that talks to a http service on localhost
this local service uses gunicorn and is implemented in Python using aiohttp (Python 3.11.2 running on Debian Linux ("bookworm") running as a VMWare VM)

Now we have some cases where users reported timeouts. Both the python application and Apache write logs, and so we've traced some of the timeouts to a weird delay. The python application logs a message right before it passes it response to gunicorn (and thus to Apache), and sometimes the time between this log and the Apache access log is really long, like 300s.

This is a somewhat rare occurrence, in a day where we process about 880k requests there are just 17 requests where the delay is 30s or more, which makes this kinda hard to debug.

Capturing all the network traffic in a huge .pcap file and then sifting through it is kinda hopeless, far too much data, and the responses from Apache are encrypted, so it makes it really hard to trace request IDs in the pcap files.

Most of the requests with delays have a response size of a few kilobytes to a few megabytes, though very seldomly we also see slow requests with response body <1kb, which should fit into a single TCP package.

Does anybody have a good hypothesis where this delay could come from, and how I could debug it?

Share Improve this question edited Mar 13 at 14:58 asked Mar 13 at 14:52 moritz 12.8k1 gold badge43 silver badges63 bronze badges

1 what are the requests, what is the loading at the time the delays are encountered (time of day) , are the delays across multiple days .... can a delay be recreated (replaying requests .... ) are there any h/ware issues - dodgy line cards, is this hosted locally, a cloud provider .... , add more debugging .... – ticktalk Commented Mar 17 at 22:50
Are you able to get access to the payloads that were subject to delay? Are other requests at the same time being processed fine? Can you reproduce it in a lab environment somehow? – Mo_ Commented Mar 18 at 16:26
1 The nature of the requests doesn't really matter, since they are handled by yet another service. According to Prometheus, the load is never high. This happens on 4 different Vmware VMs in two different clusters and locations, so unlikely a dodgy hardware problem. The problem appears in QA too, but not reproducibly. Replaying requests doesn't reproduce the problem. Other requests seem to be processed fine (need to look in the logs to be sure). – moritz Commented Mar 19 at 13:49

Add a comment |

1 Answer 1

Sorted by: Reset to default 1 +100

These sporadic 30–300s response delays typically stem from slow or stalled client reads or rare TCP stalls (packet loss/retransmissions). Your aiohttp app logs “done” after passing data to Gunicorn, while Apache only logs when the client fully receives it. A slow or flaky network/client causes the proxy to hold the socket open, delaying the final Apache log. Checking socket states (ss, netstat), Apache’s mod_status, and TCP retransmissions can confirm this. The rest of the system is likely fine—this is common when dealing with a small fraction of slow or interrupted clients.

本文标签： apacheWhat could cause strange delays while sending delays from a python aiohttp serverStack Overflow

版权声明：本文标题：apache - What could cause strange delays while sending delays from a python aiohttp server? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744695106a2620229.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

apache - What could cause strange delays while sending delays from a python aiohttp server? - Stack Overflow

1 Answer 1

更多相关文章