假设通过 httpx.Client 设置 limit 速率控制后,同时发起多个请求访问 youtube。并且由于科学原因一直连接不上
假设一共 4 个连接,max_connection=2,timeout=5s。
- 默认会发生的情况不是前两个连接 tcp 握手 timeout,后两个连接再发起连接 timeout。经过 2 * timeout = 10s 后所有连接失败
- 默认的配置里,一个请求开始 await 后,由于 limits 限制导致在本地等待的时间也算到总 timeout 里,这就会导致经过 1 * timeout = 5s 后,所有连接全 timeout 了
1. 示例代码
如下示例代码可以证明该问题:
import asyncio
import logging
import os
from asyncio import tasksimport httpxmax_conn = 2
max_keepalive = max_connproject_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "."))
assets_root = os.path.join(project_root, "assets")
cert_path = os.path.join(assets_root, "cert", "cert.pem")
TIMEOUT = 5logging.basicConfig(level=logging.DEBUG)async def __log_content_length__(response: httpx.Response):"""这是一个事件钩子函数,用于在 'response' 事件发生时被调用。"""# 优先尝试从 headers 获取 Content-Lengthcontent_length_header = response.headers.get("Content-Length")if content_length_header is not None:# 如果 header 存在,直接使用body_length = content_length_headerelse:# 如果 header 不存在,计算实际内容的长度...logging.info(f"<-- Received response: {response.status_code} {response.request.method} {response.url} "f"- Length: {body_length} bytes")async def make_req(client: httpx.AsyncClient, url):try:response = await client.get(url)except httpx.TimeoutException as e:logging.error(f"Timeout while making request to {url}: {e}")return Nonereturn responsedef main():limits = httpx.Limits(max_connections=max_conn,max_keepalive_connections=max_keepalive,)httpx_client = httpx.AsyncClient(timeout=TIMEOUT, limits=limits, event_hooks={"response": [__log_content_length__]}, verify=False)tasks = [make_req(httpx_client, f"https://youtube.com") for i in range(10)]async def runner():await asyncio.gather(*tasks)asyncio.run(runner())if __name__ == "__main__":main()
2. 修复方法
timeout 传入一个对象关闭 pool 中 wait 计时
timeout_config = httpx.Timeout(TIMEOUT, pool=None)