使用 curl_cffi 发出 http 请求的时候,如果设置了 impersonate 参数,那么 curl_cffi 根据 impersonate 的值自动设置 ua,但是此时如何我们传入的请求 headers 里面也有 UA 呢?会用哪个?

让我们来验证一下吧

搭建一个测试用的服务端

from fastapi import FastAPI, Request
import uvicorn


app = FastAPI()


@app.get("/print-headers/")
async def print_headers(request: Request):
    headers = dict(request.headers)
    for key, value in headers.items():
        print(f"{key}: {value}")
    return {"headers": headers}


if __name__ == "__main__":
    uvicorn.run(
        app='server:app',
        host="0.0.0.0",
        port=8886,
        workers=1,
        reload=True
    )

测试用的客户端

版本一:不带 impersonate 参数

from curl_cffi import requests

# 定义目标URL
url = "http://127.0.0.1:8886/print-headers"

# 发送GET请求
response = requests.get(url)

# 输出请求头内容
print("Response status code:", response.status_code)
print("Response headers:", response.headers)
print("Response text:", response.text)

服务端输出如下:

host: 127.0.0.1:8886
accept: */*
accept-encoding: gzip, deflate, br, zstd

可以看到此时是没有 UA 的

版本二:使用 impersonate 参数,并且不额外配置请求头中的 UA

from curl_cffi import requests

# 定义目标URL
url = "http://127.0.0.1:8886/print-headers"

# 发送GET请求
response = requests.get(url, impersonate='chrome120')

# 输出请求头内容
print("Response status code:", response.status_code)
print("Response headers:", response.headers)
print("Response text:", response.text)

服务端输出如下:

host: 127.0.0.1:8886
connection: Upgrade, HTTP2-Settings
upgrade: h2c
http2-settings: AAEAAQAAAAIAAAAAAAQAYAAAAAYABAAA
sec-ch-ua: "Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "macOS"
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
sec-fetch-site: none
sec-fetch-mode: navigate
sec-fetch-user: ?1
sec-fetch-dest: document
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9

版本三:使用 impersonate 参数,并且不额外配置请求头中的 UA

from curl_cffi import requests

# 定义目标URL
url = "http://127.0.0.1:8886/print-headers"

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36',
    'sec-ch-ua-platform': '"Windows"',
    'Referer': 'https://www.baidu.com/',
}

# 发送GET请求
response = requests.get(url, impersonate='chrome120',headers=headers)

# 输出请求头内容
print("Response status code:", response.status_code)
print("Response headers:", response.headers)
print("Response text:", response.text)

服务端输出如下:

host: 127.0.0.1:8886
connection: Upgrade, HTTP2-Settings
upgrade: h2c
http2-settings: AAEAAQAAAAIAAAAAAAQAYAAAAAYABAAA
sec-ch-ua: "Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
sec-fetch-site: none
sec-fetch-mode: navigate
sec-fetch-user: ?1
sec-fetch-dest: document
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
referer: https://www.baidu.com/

根据上面的结论

所以我们自己设置的请求头的参数会部分覆盖默认的参数


universe_king
3.4k 声望678 粉丝