Welcome to curl_cffi’s documentation!¶

Contents:

curl_cffi is a Python binding for curl-impersonate via cffi.

Unlike other pure Python http clients like httpx or requests, curl_cffi can impersonate browsers’ TLS signatures or JA3 fingerprints. If you are blocked by some website for no obvious reason, you can give this package a try.

Scrapfly is an enterprise-grade solution providing Web Scraping API that aims to simplify the scraping process by managing everything: real browser rendering, rotating proxies, and fingerprints (TLS, HTTP, browser) to bypass all major anti-bots. Scrapfly also unlocks the observability by providing an analytical dashboard and measuring the success rate/block rate in details.

Scrapfly is a good solution if you are looking for a cloud-managed solution for curl_cffi. If you are managing TLS/HTTP fingerprint by yourself with curl_cffi, they also maintain this tool to convert curl command into python curl_cffi code!

Features¶

Supports JA3/TLS and http2 fingerprints impersonation.
Much faster than requests/httpx, on par with aiohttp/pycurl, see benchmarks.
Mimics requests API, no need to learn another one.
Pre-compiled, so you don’t have to compile on your machine.
Supports asyncio with proxy rotation on each request.
Supports http 2.0, which requests does not.
Supports websocket.

Feature matrix¶
	requests	aiohttp	httpx	pycurl	curl_cffi
http2	❌	❌	✅	✅	✅
sync	✅	❌	✅	✅	✅
async	❌	✅	✅	❌	✅
websocket	❌	✅	❌	❌	✅
fingerprints	❌	❌	❌	❌	✅
speed	🐇	🐇🐇	🐇	🐇🐇	🐇🐇

Install¶

pip install curl_cffi --upgrade

For more details, see Install.

Basic Usage¶

requests-like¶

from curl_cffi import requests

url = "https://tls.browserleaks.com/json"

# Notice the impersonate parameter
r = requests.get(url, impersonate="chrome")

print(r.json())
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
# the js3n fingerprint should be the same as target browser

# http/socks proxies are supported
proxies = {"https": "http://localhost:3128"}
r = requests.get(url, impersonate="chrome", proxies=proxies)

proxies = {"https": "socks://localhost:3128"}
r = requests.get(url, impersonate="chrome", proxies=proxies)

Sessions¶

s = requests.Session()

# httpbin is a http test website
s.get("https://httpbin.org/cookies/set/foo/bar")

print(s.cookies)
# <Cookies[<Cookie foo=bar for httpbin.org />]>

r = s.get("https://httpbin.org/cookies")
print(r.json())
# {'cookies': {'foo': 'bar'}}

asyncio¶

from curl_cffi.requests import AsyncSession

async with AsyncSession() as s:
    r = await s.get("https://example.com")

More concurrency:

import asyncio
from curl_cffi.requests import AsyncSession

urls = [
    "https://google.com/",
    "https://facebook.com/",
    "https://twitter.com/",
]

async with AsyncSession() as s:
    tasks = []
    for url in urls:
        task = s.get(url)
        tasks.append(task)
    results = await asyncio.gather(*tasks)

WebSockets¶

from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
    print(message)

with Session() as s:
    ws = s.ws_connect(
        "wss://api.gemini.com/v1/marketdata/BTCUSD",
        on_message=on_message,
    )
    ws.run_forever()

Welcome to curl_cffi’s documentation!¶

Features¶

Install¶

Basic Usage¶

requests-like¶

Sessions¶

asyncio¶

WebSockets¶

Indices and tables¶

curl_cffi

Navigation

Related Topics

Welcome to curl_cffi’s documentation!¶

Features¶

Install¶

Basic Usage¶

requests-like¶

Sessions¶

asyncio¶

WebSockets¶

Sponsor¶

Bypass Cloudflare with API¶

ScrapeNinja¶

Indices and tables¶