Welcome to curl_cffi’s documentation!¶
- Install
- Impersonate guide
- Cookies
- Advanced Usage
- Exceptions
- Compatibility with requests
- FAQ
- Why does the JA3 fingerprints change for Chrome 110+ impersonation?
- Can I bypass Cloudflare with this project? or any other specific site.
- I’m getting certs errors
- ErrCode: 77, Reason: error setting certificate verify locations
- How to use with fiddler/charles to intercept content
- ErrCode: 92, Reason: ‘HTTP/2 stream 0 was not closed cleanly: PROTOCOL_ERROR (err 1)’
- Packaging with PyInstaller
- How to set proxy?
- How to change the order of headers?
- API References
- Change Log
curl_cffi is a Python binding for `curl-impersonate fork`_ via cffi.
Unlike other pure Python http clients like httpx
or requests
, curl_cffi
can
impersonate browsers’ TLS signatures or JA3 fingerprints. If you are blocked by some
website for no obvious reason, you can give this package a try.
Scrapfly is an enterprise-grade solution providing Web Scraping API that aims to simplify the scraping process by managing everything: real browser rendering, rotating proxies, and fingerprints (TLS, HTTP, browser) to bypass all major anti-bots. Scrapfly also unlocks the observability by providing an analytical dashboard and measuring the success rate/block rate in details.
Scrapfly is a good solution if you are looking for a cloud-managed solution for curl_cffi
.
If you are managing TLS/HTTP fingerprint by yourself with curl_cffi
, they also maintain
a curl to python converter .
Features¶
Supports JA3/TLS and http2 fingerprints impersonation.
Much faster than requests/httpx, on par with aiohttp/pycurl, see benchmarks.
Mimics requests API, no need to learn another one.
Pre-compiled, so you don’t have to compile on your machine.
Supports
asyncio
with proxy rotation on each request.Supports http 2.0, which requests does not.
Supports websocket.
requests |
aiohttp |
httpx |
pycurl |
curl_cffi |
|
---|---|---|---|---|---|
http2 |
❌ |
❌ |
✅ |
✅ |
✅ |
sync |
✅ |
❌ |
✅ |
✅ |
✅ |
async |
❌ |
✅ |
✅ |
❌ |
✅ |
websocket |
❌ |
✅ |
❌ |
❌ |
✅ |
fingerprints |
❌ |
❌ |
❌ |
❌ |
✅ |
speed |
🐇 |
🐇🐇 |
🐇 |
🐇🐇 |
🐇🐇 |
Install¶
pip install curl_cffi --upgrade
For more details, see Install.
Basic Usage¶
requests-like¶
from curl_cffi import requests
url = "https://tools.scrapfly.io/api/fp/ja3"
# Notice the impersonate parameter
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110")
print(r.json())
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
# the js3n fingerprint should be the same as target browser
# To keep using the latest browser version as `curl_cffi` updates,
# simply set impersonate="chrome" without specifying a version.
# Other similar values are: "safari" and "safari_ios"
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome")
# http/socks proxies are supported
proxies = {"https": "http://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110", proxies=proxies)
proxies = {"https": "socks://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110", proxies=proxies)
Sessions¶
s = requests.Session()
# httpbin is a http test website
s.get("https://httpbin.org/cookies/set/foo/bar")
print(s.cookies)
# <Cookies[<Cookie foo=bar for httpbin.org />]>
r = s.get("https://httpbin.org/cookies")
print(r.json())
# {'cookies': {'foo': 'bar'}}
asyncio¶
from curl_cffi.requests import AsyncSession
async with AsyncSession() as s:
r = await s.get("https://example.com")
More concurrency:
import asyncio
from curl_cffi.requests import AsyncSession
urls = [
"https://google.com/",
"https://facebook.com/",
"https://twitter.com/",
]
async with AsyncSession() as s:
tasks = []
for url in urls:
task = s.get(url)
tasks.append(task)
results = await asyncio.gather(*tasks)
WebSockets¶
from curl_cffi.requests import Session, WebSocket
def on_message(ws: WebSocket, message):
print(message)
with Session() as s:
ws = s.ws_connect(
"wss://api.gemini.com/v1/marketdata/BTCUSD",
on_message=on_message,
)
ws.run_forever()
Sponsor¶
Click here to buy me a coffee.
Bypass Cloudflare with API¶
Yescaptcha is a proxy service that bypasses Cloudflare and uses the API interface to
obtain verified cookies (e.g. cf_clearance
). Click here
to register.
ScrapeNinja¶
ScrapeNinja is a web scraping API with two engines: fast, with high performance and TLS fingerprint; and slower with a real browser under the hood.
ScrapeNinja handles headless browsers, proxies, timeouts, retries, and helps with data extraction, so you can just get the data in JSON. Rotating proxies are available out of the box on all subscription plans.