Skip to content

Rate limits & bans

The google backend is a reverse-engineered, unauthenticated endpoint. That means Google can and does block aggressive callers — by rate, by bot fingerprint, or by IP reputation. gfly ships a persistent, cross-process politeness throttle that addresses the first vector. Understanding all three will save you a lot of debugging.

An agent invokes gfly as a fresh process per call. An in-memory timer would reset on every invocation and be completely useless. Instead, gfly persists its throttle state to disk at:

$XDG_STATE_HOME/gfly/ratelimit.json # default (~/.local/state/gfly/ratelimit.json)
$GFLY_STATE_DIR/ratelimit.json # override with GFLY_STATE_DIR

The state file is keyed per-backend (google / serpapi) and is written with 0600 permissions. It tracks the timestamp of the last request, a recent-call window, the circuit-breaker blocked_until epoch, and the consecutiveBlocks counter.

When a request would arrive too soon, gfly does not silently sleep. A hung CLI deadlocks an agent loop. Instead it raises a structured error on stderr and exits non-zero:

{
"error": "request too soon; try again in 8s",
"code": "RATE_LIMITED",
"remediation": "wait 8s, or pass --wait to block automatically",
"retryAfterSeconds": 8
}

Your caller should read retryAfterSeconds and schedule a retry. If you want gfly to handle the wait itself, pass --wait.

Flag Env var Default Effect
--min-interval GFLY_MIN_INTERVAL 12 (seconds) Minimum gap between consecutive google requests
--wait off Block (sleep) until the interval or cooldown clears, up to --max-wait
--max-wait 60 (seconds) Cap on how long --wait will sleep before failing
--no-throttle GFLY_NO_THROTTLE off Bypass the throttle entirely (risky)
Terminal window
# opt into blocking sleep (gfly will wait up to 60s for the interval to clear)
gfly search JFK LHR --depart 2026-08-15 --wait
# extend the blocking cap to 3 minutes
gfly search JFK LHR --depart 2026-08-15 --wait --max-wait 180
# disable pacing (risky — increases ban probability significantly)
gfly search JFK LHR --depart 2026-08-15 --no-throttle
# set a custom interval globally via env
export GFLY_MIN_INTERVAL=20
gfly search JFK LHR --depart 2026-08-15

The dates command runs one upstream request per day in the range. It logs the expected pacing upfront so you can estimate wall-clock time before committing to a wide window:

note: scanning 14 day(s) = 14 upstream request(s), paced ~12s apart (~156s total)

When the upstream returns a 429 or CAPTCHA signal, gfly opens a circuit breaker and refuses further requests until the cooldown expires. Subsequent calls inside the cooldown short-circuit immediately (they never touch the network) and exit with code 20 (BLOCKED):

{
"error": "blocked by upstream; cooldown 300s",
"code": "BLOCKED",
"remediation": "wait for cooldown, use --proxy, or --backend serpapi",
"retryAfterSeconds": 300
}

The cooldown grows exponentially with each consecutive block, indexed against this schedule (seconds):

Block # Cooldown
1st 30 s
2nd 60 s
3rd 120 s
4th 300 s (5 min)
5th 600 s (10 min)
6th+ 1800 s (30 min)

A clean successful response resets the counter back to zero.

doctor exposes a live snapshot of the throttle without touching the upstream:

Terminal window
gfly doctor --json | jq .throttle
{
"backend": "google",
"lastRequest": 1750000000.0,
"blocked": false,
"blockedUntil": null,
"cooldownSeconds": 0,
"consecutiveBlocks": 0
}

If blocked is true, cooldownSeconds tells you exactly how long to wait before the circuit clears. The schema command includes the same snapshot under the throttle key.

Politeness helps with rate only. Google uses three distinct signals to identify and block scrapers:

Too many requests in a short window → 429 or soft CAPTCHA. The throttle directly addresses this. Keep --min-interval at 12 s or higher for sustained workloads.

2. Bot fingerprint (not addressable by rate alone)

Section titled “2. Bot fingerprint (not addressable by rate alone)”

Google inspects TLS/HTTP2 fingerprints, header ordering, and other request signatures. A sufficiently distinctive fingerprint can trigger a CAPTCHA on the very first request regardless of rate. This is a property of the fast-flights library’s request shape, not of how frequently you call it.

3. IP reputation (not addressable by rate alone)

Section titled “3. IP reputation (not addressable by rate alone)”

Datacenter and cloud IPs (AWS, GCP, Azure, Hetzner, etc.) are pre-flagged in Google’s IP reputation databases. Your account may get blocked before any request completes.

--proxy (or GFLY_PROXY) passes an HTTP(S) proxy URL to the google backend. A residential proxy with a clean IP reputation can bypass IP-based blocks:

Terminal window
gfly search JFK LHR --depart 2026-08-15 --proxy http://user:pass@residential-proxy:8080

The proxy flag has no effect on the serpapi backend.

--backend serpapi routes requests through SerpApi, which handles fingerprinting, IP reputation, and CAPTCHA solving on their infrastructure. It requires an API key (see Authentication) but is the reliable escape hatch when the google backend is blocked:

Terminal window
export GFLY_SERPAPI_KEY=your_key_here
gfly search JFK LHR --depart 2026-08-15 --backend serpapi

SerpApi is throttle-exempt in gfly, but SerpApi itself enforces per-plan quota limits (HTTP 429 → exit code 7 with retryAfterSeconds: 60).

Problem Fix
Too-soon error (RATE_LIMITED) Use --wait, increase --min-interval, or schedule retries with retryAfterSeconds
Circuit breaker open (BLOCKED) Wait out cooldownSeconds, use --proxy, or switch to --backend serpapi
CAPTCHA on first request Switch to --backend serpapi or use a residential --proxy
Datacenter IP blocked Switch to --backend serpapi or use a residential --proxy

See also: Backends · Authentication · Exit codes