ProxyOmega
For Quants, Research Funds & Compliance

SEC EDGAR Scraping Proxies for 10-K, 13F and Insider Filings

SEC EDGAR is public, but production-scale filing pipelines still bottleneck on the 10 req/s per-IP fair-use rate. Tier-1 US residential proxies spread load across a rotating pool so a thousand workers can index 10-K, 10-Q, 13F-HR, Form 4 and 8-K filings in parallel — each IP individually within the SEC's policy. Sticky sessions hold a single extraction cursor on one IP when the pipeline requires it.

Why production EDGAR pipelines need rotating US residential

EDGAR's fair-use policy is generous on a per-IP basis but unforgiving in aggregate. A single-IP crawler tops out at ~36k filings per hour. Most quant pipelines need 10x to 100x that to keep a real-time signal.

10 req/s per-IP ceiling

SEC publishes a 10 req/s per-IP fair-use rate. A rotating residential pool turns that into 10 req/s × number-of-IPs-in-rotation, so a 1,000-worker pipeline stays compliant per-IP while indexing at production throughput.

US-ASN routing

EDGAR is hosted on US infrastructure that prioritises US-originating traffic. Tier-1 US residential ASNs give consistent latency and avoid the peak-window soft-throttle some non-US ranges hit.

User-Agent compliance

The SEC requires a descriptive User-Agent with a contact email on every request. Our proxies are transparent to that header — we do not strip or rewrite it, so your pipeline stays compliant end-to-end.

Sticky for paginated extracts

Full-text search and large-index pagination break if the IP rotates mid-cursor. Sticky sessions up to 60 hours hold one IP per extraction job.

XBRL backfill bandwidth

Multi-year, all-issuer XBRL backfills run into the terabytes. Pair Platinum for the live pipeline with Budget Unlimited for the historical backfill to keep the bill bounded.

13F filing-window bursts

The 45-day post-quarter 13F window concentrates institutional filings into a few days. Burst capacity to 10,000 concurrent connections covers the whole window without queueing.

What pipelines this fits

PipelineWhy Platinum fits
13F-HR institutional holdings trackerBurst capacity during the 45-day post-quarter window, sticky sessions per fund cursor
Form 4 insider transactionsReal-time pull from EDGAR full-text search with US-ASN routing
10-K and 10-Q NLP pipelinesBulk filing download with sticky session per ticker cursor
8-K material-event monitorHigh-frequency polling against the daily filing index
S-1 IPO pipeline analyticsSustained crawl of the prospectus archive
XBRL financial statement backfillBandwidth-heavy historical archive — pair with Budget Unlimited

Recommended plan

Budget Unlimited Residential — from $51.99/mo

Pair with Platinum when running a one-off multi-terabyte XBRL or filing-archive backfill where bandwidth dominates the bill.

Integration — Python EDGAR crawler with US-ASN routing

# Python — SEC EDGAR 13F crawler via ProxyOmega Platinum US
import httpx

UA = "Quant Research [email protected]"  # SEC requires descriptive UA
proxy = "http://YOUR_USER-cc-us-asn-comcast:[email protected]:20228"

with httpx.Client(proxy=proxy, headers={"User-Agent": UA}, timeout=30) as c:
    # 13F-HR filings for a CIK
    r = c.get(
        "https://www.sec.gov/cgi-bin/browse-edgar",
        params={"action": "getcompany", "CIK": "0001067983",
                "type": "13F-HR", "dateb": "", "owner": "include",
                "count": "40"},
    )
    print(r.status_code, len(r.text))

The numbers

10/s
per-IP, SEC fair-use compliant
90M+
rotating US-routable IPs
60 hr
sticky session ceiling
10k
concurrent connections
ASN
level US routing
$3
per GB Platinum

Frequently asked questions

Do I really need proxies for SEC EDGAR? It is public data.

EDGAR is public, but the SEC enforces 10 req/s per-IP fair-use and requires a descriptive User-Agent. A production pipeline indexing tens of thousands of filings per day hits that ceiling from a single IP. A residential pool spreads load while staying within policy on every individual IP.

What can I scrape on EDGAR?

Form 10-K and 10-Q periodic reports, 8-K material events, 13F-HR institutional holdings, Form 4 insider transactions, S-1 prospectuses, DEF 14A proxies, full filing indexes, company tickers/CIK maps, XBRL financial statement data and the full-text search across all filings since 2001.

Why US-ASN routing for SEC?

EDGAR is hosted on US infrastructure that prioritises US-originating traffic. Tier-1 US residential ASNs give consistent latency and avoid the peak-window soft-throttle some non-US ranges hit.

How fast can I crawl without violating SEC fair-use?

10 req/s per IP. With 90M+ IPs in rotation, you can run 1,000+ concurrent workers each at 10 req/s and stay compliant on a per-IP basis. Sticky sessions are available when an extraction job needs one cursor on one IP.

Do these work with sec-edgar, edgar-tools and Scrapy?

Yes. Standard HTTP/HTTPS/SOCKS5 with user:pass or IP-whitelist auth. Compatible with sec-edgar, edgar-tools, sec-parsers, Scrapy, requests, httpx and any framework that takes a proxy URL.

Is XBRL and Financial Statement Data Set crawling supported?

Yes. The Financial Statement Data Sets ZIPs and per-company XBRL endpoints work through the same proxy URL. For multi-year, all-issuer historical backfills, pair Platinum with Budget Unlimited to bound the bandwidth bill.

Related use cases

Amazon Price Monitoring

Retailer pricing data to pair with retailer 10-K analysis.

SERP Scraping

Search ranking data for sentiment and discovery signals.

Ecommerce Scraping

Public retailer data for cross-validation with filings.

Start now

$5 trial credit. Pay-as-you-go after. Volume pricing on Platinum down to $1.70/GB at 1,000 GB.