Skip links

How to Detect & Block Scrapers on Shopify (4-Step Setup)

A scraper is not a vulnerability — it is a parasite. Bots that crawl your Shopify product catalog cost you bandwidth, dirty your analytics, inflate ad costs, and hand your work to competitors at zero cost to them. The good news: scrapers leave a clear fingerprint and can be blocked with high accuracy. This guide covers how to detect scrapers on Shopify, the categories that hurt merchants most, and the 4-step blocking setup that catches them without breaking real customer experience.

The four types of scraper hitting Shopify stores in 2026

  • Price scrapers. Competitors monitor your prices, often hourly. Pure intelligence-gathering — no direct fraud.
  • Catalog cloners. Dropshippers and AI-driven storefronts copy your entire catalog (titles, descriptions, images, variants) to populate fake stores.
  • AI training scrapers. LLM training pipelines slurp e-commerce sites for product data, descriptions, and review content.
  • SEO competitor tools. Ahrefs, Semrush, and similar tools crawl your store to index keyword data. Less harmful but still bandwidth.

The first three are the ones you want to block. The fourth (SEO tools) you may want to allow — they crawl politely and the data they produce is useful for your own SEO research.

How to detect scrapers on your Shopify store

  • Page hit volume per IP. No human visits 200 product pages in 10 minutes. A scraper does.
  • User-agent patterns. python-requests, curl, scrapy, Go-http-client, fasthttp — all dead giveaways.
  • Missing browser headers. Real browsers send Accept-Language, Accept-Encoding, Referrer. Scrapers often skip them.
  • Datacenter origin. Scrapers run on Amazon AWS, DigitalOcean, OVH. Traffic from these ASNs is almost never a real customer.
  • Behavioral patterns. Scrapers crawl in alphabetical or sequential order. Humans browse semi-randomly.
  • No conversions. A “visitor” who hit 50 pages, never added to cart, and never returned — almost certainly a bot.

Your Shopify native analytics will not surface these signals clearly. You need a security app with traffic-level logging — ShopFence Plus includes a Threat Dashboard that shows each of these per visitor.

The 4-step scraper blocking setup

Step 1: Block known bad user agents

The dumb scrapers send “python-requests/2.31” as their user agent. Block this entire category. A good security app maintains an updated list of known bot user agents. This catches ~30% of scraper traffic with zero false positives.

Step 2: Rate-limit per IP

Set a per-IP page-view limit (e.g., 30 pages per minute). Humans never hit it. Scrapers using a single IP do. Catches another ~30%.

Step 3: Block datacenter ASNs

Traffic from Amazon AWS, Google Cloud, DigitalOcean, OVH ASNs is overwhelmingly automated. Block them outright on storefront pages (but allow them on API endpoints if you have legitimate B2B integrations). Catches another ~25%.

Step 4: Behavioral detection

The remaining ~15% of scrapers use real browsers (Puppeteer, Playwright) routed through residential proxies. Defeats steps 1-3. The only catch: behavioral fingerprinting that detects unnatural mouse movement, missing focus events, and JavaScript timing patterns. Requires a security app with browser fingerprinting (ShopFence Plus).

Combine all four and you catch 95%+ of scrapers without affecting real visitors.

What about allowing Google, Bing, Facebook bots?

Whitelist legitimate crawlers explicitly. Standard whitelist:

  • Googlebot (verifiable via reverse DNS lookup against google.com)
  • Bingbot
  • Facebookexternalhit (Open Graph preview)
  • Twitterbot
  • LinkedInBot
  • Slackbot
  • WhatsApp (link previews)

Always whitelist by reverse DNS lookup, not just user agent — attackers spoof user agents trivially. ShopFence does this verification automatically.

Frequently asked questions

How do I detect scrapers on my Shopify store?

Watch for high page-hit volume per IP, suspicious user agents (python-requests, curl, scrapy), datacenter ASN origin, and zero-conversion behavior. ShopFence Plus automates the detection.

Can I block all scrapers on Shopify?

You can block 95%+ with a 4-layer setup (user-agent filter + rate limit + ASN block + behavioral detection). The remaining ~5% (sophisticated scrapers using real browsers and residential proxies) are hard to stop without potentially blocking real customers — accept some leakage.

Will scraper blocking hurt my Shopify SEO?

Only if you accidentally block legitimate search engine crawlers. Whitelist Googlebot, Bingbot, and other major bots by reverse DNS verification — ShopFence does this by default.

What user agents should I block on Shopify?

python-requests, curl, wget, Go-http-client, scrapy, fasthttp, java-http-client, httpie, headless-chrome (without explicit reason). A maintained block list is essential — new scraper UAs appear monthly.

Can scrapers bypass IP blocking?

Sophisticated ones, yes — by rotating through residential proxies. Layer IP blocking with behavioral detection and VPN/proxy detection to catch the rotators.

Set up scraper blocking today

For most stores: install ShopFence Plus, turn on bot detection + rate limiting + datacenter ASN block. Monitor the Threat Dashboard for a week — you will see exactly what was hitting your store. Read our bot attacks defense guide for the broader picture and the complete 2026 security guide for everything else.