Shopify Fraud Scoring: How AI Rules Protect Your Store

A single signal is not enough to stop fraud. A VPN does not make someone a criminal. A disposable email does not prove a bot is running card tests. Fast typing does not mean a checkout is automated.

But a VPN combined with a disposable email, instant form completion, a device fingerprint seen on three other failed checkouts today, and a billing address that does not match the card — that is a different picture entirely.

This is the core insight behind fraud scoring. Instead of making binary yes/no decisions based on individual signals, a scoring system evaluates every checkout across dozens of rules simultaneously, assigns weighted points for each risk factor it detects, and produces a composite score that reflects the overall probability of fraud.

ShieldFlow uses 28 rules organized into six groups to score every checkout that passes through your store. This article explains how the scoring works, why cumulative multi-signal analysis catches attacks that single-signal systems miss, and how you can tune the system for your specific traffic patterns.

What Is Fraud Scoring?

Fraud scoring assigns a numerical risk value — 0 to 100 — to every checkout event. A score of 0 means no risk factors were detected. A score of 100 means the checkout triggered multiple high-confidence fraud indicators.

The score is not a guess. It is the sum of weighted contributions from individual detection rules, each of which examines a specific aspect of the checkout: the device fingerprint, the email address, the behavioral signals, the network origin, the velocity patterns, and the payment data.

Every rule runs independently. Each rule that fires adds its weighted score to the total. The total determines the verdict: allow the checkout, warn the customer, or block the transaction entirely.

This approach has three major advantages over traditional fraud detection:

No single rule can cause a false positive. A customer using a VPN gets a few points, not a block. Only when multiple risk signals converge does the score reach blocking threshold.
Sophisticated attacks cannot evade detection by spoofing one signal. A bot that rotates IP addresses still gets caught by fingerprint clustering, behavioral analysis, and velocity detection.
Merchants can tune sensitivity. Adjusting rule weights and verdict thresholds lets you calibrate fraud prevention for your specific store, customer base, and risk tolerance.

How Rules Contribute Scores: Cumulative With Diminishing Returns

Not all rules carry equal weight. A device fingerprint linked to 50 failed checkouts in the last hour is a much stronger fraud signal than a billing-shipping address mismatch.

ShieldFlow uses a weighted scoring model with diminishing returns to prevent score inflation from correlated signals.

Primary and Secondary Scoring

When a rule fires, its contribution to the total score depends on whether it is the first rule in its group to fire or a subsequent one:

Primary rule (first in group to fire): contributes its full weighted score
Secondary rules (additional rules in same group): contribute 50% of their weighted score

This diminishing returns mechanism exists because rules within the same group often detect overlapping aspects of the same underlying fraud signal. If a checkout comes from a known datacenter IP through a VPN with proxy headers, those are three network-layer signals that all point to the same thing: the visitor is masking their real location. Without diminishing returns, three correlated network signals could push the score past the blocking threshold on network evidence alone — even if every other signal (device, behavior, email, velocity) looks clean.

With diminishing returns, the first network rule fires at full weight. The second and third contribute half weight. The combined network signal is still substantial, but it leaves room for cross-group signals to meaningfully influence the verdict.

Weight Ranges

Each rule has a configurable weight, typically ranging from 5 to 30 points depending on its signal strength:

Signal Strength	Typical Weight	Example
Strong	20-30 points	Known fraudulent fingerprint, extreme velocity (50+ checkouts/hour)
Medium	10-19 points	Disposable email domain, headless browser detection, VPN from datacenter
Weak	5-9 points	Billing-shipping mismatch, uncommon screen resolution, minor timing anomaly

A single weak signal barely moves the needle. Multiple strong signals across different groups push the score toward blocking territory quickly.

Verdict Thresholds: Allow, Warn, Block

The composite score maps to one of three verdicts that determine what happens to the checkout:

ALLOW (Score < 40)

The checkout proceeds with no intervention. The customer sees nothing unusual. This is the default state for legitimate shoppers. Most real customers score between 0 and 15.

A customer browsing your store from their home WiFi, with a consistent device fingerprint, natural mouse movements, normal typing speed, a Gmail address with a history of previous purchases, and no velocity anomalies will score 0. There is nothing to flag.

WARN (Score 40-79)

The checkout is allowed to proceed, but the customer sees a non-blocking banner with a message like “Please verify your information before continuing.” The session is flagged for merchant review in the ShieldFlow dashboard.

The WARN tier exists for borderline cases — situations where there is enough signal to justify caution but not enough confidence to justify blocking a potentially legitimate customer. A real person shopping from a hotel WiFi (flagged as shared/commercial IP) with a new email address and fast form completion might hit 55. Blocking them would be a false positive. Warning them is the appropriate response.

BLOCK (Score >= 80)

The checkout is stopped. The block_progress target in the Shopify Checkout Extension prevents the customer from advancing past the contact information or shipping step. They see a message explaining that the checkout cannot be completed.

A score of 80 or above requires multiple strong fraud signals across different rule groups. This threshold is deliberately high to minimize false positives. Reaching 80 almost always means the checkout exhibits fraud indicators across device, behavior, network, and velocity dimensions simultaneously.

Why These Thresholds?

The default thresholds (40/80) are calibrated based on analysis of hundreds of thousands of checkout events across Shopify stores experiencing active fraud attacks. They represent the optimal balance between catch rate and false positive rate for the median Shopify merchant.

But every store is different. A luxury brand selling $5,000 handbags has a lower tolerance for fraud than a $15 t-shirt store. ShieldFlow lets you adjust both thresholds up or down to match your risk profile.

The 6 Rule Groups Explained

ShieldFlow’s 28 detection rules are organized into six groups. Each group targets a distinct category of fraud signal. The groups are designed so that a sophisticated attack that evades detection in one group will still be caught by rules in other groups.

1. Device Fingerprint Rules (6 rules)

These rules analyze the browser fingerprint collected on your storefront via the Theme App Extension. The fingerprint includes canvas rendering hash, WebGL renderer string, screen resolution, installed fonts, audio context hash, and other browser-level signals.

What they detect:

Fingerprints previously associated with blocked checkouts or confirmed fraud
Headless browser signatures (missing APIs, automation flags, Puppeteer/Playwright artifacts)
Fingerprint anomalies (inconsistencies between claimed browser and actual rendering capabilities)
Fingerprint clustering (many “unique” sessions sharing the same underlying device hash)
Spoofed fingerprints (anti-detect browsers that randomize values but produce statistically improbable combinations)
Missing fingerprint (no storefront fingerprint found in cart attributes — possible direct-to-checkout bot)

Device fingerprinting is the most reliable single signal category because it is resistant to IP rotation and proxy networks. A bot can change its IP address every request but cannot easily change its canvas rendering output or WebGL driver string.

2. Behavioral Analysis Rules (5 rules)

These rules evaluate how the visitor interacted with the page before reaching checkout. Data is collected by JavaScript on the storefront and transmitted alongside the fingerprint.

What they detect:

Form fill timing (fields completed faster than humanly possible — under 800ms for full checkout form)
Mouse movement patterns (no mouse events, perfectly linear movements, or zero idle time)
Keystroke dynamics (uniform timing between keystrokes, no corrections or pauses)
Scroll and navigation behavior (no page scrolling, no product page views before checkout)
Copy-paste detection (credit card number or email pasted rather than typed)

Behavioral rules are particularly effective against headless browser bots that have been patched to pass fingerprint checks. Even with puppeteer-extra-plugin-stealth masking automation flags, the behavioral patterns of a script filling forms at machine speed are fundamentally different from a human typing, pausing, moving a mouse, and scrolling.

3. Network and Geolocation Rules (5 rules)

These rules examine the network origin of the checkout request, captured from the IP address in the request headers when the checkout extension calls the ShieldFlow backend.

What they detect:

Known datacenter or hosting provider IPs (AWS, GCP, Azure, DigitalOcean — real shoppers do not browse from servers)
VPN and proxy detection (commercial VPN exit nodes, Tor exit nodes, SOCKS proxies)
IP geolocation mismatch (IP in Nigeria, billing address in Wisconsin)
IP reputation scoring (addresses flagged in abuse databases for spam, brute force, or fraud)
Impossible travel (same device fingerprint checking out from two countries within minutes)

Network rules provide useful signal but are the easiest category for attackers to spoof via residential proxy networks. This is why network rules alone cannot trigger a BLOCK verdict under the diminishing returns model — they need corroboration from other rule groups.

4. Email and Identity Rules (5 rules)

These rules analyze the email address and identity information provided at checkout.

What they detect:

Disposable email domains (Guerrilla Mail, Temp Mail, Mailinator, and 15,000+ other throwaway domains from a continuously updated list)
Email pattern analysis (randomly generated strings like [email protected] vs. natural names)
Email-name mismatch (checkout name “John Smith” but email starts with “maria.garcia”)
New email with high-value order (email address not seen before combined with unusually large order total)
Multiple emails from same device (three different email addresses used on the same fingerprint within an hour)

Email rules are particularly valuable for catching fake checkout attacks that aim to pollute your marketing lists. Card testing bots typically use programmatically generated email addresses from disposable domains. Catching these before they reach your Klaviyo or Mailchimp instance prevents downstream email list contamination.

5. Velocity and Rate Rules (4 rules)

These rules detect abnormal checkout frequency — the hallmark of automated card testing attacks.

What they detect:

IP velocity (too many checkout attempts from a single IP within a time window)
Fingerprint velocity (too many checkout attempts from the same device fingerprint, regardless of IP rotation)
Email domain velocity (surge of checkouts using emails from the same disposable domain)
Store-wide velocity anomaly (overall checkout rate exceeding historical baseline by a configured multiplier)

Velocity rules are implemented in Redis for sub-millisecond lookups. They use sliding window counters that track checkout frequency across multiple dimensions simultaneously. When a card testing bot starts firing 500 checkouts per hour through rotating proxies, the fingerprint velocity rule catches the pattern even though no single IP exceeds the rate limit.

6. Payment and Order Rules (3 rules)

These rules examine the transaction characteristics and order data available at checkout time and via webhooks.

What they detect:

Card BIN risk scoring (certain BIN ranges are disproportionately represented in fraud, based on aggregate data)
Order value anomalies (extremely small orders typical of card testing, or extremely large orders from new customers)
Multiple declined attempts (repeated checkout submissions with different card numbers from the same session)

Payment rules primarily fire on webhook-processed events (checkouts/create, orders/create) and contribute to the post-checkout safety net for express checkout methods that bypass the storefront fingerprinting layer.

Why Cumulative Scoring Beats Single-Signal Detection

The fundamental problem with single-signal fraud detection is that every individual signal has a legitimate explanation.

VPN? Millions of privacy-conscious shoppers use VPNs daily.
Disposable email? Some people use them for all online purchases to avoid spam.
Fast form fill? Browser autofill can complete a checkout form in under a second.
Datacenter IP? Corporate employees browsing during lunch from their office network.
New device fingerprint? A customer on a new phone or freshly installed browser.

Any system that blocks on a single signal will produce unacceptable false positive rates. Block all VPN users and you lose 5-8% of your legitimate traffic. Block all disposable emails and you alienate privacy-focused customers.

Cumulative scoring solves this by requiring convergence across multiple independent signal dimensions before reaching a blocking verdict. The math makes false positives extremely unlikely:

Probability of a legitimate customer using a VPN: ~7%
Probability of a legitimate customer using a disposable email: ~2%
Probability of both simultaneously: ~0.14%
Probability of both plus anomalous behavioral signals: ~0.003%

Each additional uncorrelated signal reduces false positive probability by an order of magnitude. By the time a checkout scores 80+, the probability that it is a legitimate customer is vanishingly small.

Conversely, a multi-signal attack — a bot behind a residential proxy with a spoofed fingerprint using a real-looking email address — might evade two or three rule groups entirely. But it is almost impossible to evade all six. The behavioral rules will catch the machine-speed form filling. The velocity rules will catch the frequency pattern. The email rules will detect the programmatically generated address. The cumulative score still reaches the blocking threshold.

This is why ShieldFlow uses 28 rules instead of relying on one or two strong signals. Breadth of coverage matters more than depth in any single detection dimension.

How Merchants Tune Rules

Every store has different traffic patterns, customer demographics, and risk tolerance. A one-size-fits-all configuration does not work. ShieldFlow provides two primary tuning mechanisms.

Score Weight Sliders

Each of the 28 rules has an adjustable weight slider in the merchant dashboard. The slider controls how many points that rule contributes when it fires.

When to increase a rule’s weight:

You are seeing fraud that this rule detects but the current weight is not pushing scores high enough to trigger a WARN or BLOCK
Your store’s traffic patterns make a specific signal more indicative of fraud than average (e.g., if you never sell internationally, a foreign IP is a much stronger signal)

When to decrease a rule’s weight:

A rule is generating false positives for your customer base (e.g., if many of your customers are in the tech industry and use VPNs routinely)
You have independent verification that a signal is not correlated with fraud for your specific niche

When to disable a rule entirely:

The rule conflicts with a known characteristic of your customer base. Some stores selling privacy-focused products will disable the VPN detection rule entirely because a significant portion of their legitimate customers use VPNs.

Threshold Adjusters

Beyond individual rule weights, you can adjust the two verdict thresholds:

WARN threshold (default: 40): Lower this to catch more borderline cases for review. Raise it if too many legitimate customers are seeing warning banners.
BLOCK threshold (default: 80): Lower this if you are under active attack and want more aggressive blocking. Raise it if you are experiencing false positives on the BLOCK verdict.

A practical approach: start with defaults. Monitor the dashboard for one to two weeks. If you see legitimate customers hitting WARN verdicts frequently, identify which rules are firing and reduce their weights or raise the WARN threshold. If fraud is getting through with scores in the 60-75 range, lower the BLOCK threshold to 70 temporarily until you identify which rule weights need increasing.

Example Scenarios

Understanding how scoring works in practice makes the system tangible. Here are three representative scenarios.

Scenario 1: Card Testing Bot Attack (Score: 100 — BLOCK)

A card testing bot hits your store at 3 AM using Puppeteer behind a residential proxy network. It rotates through stolen credit card numbers at a rate of 200 attempts per hour.

Rule	Points
Headless browser detected (device group, primary)	+25
Fingerprint seen on 40+ checkouts today (device group, secondary, 50%)	+10
Form filled in 0.2 seconds (behavioral group, primary)	+20
Zero mouse movements (behavioral group, secondary, 50%)	+7
Fingerprint velocity exceeded (velocity group, primary)	+20
IP velocity exceeded (velocity group, secondary, 50%)	+8
Disposable email domain (email group, primary)	+15
Total	105 (capped at 100)

Verdict: BLOCK. The bot triggered rules across four of six groups. Even with diminishing returns on secondary rules, the cumulative score far exceeds the blocking threshold. The checkout is stopped before the payment is submitted.

Scenario 2: Suspicious But Possibly Legitimate (Score: 55 — WARN)

A real person shops from a coffee shop WiFi, uses a relatively new Gmail address, and has browser autofill complete the checkout form quickly.

Rule	Points
Commercial/shared IP detected (network group, primary)	+12
Fast form completion via autofill (behavioral group, primary)	+10
New email, no purchase history (email group, primary)	+8
Billing-shipping address mismatch (payment group, primary)	+10
IP geolocation differs from billing (network group, secondary, 50%)	+5
Minor screen resolution anomaly (device group, primary)	+5
Total	50

Verdict: WARN. The customer sees a “Please verify your information” banner but can complete checkout. The session is flagged for merchant review. No sale is lost. If the customer completes the purchase and it turns out legitimate, the merchant can whitelist the fingerprint.

Scenario 3: Clean Legitimate Customer (Score: 0 — ALLOW)

A returning customer visits your store from their home network, browses three products, adds one to cart, and completes checkout at normal human speed using their regular email address.

Rule	Points
(No rules fired)	0
Total	0

Verdict: ALLOW. Zero friction. The customer never knows ShieldFlow exists. This is the experience for the vast majority of your real customers.

Custom Rules on Top of Built-In

ShieldFlow’s 28 built-in rules cover the most common fraud patterns across the Shopify ecosystem. But some stores face niche-specific threats that warrant custom detection logic.

ShieldFlow supports custom rules that you can define to supplement the built-in set:

Geo-Restriction Rules

If you only ship to the US and Canada, you can create a custom rule that adds 25 points for any checkout originating from an IP outside those countries. Combined with other signals, this effectively blocks international fraud without a hard geographic block that might catch legitimate customers using VPNs.

Product-Specific Rules

High-risk products (gift cards, digital goods, high-resale-value items) attract disproportionate fraud. A custom rule can add weight when specific product SKUs appear in the cart, raising sensitivity for orders that include your most targeted items.

Time-Based Rules

If your legitimate customers almost never shop between 2 AM and 6 AM local time, a custom rule can add modest points to late-night checkouts. This does not block night owls on its own — but combined with other signals, it provides useful context.

Blocklist Rules

Maintain explicit blocklists of known-bad fingerprints, email addresses, email domains, or IP ranges. When a blocklisted entity appears at checkout, the custom rule adds maximum weight — effectively guaranteeing a BLOCK verdict. This is useful for repeat offenders you have already identified.

Custom rules follow the same scoring and diminishing returns model as built-in rules. They can be assigned to any of the six rule groups or to a dedicated “custom” group.

Frequently Asked Questions

How long does fraud scoring take? Will it slow down my checkout?

The entire scoring pipeline — all 28 rules evaluated against the checkout data — executes in under 50 milliseconds. Redis-backed velocity checks run in sub-millisecond time. The checkout extension has a 4-5 second timeout, and the scoring response typically returns in under 200ms including network latency. Customers do not experience any perceptible delay.

Can a legitimate customer ever score 80 or above?

It is theoretically possible but extremely unlikely with default rule weights. Reaching 80 requires strong fraud signals across multiple independent rule groups. A legitimate customer would need to simultaneously use a datacenter VPN, have a disposable email, exhibit bot-like behavioral patterns, and match a velocity anomaly. In practice, false positive rates on BLOCK verdicts are below 0.1% across ShieldFlow’s merchant base. If your store’s traffic patterns cause elevated false positives, adjusting rule weights resolves the issue.

What happens if a rule I disabled would have caught fraud?

Each rule operates independently. Disabling one rule removes its score contribution but does not affect the other 27. If the checkout exhibits fraud signals that other active rules detect, the cumulative score will still push the verdict toward WARN or BLOCK. The system is designed for redundancy — disabling a single rule degrades detection marginally, not catastrophically.

Can I see which rules fired on a specific checkout?

Yes. The ShieldFlow dashboard provides a full scoring breakdown for every checkout event. You can see the composite score, each rule that fired, the points it contributed (including whether it was a primary or secondary contribution), and the final verdict. This transparency is essential for tuning — you can identify exactly which rules are driving verdicts and adjust accordingly.

How often are the built-in rules updated?

ShieldFlow continuously updates detection logic based on emerging fraud patterns across its merchant network. New disposable email domains are added to the blocklist weekly. Headless browser detection signatures are updated as new automation frameworks and anti-detection tools appear. Rule weight defaults are recalibrated quarterly based on aggregate performance data. Updates are deployed automatically — you do not need to take any action.

Does the scoring system learn from my store’s specific fraud patterns?

The built-in rules use detection logic trained on aggregate data from the Shopify ecosystem. Over time, ShieldFlow builds a store-specific baseline for your checkout patterns — normal velocity, typical customer device profiles, expected geographic distribution. Deviations from your baseline are weighted more heavily than deviations from the global average, which means the system becomes more accurate for your store the longer it runs.

What if I am under active attack and need to block more aggressively right now?

Lower your BLOCK threshold temporarily. Dropping it from 80 to 60 means that checkouts with moderate fraud signals are blocked instead of warned. You can also increase the weights on velocity and fingerprint clustering rules during an active attack, then restore defaults when the attack subsides. ShieldFlow’s dashboard shows real-time scoring distribution so you can see immediately how threshold changes affect verdicts.

How does scoring handle express checkout methods like Shop Pay and Apple Pay?

Express checkout methods bypass the storefront fingerprinting layer, which means device and behavioral rules have limited data to work with. For these checkouts, scoring relies more heavily on network, email, velocity, and payment rules. The post-checkout webhook pipeline scores the order and can auto-cancel if the score exceeds the BLOCK threshold. This is a secondary safety net — less granular than pre-checkout scoring but still effective against automated attacks. Read more in our express checkout fraud guide.

Bottom Line

Fraud scoring works because fraud is a pattern, not a single event. No individual signal reliably separates a bot from a human. But 28 signals evaluated simultaneously, weighted by confidence, with diminishing returns to prevent correlated signal inflation, produce a composite score that separates fraud from legitimate traffic with high accuracy.

The three-verdict system — allow, warn, block — gives you graduated responses instead of a binary gate. Clean customers pass through invisibly. Borderline cases get flagged for review without losing the sale. High-confidence fraud is stopped at the checkout before the payment is processed, before the email reaches your marketing tools, before the chargeback enters your dispute queue.

If you are relying on a single signal — IP blocking, email validation, or Shopify’s built-in fraud analysis alone — you are leaving gaps that modern attackers have already learned to exploit. Multi-signal scoring closes those gaps.

ShieldFlow runs 28 rules on every checkout, scores in under 50ms, and gives you full control over weights and thresholds. If you want to understand exactly what those 28 rules detect and how, read our complete bot detection guide. For a broader comparison of fraud prevention options, see our roundup of the best Shopify fraud prevention apps for 2026.

Protect your Shopify store with intelligent fraud scoring. See how ShieldFlow works — 28 rules, real-time verdicts, full merchant control.