How to Evaluate an IP Geolocation API: Pricing, Accuracy, and False Positives
Most teams evaluating a location intelligence API start with the price per lookup and stop there. That approach misses the cost that actually matters: the cost of wrong answers. A false positive on a fraud check blocks a legitimate customer. A false negative lets a bad transaction through. The real question isn't "how cheap is the API?" — it's "how much does each wrong answer cost me?"
What Buyers Overlook in IP Data Procurement
Why Price-Per-Lookup Is the Wrong Starting Point
Every vendor publishes a number: $0.00033 per lookup, $0.0012 per lookup, $0.004 per lookup. These figures are real, but they describe only one dimension of the total cost of operating a location intelligence layer. Three other factors usually dominate the actual expense:
1. False positives — blocking legitimate traffic
If your geolocation API flags 5% of legitimate users as suspicious because it misidentifies corporate VPNs, shared ISPs, or mobile carrier NAT pools as proxies, you are paying the API to lose customers. In e-commerce, the average lifetime value of a blocked legitimate customer far exceeds the per-lookup cost by orders of magnitude. The vendor with the cheaper lookup but the higher false-positive rate is usually the more expensive option.
2. Latency — the hidden tax on every request
If your IP lookup adds 200ms to each login, checkout, or API call, you are paying in user abandonment. Research consistently shows that every 100ms of latency correlates with a measurable drop in conversion. APIs that return results in under 50ms keep your critical paths fast. Slow lookups cascade into timeout retries, circuit-breaker trips, and degraded user experience.
3. Coverage gaps — the cost of missing data
APIs with strong coverage in North America and Europe but weak data in Southeast Asia, Africa, or Latin America will return low-confidence results or null values for users in those regions. If your platform operates globally, every null or low-confidence response is a decision you cannot make — which means either elevated manual review costs or accepted risk exposure.
Vendor Evaluation Framework: What to Actually Compare
When evaluating a location intelligence or IP data vendor, use a structured comparison across seven criteria. Price is one of them — but it should be weighted against the others based on your use case.
| Evaluation Criterion | Why It Matters | How to Test |
|---|---|---|
| Accuracy (country / city) | Country-level accuracy should exceed 99%. City-level varies — check your target markets. | Run 1,000 known IPs through the API and compare against ground truth. |
| VPN / proxy detection | Critical for fraud use cases. Low detection rates let masked traffic through; high false-positive rates block legitimate VPN users. | Test with known VPN IPs, corporate VPNs, and residential proxies. Measure precision and recall. |
| Response latency (p50 / p99) | Sub-50ms p99 is essential for login, checkout, and real-time enforcement. | Run load tests from your deployment region. Track p50, p95, and p99. |
| Data coverage and freshness | 232-country coverage with ASN, ISP, connection type. Freshness affects detection of new proxy networks. | Check ASN assignment dates. Test recently-provisioned IPs. |
| Compliance posture | SOC 2 Type II, GDPR, CCPA, ISO 27001. Matters for regulated industries. | Request compliance documentation. Verify certification status directly. |
| Bulk and batch support | Data cleansing, enrichment pipelines, and analytics workloads need efficient batch processing. | Upload a sample CSV (10K+ records). Measure throughput and accuracy at scale. |
| Total cost of ownership | Price per lookup times volume, plus integration cost, support cost, and false-positive cost. | Model TCO over 12 months including engineering time and fraud losses. |
Pricing Structure Breakdown: What You Actually Pay
Most IP data vendors price on a credit-based model where one credit equals one API request. The economics change significantly at volume. Here is how a typical tiered structure works, using current market pricing as a reference:
| Tier | Monthly Lookups | Price | Cost per Lookup | Best For |
|---|---|---|---|---|
| Starter | 7,500 | $2.50/mo | $0.00033 | Small SaaS, prototyping, MVPs |
| Growth | 40,000 | $11/mo | $0.00028 | Growing platforms, mid-market |
| Business | 150,000 | $35/mo | $0.00023 | E-commerce, fintech, production scale |
| Enterprise | Custom | Custom | Negotiated | High-volume, compliance-critical |
For bulk data cleansing workloads, pricing often drops further. Processing 1 million IP records through a batch endpoint typically costs around $0.00027 per record and completes in 15-20 minutes. At 10 million records, the per-IP cost can fall below $0.00015.
The False-Positive Problem: Why Accuracy Numbers Mislead
A vendor claiming "99.6% country-level accuracy" is reporting aggregate performance across all IPs and all regions. That number does not tell you how the API performs on the specific traffic patterns that matter to your business. Two vendors can both honestly claim 99.6% accuracy and deliver wildly different results on your actual traffic mix.
The most consequential accuracy dimension for most commercial deployments is not country accuracy at all — it is proxy and VPN classification accuracy. When an API classifies a legitimate corporate VPN user as a proxy (false positive), the downstream effect is a blocked login, a flagged transaction, or a forced step-up authentication. When it misses a real proxy (false negative), a fraudster walks through.
Practical accuracy test protocol
- Collect 500 IP addresses from your own production traffic that you know are legitimate users.
- Collect 200 IP addresses from known VPN providers (commercial and corporate).
- Collect 100 IP addresses from known residential proxy networks.
- Run all 800 through each vendor's API.
- Measure false-positive rate on set 1, false-negative rate on sets 2 and 3.
- Calculate the total cost: (false positives × customer LTV) + (false negatives × average fraud loss) + (lookups × price).
Evaluating API Quality with a Test Script
Here is a practical evaluation script that runs a batch of test IPs and measures the response quality you actually get. This works with any REST-based IP intelligence API:
curl -s "https://ip-info.app/api/v1/geolocate/8.8.8.8" \
-H "x-api-key: YOUR_KEY" | python3 -m json.tool
{
"ip": "8.8.8.8",
"countryCode": "US",
"city": {
"name": "Ashburn",
"latitude": 39.03,
"longitude": -77.5,
"accuracy_radius": 1000,
"time_zone": "America/New_York"
},
"asn": 15169,
"aso": "Google LLC",
"isVPN": false,
"threatLevel": "low",
"connectionType": "corporate"
}interface IpResult {
ip: string;
countryCode: string;
city: { name: string; accuracy_radius: number };
isVPN: boolean;
threatLevel: string;
asn: number;
aso: string;
}
async function evaluateVendor(
testIps: string[],
apiClient: (ip: string) => Promise<IpResult>
) {
let falsePositives = 0;
let falseNegatives = 0;
let nullResults = 0;
let totalLatency = 0;
for (const ip of testIps) {
const start = performance.now();
const result = await apiClient(ip);
totalLatency += performance.now() - start;
if (!result.countryCode) nullResults++;
// Compare against your known ground-truth labels
// to tally falsePositives and falseNegatives
}
return {
avgLatencyMs: totalLatency / testIps.length,
falsePositiveRate: falsePositives / testIps.length,
falseNegativeRate: falseNegatives / testIps.length,
nullRate: nullResults / testIps.length,
};
}Decision Matrix: Which Tier Fits Your Use Case
Matching your use case to the right plan is not just about volume. The enforcement pattern, response-time requirement, and data depth all affect which tier makes economic sense:
| Use Case | Monthly Volume | Latency Need | Recommended Tier |
|---|---|---|---|
| Login risk scoring | 50K - 500K | <50ms p99 | Business or Enterprise |
| E-commerce fraud screening | 200K - 2M | <50ms p99 | Business or Enterprise |
| Marketing geo-targeting | 10K - 100K | <200ms | Growth or Business |
| CRM data enrichment | Bulk batch | Async (minutes) | Business (batch pricing) |
| Compliance / AML screening | 100K - 1M | <100ms | Enterprise (SLA + audit) |
| Content localization | 500K+ | <100ms | Business or Enterprise |
Compliance and Security: Non-Negotiable Requirements
If your organization operates in a regulated industry — financial services, healthcare, payment processing, government — the vendor's compliance posture is a gating criterion, not a differentiator. Confirm these before signing:
Required by most enterprise procurement processes. Verifies data handling controls.
Mandatory for processing EU user IPs. Vendor should not store or log raw IP addresses.
Required for California user data. Covers IP-based profiling and tracking.
Information security management standard. Important for financial services and government.
Matters for payment-adjacent use cases. Not all IP vendors meet this bar.
IPs processed in memory only. No persistent logs. Critical for privacy-first architectures.
Ready to Test with Your Own Traffic?
Run a real evaluation with your production IP patterns. Start with the free tier — no credit card required — and measure accuracy, latency, and false-positive rates against your actual users.
Frequently Asked Questions
How do I calculate the true cost of a false positive?
Multiply your false-positive rate by your monthly lookup volume, then multiply by the average cost of each blocked legitimate interaction. For e-commerce, this is typically the average order value times the probability the user does not return after being blocked. For SaaS, it is the monthly subscription value times the churn probability after a denied login.
What accuracy level do I actually need?
Country-level accuracy above 99% is the baseline for any commercial use. City-level accuracy matters for localization and content personalization — 95%+ is strong for major markets. VPN and proxy detection accuracy is the most impactful metric for fraud prevention: look for vendors that publish separate precision and recall figures rather than a single aggregate percentage.
Should I evaluate multiple vendors simultaneously?
Run a parallel evaluation with 2-3 vendors using your own production IP samples. Most vendors offer free tiers or trial keys sufficient for a meaningful accuracy and latency test. The evaluation should run for at least one week to capture day-of-week and time-of-day variation in both accuracy and response times.