More

tallytarik · 2025-12-15T21:59:44 1765835984

Working on improving the data pipeline for https://iplocate.io - an IP intelligence service I've worked on since 2017. A couple of recent focuses:

1. VPN and proxy detection. We already track dozens of providers, but we can do better here. There's also a bunch of metadata we collect as part of this process which we don't currently surface, so I'm looking at what else we can bring to our databases and free API.

2. Better detail and evidence on how we build and test our own geolocation database, which we create from scratch. There's been a recent trend of misinformation about geo accuracy, including from some other providers, so I want to better explain the accuracy (and inaccuracy) of various techniques, our policy for when we prefer certain data, and so on.

(Open to partnerships for any folks looking for a new provider!)

tallytarik · 2025-12-15T06:04:32 1765778672

There are plenty of VPN and proxy detection services, either as a service (API) or downloadable database, which are surprisingly comprehensive. Disclaimer: I’ve run one since 2017. Years on, our primary data source is literally holding dozens of subscriptions to every commercial provider we can find, and enumerating the exit node IP addresses they use.

There are also other methods, like using zmap/zgrab to probe for servers that respond to VPN software handshakes, which can in theory be run against the entire IP space. (this also highlights non-commercial VPNs which are not generally the target of our detection, so we use this sparingly)

It will never cover every VPN or proxy in existence, but it gets pretty close.

acka · 2025-12-15T13:25:02 1765805102

> Years on, our primary data source is literally holding dozens of subscriptions to every commercial provider we can find, and enumerating the exit node IP addresses they use.

Assuming your VPN identification service operates commercially, I trust that you are in full compliance with all contractual agreements and Terms of Service for the services you utilize. Many of these agreements specifically prohibit commercial use, which could encompass the harvesting of exit node IP addresses and the subsequent sale of such information.

infecto · 2025-12-15T15:38:25 1765813105

TOS are pretty meaningless in cases like this. It amounts to getting rejected as a customer and your account canceled.

itintheory · 2025-12-15T20:08:12 1765829292

I think ToS violations can also run afoul of CFAA.

infecto · 2025-12-15T20:14:55 1765829695

Those are pretty old cases that I think the courts have moved away from and even in those cases it was a TOS violation and explicit c&d that the company ignored.

qingcharles · 2025-12-16T00:43:24 1765845804

I don't think they can any longer, I think there is case law on this.

Illinois law makes it a misdemeanor to violate web site ToS, though. And felony for the second time IIRC. Other states probably also.

fourside · 2025-12-15T15:50:14 1765813814

Maybe the tables could be turned and we can build a service with dozens of subscriptions to every VPN detection service and report them for ToS violations ;)

MangoToupe · 2025-12-15T16:48:27 1765817307

> I trust that you are in full compliance with all contractual agreements and Terms of Service

Why? It's not like there's any real moral (or, likely, legal) reason to care beyond avoiding the service's ban hammer.

qingcharles · 2025-12-16T00:45:35 1765845935

In Illinois you could, in theory, be jailed for up to three years for violating a web site ToS. (classified as "Computer Tampering")

MangoToupe · 2025-12-16T00:47:51 1765846071

I don't think that would hold up in court anymore.

qingcharles · 2025-12-16T07:45:10 1765871110

It's a statutory offense, so you could get lucky and the prosecutor wouldn't prosecute it, but it's there for them to use:

https://www.ilga.gov/Documents/legislation/ilcs/documents/07...

... "the owner authorizes patrons, customers, or guests to access the computer network and the person accessing the computer network is an authorized patron, customer, or guest and complies with all terms or conditions for use of the computer network that are imposed by the owner;"

immibis · 2025-12-16T07:59:11 1765871951

There's a little secret that most of the business world knows but individuals do not know: You don't have to follow Terms of Service. In most cases, the maximum penalty the company can impose for a ToS violation is a termination of your account. And it's not illegal to make a new account. They can legally ban you from making a new account, and you can legally evade the ban.

Unless you're the one-in-a-million unlucky user who gets prosecuted under the CFAA's very generic "unauthorized access to a protected computer" clause, like Aaron Swartz. It seems the general consensus is this doesn't apply to breaking a website ToS, and Aaron was only in so much trouble because he broke into a network closet, as well as for copyright violation. But consult a lawyer if unsure. (That's another difference: A business will ask a lawyer if it wants to do something shady, while an individual will simply avoid doing it)

addandsubtract · 2025-12-15T12:59:08 1765803548

Tangent: if you hold access to all VPN providers, have you thought about also releasing benchmarks for them? I would be interested in knowing which ones offer the best bandwidth / peering (ping).

0xdeadbeefbabe · 2025-12-15T15:39:06 1765813146

> which are surprisingly comprehensive

How does the buyer even know what the precision and recall rates might be?

recursive · 2025-12-15T20:16:06 1765829766

Probably contrary to the stealth aspect.

ranger_danger · 2025-12-15T14:16:13 1765808173

This will also cause problems with anyone that happens to (even accidentally/unknowingly) use apps that integrate services from companies such as BrightData/Luminati/HolaVPN/etc. where they sell idle time on your device/connection to their VPN/proxy customers.

The legitimate end-user will then no longer be able to use e.g. SoundCloud.

blibble · 2025-12-15T15:23:36 1765812216

I fail to see the problem if people that allow their internet connection used by scammers/AI crawlers are banned from every service

kstrauser · 2025-12-15T16:27:54 1765816074

I’m with you on this one. Some of my projects are flooded with sus traffic from Brazil. I don’t believe there are a million eager Brazilian hackers targeting me in particular. It’s pretty clear from analysis that they’re all residential hosts running proxies, knowingly or otherwise.

The more concise word for this is “botnet”. Computers participating in one should be quarantined until they stop.

majorchord · 2025-12-15T16:32:45 1765816365

> unknowingly

Often times random shovelware apps will have these proxy SDKs embedded in them, and the only mention of it being part of the software is buried in some long ToS that nobody reads.

Dylan16807 · 2025-12-16T01:35:43 1765848943

Sort of valid today.

But the more sites that require a residential VPN for normal use, the less legitimate that argument becomes.

GoblinSlayer · 2025-12-15T16:44:22 1765817062

You might want to learn how internets work today: https://en.wikipedia.org/wiki/Network_address_translation

rdsubhas · 2025-12-15T10:06:36 1765793196

Interesting. I assumed all VPNs switched to IPv6 by now, making detection much harder.

bombcar · 2025-12-15T15:30:48 1765812648

IPv6 isn't magically unrouteable, it just routes much larger blocks of "end IP addresses."

You just track and block /24 or /16 as necessary.

tallytarik · 2025-12-15T22:01:37 1765836097

Much of the internet still does not support IPv6, so most providers will give you an IPv4 address. In fact only a few providers even support IPv6 at all.

Even with IPv6 it's not a huge problem. With a few samples we can know that a provider is operating in a given /64 or /48 or even /32 space, and can assign a confidence level that the range is used for VPNs.

tux3 · 2025-12-15T15:08:16 1765811296

Many websites including Soundcloud are still only accessible through IPv4, so this is moot, even if VPNs support IPv6 it's enough to block their V4 exit nodes for Soundcloud.

vb-8448 · 2025-12-15T18:43:17 1765824197

just out of curiosity: if i'm located in spain and i setup an ec2 or digital ocean instance in germany and use it as a socks proxy over ssh, do you will detect me?

kube-system · 2025-12-16T03:11:38 1765854698

It is even easier to block hosting providers. They typically publish official lists. Here's the full list for both of those providers:

https://ip-ranges.amazonaws.com/ip-ranges.json

https://digitalocean.com/geo/google.csv

(And even if they don't publish them, you can just look up the ranges owned by any autonomous network with the appropriate registry.)

tallytarik · 2025-12-15T19:46:09 1765827969

It won’t end up in our proxy detection database, but we track hosting provider ranges separately: https://www.iplocate.io/data/hosting-providers/

dizhn · 2025-12-15T19:05:39 1765825539

That's a hosting service IP block. Some sites block them already. Netflix for instance.

m00dy · 2025-12-15T15:54:21 1765814061

who's buying your service ?

cons0le · 2025-12-15T22:40:24 1765838424

Sounds like snitching as a service

tallytarik · 2025-12-13T23:23:11 1765668191

Most of these providers are in fact open about the fact that these locations are “virtual”, so it’s misleading to say they don’t match where they claim to be.

There is however an interesting question about how VPNs should be considered from a geolocation perspective.

Should they record where the exit server is located, or the country claimed by the VPN (even if this is a “virtual” location)? In my view there is useful information in where the user wanted to be located in the latter case, which you lose if you only ever report the location of servers.

(disclaimer: I run a competing service. we currently provide the VPN reported locations because the majority of our customers expect it to work that way, as well as clearly flagging them as VPNs)

balder1991 · 2025-12-14T02:22:28 1765678948

Yeah, Proton is quite explicit about that: https://protonvpn.com/support/how-smart-routing-works

reincoder · 2025-12-14T04:34:28 1765686868

I work for IPinfo, and I appreciate your comment.

Our product philosophy is centered on accuracy and reliability. We intentionally diverge from the broader IP geolocation industry's trust-based model. Instead of relying primarily on "aggregation and echo", we focus on evidence-backed geolocation.

Like others in the industry, we do ingest self-reported IP geolocation data, and we do that well. Given our scale and reputation, we receive a significant volume of feedback and guidance from network operators worldwide. We actively conduct outreach, and exchange ideas with ISPs, IXPs, and ASNs. We attend NOG events, participate in research conferences, and collaborate with academia. We have a community and launch hackathon events, which allow us to talk to all the stakeholders involved.

Where we differ is in who our core users are. Our primary user base operates at a critical scale, where compromises on data accuracy are simply not acceptable. For these users, IP geolocation cannot be a trust-based model. It must be backed by verifiable data and evidence.

We believe the broader internet ecosystem benefits from this approach. That belief is reflected in our decision to provide free data downloads, a free API with unlimited requests, and active collaboration with multiple platforms to make our data widely accessible. Our free datasets are licensed under CC-BY-SA 4.0, without an EULA, which makes integration, even for commercial use straightforward.

I appreciate you recognizing that our product philosophy is different. We are intentionally trying to differentiate ourselves from the industry at large, and it is encouraging to see competing services acknowledge that they are focused on a different model.

LunaSea · 2025-12-14T08:51:02 1765702262

If we can pay them in virtual dollars, no problem

tallytarik · 2025-10-13T06:14:34 1760336074

Working on improving the data pipeline for https://iplocate.io - an IP intelligence service I've worked on since 2017.

Recent focus has been on geolocation accuracy, and in particular being able to share more data about why we say a resource is in a certain place.

Lots of folks seem to be interested in this data, and there's very little out there. Most other industry players don't talk about their methodology, and those that do aren't overly honest about how X or Y strategy actually leads to a given prediction, or the realistic scale or inaccuracies of a given strategy, and so on. So this is an area I'm very interested in at the moment and I'm confident we can do better in. And it's overall a fascinating data challenge!

tallytarik · 2025-10-07T08:55:32 1759827332

Our government has been paying Deloitte & co. to produce slop for years before AI was being used to generate said slop.

Can we get a refund for all of the others too?

tallytarik · 2025-08-27T22:28:23 1756333703

ISPs have no obligation, although the ubiquity of sites and apps relying on IP geolocation mean that ISPs are incentivized to provide correct info these days.

I run a geolocation service, and over the years we've seen more and more ISPs providing official geofeeds. The majority of medium-large ISPs in the US now provide a geofeed, for example. But there's still an ongoing problem in geofeeds being up-to-date, and users being assigned to a correct 'pool' etc.

Mobile IPs are similar but are still certainly the most difficult (relative lack of geofeeds or other accurate data across providers)

miki123211 · 2025-08-28T06:50:33 1756363833

Mobile IPs reflect the user's "registered area" at best, not their actual location.

This is mostly because of how APNs / G-GNS / P-GW systems work. E.G. you may have an APN that puts you straight in a corporate network, and the mobile network needs you to keep using that APN when roaming. This is why your roaming IP is usually in the country you're from, not the one you're currently in.

I've heard of local breakout being possible, but never actually seen it in practice.

tallytarik · 2025-08-17T21:54:22 1755467662

I thought this was going to be an analysis of articles that are clearly AI-generated.

I feel like that’s an increasing ratio of top posts, and they’re usually an instant skip for me. Would be interested in some data to see if that’s true.

tallytarik · 2025-08-13T17:46:29 1755107189

G2, Sourceforge (yes, that one), and Gartner’s Capterra/GetApp/SoftwareAdvice all have the same business plan: charge vendors $x,xxx+ per month to outrank other vendors in their made up categories.

Of course, you can technically list for free.

But look! For the low low price of $x,xxx per month, now you can show one of 40 tailor-made award icons on your site!

Or, unlock the privilege of showing “user reviews” from our site on your site! (of course if you had managed to get reviews independently, you’re not allowed to use the widget without paying)

Don’t have reviews? Ah, I forgot to mention. The $x,xxx plan also comes with “review generation” — we’ll pay users to write reviews for you!

Oh, and on an unrelated note, the $x,xxx plan just also happens to unlock dofollow links across each of those 40 made up categories, which all rank highly in google. And the $xx,xxx plan means that - user ratings aside - you can end up at the top of those categories.

It’s hard to describe it other than the author says: a grift. Seeing those logos on other companies sites are now a huge turn off to me personally, and I haven’t yet capitulated for my own SaaS, but I suspect this isn’t the feeling of the execs they seek to target. Or maybe it is, and it’s just the price of doing business.

abirch · 2025-08-13T18:15:02 1755108902

I think this is the at same model that the NYTimes book reviews back had in the 1990s. Pay us money and we'll say nice things.

It'll be interesting to see how AI Agents approach things. My prediction is that more of our media is going to be controlled by our AI Agent's Algorithm instead of Google, Twitter, and Facebook's algorithm or some distant editors who decided what went on the front page of the newspaper.

tallytarik · 2025-08-02T13:36:37 1754141797

I've tried variations of this. I find it will often cause it to include cringey bullshit phrases like:

"Here's your brutally honest answer–just the hard truth, no fluff: [...]"

I don't know whether that's better or worse than the fake flattery.

arrowsmith · 2025-08-02T15:07:59 1754147279

You need a system prompt to get that behaviour? I find ChatGPT does it constantly as its default setting:

"Let's be blunt, I'm not gonna sugarcoat this. Getting straight to the hard truth, here's what you could cook for dinner tonight. Just the raw facts!"

It's so annoying it makes me use other LLMs.

cruffle_duffle · 2025-08-02T15:43:58 1754149438

Its response is still flattery, just packaged in a different form. Patronizing, actually.

BrawnyBadger53 · 2025-08-02T13:38:01 1754141881

Similar experience, feels very ironic

dcre · 2025-08-02T13:49:57 1754142597

Curious whether you find this on the best models available. I find that Sonnet 4 and Gemini 2.5 Pro are much better at following the spirit of my system prompt rather than the letter. I do not use OpenAI models regularly, so I’m not sure about them.

danielscrubs · 2025-08-02T14:59:12 1754146752

That is not the spirit nor the letter though.

dcre · 2025-08-03T03:39:13 1754192353

That is a good point. I guess the reason that distinction came to mind is that what’s happening here is the LLM trying to manifest its obedience in letter (i.e., by saying it).

tallytarik · 2025-07-01T21:02:50 1751403770

SEEKING FREELANCER | Remote | Integration Engineers, Content Writers

IPLocate is on a mission to provide developers with reliable, affordable, and easy-to-use IP address intelligence - geolocation, threat data, network information and more.

We're looking for engineers to write SDKs and integrations to use our APIs with popular programming languages, frameworks, and tools. We would prefer to work with multiple folks who are experts in their respective language/framework rather than a single engineer to write 20 integrations, so we'd love to hear about your experience.

We're also looking for content writers to help write practical tutorials, step-by-step guides, and real-world use cases for our website and blog, and for publication elsewhere (e.g. Medium, Dev.to).

Details and contact links: https://www.iplocate.io/build-for-iplocate

(We've recently launched this page as an open offer to interested folks. Get in touch with your details and we're happy to formalize an offer.)