Coffee Space


Listen:

Detecting VPNs

Preview Image

Disclaimer: This conversation for now is completely theoretical in nature and may be incorrect. Take it with a pinch of salt until I am able to prove whether it works or not.

TL;DR

In this article a VPN detection method is suggested based on some reasonable assumptions.

Problem

We have some service we provide to users and some users use VPNs to abuse the service. We control both the code on the server and the code on the client.

Currently, blocks are implemented via IP-based bans, using some netmask to either ban a single IP or a range. With the use of VPNs, such blocking processes are almost completely ineffective.

The question we ask ourselves is: Can we detect the use of a VPN reliably?

Firstly, we make some basic assumptions about the problem space:

  • Client Application – The client runs an application that we have full control over. The client may choose to hack their client, but the protocols it must communicate are completely controlled by us.
  • Server Application – We have a server application that we have full control over.
  • VPN – This is some generic service and the malicious user does not have control over the implementation of the VPN.
  • Location – It’s unlikely that the person lives next to the VPN service provider.

After some reason online, the general consensus is that it apparently cannot be done:

Unfortunately, there’s is no proper technical way to get the information you want. You might invent some tests, but those will have a very low correlation with the reality. So either you’ll not catch those you want, or you’ll have a larger number of false positives. Neither can be considered to make sense.

Generating any kind of traffic backwards from an Internet server in response to an incoming client (a port scan, or even a simple ping) is generally frowned upon. Or, in the case of a port scan, it may be even worse for you, eg when the client lives behind a central corporate firewall, the worst of which is when the client comes from behind the central government network firewall pool…

As you can see, there are some restrictions on what kinds of solutions we can even consider. This includes:

  • No port-scanning – You can’t do a mass scan, as this is slow, unreliable (you do not know their network infrastructure) and potentially will get you in trouble.
  • No crazy back-traffic – Specifically, anything that would get you in trouble with some authorities, specifically government.

The one thing we do have on our side is time. For this particular application, it is okay to make these checks as the person begins to use the service. We can reasonably wait at least 10 seconds to get the result of the detection method.

System Setup

In the setup, we consider the server, local network and client. The local network represents the client machine, a router, a local network - whichever localized networking hardware we hit first representing user’s IP. The reason for considering this will become apparent later.

No VPN diagram

In the no-VPN scenario, we see that a server connects to the local network via the internet, and then the local network connects to the client.

In the VPN scenario, we see that the server connects to the VPN first, the VPN then to the local network, and then the local network to the client.

VPN diagram

What we want to do is detect this additional hop. The VPN is specifically designed in such a way that it cannot be so easily detected, but this should in fact be the very thing that makes it detectable.

Timing

Firstly, we assume we have some mechanism to calculate the time it takes to contact the client on their IP. One way to do this could be to ping them, but actually some routers specifically block this capability - and so it is not a reliable way of detecting ping time.

What we can do on the other hand is perform a traceroute, which sends out an ICMP echo request with a TTL. This will give us a list of IPs through the backbone up until the user’s network/VPN, but perhaps not the network itself. We simply pick the last one in the list that responded to our echo request (which will be relatively close to the target) and treat this as the target’s ping time, which is our round trip time (RTT). This represents the orange arrows in the diagram.

No VPN timing diagram

Next we request some state change from the client application, and ask them to send us the result of this state change. This represents the pink arrows in the diagram. As you can see, for the non-VPn scenario this will be the sum of:

  • [A] RTT
  • [B] Local network overhead (x2 for return time)
  • [C] Time taken to perform the state change
VPN timing diagram

And for the VPN scenario, this will be:

  • [D] RTT
  • [E] VPN bouncing overhead (x2 for return time)
  • [F] Local network overhead (x2 for return time)
  • [G] Time taken to perform the state change

If we can make the assumption that local network time is minimal (which you would hope it is, even over WiFi!) and that the client state change request requires constant time, then you should be able to detect a VPN with reasonable accuracy. As a guestimate, if you’re seeing additional times of say 10ms or more - then there is a good chance that there are significant additional hops happening.

Potential Flaws

So, we made quite a few assumptions, not all of them I am particularly happy about:

  • RRT – We assume that we can retrieve the RTT, but this may not be true. A good reason this may be hard to calculate is that the client is on a mobile internet connection. One workaround for this could be to whitelist IPs on mobile internet connections, as these are unlikely locations for service abusers to use and are easily IP banned.
  • Local network – We assume that the local network does not have significant overhead, when again this may not be true. If for example a student is using this application of a large education network, perhaps the latency is significant. Also consider that in some rural areas, networks can be daisy chained to share internet from house to house to build some monstrous network infrastructure.

Next Steps

I believe the next steps will be to build out a small scale test. I would like to build it out in Java, but unfortunately Java does not support ICMP. It will likely need to be done by wrapping some Unix command…

I believe the best method to use would be to make some HTTP server and simply ask different people to connect to it via their IP and VPNs to see how it responds and tweak it. The server could simply respond with “VPN (not) detected” and they can report their results to me.