Any website finds identification of visitors crucial. Usually, cookie files are used, but they have several drawbacks, as users can delete or block them (e.g. by activating the “incognito” mode in web browsers). Besides, cookies fail to identify a user who uses several different web browsers, even if he or she connects using the same device. Hence the idea of a browser fingerprint — a unique user identifier which does not change between successive sessions and which does not depend on selected web browser.
Identification of persons
What do you need to identify a person? The answer is not that easy. A single piece of data such as gender, date of birth or ZIP code will not help you identify anybody, unless you know all the other details.
In information technology, entropy is a measure of certainty about information. It can be used as a measure of how much specific pieces of information increase the possibility of revealing someone’s identity. We can think of entropy as a certain value indicating how many values a random variable can have: two possibilities - 1 bit of entropy, four possibilities - 2 bits of entropy, etc. Since there are about 7 billion people on Earth, you need about 33 bits of entropy (2 ^ 33 ≈ 8 billion) to identify a random person.
Each detail you know about a person reduces the entropy by a certain value, which you can calculate using this formula: ΔS = - log2 Pr (X = x), where ΔS is the entropy decrease expressed in bits and Pr (X = x) is the probability of a given fact. For example, date of birth: ΔS = - log2 Pr (DOB = 01.11 ) = -log2 (1/365) ≈ 8.51 bits of information. (See Panopticlick)
Techniques of fingerprinting
How does the calculation from previous paragraph apply to web browsers? You can identify them not only by their IP address or cookies, but also by other features. The technique I would like to present is called browser or machine fingerprinting.
What features can you use to identify browsers?
Use of popular plugins
Adobe Flash is very convenient for the identification of web surfers because it is adopted by many users. Moreover, the plugin version is an excellent identifying feature.
Detecting IP address
Users can set their browsers to send all requests through a proxy in order to hide their IP address. With a Flash application you can bypass HTTP Proxy and contact your backend directly to receive the IP address.
Special fingerprint plugin
Such a plugin can read a lot of information about a user’s system, but you need user consent for installing your plugin.
Detecting browser and OS version
You cannot rely on
User-Agent HTTP header, because users can easily modify them.
You can identify a browser owing to:
- property order of special browser objects, such as navigator and screen objects, which depends on a browser, its version and operating system,
- presence of some unique methods and properties of special browser objects, such as
navigator.mozSms( Firefox ),
navigator.webkitStartActivity( Chrome ),
navigator.appMinorVersion( IE ) ,
- new browser features introduced in each version.
Checking if particular plugins are present
A very interesting example are user-agent spoofing extensions. Although user-agent spoofing is used by some users to avoid browser fingerprinting, you can find out what extension is used and employ it as a part of browser identification. You can identify each user-agent spoofing extension by a specific feature:
- modification of
navigator.userAgent, while leaving properties such as
- no support for screen object alteration, so browsers report invalid screen resolutions in case of mobile devices,
- modification of HTTP header only, instead of complete
HTML5 canvas fingerprinting
The same HTML5 Canvas elements return unique pixels depending on a web browser and an operating system. Web browsers use different image processing engines, export options or compression level, thus final images may get different hashes even if they are pixel-perfect, whereas operating systems use different algorithms and settings for anti-aliasing and sub-pixel rendering. Browserleaks.com
You can simply check the level of browser’s WebGL support, and retrieve some parameters related to the web browser’s identity.
(See How to identify WebGL )
TCP SYN packet signature
A SYN packet is the first packet sent by a client when negotiating a connection via TCP protocol. Usually, the packet signature, in particular in case of TCP Options, varies from one operating system to another, even between versions of the same operating system. This method is used by Nmap for OS fingerprinting.
Check the latency between the server and the client. The latency timing can be affected by many factors and particular user timing can vary over a range of latency numbers, so you should take only standard deviation into account.
perfData = window.performance.timing; requestTime = perfData.responseStart - perfData.requestStart; networkLatency = perfData.responseEnd - perfData.fetchStart;
After collecting data described above, it is possible to calculate a fingerprint, which :
- does not depend on the IP address and can efficiently track changes in the IP address
- can distinguish between different PCs behind a NAT
- is not affected by computer upgrades and browser updates, switching browsers, plugins or emptying local storage.
Some of the features such as browser versioning, IP address, font list or canvas fingerprint can be used to track changes in the fingerprint, as there is almost no correlation between them in case of popular OSes. If one value changes, while the other three remain unchanged, it usually means that a user switched browsers or installed new fonts. Nevertheless, it is still the same user.
- [Darkwave Technologies] (http://www.darkwavetech.com/device_fingerprint.html)
- [W3Org] (http://www.w3.org/wiki/images/7/7d/Is_preventing_browser_fingerprinting_a_lost_cause.pdf)
- [Panopticlick] (https://panopticlick.eff.org)
- [noc.to] (http://noc.to)
- [Privacy Enhancing Technologies] (http://pet-portal.eu/blog/tag_search/?tag=fingerprint)
- [Browserspy.dk] (http://browserspy.dk)
I would like to remind that browser fingerprinting should be performed only in compliance with both ethical and legal requirements.