Practical Application of TLS Fingerprinting in Bot Mitigation
In today’s digital world, cybersecurity has become a crucial issue for individuals, organizations, and even nations. Among the various threats, “bot traffic” or bot network traffic has emerged as a significant concern.
Bot traffic, primarily generated by automated scripts or programs, is widely used in various malicious activities such as DDoS attacks, spam email sending, phishing, and fraudulent ad clicks. These malicious actions not only threaten the privacy and financial security of individual users but also pose significant risks to the cybersecurity of businesses, organizations, and even national network infrastructure. Therefore, the study and defense against bot traffic have become a significant topic in the field of cybersecurity. This guide aims to explain how to use TLS fingerprinting technology to detect and identify bot traffic, thereby providing more effective protection for cybersecurity.
Introduction to TLS Fingerprinting
TLS, which stands for Transport Layer Security, is a commonly used protocol in network communication to ensure the secure transmission of data. TLS uses encryption technology during the data sending and receiving process to prevent data from being intercepted or tampered with, thereby protecting the integrity and confidentiality of the information.
TLS is used to encrypt the vast majority of traffic on the Internet, from web browsing, registration and login, payment transactions, and streaming media, to the increasingly popular Internet of Things (IoT). Its security is also favored by malicious attackers, who use TLS to hide the communication traffic of malware.
At the start of a TLS connection, the client sends a TLS Client Hello packet. This packet, generated by the client application, informs the server about the supported ciphers and preferred communication methods and is transmitted in plaintext. The TLS Client Hello packet is unique for each application or its underlying TLS library, and the hash value calculated from this packet is known as the TLS fingerprint.
Figure 1: TLS Handshake Process
The primary applications of TLS fingerprinting today are Salesforce’s open-source JA3 and JA4, with JA4 being an upgraded version of JA3 that includes more detection dimensions and scenarios. Therefore, this article mainly focuses on the application and practice of JA4-based TLS fingerprinting in bot mitigation.
1. JA3 & JA3S
The JA3 method collects the decimal values of the bytes from the following fields in the client’s Client Hello packet: TLS version, cipher suites, extensions list, elliptic curves, and elliptic curve formats. It then concatenates these values in the order they appear, separating each field with a comma and each value within a field with a hyphen.
Example:
771,4865-4866-4867-49195-49196-52393-49199-49200-52392-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-13-51-45-43-21,29-23-24,0
The JA3 fingerprint is obtained by applying a 32-bit MD5 hash to the concatenated string:
JA3: f79b6bad2ad0641e1921aef10262856b
During the calculation of the JA3 fingerprint, it is necessary to ignore the values of the GREASE fields included in the TLS extensions. This mechanism, used by Google, prevents extensibility failures in the TLS ecosystem.
Figure 2: Client Hello Message
After generating the JA3 fingerprint, we use a similar method to identify the fingerprint on the server side (i.e., the TLS Server Hello message). The JA3S method collects the decimal values of the bytes from the following fields in the Server Hello packet: TLS version, cipher suites, and extensions list. These values are then concatenated in the order they appear, with each field separated by a comma and each value within a field separated by a hyphen.
Example:
771,49200,65281-0-11-35-16-23
The JA3S fingerprint is obtained by applying a 32-bit MD5 hash to the concatenated string:
JA3S: d154fcfa5bb4f0748e1dd1992c681104
Figure 3: Server Hello Message
2. JA4+
JA4+ provides an easy-to-use and shareable modular network fingerprinting system, replacing the JA3 TLS fingerprinting standard introduced in 2017. The JA4 detection method enhances readability, aiding in more effective threat hunting and analysis. All JA4+ fingerprints are formatted as a_b_c, where different parts of the fingerprint are separated. This allows for searches and detections using just ab, ac, or c. For instance, if you only want to analyze the cookies from incoming applications, you can look at JA4H_c. This new locality-preserving format facilitates deeper and richer analysis while remaining simple, easy to use, and scalable.
JA4+ fingerprints include the following dimensions:
- JA4 — TLS Client
- JA4S — TLS Server Response
- JA4H — HTTP Client
- JA4L — Light Distance/Location
- JA4X — X509 TLS Certificate
- JA4SSH — SSH Traffic
This article primarily introduces the application of JA4. For detailed information on other dimensions, please refer to the JA4 open-source repository: https://github.com/FoxIO-LLC/ja4.
Figure 4: JA4 Schematic Diagram
JA4 is composed of JA4_a, JA4_b, and JA4_c:
JA4_r = JA4_a(t13d1516h2)_JA4_b(sorted cipher suites)_JA4_c(sorted extensions_original encryption algorithms)
JA4_a: t13d1516h2, includes the client’s TLS version, SNI, number of cipher suites, number of extensions, and ALPN. ALPN indicates the protocol the application wants to communicate with after the TLS negotiation is complete; “00” indicates a lack of ALPN. Note that the presence of ALPN “h2” does not necessarily indicate a browser, as many IoT devices communicate via HTTP/2. However, the absence of ALPN might suggest that the client is not a web browser. JA4 fingerprints the client regardless of whether the traffic is via TCP or QUIC. QUIC is the protocol used by the new HTTP/3 standard, which encapsulates TLS1.3 in UDP packets.
JA4_b: Perform SHA256 on the sorted cipher suites and take the first 12 characters. For example:
002f,0035,009c,009d,1301,1302,1303,c013,c014,c02b,c02c,c02f,c030,cca8,cca9 = 8daaf6152771
JA4_c: Perform SHA256 on the sorted extensions and original encryption algorithms, and take the first 12 characters. For example:
0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01_0403,0804,0401,0503,0805,0501,0806,0601 = e5627efa2ab1
Application of JA4 Fingerprinting in Bot Mitigation
Detection Principle
Different clients (browsers, computer software, programs) support different protocol versions, cipher suites, extensions, and encryption algorithms. During the TLS handshake, the Client Hello is transmitted in plaintext, allowing us to calculate the JA4 fingerprint to identify the client’s true properties.
- Firefox (JA4 Client Hello) ≠ Chrome (JA4 Client Hello)
- Chrome 120 (JA4 Client Hello) ≠ Chrome 80 (JA4 Client Hello)
- Chrome iOS (JA4 Client Hello) ≠ Chrome Android (JA4 Client Hello)
- Heritrix (JA4 Client Hello) ≠ Chrome (JA4 Client Hello)
When the client has not been maliciously tampered with, the JA4 fingerprint remains stable.
Application Method
In bot mitigation scenarios, applying JA4 fingerprinting for client identification requires combining it with other information: client IP information, client operating system information, client device name, version number, etc. JA4 fingerprinting has two main application methods in these scenarios: fingerprint uniqueness detection and fingerprint consistency detection.
- Uniqueness Detection:
Some client programs are designed in such a way that they have unique JA4 fingerprints, and the fingerprints of these clients change infrequently. Through uniqueness detection, such abnormal clients can be effectively identified.
Application | JA4+ Fingerprints |
Chrome | JA4=t13d1517h2_8daaf6152771_b1ff8ab2d16f (initial) |
JA4=t13d1517h2_8daaf6152771_b0da82dd1658 (reconnect) | |
FireFox | JA4=t13d1715h2_5b57614c22b0_7121afd63204(initial) |
JA4=t13d1715h2_5b57614c22b0_7121afd63204 (reconnect) | |
Safari | JA4=t13d2014h2_a09f3c656075_14788d8d241b |
heritrix | JA4=t13d491100_bd868743f55c_fa269c3d986d |
undetected_chromedriver | JA4=t13d1516h2_8daaf6152771_02713d6af862 |
IcedID Malware | JA4=t13d201100_2b729b4bf6f3_9e7b989ebec8 |
sqlmap | JA4= t13i311000_e8f1e7e78f70_d41ae481755e |
AppScan | JA4= t12i3006h2_a0f71150605f_1da50ec048a3 |
Table 1: Common Client JA4 Fingerprints
- Consistency Detection:
The principle of fingerprint consistency detection involves comparing the client’s declared device information (operating system, browser type, version number) with its JA4 fingerprint to check if it matches the actual device information corresponding to the fingerprint.
Client-Declared Device Information | Client JA4 Fingerprint | Consistency |
“brower”: “Chrome”, “brower_version”: “89.8.7866”, “os”: “Windows”, “os_version”: “7”, |
t12d290400_11b08e233c4b_017f05e53f6d | Abnormal |
“brower”: “Chrome”, “brower_version”: “93.0.4622”, “os”: “Windows”, “os_version”: “10” |
t13d431000_c7886603b240_5ac7197df9d2 | Abnormal |
“brower”: “Python Requests”, “brower_version”: “2.31” |
t13d1516h2_8daaf6152771_02713d6af862 | Abnormal |
“brower”: “Chrome”, “brower_version”: “93.0.4577”, “os”: “Windows”, “os_version”: “10”, |
t13d1516h2_8daaf6152771_e5627efa2ab1 | Normal |
“brower”: “Firefox”, “brower_version”: “116.0”, “os”: “Ubuntu”, |
t13d321200_1b30506679d3_58ed7828516f | Abnormal |
“brower”: “Edge”, “brower_version”: “14.14393”, “os”: “Windows”, “os_version”: “10” |
t12d040400_a6a9ac001284_255c81f47ac1 | Abnormal |
“brower”: “Safari”, “brower_version”: “15.6”, “os”: “Mac OS X”, “os_version”: “10.15.7” |
t13d2014h2_a09f3c656075_f62623592221 | Normal |
Table 2: JA4 Consistency Detection
- JA4 Fingerprint Database
In bot mitigation scenarios, whether through JA4 uniqueness or consistency characteristics, the underlying logic relies on a vast JA4 fingerprint database for data support, similar to JA3 fingerprinting. Therefore, building a comprehensive fingerprint database is one of the key factors in determining the success of JA4 in identifying bot traffic.
As the official JA4+ fingerprint database, related applications, and recommended detection logic are still under construction, there is currently no available fingerprint database. Therefore, the CDNetworks Security Lab has collected common client fingerprints for specific bot mitigation scenarios and implemented corresponding detection algorithms.
Conclusion
This analysis shows that TLS fingerprinting is a highly effective tool. By deeply analyzing different fields in the TLS client’s Client Hello packet, we can generate unique JA4 fingerprints and use these fingerprints to identify specific malicious bot traffic.
While TLS fingerprinting can effectively detect bot traffic, it also has certain limitations. As attackers continuously upgrade and change their strategies, TLS fingerprints will continue to be tampered with or forged. Therefore, we need to constantly update and improve detection mechanisms to maintain an advantage in the ongoing battle between attack and defense.
In bot mitigation scenarios, TLS fingerprinting provides a powerful identification mechanism, but it cannot replace other security measures. It should be considered part of a comprehensive bot security strategy, used in conjunction with threat intelligence, browser fingerprinting, and other measures to provide thorough protection.