Practical Application of TLS Fingerprinting in Bot Mitigation

September 12, 2024

Try CDNetworks For Free

Most of our products have a 14 day free trial. No credit card needed.

Share This Post

In today’s digital world, cybersecurity has become a crucial issue for individuals, organizations, and even nations. Among the various threats, “bot traffic” or bot network traffic has emerged as a significant concern.

Bot traffic, primarily generated by automated scripts or programs, is widely used in various malicious activities such as DDoS attacks, spam email sending, phishing, and fraudulent ad clicks. These malicious actions not only threaten the privacy and financial security of individual users but also pose significant risks to the cybersecurity of businesses, organizations, and even national network infrastructure. Therefore, the study and defense against bot traffic have become a significant topic in the field of cybersecurity. This guide aims to explain how to use TLS fingerprinting technology to detect and identify bot traffic, thereby providing more effective protection for cybersecurity.

Introduction to TLS Fingerprinting

TLS, which stands for Transport Layer Security, is a commonly used protocol in network communication to ensure the secure transmission of data. TLS uses encryption technology during the data sending and receiving process to prevent data from being intercepted or tampered with, thereby protecting the integrity and confidentiality of the information.

TLS is used to encrypt the vast majority of traffic on the Internet, from web browsing, registration and login, payment transactions, and streaming media, to the increasingly popular Internet of Things (IoT). Its security is also favored by malicious attackers, who use TLS to hide the communication traffic of malware.

At the start of a TLS connection, the client sends a TLS Client Hello packet. This packet, generated by the client application, informs the server about the supported ciphers and preferred communication methods and is transmitted in plaintext. The TLS Client Hello packet is unique for each application or its underlying TLS library, and the hash value calculated from this packet is known as the TLS fingerprint.

Figure 1: TLS Handshake Process

The primary applications of TLS fingerprinting today are Salesforce’s open-source JA3 and JA4, with JA4 being an upgraded version of JA3 that includes more detection dimensions and scenarios. Therefore, this article mainly focuses on the application and practice of JA4-based TLS fingerprinting in bot mitigation.

1. JA3 & JA3S

The JA3 method collects the decimal values of the bytes from the following fields in the client’s Client Hello packet: TLS version, cipher suites, extensions list, elliptic curves, and elliptic curve formats. It then concatenates these values in the order they appear, separating each field with a comma and each value within a field with a hyphen.

Example:

771,4865-4866-4867-49195-49196-52393-49199-49200-52392-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-13-51-45-43-21,29-23-24,0

The JA3 fingerprint is obtained by applying a 32-bit MD5 hash to the concatenated string:

JA3: f79b6bad2ad0641e1921aef10262856b

During the calculation of the JA3 fingerprint, it is necessary to ignore the values of the GREASE fields included in the TLS extensions. This mechanism, used by Google, prevents extensibility failures in the TLS ecosystem.

Figure 2: Client Hello Message

After generating the JA3 fingerprint, we use a similar method to identify the fingerprint on the server side (i.e., the TLS Server Hello message). The JA3S method collects the decimal values of the bytes from the following fields in the Server Hello packet: TLS version, cipher suites, and extensions list. These values are then concatenated in the order they appear, with each field separated by a comma and each value within a field separated by a hyphen.

Example:

771,49200,65281-0-11-35-16-23

The JA3S fingerprint is obtained by applying a 32-bit MD5 hash to the concatenated string:

JA3S: d154fcfa5bb4f0748e1dd1992c681104

Figure 3: Server Hello Message

2. JA4+

JA4+ provides an easy-to-use and shareable modular network fingerprinting system, replacing the JA3 TLS fingerprinting standard introduced in 2017. The JA4 detection method enhances readability, aiding in more effective threat hunting and analysis. All JA4+ fingerprints are formatted as a_b_c, where different parts of the fingerprint are separated. This allows for searches and detections using just ab, ac, or c. For instance, if you only want to analyze the cookies from incoming applications, you can look at JA4H_c. This new locality-preserving format facilitates deeper and richer analysis while remaining simple, easy to use, and scalable.

JA4+ fingerprints include the following dimensions:

JA4 — TLS Client
JA4S — TLS Server Response
JA4H — HTTP Client
JA4L — Light Distance/Location
JA4X — X509 TLS Certificate
JA4SSH — SSH Traffic

This article primarily introduces the application of JA4. For detailed information on other dimensions, please refer to the JA4 open-source repository: https://github.com/FoxIO-LLC/ja4.

Figure 4: JA4 Schematic Diagram

JA4 is composed of JA4_a, JA4_b, and JA4_c:

JA4_r = JA4_a(t13d1516h2)_JA4_b(sorted cipher suites)_JA4_c(sorted extensions_original encryption algorithms)

JA4_a: t13d1516h2, includes the client’s TLS version, SNI, number of cipher suites, number of extensions, and ALPN. ALPN indicates the protocol the application wants to communicate with after the TLS negotiation is complete; “00” indicates a lack of ALPN. Note that the presence of ALPN “h2” does not necessarily indicate a browser, as many IoT devices communicate via HTTP/2. However, the absence of ALPN might suggest that the client is not a web browser. JA4 fingerprints the client regardless of whether the traffic is via TCP or QUIC. QUIC is the protocol used by the new HTTP/3 standard, which encapsulates TLS1.3 in UDP packets.

JA4_b: Perform SHA256 on the sorted cipher suites and take the first 12 characters. For example:

002f,0035,009c,009d,1301,1302,1303,c013,c014,c02b,c02c,c02f,c030,cca8,cca9 = 8daaf6152771

JA4_c: Perform SHA256 on the sorted extensions and original encryption algorithms, and take the first 12 characters. For example:

0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01_0403,0804,0401,0503,0805,0501,0806,0601 = e5627efa2ab1

Application of JA4 Fingerprinting in Bot Mitigation

Detection Principle

Different clients (browsers, computer software, programs) support different protocol versions, cipher suites, extensions, and encryption algorithms. During the TLS handshake, the Client Hello is transmitted in plaintext, allowing us to calculate the JA4 fingerprint to identify the client’s true properties.

Firefox (JA4 Client Hello) ≠ Chrome (JA4 Client Hello)
Chrome 120 (JA4 Client Hello) ≠ Chrome 80 (JA4 Client Hello)
Chrome iOS (JA4 Client Hello) ≠ Chrome Android (JA4 Client Hello)
Heritrix (JA4 Client Hello) ≠ Chrome (JA4 Client Hello)

When the client has not been maliciously tampered with, the JA4 fingerprint remains stable.

Application Method

In bot mitigation scenarios, applying JA4 fingerprinting for client identification requires combining it with other information: client IP information, client operating system information, client device name, version number, etc. JA4 fingerprinting has two main application methods in these scenarios: fingerprint uniqueness detection and fingerprint consistency detection.

Uniqueness Detection:
Some client programs are designed in such a way that they have unique JA4 fingerprints, and the fingerprints of these clients change infrequently. Through uniqueness detection, such abnormal clients can be effectively identified.

Application	JA4+ Fingerprints
Chrome	JA4=t13d1517h2_8daaf6152771_b1ff8ab2d16f (initial)
Chrome	JA4=t13d1517h2_8daaf6152771_b0da82dd1658 (reconnect)
FireFox	JA4=t13d1715h2_5b57614c22b0_7121afd63204(initial)
FireFox	JA4=t13d1715h2_5b57614c22b0_7121afd63204 (reconnect)
Safari	JA4=t13d2014h2_a09f3c656075_14788d8d241b
heritrix	JA4=t13d491100_bd868743f55c_fa269c3d986d
undetected_chromedriver	JA4=t13d1516h2_8daaf6152771_02713d6af862
IcedID Malware	JA4=t13d201100_2b729b4bf6f3_9e7b989ebec8
sqlmap	JA4= t13i311000_e8f1e7e78f70_d41ae481755e
AppScan	JA4= t12i3006h2_a0f71150605f_1da50ec048a3

Table 1: Common Client JA4 Fingerprints

Consistency Detection:

The principle of fingerprint consistency detection involves comparing the client’s declared device information (operating system, browser type, version number) with its JA4 fingerprint to check if it matches the actual device information corresponding to the fingerprint.

Client-Declared Device Information	Client JA4 Fingerprint	Consistency
“brower”: “Chrome”, “brower_version”: “89.8.7866”, “os”: “Windows”, “os_version”: “7”,	t12d290400_11b08e233c4b_017f05e53f6d	Abnormal
“brower”: “Chrome”, “brower_version”: “93.0.4622”, “os”: “Windows”, “os_version”: “10”	t13d431000_c7886603b240_5ac7197df9d2	Abnormal
“brower”: “Python Requests”, “brower_version”: “2.31”	t13d1516h2_8daaf6152771_02713d6af862	Abnormal
“brower”: “Chrome”, “brower_version”: “93.0.4577”, “os”: “Windows”, “os_version”: “10”,	t13d1516h2_8daaf6152771_e5627efa2ab1	Normal
“brower”: “Firefox”, “brower_version”: “116.0”, “os”: “Ubuntu”,	t13d321200_1b30506679d3_58ed7828516f	Abnormal
“brower”: “Edge”, “brower_version”: “14.14393”, “os”: “Windows”, “os_version”: “10”	t12d040400_a6a9ac001284_255c81f47ac1	Abnormal
“brower”: “Safari”, “brower_version”: “15.6”, “os”: “Mac OS X”, “os_version”: “10.15.7”	t13d2014h2_a09f3c656075_f62623592221	Normal

Table 2: JA4 Consistency Detection

JA4 Fingerprint Database

In bot mitigation scenarios, whether through JA4 uniqueness or consistency characteristics, the underlying logic relies on a vast JA4 fingerprint database for data support, similar to JA3 fingerprinting. Therefore, building a comprehensive fingerprint database is one of the key factors in determining the success of JA4 in identifying bot traffic.

As the official JA4+ fingerprint database, related applications, and recommended detection logic are still under construction, there is currently no available fingerprint database. Therefore, the CDNetworks Security Lab has collected common client fingerprints for specific bot mitigation scenarios and implemented corresponding detection algorithms.

Conclusion

This analysis shows that TLS fingerprinting is a highly effective tool. By deeply analyzing different fields in the TLS client’s Client Hello packet, we can generate unique JA4 fingerprints and use these fingerprints to identify specific malicious bot traffic.

While TLS fingerprinting can effectively detect bot traffic, it also has certain limitations. As attackers continuously upgrade and change their strategies, TLS fingerprints will continue to be tampered with or forged. Therefore, we need to constantly update and improve detection mechanisms to maintain an advantage in the ongoing battle between attack and defense.

In bot mitigation scenarios, TLS fingerprinting provides a powerful identification mechanism, but it cannot replace other security measures. It should be considered part of a comprehensive bot security strategy, used in conjunction with threat intelligence, browser fingerprinting, and other measures to provide thorough protection.

More To Explore

Cloud Security