client-hello-mirror/NJA3.md

7.9 KiB

NJA3

NJA3 is an algorithm for deriving a fingerprint string from a TLS Client Hello message. It aims to be a more robust and accurate version of JA3. It makes the following changes to JA3:

  1. extension codes are sorted in ascending order
  2. known conditional extensions are not included: server_name, padding, pre_shared_key, session_ticket, application_layer_protocol_negotiation, next_protocol_negotiation, token_binding, channel_id, channel_id_old
  3. the following code groups are added:
    • record header TLS version
    • supported TLS versions
    • signature algorithms
    • pre-shared key exchange modes
    • certificate compression algorithms
  4. 16-bit GREASE values are replaced with 0x0A0A (2570) and 8-bit ones (PskKeyExchangeModes) with 0x0B (11); their positions are preserved in all code groups except for the extensions group, in which codes are sorted
  5. the fingerprint hash is SHA256 truncated to the left 128 bits

Points 1 and 2 aim to make the fingerprint stable in the face of predictable variations in a client's TLS Client Hello message. Extension codes are sorted as an adaptation to Chromium having randomized the ordering of extensions, and several extensions are excluded - namely extensions that clients are known to only send some of the time. Most extensions in the exclusion list are taken from Troy Kent's "(JA) 3 Reasons to Rethink Your Encrypted Traffic Analysis Strategies".

Points 3-5 make the fingerprint more accurate. NJA3 adds values from within supported_versions, signature_algorithms, psk_key_exchange_modes and compress_certificate - extensions that were standardized after JA3 was conceived. The TLS version from the record header is now also included. Each GREASE value is changed to 0x0A0A (if 16-bit) or 0x0B (if it's a PskKeyExchangeMode) and its position within each code group is preserved - with the exception of the extensions group, in which codes are sorted (this approach to GREASE is inspired by mercury's). MD5 is replaced with a more collision-resistant hash, while preserving MD5's convenient 16 byte length (again, something which mercury does as well).

To sum it up, NJA3v1 is composed of the following code groups:

  • record header TLS version
  • handshake TLS version
  • cipher suites
  • extensions (sorted, conditional extensions ignored)
  • supported groups (from the supported_groups extension)
  • supported point formats (from the ec_point_formats extension)
  • supported TLS versions (from the supported_versions extension)
  • signature algorithms (from the signature_algorithms extension)
  • pre-shared key exchange modes (from the psk_key_exchange_modes extension)
  • certificate compression algorithms (from the compress_certificate extension)

Ignored extensions:

  • server_name (0)
  • padding (21)
  • pre_shared_key (41)
  • session_ticket (35)
  • application_layer_protocol_negotiation (16)
  • next_protocol_negotiation (13172)
  • token_binding (24)
  • channel_id (30032)
  • channel_id_old (30031)

Future versions of NJA3 may be defined, to adapt to changes in TLS and to amend shortcomings found in previous versions.

Why this name? The N used to stand for "normalized", which is what the folks at tlsfingerprint.io call their new fingerprints with sorted extension codes (see tlsfingerprint.io/norm_fp). However, since NJA3 has come to do more than sort extension codes, let's just say it means "nervuri's take on JA3".

Example

This is the NJA3v1 fingerprint for Chromium version 116.0.5845.180 running on Debian 12.1:

  • NJA3v1: 769,771,2570-4867-4865-4866-52393-52392-49195-49199-49196-49200-49171-49172-156-157-47-53,5-10-11-13-18-23-27-43-45-51-2570-2570-17513-65281,2570-29-23-24,0,2570-772-771,1027-2052-1025-1283-2053-1281-2054-1537,1,2
  • NJA3v1 SHA256/128: 8e0ed9d95486aa6a004a682cebd14afe

It's the same fingerprint in normal browsing mode and in incognito mode, whether session resumption is used or not. JA3, on the other hand, produces a different fingerprint on every connection.

Alternate approaches

Mercury's TLS fingerprint algorithm ignores any extension codes not found in the following set:

TLS_EXT_FIXED = {
    0x0001, 0x0005, 0x0007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000d,
    0x000f, 0x0010, 0x0011, 0x0018, 0x001b, 0x001c, 0x002b, 0x002d,
    0x0032, 0x5500
}

Ignoring extensions outside of a fixed set has the advantage that future conditional extensions will not affect the fingerprint's stability. Perhaps future versions of NJA3 will use this approach. The drawback is that it makes the fingerprint less precise.

GREASE can be approached in several ways:

  • ignore GREASE values completely, as JA3 does;
  • normalize GREASE values and maintain their positions, as mercury and NJA3 do;
  • mark code groups which contain GREASE values, but ignore the positions of GREASE values within those groups - an intermediary approach.

RFC 8701 states that:

Implementations SHOULD balance diversity in GREASE advertisements with determinism. For example, a client that randomly varies GREASE value positions for each connection may only fail against a broken server with some probability. This risks the failure being masked by automatic retries. A client that positions GREASE values deterministically over a period of time (such as a single software release) stresses fewer cases but is more likely to detect bugs from those cases.

Following this guideline, Chromium places GREASE values at fixed positions within each list, including the extensions list, even as most real extensions are shuffled. This is what informed the choice of including GREASE positions in NJA3v1 (an exception is made for extensions codes, which are all sorted to simplify implementation). Future versions of NJA3 will ignore GREASE positions if other TLS implementations will be found to randomize them.

On a final note, string-based fingerprinting is fundamentally limited compared to a function-based approach. More advanced fingerprinting solutions store the entire Client Hello message and provide it as input to one or more client detection functions, the output of which can include a confidence level. In addition to TLS parameters and their order, such functions can make use of values within conditional extensions, as well as any perceivable patterns in the TLS implementation's behavior. Other messages in the TLS connection could also be used for fingerprinting - see the Future work section in "The use of TLS in Censorship Circumvention":

Client Hello messages provide a rich amount of features useful in fingerprinting TLS implementations, but there are other messages in the TLS connection that could be used to detect or block tools. For instance, once the connection is established and sends encrypted records, the lengths of these encrypted records may reveal differences between implementations

Implementation

The first implementation is written in Go and can be found here. This code is part of TLS Client Hello Mirror, a live instance of which is running at tlsprivacy.nervuri.net, which will (among other things) generate the NJA3 fingerprint of any HTTPS or Gemini client you connect to it.