Broadcom Bluetooth: Generating (not so?) random numbers

When looking into the most recent Bluetooth patches of the Samsung Galaxy Note 20 5G, I found something interesting. Once again, it's about the Random Number Generator (RNG). Jörn, Felix and me already published a WOOT paper about this last year. But there were some updates since then, which will be covered in this blog post. Oh, and it also contains assembly for the curious reverse engineer, additional honest opinions (I hope reviewer 2 will never find my blog), rants, and explanations for people who are new to Bluetooth.

If you already saw the talk or read the paper, you might want to skip forward to "The unexpected patch". But you will miss some ranting!!!11!

Why should anyone look into a random number generator?

A RNG is one of those dragons that live deep inside the firmware or even hardware. Just don't look at them. Otherwise, something might break :)

The Bluetooth Core Specification, currently at version 5.2, specifies that a Bluetooth chip has to provide a FIPS-compliant RNG. Secure encryption and authentication depend on this RNG. So, short before paper submission, we had one of those moments when Felix and me were panically checking the specification how exactly those schemes break, because this is not included in the specification. But be assured, pairing security breaks if random numbers are not random.

And it's worse than this. If you're building an embedded device and add a Bluetooth chip, this means that you have a FIPS-compliant RNG. This RNG is easily accessible via the HCI command HCI_LE_Rand (0x2018, see p. 2521 of the specification). Thus, I've heard of someone that they were using the Bluetooth RNG as source of randomness for higher-level applications :D

This is not that far-fetched. By the way, the Android Bluetooth stack uses the HCI_LE_Rand command to obtain random numbers, instead of some operating system internal function. Probably not the worst idea given the heterogeneous devices Android runs on.

Finding a hardware bug in a Bluetooth RNG could result in weak security - and this bug might be unpatchable. Quite a motivation to look into it.

Current state of Bluetooth security

But then a couple of things happened in 2020. So, given the current state of specification-compliant security issues found recently, one could even assume that the specification might have even bugdoored. Let's take a look at the latest Bluetooth security notices:

So, what does all that even mean? Back in the day of the first Bluetooth specifications, they were full of bugs. To solve these issues, the Bluetooth SIG designed a completely new scheme called Secure Simple Pairing (SSP). There is even a paper dating back to 2007, which proofs SSP to be secure. Yet, protocol proofs are very limited to what you actually modeled. So I'm explaining the bugs discovered in 2018-2020 for human, non-crypto readers :D

SSP is based on elliptic curves. Two points on this elliptic curve are exchanged. The Bluetooth specification only required to authenticate the x coordinate of these points, but not the y coordinate. An attacker could, thus, choose invalid points that were not located on the curve.

During each session, a new encryption key is generated. In the KNOB attack, the attacker sets the encryption key size (basically its entropy) to 1 byte instead of 16 bytes. This is a valid command in the Bluetooth specification /o\ The same authors later on published the BIAS attack.

SSP and its BLE variant allow pairing mechanisms like comparing numbers displayed on two devices or entering a number into one device that is displayed on the other device. In the method confusion attack, both modes are mixed m)

Classic Bluetooth and BLE can convert keys of one type into the other. There exist some devices that e.g. broadcast their identity using BLE but transfer data using Classic Bluetooth. I think my headphones do that, for example. So this is a legitimate action, but that key conversion can also break things.

Given all these attacks, it would be nice to see a completely new pairing mechanism. The current one is called "Secure Simple Pairing" but neither secure nor simple.

Since most attacks were published in 2020, finding a bug in a RNG doesn't seem that tragic anymore. Bluetooth security in terms of pairing and encryption is flawed anyway. But you might face RNGs in different places that are equally important.

Initial measurements

The Bluetooth firmware in the Nexus 5 as well as all Cypress evaluation boards has a Pseudo-Random Number Generator (PRNG) fallback in case the Hardware Random Number Generator (HRNG) is not available. On all those devices the HRNG seems to be always available. But the code was maintained over years, i.e., the newer Cypress evaluation boards added a cache. Thus, we assumed that:

  • The HRNG is not available at all on some devices. (PRNG fallback)
  • The HRNG is too slow sometimes. (Caching)
The PRNG uses some hardware-related registers as random input. Jörn measured these registers and found that they were not random at all. This is not too surprising, though. According to the symbol leaks, their names are dc_fhout_adragcStatus_adr. Thus, they are related to signal processing and might stay the same for at least the transmission period of one Bluetooth frame and the corresponding response cycle, which is 0.625ms.

These registers are then combined with a CRC32 and the previous value. But CRC32 is no cryptographically secure hash function. Thus, Felix told me that values might become xor-dependent over a series of values returned by the PRNG. Ooops :)

Given those measurements, we started responsible disclosure mid-January 2020. Since this is Broadcom, we first got a CVE from MITRE (CVE-2020-6616), which usually takes 1-3 days, and then reported the issue to Broadcom, Cypress, and a few selected customers. This was the end of Jörn's thesis...

Broadcom's reply was as expected: Yes, there is a PRNG fallback. But we never use the PRNG fallback.

Broken device contest winner: Samsung Galaxy S8

In previous responsible disclosures, Broadcom didn't necessarily tell the truth. Moreover, they're maintaining so many different firmware versions and seem to have very bad version management. They might simply not be aware if there was a device without HRNG.

I found the time to look into a few more devices end of January. My first pick was the Samsung Galaxy S8. It's the first Bluetooth 5 compatible smartphone. And as such it definitely deserves a trophy: ⭐ We test in production! ⭐ At least this also means that the Samsung Galaxy S8 doesn't have 3 year old libraries and just a somewhat newer compile date as some other chips, yay.

There's probably not too much to do to avoid testing in prod as a developer. Not that many compatible real devices to test against. But believe me, every single BLE handler has a hook into the Patchram on this device m) And the best part, in January 2020 it was still in Samsung's monthly patch cycle. So it was simply my first pick.

Guess what? It only had a PRNG. No HRNG.

So, on the night from January 31 to February 1, I wrote yet another mail to Samsung because they won the broken device lottery. However, they were a bit grumpy because they didn't make the 90-day disclosure deadline, I guess, so it's just the CVE mentioned in their update and not any reporter names. At least the CVE fulfilled its purpose: Tracking Broadcom's patches, which is a mess anyway...

Better fix than expected

Apparently, the HRNG was still present on the Samsung Galaxy S8. But without the code for initialization during startup, reading random values from the registers wouldn't work. So, after the June 2020 update, random numbers are random again. Since the RNG implements caching, one needs to request multiple values until the HRNG register changes.

The unexpected patch: Samsung Galaxy Note 20 5G

Short after the Samsung Galaxy S8, Broadcom rewrote the RNG library and removed the PRNG fallback. Devices like the Samsung Galaxy S10/S20 or the iPhone 11/SE2 did no longer have a PRNG :)

Trying to understand how checksums are applied to .hcd files (see the previous blog post), I was looking for references to CRC32 and other hash functions on a Samsung Galaxy Note 20 5G. And suddenly saw a patch that re-introduced the PRNG fallback:


The weirdest part in this piece of code is that they still use CRC32, even though the firmware is capable of calculating cryptographically secure hashes. Maybe they didn't read our paper, who knows. Building a more secure PRNG fallback would definitely be possible. This PRNG fallback is still insecure if triggered multiple times in a row.

So I wrote Samsung and Broadcom again. They replied within a week, that's quite fast :D And their response was... The PRNG fallback is used at the very exceptional case that the HRNG is not available.

Or is it?

I didn't do any stress test on the Note 20, but I did stress test the HRNG on a couple of other devices and can confirm that the PRNG fallback was never triggered. Given the screenshot of the PRNG, one can simply set a break point at the first address and then run the HCI command HCI_LE_Rand to request some random values. Looks okay so far:

# Check that the PRNG function is where we expect it to be
> disasm 0x17AD04
  17ad04:       b570            push    {r4, r5, r6, lr}
  17ad06:       f698 fec7       bl      13a98 <.text-0x16726c>
  17ad0a:       4b1a            ldr     r3, [pc, #104]  ; (17ad74 <.text+0x70>)
  17ad0c:       4c1a            ldr     r4, [pc, #104]  ; (17ad78 <.text+0x74>)

# InternalBlue on Android 11 is a bit slower, but code is changed despite timeout after multiple seconds
> writeasm 0x17ad04 bkpt 1
[*] Assembler was successful. Machine code (len = 2 bytes) is:
[!] _sendThreadFunc: No response from the firmware.
[!] sendHciCommand: waiting for response timed out!
[<] Writing Memory: Write failed!
> disasm 0x17AD04
  17ad04:       be01            bkpt    0x0001
  17ad06:       f698 fec7       bl      13a98 <.text-0x16726c>
  17ad0a:       4b1a            ldr     r3, [pc, #104]  ; (17ad74 <.text+0x70>)
  17ad0c:       4c1a            ldr     r4, [pc, #104]  ; (17ad78 <.text+0x74>)

# Prior to first usage
> hd -l 0x20 0x352600
00352600: 03 00 00 00  ff ff 0f 20  ff ff ff ff  77 77 77 77   
                             ^---- not ready!

# A bit later... use a long sleep because Android 11 is sloooow.
> repeat 7000 sendhcicmd 0x2018
> hd -l 0x20 0x352600
00352600: 03 00 00 00  01 00 00 00  d3 09 90 b8  77 77 77 77  
                                      ^---- new random value 

# ...and again a bit later
> sendhcicmd 0x2018
> hd -l 0x20 0x352600

00352600: 03 00 00 00  00 00 00 00  ec 1b a7 6a  77 77 77 77 
                                      ^---- new random value 

Accessing the HRNG like this is very slow. But I measured it with custom assembler snippets on a lot of devices and found that when it's present it's also fast enough and random according to the Dieharder test suite. So I don't see any reason why the Note 20 should be any different.

The cache on top of the HRNG is quite interesting, though, because it doesn't fill the cache continuously but only when it runs out of random numbers. So the cache might still run into the same performance issues as on the plain Nexus 5 code without a cache >.<

Open questions

So, why do Broadcom engineers distrust their HRNG that much? Why did they maintain multiple PRNG fallback versions? I don't know, but I have a couple of ideas.

A data sheet of another Broadcom chip said that they might be using a temperature-based HRNG, Jörn even did some measurements at low and high temperatures. Anyway, that didn't lead to anything and probably it's just the thermal noise, so the surrounding temperature really doesn't matter.

The RNG code that accesses the HRNG always performs the following checks:

  • Is the HRNG available? If not, execute the PRNG.
  • Wait for the HRNG until the next 4 byte value is ready.
For the curious reverse engineer, it looks like this on the CYW20819 evaluation board:


The function rbg_rand is the only one that accesses the HRNG. The firmware runs on a single ARM core. So I don't expect any issues due to multithreading or similar.

But there is one core that runs in parallel on combo chips: The Wi-Fi core. It might be that they share the same HRNG and that it's not available for a short moment in that case. So far that's my most plausible explanation. While randomness exhaustion between Wi-Fi and Bluetooth would definitely be funny, we already have much worse memory leakage and even code execution (CVE-2020-10367 and CVE-2020-10368). So at that point in time it doesn't matter if this was the cause and I'm too lazy to make those measurements. Feel free to make them if you're looking for a low-hanging Broadcom CVE that will never be patched ;)


Comments

Popular posts from this blog

Always-on Processor magic: How Find My works while iPhone is powered off

Embedding Frida in iOS TestFlight Apps

BlueZ: Linux Bluetooth Stack Overview