Hunting Ghosts in Bluetooth Firmware: BrakTooth Meets Frankenstein

There's a new Bluetooth vulnerability collection, called BrakTooth. PoCs are under NDA until end of October. Are these bugs real? Do they affect more than the Cypress dev kit, for example, iPhones and MacBooks?

Jan, who developed the Frankenstein emulator for Cypress chips and discovered the BlueFrag vulnerability within Android, joined my efforts in reproducing these bugs. All tooling we used is already public, as well as the BrakTooth vulnerability descriptions. We will not publish our own BrakTooth PoCs until the BrakTooth NDA ends to protect end users.


Link Management Protocol

If you're reading this and already have a background on Bluetooth, continue reading on the next section. Everyone else still needs to get familiar with an over 3k page long nightmare, called "Bluetooth Specification". All page numbers refer to version 5.2, even though version 5.3 was released recently. As of now, there's no 5.3 chip available.

The Link Management Protocol (LMP) negotiates connections and their parameters in Bluetooth Classic. There's also a Link Control Protocol (LCP) in Bluetooth Low Energy, which serves a similar purpose, but expect from that has very different properties. Everything in this post will be about LMP packets as well as their headers.

LMP is available prior to pairing, since it is meant to initiate connections and new pairings. In this state, it provides basic information about the Bluetooth chip: The Bluetooth version and even the Bluetooth subversion, typically being identical to the firmware version. Anyone who knows the lower four bytes of a MAC address can talk LMP to a Bluetooth chip. (This is a lesser-known fact you can try at home using l2ping on Linux, the first 2 bytes can contain arbitrary values.) Even if a chip is not discoverable, meaning that it is not listed in the scan results of other devices, it typically accepts LMP connections. For those who are less familiar with embedded devices, a Bluetooth chip is analogous to a public web server, and everyone who knows its IPv4 address can connect to it and would get immediately all information about the software running on that server.

LMP does not only manage connections, it is also responsible for pairing two devices, starting encrypted sessions, etc. The logic bugs in LMP are endless, and instead of citing >10 papers, all you have to know is that encryption and pairing broke every few months since 2018. Meanwhile, all major vendors built their own key exchange mechanisms on top of Bluetooth. It is hard to find such logic bugs with fuzzing. Instead, one has to read the specification, find weird exceptions and error conditions, and try them on real implementations. In practice, some of these bugs were as simple as "setting a key component to zero" or "request a specification-compliant encryption downgrade".

In addition to not providing any security at all, LMP serves a few other basic tasks, such as negotiating parameters for the physical layer. If something on this layer is misconfigured, connections are dropped. Errors in the LMP state machine easily lead to denial of service, for example in the form of hangs, crashes, or degraded connection quality.

Each LMP packet has up to 17 bytes. The first byte is an opcode, and each opcode has a fixed length payload (p. 679). Nonetheless, parsing these bytes can be error-prone. BrakTooth V1 is a vulnerability in the ESP32 chip, which allows arbitrary code execution for an LMP_features_res_ext packet. Another issue I identified in Broadcom chips (CVE-2018-19860) allows code execution under certain constraints on older firmware for the undefined, vendor-specific opcode 0.

The LMP nightmare doesn't stop here. The BrakTooth researchers were the first to fuzz LMP payloads in combination with headers over-the-air. A Bluetooth packet has two headers: A packet header and a payload header. The curious reader will directly spot the redundant Flow bit, but it's worse than that.



The packet header is also called baseband header in the BrakTooth writeup. It defines the logical transport address (LT_ADDR, p. 450) as well as the packet Type (p. 471). If a type is valid given a certain physical layer modulation scheme is decided by a fair dice roll.


A DM1 packet is basically an ACL packet (p. 474) and can have up to 18 bytes of payload (p. 477). Thus, it has an implicit length constraint of 18 bytes. When encoding an LMP packet into an DM1 packet, 1 byte is used for the payload header and the remaining 17 bytes are the LMP packet as described above. In contrast to a normal ACL packet, the Link Layer ID (LLID) is set to 3 to indicate that something is an LMP packet (p. 482).

The payload header has yet another length field. This field is 5 bit wide, which means that it could define payload lengths up to 31 bytes.

Depending on the packet type in the packet header, the payload header could also be 2 bytes, meaning that the length field is 10 bit wide, resulting in up to 1023 bytes long packet payloads.

Prior to receiving an LMP packet, the packet and the payload header must be parsed correctly. The combination of changing packet and payload header as well as the payload could lead to overflows and code execution in the worst case, or to very weird protocol parsing states. BrakTooth aims to identify such bugs over-the-air.

Vulnerability Classification

The BrakTooth website lists four vulnerabilities affecting the CYW20735 evaluation kit. The researchers used the official SDK to check if the firmware restarts or rises assertions. This means that they would get messages like DBFW_ASSERT_TYPE_FATAL on the Cypress SDK. They didn't use the Frankenstein heap sanitizer to check if these crashes are memory corruptions. The also didn't attach InternalBlue, which can parse stack dumps when the chip crashes, which further enables debugging in case of potentially exploitable bugs that aren't memory corruptions. Instead, they were running the Cypress rfcomm_serial_port example application from the SDK, without detailed debugging of the chip's state.

Note that these example applications don't connect to a host but are standalone. Thus, the overall behavior of the stack might be a bit different when we attach Frankenstein, which attaches to the Linux BlueZ stack as a host. The LMP implementation is always handled by the controller (the CYW20735 chip), so even if we attach Frankenstein, vulnerabilities within LMP should reproduce.

But before, let's classify if the crashes could be exploitable heap overflows or not. The older ThreadX block buffer structure, as present in the CYW20735 evaluation kit, doesn't have security checks. Each block starts with a 4-byte pointer to the next block. Thus, everything that looks like a controllable 4-byte overflow into the next block could be exploitable.
  1. AU Rand Flooding (V12). The vulnerability description states that "unsolicited LMP responses trigger a heap overflow", which in turn leads to DoS. Jan pointed out that the memory allocator is called dynamic_memory_AllocateOrDie. Dying is implemented by rising an assertion and terminating execution. Thus, if the firmware runs out of memory, it will terminate in a controlled way, not resulting in a memory corruption. There's also a function dynamic_memory_AllocateOrReturnNULL, which is used more rarely, but this should not crash when out of memory, assuming that the caller checks the return value.
  2. Invalid Max Slot Type (V13). Setting the LT_ADDR=0 and Type=0xa in the baseband header will crash the target. However, the type field is only indirectly associated with lengths, as it defines the packet type and modulation (BT 5.2 p. 471). The LT_ADDR is even less exciting, it only defines source/destination type of a transmission (p. 450). The specification states that this address shall not be 0 for certain packet types. So, most likely connections are reset or the firmware has an assertion in case this condition is not met, and would terminate gracefully.
  3. Max Slot Length Overflow (V14). If the payload header length is set to 31, there is an overflow. A one byte payload header, as used in LMP, can't exceed 31 bytes, since it is a 5 bit value (p. 482). However, the maximum payload length for LMP packets is 17 bytes (p. 679ff). The LMP_max_slot packet only has 2 bytes of payload. Thus, this might overflow by 29 bytes, which could be sufficient to corrupt the next block element by 4 or more bytes. That some more packets need to be sent afterwards, such as the LMP_auto_rate mentioned in the description, is a typical requirement for a memory corruption bug to cause a crash. This bug might lead to remote code execution! This is the bug that we'll analyze in-depth in the following.
  4. Invalid Timing Accuracy (V15). Setting invalid timings in a connection leads to a firmware crash. The BrakTooth description mentions that there's a watchdog in the firmware that leads to a restart. There is a Cypress SDK specific watchdog initialized in the wiced_init_timer, which most likely is what the BrakTooth researchers observed being triggered. When setting an invalid timing accuracy, the scan window is minimized (p. 653). This likely leads to lost connections and timeouts and indeed could trigger a watchdog. The vulnerability affects multiple chips and the BrakTooth description is a bit unclear, for example, they mention that the LMP_timing_accuracy_res would be sent using opcode 91 instead of 48 in the vulnerability's diagram, and the amount of silent pairings, disconnects, and retries in the loop is unspecified. Based on the description, V15 is hard to reproduce and could also just lead to a memory exhaustion similar to V12.

Note that Jan fuzzed LMP using Frankenstein. He only focused on memory corruption bugs. If there would be conditions that lead to terminated connections or weird audio artifacts, Frankenstein would not discover this. Ideally, if V14 is an overflow, Frankenstein should've found it. But fuzzers can also miss bugs, for example, due to limitations caused by the emulation, very rare conditions, and the fact that the fuzzed input to LMP doesn't complete connections most of the time.

Why do we even care about remote code execution in a Cypress dev kit? Well... Broadcom sold parts of their wireless IoT stuff to Cypress in 2016, and the code base is similar. (Cypress was later on acquired by Infineon, but from a technical standpoint, this only changes responsibilities and not bugs in the code.) A vulnerability in a Cypress chip is likely to affect Broadcom chips as well, which are literally everywhere: All iPhones, all MacBooks, the non-US Samsung Galaxy S and Note series, all Raspberry Pis, and more. Moreover, due to coexistence vulnerabilities I discovered, a code execution within the Bluetooth part of the chip means that one can also execute code within Wi-Fi. 

Is it patched?

The interesting part here is not the Cypress dev kit patch. Cypress has to triage the bugs and also inform Broadcom. Broadcom then needs to inform their customers and develop patches. This process takes rather long. There's a high chance that BrakTooth is not patched on iPhones. Everyone uses Bluetooth on smartphones...

After downloading the latest IPSW, we can extract it, then open the largest DMG and browse to /usr/share/firmware/bluetooth/. No matter if we do this on iOS 14.5.1 or 14.7.1, the file names and sizes are the same. According to the file names, the latest firmware is from January 2021. Depending on the iPhone model, there are different code names. For example, Aladdin is the iPhone Xs, Moana the iPhone 11, Fiti the iPhone SE2020, and all the nuts are the iPhone 12 series.

The date in the file name is not always accurate. But since it didn't change between these two iOS versions and the BrakTooth advisory by Cypress mentions that fixes for validation were sent to the researchers mid August, it's next to impossible that any of the iOS patches include BrakTooth fixes. The only exception would be that even the somewhat older iPhone Xs isn't vulnerable to BrakTooth, but that seems even more unlikely.

Time for hunting and reproducing these bugs!

Edit after the iOS 15 release: Finally some new firmware versions, so BrakTooth should be fixed. Somewhat concerning, though, the iPhone 13 code names seem to be Camellia, Mimosa, Lilac and Acacia and belong to the same chip as the iPhone 12, so they're saving some money by reusing the previous chip generation.

Sending LMP with InternalBlue

The BrakTooth researchers mention that they will release their ESP32-based PoCs end of October. I don't have an ESP32, and I don't want to sign an NDA. But the vulnerability descriptions are very verbose. Releasing the PoCs end of October only protects against script kiddies, but not from attackers that already have a toolchain for sending arbitrary LMP packets.

Dennis Mantz, who initially developed InternalBlue, added basic LMP sniffing and sending capabilities. Moreover, I already found two vulnerabilities within Broadcom LMP (CVE-2018-19860 and CVE-2019-6994). The initial version of InternalBlue only supports the Google Nexus 5 smartphone. I didn't bother too much to port some of the LMP sending capabilities to other chipsets. However, sending arbitrary LMP works for me. And it sufficiently worked for others, for example, the researchers who found the KNOB and BIAS vulnerabilities heavily relied on the Nexus 5 LMP feature.

The Nexus 5 implementation has some shortcomings. We can only send up to 17 bytes of payload. The baseband and payload header are generated later on by the firmware. This is sufficient to trigger some of the BrakTooth bugs according to their description, such as V1, V3, V4 and so on. But we want to reproduce V14, so it's a bit limited. Will 17 bytes be enough to trigger the same overflow? Given that the LMP_max_slots payload is supposed to be 2 bytes, this could work as well.

As a first setup, I use the Nexus 5 as attacking device. My target device is a MacBook Pro 2020 with the BCM4364B3 Bluetooth+Wi-Fi combo chip. This one is a bit special. Even though the MacBook is recent, this chip still connects via UART, not PCIe, and it seems to be based on a Cortex M3 instead of a Cortex M4. Thus, the transport layer is potentially similar to the CYW20735 chip. Moreover, I want to evaluate which packets arrive on the MacBook. Using the Bluetooth PacketLogger from Apple's Additional Tools for Xcode, we can enable LMP logging. Note that this will not work on all device combinations. But once again, it works for me. It's a nice and debuggable setup :)

All Broadcom and Cypress firmwares have an HCI command that enables sending LMP packets. This command is checked against packet lengths. The InternalBlue command fuzzlmp disables this check and we can now send arbitrary LMP payloads to an active connection.
After enabling InternalBlue's fuzzing feature, it is possible to send arbitrary LMP opcodes in combination with arbitrary payloads. In the following screenshot, I'm cycling through opcode 2 to 8, and set the payload to all of these to the value 0x0123456789abcdef. Short after sending opcode 7, the connection is terminated, because this opcode corresponds to an LMP_detach packet. Connections might also be detached by the host for various other reasons. For example, a phone typically needs to be configured as wireless Bluetooth hotspot (BNEP) and is disconnected otherwise. So, injecting LMP before the target detaches can be a bit tricky without actually coding something within the Nexus 5 LMP state machine...
On the MacBook Pro 2020, we can open the PacketLogger and will see all LMP payloads injected by the Nexus 5. So, yaaay, the tooling we built some years ago still works!
Next, I build a PoC for V14, using the existing PoCs for my other LMP CVEs. All in plain assembly, what could possibly go wrong? Finally, I have something that sends an LMP_max_slots request. When I send it to the MacBook, music playback is either getting horrible noisy or the connection is closed. I test it against a couple of iPhones, lots of noise on the iPhone SE2020 and iPhone 11, nothing happens on the iPhone 7, 8 and 12. Thus, the bug was introduced in some version and internally fixed in the latest version. I test it against the CYW20735 board. I use a Linux machine as a host in a loudspeaker role and play music from an iPhone 7, and then connect the Nexus 5 to it. The Frankenstein heap sanitizer triggers:
[*] Firmware says: Heap Corruption Detected
[*] Firmware says: Prehook
[*] Firmware says: dynamic_memory_sanitizer_lr = 0x02dee5
[*] Firmware says: dynamic_memory_sanitizer_r0 = 0x21134c
[*] Firmware says: dynamic_memory_sanitizer_r1 = 0x211380
[*] Firmware says: dynamic_memory_sanitizer_r2 = 0x07
[*] Firmware says: dynamic_memory_sanitizer_r3 = 0x05c4
[*] Firmware says: pool = 0x20d368
[*] Firmware says: pool->block_start = 0x2135e0
[*] Firmware says: pool->capacity = 0x10
[*] Firmware says: pool->size = 0x0180
[*] Firmware says: *free_chunk = 0x0400090f
The function at 0x02dee5 is hci_sendEvent. This is reasonable, since LMP_max_slots in turn informs the host about the maximum slots using HCI.

Okay, cool, so here we go and have arbitrary code execution? Maybe, not so fast...

New and old vulnerability

Still confused why 17 bytes seem to be sufficient, I try some more stuff. Even a normal LMP_max_slots causes the same behaviour, with a length set to 2 bytes. This is not an LMP overflow. This is something else. 

When connecting to the Linux host with the same iPhone 7 + CYW20735 combo, without any modified LMP packets, there's still a heap overflow in hci_sendEvent. This is just CVE-2019-18614 that triggered. This is also indicated by the affected pool size, 384. If it would be LMP that is overflowing, it would likely be in the pool with 48 byte elements. Quite interesting, because recently, when I tried to reproduce it on purpose, it didn't trigger. And now I can even play music without this CVE triggering any bugs in the firmware, but the Nexus 5 reliably triggers it. Ooops.

Just for reference, these are the block sizes and pools on the CYW20735 dev kit without patches against CVE-2019-18614, and I didn't update the dev kit while I was testing:
> info heap
[*]   [ Idx ] @Pool-Addr  Buf-Size  Avail/Capacity  Mem-Size @ Addr
[*]   -----------------------------------------------------------------
[*]   BLOC[0] @ 0x200498:       48    33 / 36           1872 @ 0x211610
[*]   BLOC[1] @ 0x2004BC:       96    20 / 20           2000 @ 0x211D60
[*]   BLOC[2] @ 0x2004E0:      268     9 / 10           2720 @ 0x212530
[*]   BLOC[3] @ 0x20D344:      384     4 /  4           1552 @ 0x212FD0
[*]   BLOC[4] @ 0x20D368:      384    16 / 16           6208 @ 0x2135E0
[*]   BLOC[5] @ 0x20D38C:      264    15 / 15           4020 @ 0x214E20

So what is the noise that I hear on the iPhone SE2020 (14.7.1), 11 (14.5.1), and the MacBook Pro 2020 (11.5.2)? Apparently, a new vulnerability. But again just something in the LMP state machine and not exploitable beyond DoS. I already reported it, and in fact, it has nothing to do with the LMP_max_slots. It was just me trying to program assembly and doing things horribly wrong. It may or may not be a bug collision with BrakTooth, but it doesn't match any of the BrakTooth vulnerability descriptions.

Setting the packet and payload header

The function DHM_LMPTx pre-processes and sends LMP payloads. LMP payloads are stored into an slist before being sent by the Bluetooth Core Scheduler (BCS). Elements in this slist are 32 bytes long. The first 12 bytes are typically set to zero, but can also contain 4 byte long callback function addresses. At offset 12, the struct continues with the LMP opcode, and then continues with the LMP payload. Offset 29 (=12 + maximum payload length of 17) contains the length. The length is obtained from a handler table (LM_LMPInfoTable and LM_LMPInfoTableEsc4). Setting other lengths at offset 29 and then putting the LMP payload into the slist results in a crash of the Nexus 5 firmware rather than an invalid packet being sent.

Instead of using InternalBlue for the PoC, Frankenstein becomes an option. Technically, it does almost the same as InternalBlue on the Nexus 5. But Frankenstein supports writing patches in C, and we have full symbols for the CYW20735 board since they were leaked in Cypress WICED Studio 6.2.

We need to set the packet and payload headers to exactly reproduce BrakTooth. Jan already did this for EIR packets to build a PoC for CVE-2019-11516. The fact that Frankenstein identified this bug and that it requires an invalid payload header demonstrates that the fuzzer should also be able to identify BrakTooth V14, assuming that randomness and mutations are with us.

No matter if something is an EIR, SCO or ACL packet, the packet and payload header format is the same. Instead of setting the EIR headers in bcs_dmaTxEnableEir, we can set the ACL header (LMP is similar to ACL, LMP just uses LLID=3 instead of LLID=1 and 2) in bcs_dmaTxEnable. The remaining details are left as an exercise to the reader. But the feature set implemented by BrakTooth was always "hidden" inside Frankenstein. Frankenstein even ships with a patch that flips bits in the ACL headers, which led to discovering BlueFrag on Android. However, we never built a nice toolchain for it, and we never systematically tested a large number of devices over-the-air. The BrakTooth researchers built a nice framework, and doing all the tests is really cool to see. Also having another Bluetooth toolchain based on the ESP32 is great news.

What could overflow within LMP?

Based on Frankenstein, Jan and me build a couple of PoCs and try them against the Cypress eval kit. No crash happens. Is this overflow real?

The function DHM_BasebandRx handles LMP packets received via ACL. If the LLID is 0b11 (3), it will call the function lm_LmpReceived. This function adds an LMP packet with the same structure as for transmission to an slist and then calls bthci_lm_thread_SetEvent. This event signals the link manager to process the packet. lm_LmpReceived does not use memcpy with a variable length. Instead, independent from the length field in the payload header, it will copy 20 bytes from the ACL payload. 

int lm_LmpReceived(int ACLConnectionPtr, _DWORD *lmp_payload_20_bytes)
{
  int result; // r0
  int buffer_32; // r4
  void (__fastcall *callback_function)(int, int); // r2

  result = *(unsigned __int8 *)(ACLConnectionPtr + 155);
  if ( !result )
  {
    buffer_32 = lm_allocLmpBlock();// performs dynamic_memory_AllocateOrDie(32)
    *(_DWORD *)(buffer_32 + 12) = lmp_payload_20_bytes[1];// opcode starts at offset 12.
    *(_DWORD *)(buffer_32 + 16) = lmp_payload_20_bytes[2];
    *(_DWORD *)(buffer_32 + 20) = lmp_payload_20_bytes[3];
    *(_DWORD *)(buffer_32 + 24) = lmp_payload_20_bytes[4];
    *(_DWORD *)(buffer_32 + 28) = lmp_payload_20_bytes[5];
    callback_function = *(void (__fastcall **)(int, int))(ACLConnectionPtr + 4);
    if ( callback_function )
      callback_function(ACLConnectionPtr, buffer_32 + 12);
    *(_BYTE *)(buffer_32 + 31) = *(_DWORD *)ACLConnectionPtr;
    *(_BYTE *)(buffer_32 + 30) = 0;
    slist_add_tail((_DWORD *)buffer_32, (_DWORD **)&lmpMsgList);
    result = bthci_lm_thread_SetEvent(0x20000);
  }
  return result;
}
Note that this is one of the many places that would lead to a controlled firmware crash if memory is exhausted (BrakTooth V12 and maybe V15). Moreover, ignoring the length field in an LMP packet header and later on acknowledging it irrespective of the length is not specification-compliant (BrakTooth A1+A2).

The LMP payload will asynchronously be processed by lm_handleEvents. For the event code 0x20000, it calls lm_handleLmpMsg, which gets elements from the slist and either acknowledges them or processes them in lm_HandleLmpReceivedPdu. This LMP reception handler is finally the one that applies the length fields as defined in the Bluetooth specification using the function lm_getLmpInfoType, which accesses the definitions in LM_LmpInfoTable and LM_LmpInfoTableEsc4. Moreover, lm_HandleLmpReceivedPdu calls diag_logRxLmpPkt, which statically logs 17 bytes of the payload, no matter what the actual packet length was. This explains why we will see the additional bytes in Apple's PacketLogger, and why it always reports the LMP length to be 17 bytes, no matter what the actual transmitted packet length was. Thus, these effects are not caused by an overflow, but by handling all LMP packets as fixed length until actually processing their payload.

Even if the LMP_max_slot packet would have 31 bytes of payload, these would be cut once they reach lm_LmpReceived

There are still a couple of checks a packet must pass before DHM_BasebandRx gets the packet, as well as some initial copying from the modem.

The modem

There's a big difference between our CYW20735-based PoC and the BrakTooth PoC for the ESP32. The platforms are fundamentally different. While it is possible to control the firmware on the CYW20735 evaluation kit and set modem registers to configure the modulation scheme (aka packet Type field), the payload Length field, etc., the modems might show different behavior.

To further evaluate and compare modem behavior in detail, both, the BrakTooth PoC, the CYW20735 PoC, and ideally an expensive SDR-based setup are required. At the time of writing this article, this is not available.

The Bluetooth Core Scheduler (BCS) in the Cypress firmware has helper functions used during packet reception, called bcs_utilBbRxPktHdr, bcs_utilBbRxPktSetLength, bcs_utilBbRxPyldCheck, etc. When using two Cypress evaluation boards (CYW20735+CYW20819 in our setup), we extract information from these functions.

Once a payload length of more than 17 bytes is set on the attacker's end, the receiver will no longer see this packet in the bcs_utilBbRxPktSetLength function. This function gets length information from the pkt_log register and applies it to a target struct holding packet information. Thus, it is possible that the Cypress modem on the attacker or receiver end withdraws DM1 packets longer than 17 bytes. This could lead to different behavior and explain why our PoC does not lead to a crash.

In addition to DM1 packets (Type 3), the Cypress firmware also accepts 3DH-1 packets (Type 8) with LMP payloads, according to DHM_BasebandRx. 3DH-1 packets can have a regular payload length of up to 85 bytes (p. 485). We can successfully set the type to 3DH-1, this type reaches the function bcs_utilBbRxPyldHdr over-the-air. Nonetheless, the payload is never actually passed on to LMP processing, even if we leave the length field intact. Yet, something in the modem starts spamming a lot of messages, resulting in way more bcs_utilBbRxPyldHdr calls than usual. In our tests, this did not lead to memory exhaustion, but in general, this could be possible.

This is not the first time that we encounter weird modem behavior in combination with setting lengths. The PoC for CVE-2019-11516 sets the EIR header length, similar to the method we used to reproduce the BrakTooth vulnerabilities. On the receiver, the EIR payload is mapped to a memory area of the modem. In this area, the EIR payload is duplicated, meaning that there is a partial copy of the original payload appended to the intended payload.

If BrakTooth triggers a crash or overflow in or via the modem, it is possible that our Cypress-based PoC doesn't reproduce this.

Improper state handling

If we search for improper state handling of LMP packets and their headers in general, the root cause is likely in the DHM_BasebandRx function, which decides if something is an LMP packet, or the aclTaskRxDone function, which initially parses the ACL packet and payload header. The bit and byte fiddling in these functions is super weird and might be error prone. Issues in this function are not related to LMP in particular, but ACL in general, and should've been detected by Frankenstein.

Something that could still cause some trouble is that lm_HandleLmpMaxSlotPdu accesses the current max slots setting for the active ACL connection. This is not related to an out of bounds accesses, but if the max slots setting is applied incorrectly or only on one of the both ends of a connection, this could cause the connection to be terminated, and maybe also lead to issues in the modem. This might be detected by an over-the-air testing approach like BrakTooth but not by Frankenstein.

When we run a PoC that sets the LMP_max_slot to 31 and send it to the MacBook Pro, it terminates all current connections. The LMP_max_slot packet sets the number of slots (BT 5.2 p. 606), and this can be accepted or denied. The max slot setting seems to be applied correctly and afterwards communication stops. This might be the BrakTooth vulnerability.

Here is a screenshot of a connection between the CYW20735 evaluation kit (attacker role) and the MacBook Pro (under attack) captured with the PacketLogger. The LMP_max_slot packet as well as the LMP_auto_rate packet are part of the normal connection setup. Thus, we only need to set the length to 31 in the LMP_max_slot packet. Setting it in the LMP_auto_rate_packet instead seems to work as well, which we did in the screenshots below. The PacketLogger still shows both packets being sent/received. Afterwards, in contrast to a normal connection setup, the firmware attempts to send further packets, which are not acknowledged and, thus, sends a supervision timeout. We assume that BrakTooth breaks something in the reception handler of the MacBook. The CYW20735 evaluation kit still receives the LMP messages from the MacBook and queues LMP reply messages.
The controller indicates the connection timeout after the last LMP message, which is a name request as shown in the screenshot, to the host, which is macOS in our case. This is a simple connection timeout, being reported to the host.
The BrakTooth researchers noted that they received a DBFW_ASSERT_TYPE_FATAL!!! module_id: 0, id: 9b00 message for this specific issue during testing. They were using the Cypress SDK, which doesn't have HCI, and, thus, might handle connection failures differently. The SDK handler for assertions cuts the first byte from the id, and the only reference using the id 0x80009b00 is in dynamic_memory_AllocateOrDie. This is significantly different from what we observed. The root cause for this is unclear and indicates, despite not being able to reproduce BrakTooth on our end, it could still be dangerous. Assuming the least impact, it could indicate that the chip runs out of memory, but the allocator could also fail due to other reasons.

On our end, the only special behavior is that if another device is connected in parallel, such as AirPods, all connections are terminated. This is likely a macOS-specific behavior, it does not happen on iPhones.

In comparison, these are the CYW20735 and MacBook establishing a connection while no attack is ongoing. LMP packets, both sent and received, appear in the log (even though sometimes out of order, since HCI is asynchronous), and the setup continues further with a proper LMP_detach message. Both devices are not paired, and the detach message after a few seconds is the expected behavior in this case.
When using the same PoC that works against the MacBook Pro for iPhones (tested: 8, 11, 12), connections are not terminated. We assume that this is due to the macOS Bluetooth stack. End of 2018, there used to be a similar behavior in the iOS Bluetooth stack: If a connection was initiated but not completed after 60 seconds, the Bluetooth daemon would restart. Apple has a lot of developers, and they keep them busy by maintaining two different Bluetooth stacks for iOS and macOS, as well as fixing all bugs twice.

We observe the same effect for BrakTooth V13, which also sends an LMP_max_slot packet but with an invalid LT_ADDR and Type. However, the modem might again have an effect on this setting.

Based on our triage, we assume that the BrakTooth vulnerabilities also apply to Broadcom chips, but are not exploitable beyond denial of service attacks. With root causes within modem components as well as the ACL and LMP state machines, Frankenstein could not identify them. We are looking forward to the BrakTooth researchers publishing their PoCs, as well as us publishing our PoCs, to test them in practice and against more targets.

Special thanks

We want to thank Matheus Garbelini from the BrakTooth researchers for commenting on our triage attempts with the Cypress evaluation kits, as well as plonk for proofreading this post.

Comments

Popular posts from this blog

Always-on Processor magic: How Find My works while iPhone is powered off

Bluetooth → Wi-Fi Code Execution & Wi-Fi Debugging