BlueZ: Linux Bluetooth Stack Overview
Found some time for another Bluetooth rant :) This time it's going to be about BlueZ, the Linux Bluetooth stack. Note that there are other Bluetooth stacks for Linux such as BTstack, but I didn't find the time to play around with these, and BlueZ is still what you get these days if you install a normal Linux distribution.
This is my view on about BlueZ and a couple of things might be over-simplified. Feel free to add comments to this post if anything is wrong or is better explained elsewhere. However, I found that there is no good overview from a programming and hacking perspective, and often times I get questions about patching certain things within InternalBlue that have a root cause deep down in the Linux kernel.
BlueZ is missing documentation. In fact, I ended up using dynamic debugging here and there to understand which functions are still called and which are deprecated. Otherwise, this blog post would not be needed for an open-source project m)
Linux Bluetooth stack vs. mobile stacks
Most Bluetooth stacks consist of two components: A kernel module and a Bluetooth daemon. Note that many IoT applications simply run within the Bluetooth chip, so this model is not accurate in that case. However, it applies to iOS, Android, Linux and many other platforms.
On mobile devices, the kernel has high privileges and the Bluetooth daemon is a sandboxed process running as separate user. Thus, mobile stacks tend to put all functionality into the Bluetooth daemon. For example, the iOS Bluetooth daemon has multiple abstract interfaces that support Brodacom chips with PCIe, UART and USB interface and also a few other chips, such as Apple's RTKit-based Bluetooth chip in the Apple Watch, and I think the Apple TV has yet another Bluetooth chip. Since tvOS and watchOS are based on iOS, all of them are supported by the iOS Bluetooth daemon.
The main design principle is: If anything can run in user space on a mobile device, it runs in user space.
Linux, however, does not have sandboxing and the Bluetooth daemon simply runs as root. The kernel is responsible for all the chip and protocol flavors. Even protocols like BNEP (Bluetooth tethering to share an Internet connection between two devices) are fully implemented in the kernel. If there's nothing to separate in terms of privileges, just put it all into the kernel...
(Pro tip: use the hostname honeypot when enabling Bluetooth on Linux to prevent being hacked :D Given the fact that Raspberry Pis ran out of Patchram slots years ago and any parsing issue within the Bluetooth kernel module or the Bluetooth daemon could lead to a full system compromise, that's the only reasonable action...)
BlueZ layers and interfaces
The BlueZ stack abstracts everything into the opposite direction. It uses a management interface, exposed to the user space. Opening a Bluetooth management socket requires special privileges. The Linux Bluetooth daemon uses this management interface to communicate information to the kernel. For example, on initialization of bluetoothd, it loads various information from /var/lib/bluetooth/ and sends this via the management interface using routines like load_devices. You can find a generic BlueZ overview and a BLE overview in presentations.
The corresponding part on the kernel is separated into protocol components (/net/bluetooth/) and device components (/drivers/bluetooth/).
Protocols include HCI, which is a standardized interface to communicate with a Bluetooth chip (controller) via an operating system (host). However, even upper layer protocols like BNEP (tethering), SCO (headsets) and more are implemented in the kernel.
Drivers include various flavors for Intel, Broadcom, ... chips. This is what you would always expect to be in a kernel up to a certain extent.
For a project like InternalBlue, that directly wants to communicate over HCI with a Broadcom chip, there is still a mechanism that enables us to do this without patching the kernel. It is possible to open an HCI socket and send HCI commands. Some commands are restricted to root, others aren't. Wireshark and btmon do the same to observe packets.
After asking why there are a management socket and an HCI socket in parallel and why the Bluetooth daemon was exclusively using a management socket, someone in the #bluez IRC channel told me that this is a feature. The kernel needs to talk HCI in some cases (i.e. initializing the firmware of a chip) that is prone to races with HCI originating from the Bluetooth daemon. The management layer abstracts the HCI functionality with different command codes and is documented in detail in the BlueZ doc/mgmt-api.txt.
So, yep, I was tracing hci_send_req and hci_send_cmd with Frida, but both are never called by the Bluetooth daemon during an active connection. It's still in the code because tools like hcitool can send raw HCI commands, as well as the socket implementation.
Broadcom diagnostics
One reason to hook into HCI and modify HCI are Broadcom diagnostic commands. On most HCI implementations, HCI commands have the hex prefix byte 0x01, HCI events use 0x04, and on the same layer, ACL has 0x02 and SCO has 0x03. Broadcom diagnostics introduce the prefix 0x07.
If the Linux Bluetooth kernel module correctly identifies the chip as a Broadcom chip and if the chip has a UART (or USB? not sure any more, sorry...) interface, the stack creates a vendor_diag interface in the debugfs. In this case, you can enable diagnostics as follows:
echo 1 > /sys/kernel/debug/bluetooth/hci0/vendor_diag
From then on, you will see diagnostic messages in btmon, which is a BlueZ tool.
Sounds easy? Well, I only managed to see this interface and diagnostic messages once when booting a plain Ubuntu on a MacBook :D While this could work on all Raspberry Pis, this doesn't work, because they're using the wrong type of interface. It also doesn't work for Cypress chips since their vendor ID is 0x131 instead of 0xf, even though it should definitely work for a couple of these.
The diagnostics interface has a various interesting information. From a Bluetooth protocol security perspective, displaying Link Management Protocol (LMP) packets is the most interesting one. Sending and observing LMP is essential for the BIAS and KNOB attack, and Daniele published a PoC for BIAS in a configuration that didn't use a hardware combination that supports BlueZ diagnostics out of the box. So I wondered how exactly he did it...
Patching around in the Linux kernel
The clean patch would be to fix vendor_diag within BlueZ and the Linux kernel. I had a student wasting a lot of time ago trying to get that work without success.
Echoing 1 to vendor_diag corresponds to sending the bytes 07 f0 01 over raw HCI, as InternalBlue does in the function enableBroadcomDiagnosticLogging. So, instead of properly fixing the diagnostic interface, which then would only allow decoding LMP with btmon and not the Wireshark LMP plugin as far as I know, Daniele opted to fix HCI within the kernel to accept the prefix 0x07. He did that for the kernel version 4.14.111 and didn't provide a diff file but just the full source code of the patched files. Thus, the solution to use his patches is basically running BlueZ with a pre-compiled 4.14.111 kernel /o\
Obviously, one could still diff that against the original 4.14.111 kernel. Daniele's patch includes the following files:
include/net/bluetooth/hci.h
drivers/bluetooth/h4_recv.h
drivers/bluetooth/hci_uart.h
drivers/bluetooth/hci_h4.c
net/bluetooth/hci_sock.c
net/bluetooth/hci_core.c
At least my copy of 4.14.111 does not have h4_recv.h. Other than this, the patch is rather simple. It removes all existing diagnostic parsing and instead adds 0x07 as new HCI type. This is definitely not a patch that should go into the mainline kernel, since enables the usually invalid byte prefix on all devices regardless if they're a Broadcom chip or not. But at least the check if they're supporting Broadcom vendor diagnostics is irrelevant with that :) In case you want to see how the patch works, the diff output is here. Happy LMP hacking ;)
Hey Jiska, thanks for the post! It was a nice read.
ReplyDeleteI think it is also possible (for serial at least) to use the btattach util with -P bcm to view the vendor diag messages via the bluetooth-monitor interface without any kernel patching.
Hi Paul,
Deletethanks for reading the blog post and the reply :) I just tried to get this working on a CYW20735 evaluation board with UART. I attach the board as you suggested:
# btattach -B /dev/ttyUSB0 -P bcm -S 3000000
However, setting the interface up results in a timeout:
# hciconfig hci0 up
Can't init device hci0: Connection timed out (110)
The btmon output indicates that the device is set to Broadcom. Something else still seems to go wrong :(
Bluetooth monitor ver 5.55
= Note: Linux version 5.10.0-5-amd64 (x86_64)
...
= New Index: 00:00:00:00:00:00 (Primary,UART,hci0) [hci0] 166.244078
= Open Index: 00:00:00:00:00:00 [hci0] 166.244154
= Index Info: 00:00:00:00:00:00 (Broadcom Corporation) [hci0] 166.244162
< HCI Command: Reset (0x03|0x0003) plen 0 #3538 [hci0] 166.244274
@ RAW Open: hciconfig (privileged) version 2.22 {0x0002} 171.154985
= Close Index: 00:00:00:00:00:00
...
A MAC of zeros usually indicates that the stack cannot talk to the board as needed. Using btattach without -P bcm flag works without any issues.
Yes, it's a Cypress board, but I know the firmware, it has exactly the same debug feature. No idea what exactly goes wrong, dmesg just claims timeouts that don't happen when I don't use the Broadcom mode:
[790624.727709] Bluetooth: hci0: command 0x0c03 tx timeout
[790632.855113] Bluetooth: hci0: BCM: Reset failed (-110)
So, if you have any idea what exactly goes wrong that would be really helpful :)
Hey Jiska, thanks for the reply!
DeleteI also used a Cypress board (CYW20819) and it worked fine for capture.
I tried the steps you did and also had some timeout issues with your baudrate (300000) and switching between those btattach flags.
Not sure if the baud rate even matters, but with 115200 and without switching the modes it worked kind of reliable for me.
My system: Bluetooth monitor ver 5.58 = Note: Linux version 5.11.14-arch1-1 (x86_64) = Note: Bluetooth subsystem version 2.22
So I just attached the board with: sudo btattach -P bcm --speed 115200 -B /dev/ttyUSB0
Enabled the LMP messges with: echo 1 | sudo tee /sys/kernel/debug/bluetooth/hci1/vendor_diag
Started wireshark on the bluetooth-monitor interface.
Wireshark requires a small change to decode the LMP packages with your dissector plugin, it should show those packages as HCI_MON packets with opcode 11.
This method only works for capturing AFAIR. Sending bcm messages still requires patching.
Maybe this helps :)
Hi Paul :)
ReplyDeleteSorry for the late reply. Using the CYW20735 board, I still get a timeout for `-P bcm`, no matter which baud rate I use.
But you're right - for the CYW20819 board the arguments you provided work. I guess most people bought the more recent CYW20819 board anyway. And I can confirm that I see vendor diagnostic messages with btmon and also in Wireshark :D
Best,
Jiska