+++ title = "Supporting ALSA compressed offload in PipeWire" date = 2024-03-18 +++ ## Background **Editor's note**: this work was completed in late 2022 but this post was unfortunately delayed. Modern day audio hardware these days comes integrated with Digital Signal Processors integrated in SoCs and audio codecs. Processing compressed or encoded data in such DSPs results in power savings in comparison to carrying out such processing on the CPU. ``` +---------+ +---------+ +---------+ | CPU | ---> | DSP | ---> | Codec | | | <--- | | <--- | | +---------+ +---------+ +---------+ ``` This post takes a look at how all this works. ## Audio processing A traditional audio pipeline, might look like below. An application reads encoded audio and then might leverage a media framework like GStreamer or library like ffmpeg to decode the encoded audio to PCM. The decoded audio stream is then handed off to an audio server like PulseAudio or PipeWire which eventually hands it off to ALSA. ``` +----------------+ | Application | +----------------+ | mp3 +----------------+ | GStreamer | +----------------+ | pcm +----------------+ | PipeWire | +----------------+ | pcm +----------------+ | ALSA | +----------------+ ``` With ALSA Compressed offload, the same audio pipeline would look like this. The encoded audio stream would be passed through to ALSA. ALSA would then, via it's compressed offload API, send the encoded data to the DSP. DSP does the decode and render. ``` +----------------+ | Application | +----------------+ | mp3 +----------------+ | GStreamer | +----------------+ | mp3 +----------------+ | PipeWire | +----------------+ | mp3 +----------------+ | ALSA | +----------------+ ``` Since the processing of the compressed data is handed to a specialised hardware namely the DSP, this results in a dramatic reduction of power consumption compared to CPU based processing. ## Challenges - ALSA Compressed Offload API which is a different API compared to the ALSA PCM interface, provides the control and data streaming interface for audio DSPs. This API is provided by the [tinycompress](https://github.com/alsa-project/tinycompress) library. - With PCM there is the notion of `bytes ~ time`. For example, 1920 bytes, S16LE, 2 channels, 48 KHz would correspond to 10 ms. This breaks down for compressed streams. It's impossible to estimate reliably the duration of audio buffers when handling most compressed data. - While sampling rate, number of channels and bits per sample are enough to completely specify PCM, various parameters may have to be specified to enable the DSP to deal with multiple compressed formats. - For some codecs, additional firmware has to be loaded by the DSP. This has to be handled outside the context of audio server. ## Requirements - Expose all possible compressed formats. - Allow a client to negotiate the format. - Stream encoded audio frames and not PCM. ## PipeWire PipeWire has become the default sound server on Linux, handling multimedia routing and audio pipeline processing. It offers capture and playback for both audio and video with minimal latency and support for PulseAudio, JACK, ALSA, and GStreamer-based applications. ## SPA PipeWire is built on top of SPA (Simple Plugin API), a header only API for building plugins. SPA provides a set of low-level primitives. SPA plugins are shared libraries (.so files) that can be loaded at runtime. Each library provides one or more `factories`, each of which may implement several `interfaces`. The most interesting interface is the `node`. - A node consumes or produces buffers through ports. - In addition to ports and other well defined interface methods, a node can have events and callbacks. Ports are also first class objects within the node. - There are a set of port related interface methods on the node. - There may be statically allocated ports in instance initialization. - There can be dynamic ports managed with `add_port` and `remove_port` methods. - Ports have `params` which can be queried using the `port_enum_params` method to determine the list of formats `EnumFormat`, the currently configured format `Format`, buffer configuration, latency information, `I/O areas` for data structures shared by port, and other such information. - Some params such as the selected format can be set using the `port_set_format` method. ## Implementing compressed sink SPA node This section covers some primary implementation details of a PipeWire SPA node which can accept an encoded audio stream and then write it out using ALSA compressed offload API. ```c static const struct spa_node_methods impl_node = { SPA_VERSION_NODE_METHODS, .add_listener = impl_node_add_listener, .set_callbacks = impl_node_set_callbacks, .enum_params = impl_node_enum_params, .set_io = impl_node_set_io, .send_command = impl_node_send_command, .add_port = impl_node_add_port, .remove_port = impl_node_remove_port, .port_enum_params = impl_node_port_enum_params, .port_set_param = impl_node_port_set_param, .port_use_buffers = impl_node_port_use_buffers, .port_set_io = impl_node_port_set_io, .process = impl_node_process, }; ``` Some key node methods defining the actual implementation are as follows. ### **`port_enum_params`** `params` for ports are queried using this method. This is akin to finding out the capabilities of a port on the node. For the compressed sink SPA node, the following are present. - `EnumFormat` This builds up a list of the encoded formats that's handled by the node to return as a result. - `Format` Returns the currently set format on the port. - `Buffers` Provides information on size, minimum, and maximum number of buffers to be used when streaming data to this node. - `IO` The node exchanges information via `IO` areas. There are various type of `IO` areas like buffers, clock, position. Compressed sink SPA node only advertises `buffer` areas at the moment. The results are returned in an [SPA POD](https://docs.pipewire.org/page_spa_pod.html). ### **`port_use_buffers`** Tells the port to use the given buffers via the `IO` area. ### **`port_set_param`** The various `params` on the port are set via this method. `Format` `param` request sets the actual encoded format that's going to be streamed to this SPA node by a pipewire client like `pw-cat` or application for sending to the DSP. ### **`process`** Buffers containing the encoded media are handled here. The media stream is written to the IO buffer area which were provided in `use_buffers`. The encoded media stream is written to the DSP by calling `compress_write`. ### **`add_port`** and **`remove_port`** Since dynamic ports aren't supported, these methods return a `ENOTSUP`. ## `pw-cat` `pw-cat` was modified to support negotiation of encoded formats and passing the encoded stream as is when linked to the compressed sink node. ## Deploying on hardware Based on discussions with upstream compress offload maintainers, we chose a Dragonboard 845c with the Qualcomm SDM845 SoC as our test platform. For deploying Linux on Embedded devices, the tool of choice is Yocto. Yocto is a build automation framework and cross-compile environment used to create custom Linux distributions/board support packages for embedded devices. Primary dependencies are - tinycompress - ffmpeg - PipeWire - WirePlumber The `tinycompress` library is what provides the compressed offload API. It makes `ioctl()` calls to the underlying kernel driver/sound subsystem. `ffmpeg` is a dependency for the example `fcplay` utility provided by `tinycompress`. It's also used in `pw-cat` to read basic metadata of the encoded media. This is then used to determine and negotiate the format with the compressed sink node. `PipeWire` is where the compressed sink node would reside and `WirePlumber` acting as the session manager for `PipeWire`. Going into how Yocto works is beyond the scope of what can be covered in a blog post. Basic Yocto project concepts can be found [here](https://docs.yoctoproject.org/overview-manual/concepts.html?highlight=meta+layer#yocto-project-concepts). In Yocto speak, a custom [meta layer](https://github.com/asymptotic-io/meta-asymptotic) was written. Yocto makes it quite easy to build `autoconf` based projects. A new `tinycompress` [bitbake recipe](https://github.com/asymptotic-io/meta-asymptotic/blob/master/recipes-multimedia/tinycompress/tinycompress.bb) was written to build the latest sources from upstream and also include the `fcplay` and `cplay` utilities for initial testing. The existing PipeWire and WirePlumber recipes were modified to point to custom git sources with minor changes to default settings included as part of the build. ## Updates since the original work Since we completed the original patches, a number of changes have happened thanks to the community (primarily Carlos Giani). These include: - A device plugin for autodetecting compress nodes on the system - Replacing `tinycompress` with an internal library to make all the requisite `ioctl()`s - Compressed format detection (which was previously waiting on [an upstream API addition we implemented in `tinycompress`](https://github.com/alsa-project/tinycompress/pull/16) ## Future work - Make compressed sink node provide clocking information. While the API provides a method to retrieve the timestamp information, the relevant timestamp fields seem to be not populated by the `q6asm-dai` driver. - Validate other encoded formats. So far only MP3 and FLAC have been validated. - May be the wider community can help test this on other hardware. - Add capability to GStreamer plugin to work with compressed sink node. This would also help in validating pause and resume.