Add new post on UIO

Signed-off-by: Sanchayan Maity <maitysanchayan@gmail.com>
2019-08-30 23:56:42 +05:30 · 2019-08-30 23:56:42 +05:30 · 1571ec0480
parent 85271c6438
commit 1571ec0480
1 changed files with 178 additions and 0 deletions
--- a/posts/2019-08-30-tale-of-working-with-uio.md
+++ b/posts/2019-08-30-tale-of-working-with-uio.md
@ -0,0 +1,178 @@
+---
+author: Sanchayan Maity
+title: A tale of working with Xilinx DisplayPort & UIO
+tags: linux, drivers, kernel
+---
+
+At work, we use a DisplayPort IP from Xilinx. Xilinx does not provide any driver
+for this. There is a [TX](https://www.xilinx.com/support/documentation/ip_documentation/v_dp_txss1/v2_0/pg299-v-dp-txss1.pdf)
+and [RX](https://www.xilinx.com/support/documentation/ip_documentation/v_dp_rxss1/v2_0/pg300-v-dp-rxss1.pdf).
+Baremetal code support is provided, however, we needed support for Linux.
+Ignoring interrupts, it is easy to get this baremetal code to work on Linux.
+Xilinx's baremetal code at it's core uses [Xil_Out32](https://github.com/Xilinx/embeddedsw/blob/master/lib/bsp/standalone/src/common/xil_io.h#L219) and [Xil_In32](https://github.com/Xilinx/embeddedsw/blob/master/lib/bsp/standalone/src/common/xil_io.h#L147)
+function for writing and reading to registers. The implementations for
+these can be replaced with mmap for accessing the registers. For DP TX side, we
+do not need to handle interrupts and setting up the registers is enough. For RX,
+however, we need interrupts to setup RX. For example, the link training for DP
+is initiated once a Training Pattern 1 (TP1) interrupt is detected.
+
+Linux being a monolithic kernel, there is clear separation between kernel and
+user space. Interrupts can only be handled in kernel space. However, it was
+easier to use the ported baremetal code in user space and so we also needed to
+handle interrupts in user space. Generally, one writes a driver to do all this
+but since we only needed the interrupt part to be handled in kernel space, this
+is where UIO subsystem comes in.
+
+Using the UIO subsystem, it is possible to handle the interrupts in kernel space
+while the rest like reading or writing to the registers can be done in user space.
+There is a [Userspace I/O Platform driver with generic IRQ handling code](https://elixir.bootlin.com/linux/latest/source/drivers/uio/uio_pdrv_genirq.c).
+I will take the example of Xilinx DisplayPort RX here. After my colleague who
+works on FPGA side generates the FPGA firmware along with device trees which
+has entries as per the peripherals configured on FPGA side, DP RX peripheral
+let's say is in the memory region 0x80004000 to 0x80006000. An interrupt will
+also be assinged based on how the FPGA Programmable Logic (PL) connects to the
+Processing System (PS). PL is the FPGA and PS is the ARM64 SoC.
+
+The extended device tree entry looks like this. *reg* specifies the memory that
+can be mmaped in user space and accessed while the *interrupts* will be used by
+the kernel code.
+
+```c
+&SUBBLOCK_DP_BASE_v_dp_rxss1_0 {
+    compatible = "dprxss-uio";
+    interrupt-parent = <&gic>;
+    interrupts = <0 92 4 0 92 4>;
+    reg = <0x0 0x80004000 0x0 0x2000>
+    status = "okay";
+}
+```
+
+To link the UIO platform driver to this, we add the following to the *bootargs*
+environment variable in u-boot.
+
+```c
+uio_pdrv_genirq.of_id=dprxss-uio
+```
+
+We need to do this, since the compatible property for device tree is not specified
+in the driver. See [here](https://elixir.bootlin.com/linux/latest/source/drivers/uio/uio_pdrv_genirq.c#L252).
+So it is a module parameter.
+
+```c
+static struct of_device_id uio_of_genirq_match[] = {
+	{ /* This is filled with module_parm */ },
+	{ /* Sentinel */ },
+};
+MODULE_DEVICE_TABLE(of, uio_of_genirq_match);
+module_param_string(of_id, uio_of_genirq_match[0].compatible, 128, 0);
+MODULE_PARM_DESC(of_id, "Openfirmware id of the device to be handled by uio");
+```
+
+Now, a combination of poll and read can be used to wait for interrupts in user
+space.
+
+So all well and good, however, there are some caveats to know. Once an interrupt
+is handled in kernel code, it disables the interrupt.
+
+```c
+static irqreturn_t uio_pdrv_genirq_handler(int irq, struct uio_info *dev_info)
+{
+	struct uio_pdrv_genirq_platdata *priv = dev_info->priv;
+
+	/* Just disable the interrupt in the interrupt controller, and
+	 * remember the state so we can allow user space to enable it later.
+	 */
+
+	spin_lock(&priv->lock);
+	if (!__test_and_set_bit(UIO_IRQ_DISABLED, &priv->flags))
+		disable_irq_nosync(irq);
+	spin_unlock(&priv->lock);
+
+	return IRQ_HANDLED;
+}
+```
+
+The interrupt re-enable logic is in the below function.
+
+```c
+static int uio_pdrv_genirq_irqcontrol(struct uio_info *dev_info, s32 irq_on)
+{
+	struct uio_pdrv_genirq_platdata *priv = dev_info->priv;
+	unsigned long flags;
+
+	/* Allow user space to enable and disable the interrupt
+	 * in the interrupt controller, but keep track of the
+	 * state to prevent per-irq depth damage.
+	 *
+	 * Serialize this operation to support multiple tasks and concurrency
+	 * with irq handler on SMP systems.
+	 */
+
+	spin_lock_irqsave(&priv->lock, flags);
+	if (irq_on) {
+		if (__test_and_clear_bit(UIO_IRQ_DISABLED, &priv->flags))
+			enable_irq(dev_info->irq);
+	} else {
+		if (!__test_and_set_bit(UIO_IRQ_DISABLED, &priv->flags))
+			disable_irq_nosync(dev_info->irq);
+	}
+	spin_unlock_irqrestore(&priv->lock, flags);
+
+	return 0;
+}
+```
+
+This is called from [uio_write](https://elixir.bootlin.com/linux/latest/source/drivers/uio/uio.c#L648).
+And if you know how [file operations](https://linux-kernel-labs.github.io/master/labs/device_drivers.html)
+work, this uio_write will be called when a *write* system call is issued with a
+file descriptor received from opening the */dev/uioX* node.
+
+```c
+static const struct file_operations uio_fops = {
+	.owner		= THIS_MODULE,
+	.open		= uio_open,
+	.release	= uio_release,
+	.read		= uio_read,
+	.write		= uio_write,
+	.mmap		= uio_mmap,
+	.poll		= uio_poll,
+	.fasync		= uio_fasync,
+	.llseek		= noop_llseek,
+};
+```
+
+Now, here comes the problem. For me, after the very first interrupt I was not
+getting any more interrupts. So, basically my write call was not working to
+re-enable the interrupt. Putting print statements in uio_write, I could see a
+call to the write function did not result in invocation of uio_write.
+
+I was perplexed and wasted 4-5 hours trying to figure out what's wrong. I wrote
+a small piece of code outside the project workspace which opened /dev/uioX
+and then did a write. In this case, I could see my prints from uio_write function
+which eventually called uio_pdrv_genirq_irqcontrol to enable the interrupt. So,
+something was wrong with my project.
+
+I use neovim and ctags for code navigation. Trying jump to definition on my
+write call made me end up in a write.c file. The very first initial project
+setup was done by my FPGA engineer colleague since Xilinx SDK generates bare
+metal code samples based on the design done. I had not noticed this file before.
+It seemed to be an artifact of the code ported over from baremetal and had a
+write function as below.
+
+```c
+__attribute__((weak)) sint32 write(sint32 fd, char8 *buf, sint32 nbytes)
+```
+
+I was aware of the weak attribute from my work in u-boot where it is used to
+define board specific functions over riding default ones. The GCC manual defines
+it as "The weak attribute causes the declaration to be emitted as a weak symbol
+rather than a global. This is primarily useful in defining library functions
+which can be overridden in user code".
+
+There was no other write function defined in my project. Ideally it should have
+been picked up from glibc. However, this was not what was happening. I did not
+need that write implementation in write.c which was actually writing to a UART
+port, so after removal everything started working fine. DP link training was
+succeeding finally.
+
+You can read more about UIO [here](https://www.kernel.org/doc/html/latest/driver-api/uio-howto.html).