Performance overview#
IPC and Syscall Performance#
With L4Re being a microkernel-based system, some of you are interested in the IPC and syscall performance of L4Re. IPC is a base-level communication mechanisms that allows to exchange a limited amount of payload data between two threads. The fastest IPC is between two threads running in the same address space (task) on the same CPU core. A syscall is also IPC but only communicates with the kernel.
The following table provides IPC performance numbers for a single IPC on
various popular platforms. To perform the measurement, the L4Re microkernel has
been configured in its performance configuration CONFIG_PERFORMANCE=y
,
i.e., without assertions.
The source code of the benchmark program can be found here. The images used to measure those are linked in the table below.
Numbers are measured with the performance counters. On Arm, the cycle counter is used. On x86, the fixed-function counters are used.
Platform |
Processor |
IPC (in CPU cycles) |
Syscall |
Image |
|
---|---|---|---|---|---|
Intra |
Inter |
||||
amd64 / x86_64 |
Intel N100 |
173/622/5431 [4] |
392/1395/5871 [4] |
64/190/1481 [4] |
Img [6] |
amd64 / x86_64 |
Intel Xeon Platinum 8352S |
511/649/5431 [4] |
934/1128/5871 [4] |
222/160/1481 [4] |
|
Raspberry Pi 5 64bit - EL1 |
Arm Cortex-A76 |
247 |
384 |
138 |
Img [5] |
Raspberry Pi 5 64bit - EL2 |
Arm Cortex-A76 |
300 |
401 |
202 |
Img [5] |
NXP S32G2 64bit - EL1 |
Arm Cortex-A53 |
562 |
691 |
230 |
|
NXP S32G2 64bit - EL2 |
Arm Cortex-A53 |
661 |
770 |
228 |