Performance Overview#

IPC, Context-Switch and Syscall Performance#

With L4Re being a microkernel-based system and hypervisor, some of you are interested in the IPC and syscall performance of L4Re as well as the performance of context switches. IPC is a base-level communication mechanisms that allows to exchange a limited amount of payload data between two threads. Context switching is switching from one executing thread to another, which sending a message is exactly doing. The fastest IPC is between two threads running in the same address space (task) on the same CPU core (Intra). Inter is IPC between two address spaces. A syscall is also an IPC but only communicates with the kernel.

The following table provides IPC performance numbers for a single IPC on various popular platforms. To perform the measurement, the L4Re microkernel has been configured in its performance configuration CONFIG_PERFORMANCE=y, i.e., without assertions.

The source code of the benchmark program can be found here. The images used to measure those are linked in the table below.

Numbers are measured with the performance counters. On Arm, the cycle counter is used. On x86, the fixed-function counters are used.

Platform	Processor	IPC (in CPU cycles)		Syscall	Image
Platform	Processor	Intra	Inter	Syscall	Image
Raspberry Pi 5 64bit - EL1	Arm Cortex-A76	247	384	138	Img [4]
Raspberry Pi 5 64bit - EL2	Arm Cortex-A76	300	401	202	Img [4]
NXP S32G2 64bit - EL1	Arm Cortex-A53	562	691	230
NXP S32G2 64bit - EL2	Arm Cortex-A53	661	770	228
Ampere Altra (32 Cores) 64bit - EL2	Arm Neoverse-N1	298	440	148
amd64 / x86_64	Intel N100	173/622/543 [5]	392/1395/587 [5]	64/190/148 [5]	Img [6]
amd64 / x86_64	Intel Xeon Platinum 8352S	511/649/543 [5]	934/1128/587 [5]	222/160/148 [5]	Img [6]

Performance Overview

Contents

Performance Overview#

IPC, Context-Switch and Syscall Performance#