Thursday, February 21, 2013

Enforce RTOS in hypervisor

This is a brain dump of a simple idea of using the hypervisor to make sure that a real-time operating system is, honestly, real time.

An operating system can't be real-time if any parts of it masks interrupt. It also cannot be real-time if its interrupt handler runs for too long, causing some interrupts to be missed. However, in the kernel everything is run at privileged level, and there is no law enforcement in this wild-wild-west.

Unless you run the kernel under a hypervisor that enforces these policies.

When a kernel runs under a hypervisor, privileged instructions such as cli/sti (clear interrupt, set interrupt) is trapped by hypervisor. Since the hypervisor handles all the interrupts first, it would normally just disable or enable interrupt relay for that particular virtual machine. In our case, to ensure that the kernel never clears interrupt, all it has to do is to abort the kernel. Simply ignoring it would likely cause the kernel to corrupt its internal data structure and die sometime later anyway.

Alternatively, the hypervisor would simply log the cli/sti as they happen but still emulate them. A RTOS kernel will only pass certification if no such events are logged.

The hypervisor could also measure the time an interrupt handler in the kernel takes to execute. If handler takes too long, it would abort the VM. Alternatively, the hypervisor could detect and log a long running interrupt, or when an interrupt is about to be dropped due to a long running interrupt or masked interrupt.

ps. Some good references: