Wednesday, November 27, 2013

REPL kernel and CPU scheduler

Today I saw a picture of BeOS and its kernel debugger. It had occurred to me that a debugger is really the read-eval-print-loop (REPL) of the crude assembly or C language. It makes perfect sense why it's a good idea. BASIC, Scheme, Standard ML, Python, and JavaScript are some example languages featuring a REPL. GDB even features a limited REPL that allows you to call an existing function when debugging a C/C++ program. The REPL makes it easier to inspect the program state and try out some modifications of the program while it's running. It's a powerful rapid prototyping tool. Any serious programming environment must support REPL.

Kernel debugger must be in some way tied to the CPU scheduler, since it needs to suspend execution of the kernel and inspect its state in suspension. This leads me to consider in a very abstract sense how to write a CPU scheduler in general.
void scheduler() {
  while (1) {
    task_t *task = next();  // OS sched. alg.
    if (!task) {
      halt();  // CPU specific.
      continue;
    }
    run(task);  // CPU specific.
    put(task);  // OS sched. alg.
  }
}
The scheduler is really an event-driven loop using some CPU specific subroutines and OS scheduler algorithm functions.

The CPU specific functions are:
  • halt() makes the CPU idle, possibly enters power-saving mode, until an interrupt occurs.
  • run() context switches into the task and returns when an interrupt occurs.
Interrupt is the primary way to allow the scheduler to regain control. For example, preemptive scheduling can be implemented by setting a timer interrupt to force run() to return when the interrupt is triggered. It could be a periodic or a one-shot interrupt; we don't really care. For hardware that only has one single-shot timer, we can easily write a timer multiplexer.

To make things simple, assume that everything in the kernel and the user processes can be structured as a task. The run() function would have to discern what type of context switch is appropriate for what type of task.

The OS scheduling functions are:
  • next() to fetch the next task in the scheduling queue that is ready to run.
  • put() places the task back into the scheduling queue.
The scheduler itself has no notion of priorities. It is next() that returns the next task to run. But depending on the implementation of the queue, the actual scheduling decision can be made in either next() or put(). Furthermore, we defer the notion of task completion to put(). If put() determines that a task is done, it will dispose the task in some other way than putting it back to the queue.

In a multi-processor system, each CPU will run its own scheduler and have its own task queue. The next() function might attempt to steal task from another CPU's queue if the current queue is empty. The queue takes ownership of a task, but the task can migrate from one queue to another. Work stealing of distributed queues is a good load balancing strategy, and I think it's the only such strategy that is provably scalable.

Within this scheduler, one possibility is to represent the REPL as a task and schedule it like all other kernel tasks. This is the simplest, but the REPL would run simultaneously with all tasks. It cannot be used to debug the scheduler or examine a freeze state of the CPU. One the other hand, having a REPL that can look at the live state of the kernel is pretty cool. Suspending the kernel is quite easy: simply allow uninterruptible kernel tasks in run(), then REPL can switch between live and freeze states by toggling the interruptible flag of its own task. You can also suspend other CPUs using a boolean variable to force their next() to all return the nil task which makes the CPU halt. You don't need to disable their interrupts. To resume, just reset the variable.

Even so, such REPL won't allow you to step into or over the scheduler or interrupt handlers. A true SoftICE styled debugger is only possible if you hijack the CPU specific implementation of run(). The good news is that it should be able to coexist with our REPL which is already pretty useful.

No comments: