Sunday, August 30, 2009

Analysis of Darwin kernel exploit

This is an addendum account of The Story of a Simple and Dangerous Kernel Bug.

On Unix, the fcntl() system call is used to manipulate file descriptors (duplication, close-on-exec, access mode, ownership, and locking). On Mac OS X, it is also used to provide read-ahead and write-back hints, and therefore some of the commands are file-system specific. The file-system specific commands are passed to the device driver as if an ioctl called has been made; on Unix, ioctl() allows the device driver to export any method call (associated to a file descriptor) to the userland.

The problem with passing fcntl() arguments to ioctl() in Mac OS X is that ioctl() and fcntl() checks their arguments differently. The argument depends on which command code is passed to the function. If the argument is a pointer, the pointer should be checked that it points to valid memory. However, fcntl() argument checking code does not recognize ioctl commands, so it just assumes that the arguments are integers. This allows unchecked pointers to be passed to device driver code, which can overwrite arbitrary memory address.

Wednesday, August 26, 2009

Summary of 32-bit Windows 3GB memory limit

Just read this article yesterday about how the memory limit is a licensing restriction more than a technical restriction, and thought I would summarize it below.
  • x86 hardware reserves the physical address space between 3GB to 4GB for PCI memory mapped I/O (i.e. talking to devices the same way we access memory), so this part of the RAM is hidden by design of hardware architecture on the motherboard.
  • Some motherboard hardware supports 36-bit addressing on the bus, and the RAM between the 3GB to 4GB range is remapped to a physical address higher than what 32-bit physical address can access.
  • Intel CPU after Pentium Pro comes with Physical Address Extension that allows a 32-bit virtual address to be mapped to a 36-bit physical address.
This boils down to:
  • If the motherboard supports 36-bit address and remaps the shadowed RAM between 3GB--4GB range, then an operating system (such as Linux, Mac OS X, Solaris) could use that RAM using Physical Address Extension, as well as any RAM beyond 4GB.
However, the licensing restriction on Windows works like this:
  • Prior to Windows XP SP2, there is no licensing restriction, so on a capable hardware, 32-bit Windows XP could use as much physical memory as supported by the hardware.
  • Due to licensing restriction, the kernel in and after Windows XP SP2 ignores RAM located at physical address space beyond 32-bit, effectively capping usable RAM to 3GB even if the hardware has 36-bit physical address and remaps the 3GB--4GB RAM.
  • This licensing restriction is lifted on 32-bit versions of Windows Server and Windows Datacenter, as well as 64-bit Windows. These versions of Windows can access all available RAM supported by the hardware.
  • Windows Vista SP2 now reports the amount of RAM installed on the system, not the amount of RAM used by the operating system.

Friday, August 14, 2009

Unit testing for C++ "concept"

When you gain the ability to do template generic programming like this:
template<typename T>
class MyClass {
// use T somewhere
It falls naturally that you expect T to provide certain functionality. For example, many people write code like this:
template<typename T>
class MyClass {
T Add(T x, T y) { return x + y; };
This assumes that T has a well-defined operator+. It might be more precise to write this:
template<class T /* implements Addition */>
class MyClass {
T Add(T x, T y) { return x + y; };
where the interface Addition could be specified like this:
template<typename T>
class Addition {
T operator+(T that)();
This is to say that the T you use to instantiate MyClass cannot be any type, but only a type that has implemented addition. The concept of addition is not very interesting, so let's use another example.

Let's say you have the concept of a map<K, V> that is an associative map from keys of type K to values of type V. For the sake of testing, you can assume that K is both hashable and comparable (using an integer key will do). Then you are given two implementations of the map<K, V> concept, one implements the binary search tree sorted by K, and one implements hash table hashed by K. Most of the tests you want to do have to do with the concept of map<K, V> itself, e.g. insert, replace, find, and remove; not to do with the particular way the map is implemented. How do you avoid duplication? Use Type-Parameterized Tests.

However, here is a tricky bit. A concept does not describe how a class is constructed or destroyed, only that the class implements certain methods. What if the class does not have a default constructor? For example, the hash table requires an initial size. Most implementations of a hash table supplies a "good enough" default value for the default constructor, but there are cases where this is not applicable.

I figured out that you could parameterize the test using a factory class instead, and the factory would act like a default constructor. This is illustrated here.

However, this is not as straightforward as I would like because when a template class inherit from the base class which is also a template, the derived class does not have access to any of the protected variables! This is illustrated here. This code compiles, but if you replace the self->x_ and self->y_ references with simply x_ and y_, gcc would fail to compile.

Update: apparently gcc is correct, and the workaround is to simply prefix the member access with "this->."

Thursday, August 13, 2009

Unit testing not a good fit for exhaustiveness

So I implemented a layered memory allocator from scratch, somewhat in the fashion of Hoard (mine has a different abstraction). My allocator heavily uses C++ Mixin and Curiously Recurring Template Pattern (CRTP). Mixin is an overloaded term, but the concept I'm citing here is a class whose superclass is a template parameter. This allows a limited form of aspect-oriented programming where the Mixin is an added aspect of an existing class. CRPT is a recursive use of Mixin, which coerces the inheritance hierarchy of the user and the Mixin to form one class. The result allows static dispatch of methods which would otherwise have to be declared virtual, and subsequently dispatched dynamically using a virtual table.

Anyway, long story short. After creating many composable layers of memory allocator, I also created corresponding unit test for each layer. However, the unit tests only covered the best case scenario, that Alloc() never returns NULL. It worked well. But I also wanted to test the case where some allocation returns NULL, in order to tell if the layer handled NULL pointer propagation correctly. This unit test scenario turns out to be much of a nuisance.

To briefly explain why, no allocator always deterministically returns a valid pointer, nor does it always deterministically returns NULL, and it's the handling of the non-determinism that matters. Constructing a mock allocator that always returns NULL would still bypass some code paths. To make sure we traverse all code paths, we would have to construct an allocator that returns NULL after n successes for an arbitrary n, and run the unit tests for n = 1, 2, ... and so on. It may never exhaust all possibilities for a complex allocator layer such as the free list allocator. But then, what about a code path that is not reached unless the allocator succeeds every 2 attempts, or one that succeeds every 3 attempts, and so on?

NULL pointer is a special form of the 'a option datatype in ML. The concept is really simple. The 'a option datatype has two constructors, SOME which carries an item of type 'a, and NONE which doesn't carry an item. When a pointer is not NULL, it is assumed that it refers to a valid object, hence corresponds to SOME; when a pointer is NULL, it is assumed that no such object exists (typically due to an exception), hence corresponds to NONE. When you pattern match against the constructors of a datatype, ML also performs exhaustive analysis. Forgetting to check for NULL is a kind of non-exhaustive pattern matching. Since exhaustiveness is a local property, using unit test (which tests for a property that is much wider in scope) is overkill and ineffective. It is very hard to create a unit test that will run all possible execution paths, especially for the error handling code, because simulating all possible combinations of error condition is impractical.

Although a lot of times it's easier to write unit test for operations that are in-spec (e.g. to make sure a sorting algorithm permutes the sequence in order), it is best to leave exhaustive analysis to a compiler, like ATS, which effectively deals with out-of-spec situations.

Tuesday, August 11, 2009

The coming of virtual appliances

Everyone knows that web 2.0 has revolutionized people's private lives, with online social and entertainment applications like Facebook and YouTube. However, web 2.0 so far only has limited success for business applications in offices where the real money is at. While people are willing to entrust personal data to web 2.0, businesses can't do the same about their critical business data. Even well-known brand name like Google is fighting a hard fight selling the Apps for Domain service.

The business concerns are manifold. Privacy: do I get to control who has access to my data? Safety: once data is stored, will it persist and out-survive any forms of failure? Liability: is there someone I can sue and recover some damage when worse comes to worst? Being a business model still in its infancy, many web 2.0 companies have terms of service that are against all these concerns. Only after a while do people realize that these are non-issues with certain brand name companies simply because of the sheer number of user base and the lack of negative publicity.
Digression. In fact, some negative publicity became confidence boosters. When lightning stroke one of Amazon's data centers for EC2, it brought down a rack of machines. All their customers had to was to restart the instances on other machines that are still working. They lost only the temporary instance data that have not been committed to permanent storage, the same kind of loss if you unplug the power to your servers. It took only a few minutes to resume normal operation.
It is different if you are a small web 2.0 startup that wants to focus on business customers. How do you build a customer base when nobody wants to trust their data with you, and how do you build trust without a customer base? Furthermore, the remoteness of the web means it's more difficult to hold you physically accountable, so that is one more reason not to trust you. This is an obvious reason why web 2.0 business-oriented applications failed to flourish so far.

Also, as a small web 2.0 startup, you want to focus on developing the application, not on the plumbing. You can go back to pre-Internet model, selling software on a disc and have your customers install it on their own machines. However, some web 2.0 applications are built on top of many dependencies (web server, database, programming language runtime) as well as being intricately integrated with the operating system (disk access policy, user privilege, network firewall rules, performance tuning). It might not be your customer's best interest to run IT in-house either.

Cloud computing is a solution to this problem. Here comes a brand name utility provider like Amazon with a service like EC2 that you will more likely trust. And a third-party software company with a web 2.0 application you need. They give you a stateless virtual machine image (such as AMI) and charge you a license fee for using their software. Amazon provides the hardware, storage, and network infrastructure, and bills you for the operational costs. All the costs are fully transparent: license fee for the development and support of the application, and Amazon for only what you use.

When it comes to a customer who has built an infrastructure around VMware technology, the third-party simply rereleases their web 2.0 application as a VMware virtual appliance. It is easy to adapt to another virtualization technology because the only difference is the tool used for packaging a virtual appliance; the software stack---the operating system and all the dependencies---stay very much the same.

Once the web 2.0 company grows large enough, they could consider selling hardware appliances; or if they're willing, provide a hosted solution for their apps when someone asks for it.

To summarize, cloud computing with virtual appliances is going to be a great platform for business oriented web 2.0 technology, particularly lowering barrier to enter market due to lack of trust of critical data. And now, web 2.0 can finally enter the business sector.