Sunday, January 11, 2009

Objective-C memory management

Here is a quick summary of Objective-C memory management.

Objects are reference counted, so if you do:
NSClass *objp = [[NSClass alloc] init];
It will need an accompanying:
[objp release];
But many Core Foundation classes also have factory (class) methods that look like:
NSSomething *objp = [NSSomething somethingWithInitializer:initializer];
For example, NSArray methods +array, +arrayWithArray, +subarrayWithArray, ...; NSString methods +string, +stringWithString, ...; and so on. The convention used by Core Foundation is that the object returned by factory methods still have reference count of 1, but they are added to the "autorelease" pool. You do not release nor autorelease these objects!

The "autorelease" pool is simply a list of objects to be released at a later time. The pool is dynamically scoped (like exception frames) and is unrelated to the static (lexical) scope of the program. When the program leaves an autorelease scope, all the objects in the pool of that scope are released. An object can be added to the pool several times, in which case it will be released the same number of times they are added to the pool. There doesn't seem to be any way to take an object out of autorelease pool.

Cocoa's Event Manager always creates an autorelease pool frame prior to handling an event. The frame is popped after calling the event handler. This means your program typically creates some garbage that is cleaned up altogether after finishing an event. Objects that outlive the scope of an event must be explicitly retained.

Apparently, Leopard also supports garbage collection with the -fobjc-gc GCC flag.

Thursday, January 8, 2009

Making NULL-pointer reference legal

Normally, NULL pointer reference results in access violation and is illegal. However, it is possible to make NULL pointer reference legal on several operating systems. There are two ways to do that: (a) using mmap(2) with MAP_FIXED flag (works on Linux and Mac OS X), and (b) using shmat(2) with the SHM_RND flag (works on Linux). Consider the following program:
#include "stdio.h"
#include "sys/mman.h"

int main()
void* p = mmap(

if (p == NULL)
printf("*(NULL) = %x.\n", *((int*) NULL));

return 0;
The program, when run on Linux and Mac OS X, prints *(NULL) = 0, and that's the result of the program doing a NULL pointer reference. This doesn't work on AIX.

Uses of hugetlb

While the kernel hackers busily working out what are the reasonable performance expectation of hugetlb in the Linux kernel for allowing application to use large memory pages (>>4KB, typically ~4MB) offered by Page Size Extension, there are a few consequences for using large pages that restrict applicable uses of it. Using a large page improves performance by significantly reduces TLB misses, but the sheer size and alignment requirement of the page is the cause for concern. Assuming a large page is 4MB in size, a 4MB page has to be aligned to 4MB in the physical memory and has to be continuous, and the OS mixing 4KB and 4MB pages might run out of continuous physical memory due to fragmentation caused by 4KB pages.

An application could use large pages as a general-purpose heap, but it should avoid fork(). There are currently two ways to allocate large pages: using mmap(2) to map a file opened in hugetlbfs (on Linux only), or shmget(2) passing a special flag (SHM_HUGETLB on Linux, SHM_LGPAGE on AIX, noting that on AIX a large page is 16MB).
  • Using shmget(2), the shared memory backed by large page cannot be made copy-on-write, so both parent and child processes after fork() now share the same heap. This will cause unexpected race condition. Furthermore, both processes could create a new heap space which will be private, and references to the new heap will cause memory error in the other process. References to memory newly allocated in another private heap such as malloc(2) will also be invalid in the other process.
  • While mmap(2) allows you to set MAP_PRIVATE to trigger copy-on-write, the copying is going to be expensive for two reasons. The most obvious one is the cost of copying 4MB data even if the child only modifies a word of it and proceeds with an exec(2). With 4KB memory pages, copying a page on demand is much less expensive and can be easily amortized across memory access over process run-time. The other reason is that Linux might run out of hugetlb pool space and has to assemble physically continuous blocks of memory in order to back the new large page mapping. This is due to the way Page Size Extension works. Furthermore, the OS might also require the swap space backing a large page to be continuous, needing to move blocks on a disk, causing lots of thrashing activity. This is much more expensive than simple copy-on-write.
A large page could also be used as a read-only memory store for caching large amounts of static data (e.g. for web servers), so it doesn't have to be made copy-on-write. It could be made read-write but shared if memory access is synchronized across processes, for example by using SVr4 semget(2) or by using lock-free wait-free data structures. This is appropriate for database servers.