Wednesday, September 10, 2008

Minix3 in VMware Player

Here are a few installation tips:
  • Use easyvmx to create a virtual machine configuration.
  • It appears that 256MB of memory should be enough. This can be changed later in VMware player.
  • Network driver should be vlance (which appears as AMD Lance to Minix).
  • The installation is going to be much faster if you download the Minix ISO (IDE-3.1.2a.iso) and fix that image as the second CDROM drive in VMware player. Leave the first drive on auto detect.
  • Minix doesn't support more than 4GB hard drive space. When partitioning, I use 1024 MB for /home and the rest for /usr (which gives about 3000MB).
  • Sound driver: Ensonic ES1371.
  • Disable serial and parallel ports. On my machine I don't have access to them as a normal user.

Saturday, September 6, 2008

Funny how people use scripting languages...

As a reflection to Firefox TraceMonkey benchmark against Google Chrome V8, the trend that JavaScript will become a common run-time language is emerging. JavaScript is dynamically typed, and now fairly optimized in performance. More and more applications are written for the web 2.0 platform, in JavaScript and HTML.

In addition, as we hope that strongly typed systems will be adoped more for debugging aid, serious programmers will probably write applications in something like Google Web Toolkit, which is a compiler from a subset of Java—a typed language—to JavaScript. This further adds to the impression that JavaScript can be used as a common run-time object language for other source languages to compile into.

Before JavaScript, there is C, which is used by source languages like ATS, Haskell (for bootstraping), and SmallTalk (also for bootstraping). Looks like JavaScript might becoming to assume the role of C as an object language on the Web platform.

Thursday, September 4, 2008

Hard to find bug

A while back, I implemented an array based queue. For the purpose of lock-free wait-free operations, the two indices for front and rear are strictly increasing, and the corresponding index into the array is computed by taking the modulo of array length. It is designed that way so when two threads modify the front or rear pointers, they can detect when they're in each other's way (using compare and swap) and help along the other thread to make progress, rather than just spinning and waiting.

The problem is this. In a practical computer, these front and rear pointers are represented in finite integers (most likely 32-bit), and they will wrap around after a long time. If the length does not divide 232, then 232 modulo array length should be non-zero, while the wrap around makes the pointers 0, and 0 modulo length is zero. This causes discontinuity in array access, and it only happens after manipulating the queue for a very long time.

The fix is to force array length to be a power of 2, which divides 232, so 232 modulo length of the array is 0 just like the integer wrap-around makes it so.

Wednesday, September 3, 2008

Interview Questions

Here are some of the programming interview questions that I came up with. These are not questions from someone else. Their similarity to existing questions are purely coincidental.
  • Suppose you are given two numbers represented by a list of their prime factors and the power (e.g. 2520 = 23 × 32 × 5 × 7, and 2772 = 22 × 32 × 7 × 11),

    • Determine what is a good representation of numbers as prime factors; and
    • Write the gcd (greatest common denominator) and lcm (least common multiplier) for two arbitrary numbers.
  • Suppose I give you a function (predicate) that takes an integer and tells you whether the number is smaller than or equal to some integer constant only the function knows about. You can assume the hidden constant is an integer greater than or equal to 0. Write a function that tries to guess the hidden integer (hint: use binary search, but watch out for subtle corner cases).

Language features that I cannot live without

  • Mesh implementations around for code reuse. Also a detailed specification so the compiler would warn when a particular combination may not produce the desired result (i.e. combining an iterator with O(logn) collection lookup does not result in O(n) algorithm).
  • Detailed stack trace when an unhandled exception happens. Requires:

    • Tracing stack frames.
    • Converting code address to symbols.
    • Look-up function type from symbol (so we know how to print the arguments).
    • The facility to convert any run-time value to a string.
  • Unit testing. Also needs to convert any run-time value to a string for printing the test case that fails expectation.
Build system
  • Like ocamlbuild, specify the source code for "main" executable and automatically figures out the dependencies.
  • Should be able to aggregate source files into an artifact (i.e. library) and build around the artifacts.
  • Leverage external Makefile targets when necessary.
  • Need a way to automatically generate documentation from source code. The detail would be written as special comments.

Update: (May 29, 2009)

Higher order functions
  • Lexically scoped closures. The closure doesn't have to be heap allocated. I think most closures I use do not escape.
  • Higher-order function inlining. A lot of time I use higher order function to customize an algorithm, like list iterator. In these cases, inlining makes it as efficient as writing the algorithm by hand everytime I use it.