Wednesday, March 4, 2009

stdio.h discard a line from input

There are a few different ways to phrase this question, all relating to stdio.h input handling:
  • What is a better way to discard input other than using fgetc() to read it character by character?
  • How do I discard a line without using fgets() to store it first? It may not clear the line completely due to limited buffer size.
Through creative uses of scanf() this can be done. The format string of scanf() looks like printf(). Rather than passing the value to be printed, you pass a pointer to some memory location to store the value that is parsed. In addition, you can use an astrisk '*' modifier to tell scanf() to discard input.

The first try, scanf("%*s\n"), doesn't really quite do what we wanted. The format "%s" matches only non-white characters (space, tab, newline, etc), so we only discard up to the nearest white space. The "\n" eats another white space (not a newline, contrary to intuition).

The Glibc version* of scanf can match a sequence of characters of any length given character ranges in brackets, sort of like POSIX regular expression character classes. The matching does not care about white spaces, so scanf("%*[^\n]") discards all non-newline characters from the input. But that doesn't quite do the trick yet. The input buffer still has a newline character.

Finally, scanf("%*[^\n]\n") does the trick. It first discards all characters leading to a newline, then discards a white character which must be a new line.

*Update: appears to be ANSI C3.159-1989, so this feature should be portable to any conforming implementations. AIX has it. BSD has it. Glibc is the only one I tried.

Monday, March 2, 2009

C++ undefined vtable symbol

I was recently puzzled by a student's problem using GCC to compile some C++ code. He compiled the partial code after every changes in order to make sure he didn't screw up. At one point, the code compiled but gave this linker error:
Undefined symbols:
"vtable for PRNG_Uniform", referenced from:
__ZTV12PRNG_Uniform$non_lazy_ptr in ccBZAsss.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
Searching on the web didn't yield any meaningful information about the error message. It took me a while, but I finally realized that it just means "there are some virtual methods you haven't defined; please define them."

In this particular example, the idea is to have an abstract base class PRNG (pseudo-random number generator) that derives PRNG_Uniform (uniform distributed) and other PRNGs with a variety of probability distributions (e.g. exponential, normal, or even deterministic). There is a virtual method from the base class that takes a random sample. The subclasses would implement that method. Another part of the system he implements would refer to the base class PRNG without caring what probability distribution characterizes the random samples.

The idea is that C++ compiler creates a virtual table for PRNG and each of its subclasses. The virtual table allows the method call on an object to be re-routed to the appropriate subclass implementation, even if all we have is a pointer to a PRNG object rather than to PRNG_Uniform or a specific subclass of PRNG.

In order for this virtual table to be available for the linker, all the virtual methods must be defined at link time.