Monday, November 23, 2020

pidfd_getfd is harmful!

Everyday, it seems like Linux is gaining more Windows features to allow one process to hijack another process, which enables computer viruses and malware to propagate and steal user data. A recent example is the addition of pidfd_getfd(), a system call that lets one process duplicate a file descriptor belonging to another process. The idea is that process Mallory would inspect procfs and see which file descriptors process Victim has open, then Mallory can duplicate them at will.

Before pidfd_getfd(), regular files may be reopened via procfs. It works even if the file has been unlinked, although the opening is still subject to inode permission. It may not seem like a big deal, but file permission controls access by the owner user and group, so it could not let one process keep secret from another process belonging to the same user, or from root. Fortunately, sockets and pipes cannot be reopened from procfs.

There are some questionably legitimate uses for pidfd_getfd(), but this mechanism is flaky. A commentator on LWN already pointed out the hilarious confusion due to race condition. There is a delay between Mallory enumerating the file descriptors in procfs and when it decides to steal a file descriptor. In the mean time, Victim could have duplicated another file to it (e.g. via dup2()).

Here is the real reason why pidfd_getfd() is harmful: it violates the assumption in POSIX that a file descriptor is a credential testifying that a process has access to some resource. This credential can be passed around using sendmsg() with SCM_RIGHTS, or inherited by a child process through fork(). Both of these are deliberate actions taken by someone who possesses the credential. Now the security model is weakened to allow anyone who can call pidfd_getfd() to gain unauthorized access to these resources. It's unauthorized because it lacks deliberate action by the credential owner.

Before, a process could create a socketpair() and be secure in the knowledge that only the process it shares the file descriptor with can communicate over this secure local channel. Since the socket pair is already connected, it cannot be connected to another socket. But pidfd_getfd() allows Mallory to access the secret channel between Alice and Bob.

Before, Alice could send a file descriptor for a file she owns to a specific process owned by Bob even though Bob normally couldn't open the file. Another process owned by Bob would not be able to reopen that file from procfs. However, pidfd_getfd() bypasses the file permissions check.

An important principle in designing a secure system is to disallow access or an operation by an outsider unless a deliberate action is taken by the owner of the resource. Unfortunately, pidfd_getfd() breaks that principle. It was introduced in Linux kernel 5.6, so versions henceforth are no longer secure by design.

Thursday, November 5, 2020

Electronic Voting

As Americans patiently wait for the result of the 2020 POTUS election, I decided to take a crack at solving the electronic voting problem, with a timing that is perhaps a bit irrelevant now but may be useful again in 4 years. Now, this is by no means a survey of the state of the art on electronic voting, since the research has been done in so many years by so many people. This exercise is just for me to kill time.

A naive electronic voting may have accountability issues, namely we don't know if a vote is legitimate, or that all votes have been counted correctly. In order to know if a vote is counted, we need to know how everyone voted, which seems to sacrifice anonymity, unless we come up with an electronic analogy of a paper ballot. Traditionally, a ballot is issued in person at the voting station, or mailed to the voter in advance for mail-in ballots. There is little protection against ballot forgery, but it would be difficult to forge ballots physically at a large scale.

For electronic voting, a ballot certificate would be issued by the county government to the voter upon voter registration. The issuer is responsible for checking voter identification before issuance. This ballot certificate gives the voter the right to vote, and should be kept secret before a vote is cast (anyone in possession of a ballot certificate could vote with it). The certificate has a unique serial number and a cryptographic signature signed by the issuer. The serial number ensures that the ballot could only be used to cast at most one vote, and the signature verifies that the ballot is issued legally by the county. Otherwise the ballot is not tied to the voter identity. The county should maintain a separate list of voters who already registered, so a voter could only ever receive one ballot certificate. Large scale fraud can be detected if a county somehow issued more ballot certificates than its voting population.

Votes are entered into the voting databases along with the ballot certificate. Since the voting database contains no personal identifiable information, its entirety can be downloaded by the public when the election is concluded, and anyone could verify the presence of their vote in the database as well as independently count all of the votes. US population is only ~328 million, so the database is probably ever going to be only a few gigabytes which can comfortably fit onto a smart phone. Ballot verification could still take an hour or two (e.g. ed25519 signature verification rate is 71K/s), but counting should take only a few minutes if not seconds.

The challenge is distributing the database while ensuring integrity, since anyone receiving a copy with parts of the database missing or tempered with would not know it, and would arrive at a different count.

To ensure that all the votes are recorded without being tempered, they need to appear as part of a Merkle Tree (like how git works). Each transaction begins with a base commit hash and stores the content of the voting record (ballot, vote), both of which would derive a new commit hash. The complete voting record (base commit hash, ballot, vote, new commit hash) is returned to the voter as a receipt, and the new hash is also used as the base commit hash for the next transaction. Once a vote is cast, the ballot certificate is no longer a secret.

This simplest form of the Merkle Tree is just a singly linked list, but this type of transaction is very inefficient, since it requires all the votes in the universe to be sequentialized (i.e. only one person can vote at any given time). A tree construction allows transactions to take place in parallel (i.e. many people can cast vote at the same time). First, vote commits are built sequentially as before, but when a good number of votes are collected, they are finalized into a block by recording the final commit hash of the block. Multiple blocks can be built concurrently and then merged into a superblock by creating a commit hash out of all the block hashes. Superblocks can be similarly merged, until the election is closed, at which point the whole election is given a final commit hash.

This way, a voter can verify that their vote is counted in the election if they could trace their vote's commit hash all the way to the final election hash. If a vote is tempered with, the commit hash would change. If a block is tempered with, its block hash would change. If any merges are tempered with, the final election hash would change. Anyone could download the whole voting database and know that they have the complete, untempered copy. They can write software to independently verify the Merkle Tree without trusting the software that generated it.

The electronic ballot certificate could be printed on paper as a 2D barcode, which allows the bearer to cast the vote as a paper ballot. When voting in person, the voting machine would scan the barcode, ask for voter input, conduct the transaction, then print out the receipt with the commit hash. If this 2D barcode is a QR code, it could be scanned by a mobile app for possibly voting by phone, but the risk is that a malware app could steal the ballot and cast the vote without the voter's knowledge.

I believe this system is secure and practical, but I haven't spent too much time agonizing over it to tell for sure. I purposefully avoided mentioning the word "blockchain" because many people use that buzzword without knowing what it really means. Even though a blockchain is built using a Merkle Tree as well, it limits the rate of transactions globally to 5 per second (really low) through proof of work. The proof of work was invented for Bitcoin to impose artificial scarcity to the cryptocurrency. In the context of election, scarcity may be able to mitigate large scale fraud, but this is not a good way for ensuring election integrity.