Life of a Computer Scientist

Tuesday, March 4, 2025

Review of Memory Safe C/C++ Proposals

In C++ creator calls for help to defend programming language from 'serious attacks', the article mentioned a few proposals:

Profiles (C++) by Bjarne Stroustrup, work in progress on GitHub.
TrapC by Robin Rowe, news report also has example code.
Fil-C by Filip Pizlo.
Mini-C by Aymeric Fromherz (Inria) and Jonathan Protzenko (Microsoft).
Safe C++ (also known as Circle C++) by Sean Baxter.

Assuming we all know why memory safety is important, let's dive into how each of these proposals deliver on memory safety. Bear in mind that these are working proposals, so some of the details are "magic" or wishful thinking that needs to be fleshed out further.

Profiles (C++)

Profiles by Bjarne Stroustrup is not a single proposal, but a collection of proposals. Each profile states a promise about the safety property it provides and the language features or checks needed to achieve it. Profiles enforcement can be specified in code or toggled through compiler flags.

Through the use of a new expect() function, which is like assert(), error handling can be done in one of the following ways: ignore, logged (and ignored), logged (and throw an exception), throw an exception, or exit the program.

There are several profiles and their summaries:

Profile: Type stipulates that "every object is used only in accordance with its definition." It may be a union of several profiles such as Ranges, Invalidation, Algorithms, Casting, RAII and Union. One idea is to eliminate raw pointer handling from collection objects through a new span abstraction.
Profile: Arithmetic detects over and underflow conversion errors.
Profile: Concurrency detects race condition and deadlocks. It acknowledges that this is the "least mature of the suggested profiles" and "has received essentially no work specifically related to profiles."
Profile: Ranges detects out of range indexing.
Profile: Pointers stipulates that "every pointer points to an object or is the nullptr; every iterator points to an element or the end-of-range; every access through a pointer or iterator is not through the nullptr nor through a pointer to end-of range." It introduces a new language feature not_null, which looks like a type qualifier but it is not clearly specified.
Profile: Algorithms stipulates that "no range errors from mis-specified ranges (e.g., pairs of iterators or pointer and size). No dereferences of invalid iterators. No dereference of iterators to one-past-the-end of a range." There is some overlap with the pointers profile regarding the iterator end of range. It introduces a new language feature not_end(c, p) which returns whether iterator p is at the end of the container c.
Profile: Initialization stipulates that "every object is explicit initialized."
Profile: Casting prevents integer truncation by narrowing cast. Provides a narrow_cast<> with runtime checking.
Profile: Invalidation prevents "access through an invalidated pointer or iterator." The compiler is supposed to "ban calls of non-const functions on a container when a pointer to an element of the container has been taken," and suggests that the compiler does it through "serious static analysis involving both type analysis and flow analysis" (i.e. magic).
Profile: RAII prevents resource leaks by representing every resource as a scoped object. The constructor and destructor handle the acquisition and release of the resource. This can also be used to do reference counting.
Profile: Union says "every field of a union is used only as set" and suggests later providing pattern matching (i.e. algebraic data types) as an alternative.

It is clear that Profiles leverage many C++ only features and will not apply to C. However, the strength of this approach is that it recognizes safety as a synthesis of many issues that can be addressed incrementally. It allows legacy code to be incrementally updated to satisfy one profile at a time, so there is less upfront cost towards memory safety.

Profiles can also become unnecessarily broad. For example, concurrency through flow analysis is another can of worms that requires computing the arbitrary permutation of concurrent access to detect race conditions. Invalidation is also magic, as most code do not sufficiently express their intent to transfer resource ownership. On the other hand, it is unclear if all the profiles together will guarantee what people now expect from "safe" Rust.

TrapC

TrapC by Robin Rowe proposes a new dialect of C with the following modifications:

malloc() always allocates from a garbage collected heap, and free() is no-op.
Pointers are instrumented with type and size information, and pointer dereferencing is checked in runtime.
Access violations can be caught using a new trap statement. Unhandled violations will terminate the program.
goto and union are not supported.
TrapC can call C functions but not vice versa.
Typesafe printf() with a generic "{}" format specifier.
No special provision for thread-safety.

It is supposed to be able to compile unmodified legacy C code with additional runtime checks. When access violations cause unwanted program termination, users can write trap handlers as necessary. The white paper suggests a possible goal to "produce executables that are smaller and faster than from C compilers" but it is not clear how it is possible with additional runtime checking overhead (i.e. magic).

There is some escape analysis, so if a scoped object is returned as a pointer, it becomes heap allocated, similar to Go. Through this feature, TrapC proclaims that "it is possible to have code that is wrong in C, yet right in TrapC," but it is not clear how much legacy code that used to have undefined behavior in C will now benefit from having a defined behavior.

Fil-C

Fil-C by Filip Pizlo proposes a new dialect of C with the following modifications:

malloc() always allocates from a garbage collected heap, but free() puts an object to a free list.
Pointers are instrumented with type and size information, and pointer dereferencing is checked in runtime.
Garbage collection implementation supports concurrent collection without stop-the-world.

I have some doubts about the free list. The proposal does not prevent pointers from being aliased (having multiple pointers to the same object). Freeing an object will nullify one pointer but the other pointer is still valid. The proposal may be a little immature.

Much of the manifesto extols the virtue of author's garbage collector design, so it's not clear if the author is selling a new language or selling a new garbage collector. Garbage collector is not supposed to be tied to the language. There is no one-size-fits-all garbage collector, so it ought to be possible to use different garbage collection strategies depending on the workload requirements of the application.

Mini-C, or "Compiling C to Safe Rust, Formalized"

Aymeric Fromherz (Inria, France) and Jonathan Protzenko (Microsoft Azure Research, US) explore how to compile C to Rust without resorting to "unsafe" Rust. The resulting code strongly provides the same safety guarantee that Rust provides. Some of the considerations include:

Static analysis to translate pointer arithmetics in C to slices and splitting in Rust.
Infers when a reference needs mutable borrowing, including references from a struct.

They validated the feasibility of their approach on a subset of C that is already formally verified through other means, but it is probably a long shot from being able to accept legacy C code.

It relies on the fact that some carefully written C code has internal consistencies that are not explicitly expressed, but we can design inference algorithms to figure out what these internal consistencies are. Some of the inference techniques used in this paper can be reversely applied on Rust to reduce the notational requirements of Rust code.

The resulting executable does not need garbage collection, but still relies on runtime bounds checking (in Rust).

Safe C++ (also known as Circle C++)

Sean Baxter started the Circle C++ compiler around 2019 as "a compiler that extends C++17 for new introspection, reflection and compile-time execution" with a flare in meta-programming. Some of the memory safety extensions implemented by this compiler over the years are now being proposed as a C++ draft.

Some highlights of the proposal:

#feature on safety activates the compile-time memory safety checks, like #pragma in existing compilers.
A safe specifier that requires usage of safety language extension in a function (though it still allows explicit unsafe code), like the noexcept specifier that disallows a function from throwing an exception.
It still relies on runtime bounds checking.
Ownership tracking through checked references T^ (mutable) and const T^ (shared). Each object may have either a single mutable reference, or any number of shared references, but not both at once.
Named lifetime parameter /a and borrowing reference T^/a.
Lifetime binder template parameter typename T+.
A mut statement prefix to establish a mutable context that allows conversions from lvalues to mutable borrows and references.
A new standard library with safe containers and algorithms. In particular, it replaces begin() and end() iterators with slice iterators annotated by named lifetime parameters.
Pattern matching with "choice type" to enforce that optional values have to be checked before access.
Type traits for thread safety: T~is_send, T~is_sync.
Type traits for allowing unsafe pointer arithmetics: T~as_pointer, T~as_length; not fully explained.
Type traits T~string, T~is_trivially_destructible; not fully explained.

The safety semantics are inspired mostly by Rust, which is mentioned in the proposal 85 times.

Safe C++ may very well provide some concrete design for some of the Profiles work by Stroustrup. Contrary to Profiles, Safe C++'s monolithic "all or nothing" approach might make it more difficult to port legacy code due to the upfront cost to satisfy all memory safety requirements all at once. Perhaps choice type, thread safety, and pointer arithmetics can be split into their own Profiles.

Conclusion

There are several ways to compare and contrast these approaches.

Whether they expect significant modification to legacy code:

Upfront: Safe C++, Mini-C.
Incrementally: Profiles C++
No: TrapC, Fil-C.

Whether they force the use of garbage collection:

Yes: TrapC, Fil-C.
No: Profiles C++, Safe C++, Mini-C.

Whether they require C++.

Yes: Profiles C++, Safe C++.
No: TrapC, Fil-C, Mini-C.

In light of the recent Rust-in-Linux debacle, if we were to port Linux kernel code to a memory safe C dialect, we would not be able to use garbage collection, nor would we be able to use C++. This leaves Mini-C as the only viable option. However, the inference algorithm may not be able to handle the complexity of Linux kernel code, so some kind of object borrowing or lifetime annotation will still be needed.

Incremental safety check is a useful feature, as this alleviates the upfront cost to fix legacy code for all memory safety issues all at once.

It's worth noting that all of the proposals above require runtime bounds checking. Without some kind of size annotation throughout the code, it would be hard for static analysis to infer whether bounds checking can be safely omitted. This precise problem is solved by ATS through the use of dependent types. Perhaps it could be useful to design a dependent type system for a dialect of C for those who aren't used to ML styled programming of ATS. We can take some inspiration from Mini-C to reduce the notational overhead.

Saturday, February 8, 2025

Polling vs. Interrupt

Recently there is news reverberating across the interwebs that Tiny Linux kernel tweak could cut datacenter power use by 30%. The claim is based on this code commit detailed by the paper Kernel vs. User-Level Networking: Don't Throw Out the Stack with the Interrupts. The main idea is to mix the use of polling and interrupt driven I/O to get the best of both worlds.

Polling is just busy looping to wait for something to happen. It only takes a few CPU cycles each time we check, so the turnaround is very efficient. But if we keep looping and nothing happens, it is wasteful work.

Interrupt driven I/O programs the CPU how to handle an event when something happens, then it tells the CPU to go do something else or go to sleep when there is nothing else to do. The CPU has to save the context of the current program and switch context between programs, so the overhead is greater. But it is more efficient if there is a lot of wait between events.

When we have events that take some time between arrival, but tend to arrive in a bunch (e.g. poisson process with Exponential Distribution intervals), then it makes sense to wait for the next event using an interrupt, then process subsequent events using polling. We can further limit the amount of time spent in polling to equal to the opportunity cost if we were to use an interrupt, and it's a win-win situation: if an event arrives sooner, we could save on the cost of the interrupt minus the cost of polling, but never worse.

This idea is in fact not new. This technique is often used in task schedulers. I remember seeing its use in an old MIT Cilk version circa 1998. I have seen it in some mutex implementations. I have also used the same technique in my Ph.D. work circa 2014 (in the artifact Liu_bu_0017E_429/libikai++-dissertation.tbz2, look for parallel/parallel.cc and the deep_yield() function in particular). This technique just seems very trivial, so I have never seen anyone bother mentioning it.

So will this technique cut datacenter power use by 30%? The paper measured "queries per second" improvements in a memcached microbenchmark, but it is naive to assume that 30% increase in performance automatically translates to 30% reduction in power use. That's assuming all the datacenter does is receiving network packets with no additional work, such as: cryptography, marshaling and unmarshaling of wire protocol, local memory and data storage I/O, calling out to external services, and computing a response. The 30% figure is the upper bound in the most ideal scenario, and it caught the attention of the Internet.

It is a welcoming optimization nonetheless, and the authors have done the hard work to validate their claims with benchmarks and a thorough explanation how it works, so they deserve the credit for the execution. Perhaps someone could continue to audit whether other OS subsystems (e.g. block devices, CPU scheduler) can also benefit from this treatment, for Linux and other OSes.

Friday, January 3, 2025

Why RAID 1+0 is Better Than RAID 0+1?

In this article, we discuss why RAID 1+0 (stripes of mirrors) is better than RAID 0+1 (mirror of stripes) for those who are building a storage array.

Image credit: Network Storage Server (angle view with case open) from Wikimedia Commons

What is RAID, and why?

RAID (redundant array of independent disks) describes a system of arrangements to combine a bunch of disks into a single storage volume. The arrangements are codified into RAID levels. Here is a summary:

RAID 0 (striped) essentially concatenates two or more disks. The stripe refers to the fact that blocks from each of the disks are interleaved. If any disk fails, the whole array fails and suffers data loss.
RAID 1 (mirrored) puts the same data across all disks. It does not increase storage capacity or write performance, but reads can be distributed to each of the disks, thereby increasing read performance. It also allows continued operation until the last disk fails, which improves reliability against data loss (assuming that disk failures happen independently, see discussion below).
RAID 2, 3, 4, 5 (parity) employ various parity schemes to allow the array to still operate without data loss up to one disk failure.
RAID 6 (parity) uses two parity disks to allow up to two disk failures without data loss.
RAID-Z (parity) is a parity scheme implemented by ZFS using dynamic stripe width. RAID-Z1 allows one disk to fail without data loss. RAID-Z2 two disks, and RAID-Z3 three disks.

Disk failures may be correlated by many factors—operational temperature, vibration, wear, age, and manufacture defects shared by the same batch—so they are not fully independent. However, the failures can be somewhat randomized if we mix disks from different vendors or models. Each year, Backblaze publishes hard drive failure rates aggregated by vendor and model (e.g. 2023) which is an invaluable resource to decide which vendor and model to use.

Why use nested RAID?

When disk fails in a RAID up to the tolerated failure without data loss, the failed disk can be replaced with a blank disk, and data is resilvered into the new disk. The resilver time lower bound is determined by the capacity of the disk divided by how fast we can write to it. A larger disk will take longer to resilver. If the disk failures are not fully independent, then there is a greater risk of data loss if another disk fails during resilvering.

By way of example, a 24TB disk at a maximum write speed of 280 MB/s will take about a day to resilver (rule of thumb: each 1 TB takes 1 hour). This makes a large disk somewhat unappealing due to the long resilver time. More realistically, we may want to use smaller disks so they can complete resilvering more quickly.

Creating RAID from smaller capacity bound disks necessitates nesting the RAID arrangements. Here are some ways we can arrange the nesting:

RAID 0+1 (a mirror of stripes) takes a stripe of disks (RAID 0) and mirror the stripes (RAID 1). When a disk fails, it impacts the entire stripe, and the whole stripe needs to be resilvered.
RAID 1+0 (a stripe of mirrors) takes a bunch of mirrored disks (RAID 1) and creates a stripe (RAID 0) out of the mirrors. When a disk fails, it impacts just the mirror. When the disk is replaced, only the mirror needs to be resilvered.
RAID 5+0, RAID 6+0, etc. creates an inner parity RAID combined as an outer stripe (RAID 0).

Nesting are not created equal!

We can see qualitatively from the failure impact point of view that RAID 1+0 has a smaller impact scope, therefore is better than RAID 0+1. But we can also analyze quantitatively what is the probability that a second failed disk might cause total data loss.

In RAID 0+1, suppose we have N total disks arranged into a 2-way mirror of \( N / 2 \) disk stripes. The first failure already brought down one of the mirrors. The second failed disk would bring down the whole RAID if it happens in the other stripe, or approximately 50% chance. Note that we cannot simply swap striped disks between the mirrors to avert the failure, since disk membership belongs to a specific RAID. You might be able to manually hack around that by hex-editing disk metadata, but that is a risky dealing for the truly desperate person.
In RAID 1+0, suppose we have N total disks arranged into \( N / 2 \) stripes of 2-way mirrored disks. The first failure already brought down one of the mirrors. The second failed disk would bring down the whole RAID if it happens in the other disk of the same mirror, or approximately \( 1 / N \) chance. The number of failed disks we can tolerate without data loss is a variation to The Birthday Paradox.

Deep dive into aggregate probability of failure

Another way to analyze failure impact quantitatively is to see what happens if we build a RAID 0+1 or RAID 1+0 array using disks of the same average single disk failure rate. We compute the aggregate probability of failure that would suffer data loss, assuming that failures are independent.

Suppose the probability of average single disk failure is p.
For RAID 0 with N striped disks, the probability that any of the N disks fails is \( P_0^N(p) = 1 - (1 - p)^N \), i.e. the opposite of all N disks do not fail, a double negation.
For RAID 1 with N mirrored disks, the probability that all N disks fail at the same time is \( P_1^N(p) = p^N \)
For RAID 0+1, with k-way mirrors of \( N / k \) stripes, the failure probability is:
\[
\begin{aligned}
P_{0+1}^{k,N/k}(p) & = P_1^{k}(P_0^{N/k}(p)) \\
& = P_1^k( 1 - (1 - p)^{N/k} ) \\
& = ( 1 - (1 - p)^{N/k} )^k
\end{aligned}
\]
For RAID 1+0, with \( N / k \) stripes of k-way mirrors, the failure probability is:
\[
\begin{aligned}
P_{1+0}^{N/k,k}(p) & = P_0^{N/k}(P_1^k(p)) \\
& = P_0^{N/k}(p^k) \\
& = 1 - (1 - p^k)^{N/k}
\end{aligned}
\]

We can plot \( P_{0+1}(p) \) and \( P_{1+0}(p) \) and compare them against p, the probability of single disk failure representing the AFR (annualized failure rate). Here are some examples of p from Backblaze Drive Stats for 2023 picking the models with the largest drive count from each vendor:

MFG	Model	Drive Size	Drive Count	AFR
HGST	HUH721212ALE604	12TB	13,144	0.95%
Seagate	ST16000NM001G	16TB	27,433	0.70%
Toshiba	MG07ACA14TA	14TB	37,913	1.12%
WDC	WUH721816ALE6L4	16TB	21,607	0.30%

In most cases, we are looking at p < 0.01. Some of the models have AFR > 10% but it is hard to tell how accurate the number is due to the small drive count for that model. Those models with a small average age can bias towards high early mortality due to the Bathtub Curve.

We generate the plots in gnuplot using the following commands:

gnuplot> N = 4
gnuplot> k = 2
gnuplot> set title sprintf("N = %g, k = %g", N, k)
gnuplot> plot [0:0.02] x title "p", (1 - (1-x)**(N/k))**k title "RAID 0+1", 1 - (1 - x**k) ** (N/k) title "RAID 1+0"

Number of disks N = 4 with mirror size k = 2 is the smallest possible nested RAID 0+1 or RAID 1+0. At p < 0.01, both RAID 0+1 and RAID 1+0 offer significant improvement over p.

At N = 8 while keeping the same mirror size k=2, we see that both RAID 0+1 and RAID 1+0 still offer improvements over p. However, RAID 1+0 failure rate doubles, and RAID 0+1 more than triples.

At N = 16, the trend of multiplying failure rate continues. Note that RAID 0+1 can now be less reliable than a single disk, while RAID 1+0 still offers 6x improvement over p.

To get the failure rate under control, we need to increase the mirror size to k = 3. Even so, RAID 1+0 failure rate (very close to 0) is still orders of magnitude lower than RAID 0+1.

So from the quantitative point of view, RAID 1+0 is much less probable to suffer a total data loss than RAID 0+1.

Conclusion

As a rule of thumb, when nesting RAID levels, we want the striping to happen at the outermost layer because striping is the arrangement that accumulates failure rates. When we mirror for the inner layer, we reduce the failure rates by orders of magnitude, so the striping accumulates failure rates much slower.

This conclusion also applies to storage pool aware filesystems like ZFS. RAID-Z does not allow adding disks to an existing set. The best practice is to always add disks as mirrored pairs. You can also add a new disk to mirror an existing disk in a pool. Stripes of mirrors offer the most flexibility to expand the storage pool in the future.

This is why RAID 1+0 is better.

Sunday, November 3, 2024

irony-install-server with MacPorts

Update: I was only able to get irony to work up to the 2019 version, but irony has been unmaintained since 2023 and has now since suffered significant bit-rot, and I could not get the 2023 version to work. There seems to be other alternatives (e.g. lsp-mode). However, I'm keeping this page for the record.

The irony server provides symbol completion for irony-mode on Emacs. Under the hood, it uses libclang to parse C and C++ source. On Mac OS, the Xcode command line tools comes with clang, but it does not provide the necessary header files. Here are the specific instructions for MacPorts.

$ cd "$(mktemp -d)"  # any empty temporary directory will do.
$ clang -v
Apple clang version 17.0.0 (clang-1700.0.13.3)
Target: x86_64-apple-darwin24.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
$ sudo port install cmake clang-17
--->  Computing dependencies for cmake
...
--->  Cleaning cmake
--->  Computing dependencies for clang-17
...
--->  Cleaning clang-17
--->  Scanning binaries for linking errors
--->  No broken files found.                             
--->  No broken ports found.
$ cmake \
  -DLIBCLANG_LIBRARY=/opt/local/libexec/llvm-17/lib/libclang.dylib \
  -DLIBCLANG_INCLUDE_DIR=/opt/local/libexec/llvm-17/include \
  -DCMAKE_MACOSX_RPATH=1 \
  -DCMAKE_INSTALL_RPATH=/opt/local/libexec/llvm-17/lib \
  -DCMAKE_INSTALL_PREFIX=$HOME/.emacs.d/irony/ \
  $HOME/.emacs.d/elpa/irony-20231018.1915/server &&
cmake --build . --use-stderr --config Release --target install

Some notes:

MacPorts llvm package provides the libclang.dylib, but clang provides the header files for it (see macports #69392).
LIBCLANG_LIBRARY has to point to the dylib (or the .so on Linux), not the lib/ directory.
irony-20231018.1915 has a bug where irony.el version string is now ";; Package-Version: 20231018.1915", but the server/src/CMakeLists.txt still uses ";; Version:" to detect the version. Update the string accordingly.
Requires cl-libify to patch cl to cl-lib. Install it in emacs using M-x package-install cl-libify.

Friday, March 29, 2024

SSH server compromised by xz/liblzma 5.6.0 and 5.6.1

A backdoor compromising SSH server introduced in xz/liblzma 5.6.0 and 5.6.1 was reported today to oss-security by Andres Freund. According to the analysis, when the sshd binary is initialized by the dynamic linker at startup, the initialization code in liblzma installs a hook to the dynamic linker that modifies subsequent dynamic library symbol tables (before they are made read-only) that replaces SSH RSA encryption functions with malicious code.

Image credit: Wikimedia Commons

The xz repository provides a widely used data compression command line program “xz” as well as a library “liblzma” that allows the compression algorithm to be used in other programs. SSH is a remote secure login server. Although SSH does not use liblzma directly, many distributions such as Debian and Redhat patches it to integrate with systemd notification, which uses liblzma.

The malicious code in liblzma was introduced as obfuscated binary test data in a git commit and patched into compiled binary using an obfuscated M4 macro as part of the build system. The malicious git commit was introduced by JiaT75, who has been contributing xz commits for about two years while taking advantage of the mental health issues of the original author of xz, who maintained it as an unpaid hobby project. Most of Jia's commits are non-technical fixes such as translation or documentation. Jia reportedly urged Redhat and Debian maintainers of the xz package to push the new version to production, which suggests that it is premeditated. The attack was only discovered because Andres Freund noticed that his SSH login got slower and decided to investigate.

It is nearly impossible to manually audit supply chain attacks like this, but there is one way to mitigate this attack vector: all setuid binaries should be statically linked with the static-PIE linking option. Static linking eliminates the dynamic linking attack vector, while PIE enables address-space layout randomization (ASLR) to make it impossible for malicious actors to patch code in runtime. Allegedly, OpenBSD already compiles system binaries this way.

Some may have concerns about static linking, but they can read the refutation by Gavin D. Howard.

Advisory About Go

Static linking alone does not completely guard against runtime code patching, but we need address-space layout randomization (ASLR) for both code and data. That is because the data sections also contain function pointers that could alter the code path. Without ASLR for data, any function pointer in heap-allocated objects can be compromised by supply chain attack.

Go does not support heap data ASLR (golang/go#27583). This means that malicious code using the unsafe package could traverse the heap and override interface function pointers to change the behavior of existing code. This is despite the fact that Go language always compiles and statically links on the whole-program, and has a -buildmode=pie for code ASLR. Unfortunately, the Go crypto, ssh, tls, and net/http packages all make ample use of interfaces, and every one of these can be an attack vector.

Saturday, March 16, 2024

Deep Dive into MQA-CD Encoding

A few weeks ago, I saw this video by Techmoan introducing the MQA-CD. MQA-CD is an audio CD that can be played back in a regular CD player, which is limited to 16-bit samples at 44.1 kHz. However, when played back through an MQA decoder, it promises better sound quality at 24-bit at 192 kHz.

Before we dig into the MQA marketing material, we need to understand that MQA is an encoding scheme that can exist outside of a CD, e.g. audio delivered over the radio or the Internet. Some of the non-CD transports are assumed to carry 24-bit at 48 kHz or higher. However, MQA-CD transport is limited to 16-bit at 44.1 kHz by the CD as its physical medium.

At first glance, MQA violates the Nyquist–Shannon sampling theorem which places a hard upper-bound that a signal at frequency B must be uniquely represented by at least 2B samples per second. However, we can give it some leeway by allowing for lossy encoding, even though some MQA marketing material claims that the encoding is lossless.

In a lossy scheme, we can steal some lowest significant bits from the sample to passthrough a data stream like MP3 that employs psychoacoustic coding. The lowest significant bits sound like the noise floor when listened to without the decoder, and the psychoacoustic coding allows us to put more detail into the noise more economically—basically, the data stream contains instructions about how to synthesize only sounds humans can hear, so we use less data than if we have to encode the full Nyquist-Shannon spectrum. Furthermore, the data stream only needs to contain the delta, which is the sound not already present in the non-stolen bits.

The question about MQA-CD is how many bits it is stealing?

Music Origami, according to MQA

The MQA website links to a blog by the MQA inventor, Bob Talks, which discusses the CD encoding with some technical detail, but it is a little confusing:

If the original source is 44.1kHz/24b or if the sample rate is 88.2, 176.4, 352,8 kHz, or DSD, then a standard MQA file will be 44.1 kHz/24b. The file contains the information for decoding, ‘unfolding’, and rendering.
This 24b MQA file is structured so that, if in distribution it encounters a ’16-bit bottle-neck’ (e.g. in a wireless or automotive application), then the information in the top 16 bits is arranged to maximise the downstream sound quality and still permits unfolding and rendering. See [2]

[2] MQA-CD: Origami and the Last Mile

So reference [2] should contain some information about how the 24-bit is truncated to 16-bit. Here are some mentions:

The Green signal is completely removed by MQA decoders; but it is there so that we can hear more of the music when playback is limited to a 16-bit stream.

Sometimes we might want to listen to MQA music on equipment that doesn’t support 24 bits – maybe only 16? Rather than throw away all the buried information, MQA carries a small data channel (shown in Green) which can contain the ‘B’ estimates, enabling significantly improved playback quality on, e.g. a CD, over ‘Airplay’, in-car, to certain WiFi speakers and similar scenarios.

But it is also confusing because it shows the “Green signal” at -120 dB. We know that CD dynamic range is 96 dB, so it could not have been able to represent -120 dB noise floor. Samples at 24-bit has a dynamic range of 144 dB. However, the signal charts in the page shows a floor of -168 dB, and it was putting some information below -144 dB, which requires 28-bits.

As a side note, CD dynamic range of 96 dB is determined by the formula in terms of the 16-bit sample depth: \( 20 \times \log_{10}{2^{16}} \approx 96 \). As a rule of thumb, each bit in the sample represents about 6 dB in dynamic range.

Another page Deeper Look: MQA 16b and Provenance in the Last Mile also states that:

If we look at the block diagram above, we can see there are three components to the MQA data, broadly described as: i) top 16 bits, ii) MQA signalling and iii) bottom 8 bits

The block diagram clearly shows that the encoding result in 24-bit master file, but it still does not explain how that is reduced to MQA-CD which is bottlenecked to 16-bit samples.

Is Bit Stealing Plausible?

Since MQA still does not explain how the 24-bit master is reduced to 16-bit transport depth on a CD, we are left to speculate about the bit stealing idea earlier.

If we allow stealing 4 bits per sample, then we get a data rate of \( 2 \textit{ channels} \times 4 \textit{ bits per sample} \times 44100 \textit{ Hz} \approx 344 \textit{ kbps} \). This is pretty generous for high quality AAC, which is typically 256 kbps. The dynamic range before decoding is reduced from 96 dB to 72 dB, which is still comparable to a very high quality magnetic tape.

So I would say it is plausible, but it is inconclusive from the MQA marketing material if this is how they did it.

Furthermore, I don’t see the point of MQA’s “Music Origami” that folds 24-bit 192 kHz into 24-bit 48 kHz. If the transport is already capable of lossless 24-bit data, it must be a digital transport that is not a CD, which means there is no requirement to maintain backwards compatibility with a Red Book CD player. We can just use the whole stream to transport encoded audio, e.g. AAC or Flac. Even some later CD players in the 2000’s can play MP3 from a data CD or from a USB drive. That was all possible before MQA launched in 2014.

Which is why Techmoan says that even if you believe MQA delivers higher quality audio, it is a format that came a little too late.

Carrot and Stick Security Design

Carrot and stick security design is the idea to have frontend and backend work together to enforce security policies in software. The frontend interacts with the user and steers them towards compliance, while the backend enforces the security rules. Although we don’t necessarily use carrot and stick to mean reward and punishment, the carrot is a “soft nudge” and the stick is a “hard boundary.” If the user bypasses the frontend and tries to interact with the backend directly, they will be met with a hard error message.

Image credit: Wikimedia Commons

(“Good cop, bad cop” is a similar strategy, although the cop analogy may be controversial.)

An example is a photo gallery that allows visitors to browse but only signed-in users to download images. The frontend may present a “download” button but will ask the user to either login or create an account. The backend checks that the login credentials are present before allowing the image to be downloaded.

If the security check is only done in the frontend, then the user could simply bypass login by forging a URL request directly to download the images. If the security check is only done in the backend, then an innocent user that did not know they need to signup or login first may be confronted with an unfriendly error message.

I’m intentionally using the terms “frontend” and “backend” loosely. In practice, the designations may differ depending on the application:

For a user-facing website, frontend is the client side Javascript, and backend is the HTTP server.
For a mobile app, frontend is the app, and backend is some API used by the app.
For an API, frontend is the HTTP server middleware, and backend is the internal data storage.

Even at the API level, the API design should try to encourage well-defined use cases (the carrot), and let the protocol layer check for malformed requests (the stick).

What this means is that a complete software stack that spans the gamut of client side Javascript or app, an API middleware, and backend storage should implement security enforcement at all layers.

Friday, March 1, 2024

Memory Safety State of the Union 2024, Rationale Explained

There has been renewed interest in programming languages after The White House recently published a recommendation suggesting the transition to a memory safe language as a national security objective. Although I am not an author of the report, I want to explain the rationale that someone might use to consider whether their infrastructure meets the memory safety recommendations. This is more of an executive-level overview than a technical guide to programmers.

Image credit: Wikimedia Commons, Whitehouse North.

On February 26, 2024, the White House released Statements of Support for Software Measurability and Memory Safety calling attention to a technical report from the Office of the National Cyber Director titled “Back to the Building Blocks: A Path Towards Secure and Measurable Software” (PDF link). The whole framework works like this: we ultimately want to be able to measure how good the software is (e.g. by giving it a score), and it relies on memory safety as a signal. The tech report also references another report published by Cybersecurity and Infrastructure Security Agency (CISA) titled The Case for Memory Safe Roadmaps which contains a list of memory safe language recommendations.

To supplement their list, I will be using the TIOBE index top 20 most popular programming languages to provide the examples: Python, C, C++, Java, C#, JavaScript, SQL, Go, Visual Basic, PHP, Fortran, Pascal (Delphi), MATLAB, assembly language, Scratch, Swift, Kotlin, Rust, COBOL, Ruby.

I will also throw in some of my personal favorites: LISP, Objective Caml, Haskell, Shell, Awk, Perl, Lua, and ATS-lang.

High Level Languages

Languages that do not expose memory access to the programmer tend to be memory safe. The reason is that it reduces the opportunity for programmers to make memory violation mistakes. When accessing a null pointer or an array with an out of bounds index, these languages raise an exception that could be caught, or return a sentinel value (0 or undefined value), rather than silently corrupting memory content.

Examples: Python, Java, C#, JavaScript, SQL, Go, Visual Basic, PHP, MATLAB, Scratch, Kotlin, Ruby; LISP, Objective CAML, Haskell, Shell, Perl, Awk, Lua.

Under the hood, they employ an automatic memory management strategy such as reference counting or garbage collection, but programmers will have little to no influence over it because the language does not expose memory access.

It does not matter whether the language execution happens at the abstract syntax tree (AST) level, compiled to a byte code, or compiled to machine code. In general, any language could have a runtime implementation that spans the whole spectrum through Just In Time compilation.

Things to Watch Out For

High level languages are prone to unsanitized data execution error such as SQL injection, Code Injection, and most recently Log4j. This happens when user input is passed through to a privileged execution environment and treated as executable code. High level languages often blur the line between data and code, so extra care must be taken to separate data from code execution. Data validation helps, but ultimately data should not have influence over code behavior unless it is explicitly designed to do so.

I strongly oppose using PHP or any products written in PHP, which is particularly notorious for SQL and code injection problems and single handedly responsible for all highly critical WordPress vulnerabilities. But if you inherited legacy infrastructure in PHP, there are principles that will help hardening it.

Even though memory access errors are raised as an exception, if these exceptions are not caught, they could still cause the entire program to abort. They also still allow potentially unbounded memory consumption leading to exhaustion, which causes program to abort or suffer severely degraded performance, leading to denial of service.

Some languages provide an “unsafe” module which is essentially a backdoor to memory access. Using them is inherently unsafe.

Most languages also allow binding with an unsafe language through a Foreign Function Interface (ffi) like SWIG. This allows the high level code to run potentially unsafe code written in a non-safe language like C or C++.

Mid Level Languages

These languages expose some aspects of memory management to the programmer—such as explicit reference counting—and provides language facilities to make it safer.

Examples: Swift, Rust; also ATS-lang.

Performance is the main reason to use these languages, as memory management overheads have negative performance impact, and in some time-sensitive applications, we have to carefully control when to incur these overheads. The tradeoff is programmer productivity, since they have to worry about more things. Since performance is the main concern, these languages tend to be compiled into machine code before running in production.

I want to call out ATS-lang because it is a language I helped working on for my Ph.D. advisor, Hongwei Xi. It was conceived in 2002 and predated Rust (2015). ATS code can mostly be written like Standard ML or Objective CAML with fully automatic memory management. It also provides facilities to do manual memory management. Safety is ensured by requiring programmers to write theorems to prove that the code uses the memory in a safe manner (papers). The theorem checker uses stateful views inspired by linear logic to reason about acquisition, transferring of ownership, and disposal of resources.

Things to Watch Out For

These languages are safe by virtue that the compiler can check for most programmer errors in compile time, but these languages still provided unsafe ways to access memory.

Swift: UnsafePointer
Rust: Unsafe Rust
ATS-lang: Unsafe C-style Programming in ATS

Furthermore, they are still prone to denial of service, SQL injection, and code injection vulnerabilities.

Low Level Languages

These languages require the programmer to manually handle all aspects of memory management for legacy reasons. For this reason, they are inherently unsafe.

Examples: C, C++, Pascal.

Although garbage collection was invented by John McCarthy in 1959 for the LISP language, that concept did not gain mainstream adoption until much later.

Even so, there are a few strategies to make these languages more memory friendly.

Use an add-on garbage collector like Boehm-GC. Note that object references stored in the system malloc heap are not traced for liveness, so care must be taken when using both GC malloc and system malloc.
C++ code should use Resource Acquisition is Initialization (RAII) idiom as much as possible. The language already tracks object lifetime through variable scope. The constructor is called when an object is introduced into the scope, and the destructor is called when the object leaves the scope. Smart pointers like std::unique_ptr and std::shared_ptr use RAII to manage memory automatically.

My particular contribution in this field is my Ph.D. dissertation (2014), which proposed a different type of smart pointer in C++ that does not auto-free memory but still helps catching memory errors. I showed that it is practical to use, by implementing a memory allocator using the proposed smart pointer.

Legacy Languages

I have omitted Fortran and COBOL from any of the lists above. Historically, Fortran and COBOL only allowed static memory management. All the memory to be used by the program are declared in the code, and the OS provisions for them before the program is loaded for execution. However, they never had any array bounds checking, so they are not memory safe. Furthermore, attempts to modernize the language with dynamic memory allocation exacerbated the problem that these languages were not designed to be memory safe.

Fortran: memory management
COBOL: memory allocation

I have also omitted assembly languages as a whole. Assembly languages tend to be bare metal and provide completely unfettered memory access, but there have been some research to enhance the safety of assembly languages (e.g. Typed Assembly Language).

Conclusion

There is no language that is completely memory safe. Some languages are safer by design because either it limits the ability of the programmer to manipulate memory, or it gives the programmer facilities to help them use memory in a safer way. However, almost all languages have ways to unsafely access memory in practice.

Memory safety is also not the only way to cause security breach. Executing data as code is the main reason behind SQL and code injection, and these lead to highly critical privilege escalation, penetration, and data leak attacks. One safety net we can provide is by making code read-only, but this does not absolve the importance of good data hygiene.

Unbounded memory consumption can degrade performance or cause denial of service attacks, so memory usage planning must be considered in the infrastructure design.

In any case, the language is not the panacea to the memory safety or security problems. Programmer training and a culture to emphasize good engineering principles are paramount. However, I would love to see a renewed interest in programming language research to enhance language safety, by designing languages that encourage good practices.

Further Resources

Memory management in various languages

Sunday, February 18, 2024

Infinite Monkey Theorem, Debunked

If you let a monkey hit random keys on a typewriter for an infinite amount of time, will it happen to write the complete works of Shakespeare? Common wisdom says that not only will it write Shakespeare, but it will “almost surely” type every possible text an infinite number of times.

The proof idea goes like this: the probability to produce a permutation matching the complete works of Shakespeare is very small, but still greater than zero, so it will “almost surely” happen.

This is assuming that the monkey will eventually produce such permutation with a non-zero probability.

What if some permutations are simply not possible? There could be several reasons why this particular monkey and this particular typewriter could not have written Shakespeare. In order of increasing complexity:

Some letters on the keyboard are broken, so some letters in the works of Shakespeare could not be typed.
Due to the clumsiness of the monkey, it is only able to hit the space bar before or after hitting the letters above it. This means that every word has to always begin and end with ‘z’, ‘x’, ‘c’, ‘v’, ‘b’ ‘n’, or ‘m’. Shakespeare often used words that do not satisfy this criteria, so the works of Shakespeare could not have been written.
The monkey is not able to repeat what it types within a certain number of words, so phrases like “to be or not to be” can never be written.

Infinity does not mean that it will allow every permutation. An example is the infinite sequence that concatenates the set of numbers consisting of only even digits: 0 2 4 6 8 20 22 24 26 28 40 42 44 46 48 etc. This sequence is infinite, but it would never contain the number ‘1’.

(On the other hand, the sequence of all even numbers concatenated will contain all finite numbers, including the odd numbers; since starting from 10—which is still an even number—we simply ignore the last digit and look at the tens and above, e.g. 123 is contained in 1230 1232 1234...)

This is not an argument about “almost surely” being practically zero. Instead, we are simply saying that infinite monkey is not almost surely, as the output is handicapped in some way albeit still infinite.

If we want to apply the infinite monkey theorem elsewhere, infinity alone is not sufficient. We have to also show that the desired permutation has a non-zero probability to happen, as this is not a given. On the contrary, an application that only establishes infinity can be disproven by showing that the desired output actually has a zero probability of happening.

∞ ⨉ 𝜀 = ∞ but ∞ ⨉ 0 = 0

In reality, we often confuse the mechanism of the permutation with the existence of output. We cannot say that the complete works of Shakespeare exists in the output implies that it must have been produced by the monkey. What if we have a programmed typewriter such that, if the monkey hits ‘s’ a million times in a row, would insert the complete works of Shakespeare? The complete works “exists” in the output, but it is not produced through the monkey's mechanism of permutation.

(This is analogous to the panspermia hypothesis saying that life could have been introduced from an external source rather than permuted naturally within the system. The hypothesis allows that a system capable of sustaining finite life may not have been able to produce life infinitely.)

Showing that some improbable event has zero probability of being produced could be quite difficult but “almost surely” doable. However, if we try to categorically argue that “almost surely” is practically zero, then the argument is only “almost surely” believable.

Friday, January 19, 2024

Deep Learning and Solving NP-Complete Problems

This recent article about AlphaGeometry being able to solve Olympiad geometry proof problems brings to my attention that there have been active research in recent years to combine deep learning with theorem proving. The novelty of AlphaGeometry is that it specializes in constructing geometry proofs, using synthetic training data guided by the theorem prover, which previous approaches have had difficulty with.

As a formal method like type checking of programming languages, theorem proving tries to overcome the ambiguity of mathematical proofs written in a natural language by formalizing proof terms using an algebraic language with rigorous semantics (e.g. Lean), so that proofs written in the language can be verified algorithmically for correctness. In the past, proofs had to be tediously written by humans, as computers had great difficulty constructing the proofs. Computers can do exhaustive proof search, but the search space blows up exponentially on the number of terms, which renders the exhaustive search intractable. However, once the proof is constructed, verification is quick. Problems that are exponentially intractable to solve but easy to verify are called NP-complete.

Traditional proof searching limits the search space to simple deductions, which can offload some tedium from manually writing the proofs. In recent years, machine learning is used to prune or prioritize the search space, making it possible to write longer proofs and solve larger problems. It is now possible for computers to construct complete mathematical proofs for a selected number of high school level math. The idea is a natural extension to what AlphaGo does to prune the game's search space, which I have written about before.

This is quite promising for a programming language that embeds theorem proving in its type system to express more elaborate safety properties, such as Applied Type System (ATS) designed by my advisor, which I have written about before. Programs written in ATS enjoy a high degree of safety: think of it as a super-charged Rust, even though ATS predates it. Now that generative AI can help people write Rust, a logical next step would be to create a generative AI for ATS.

I would be interested to see how machine learning can be applied to solve other NP-complete problems in general, such as bin-packing, job shop scheduling, or the traveling salesman problem. However, machine learning will face fierce competition here, as these problems are traditionally solved using approximation algorithms, which comes up with a close-enough solution using fast heuristics. Also, only a few of them have real-world applications. Admittedly, none are as eye catching as "AI is able to solve math problems" giving the impression of achieving general intelligence; in reality, the general intelligence is built into the theorem proving engine when it is used in conjunction with machine learning.

However, I think proof construction is a very prolific direction that machine learning is heading. Traditionally, machine learning models are a black box: even though it can "do stuff," it could not rationalize how to do it, let alone teaching a human or sharing the same insight with a different machine learning model. With proof construction, at least the proofs are human readable and have a universal meaning, which means we can actually learn something from the new insights gained by machine learning. It is still not general intelligence, and the symbolic insight to be gained might very well be a mere probabilistic coincidence, but it does allow machine learning to begin contributing to human knowledge.

Sunday, January 7, 2024

So many camera systems! My thoughts.

My camera journey is a little meandering, but it can roughly be roughly summarized as "all I want is the perfect camera."

My first camera is a Canon PowerShot A50, bought in 1999, which is a compact CCD camera. It can do 10-bit RAW, which I later discovered allowed me to correct underexposed photos more easily, and the colors actually looked decent! The camera writes to an original CompactFlash. Without a reader, you'd have to transfer images out of the camera with a serial port dongle which is very slow (115200 baud, or ~14 kbps). I later sold it at a fundraiser.

Next I had a Sanyo Xacti VPC-HD1000, bought in 2007, which is a camcorder that can shoot 720p60, 1080p30, or 1080i60 video. It also came with a strobe light for photos. It had some very rudimentary electronic stabilization that didn't work very well. I tried Jonny Lee's Poor Man Steadicam hack, but I was still not happy with the result. After a few years of use, the sensor suddenly went grey on me, and eventually it was disposed through electronics recycling.

My next camcorder is a Panasonic HC-V750, bought in 2014, which can do 1080p. It had great image stabilization out of the box, but the color was a little washed out, so I never really used it very much.

My first somewhat serious camera is the Lytro Illum. Bought in 2015, it was the first computational camera that will let you adjust focus and depth of field after a picture has been taken. It was ahead of its time. Even though it had a 40MP sensor, the pixels had to record depth, so the final image was only about 10MP, and it was very noisy in low light. Nonetheless, it was a great tool to study depth of field in composition. I still have it, but after Google acquired Lytro, they discontinued support. The software still works on an Intel Mac, and it should work on the Apple ARM silicon for now using Rosetta 2 emulation, but its years are numbered. The Lytro software on Windows may have a better luck!

In 2018, I decided I want an interchangeable lens camera that can shoot video. At that time, the Panasonic GH5S was one of the first that can shoot 4K 30p 10-bit in camera, although the 10-bit video is not natively supported by Mac OS even til this day (VLC/FFmpeg supports it). In the next few years, I also started a collection of Micro Four-Thirds lenses. I will talk about them later.

I was pretty happy with the GH5S, except the contrast detection autofocus is not very good. The lack of IBIS also meant I could not do hand-held video very much. Some lenses have OIS, but they are not quite enough. I bought a gimbal, but it attracted enough attention that I was asked to leave the premise a few times when out shooting. Also, since Mac OS does not natively support the 10-bit H.264 from the GH5S, I have to transcode them to ProRes when editing the videos, which is a pain. As a result, I used it more for stills photography. The 14-bit RAW is great for color grading, though the sensor is limited to 10.2 MP.

The 14-bit RAW at 10.2 MP means the GH5S is excellent for taking time lapses. The camera's built-in conversion only resulted in a lower-resolution 8-bit video, which I ended up using more due to laziness, but the RAW files can be converted to DNG using Adobe DNG Converter and edited in Davinci Resolve for the most pristine quality time lapse.

In 2022, I bought two more cameras which overcompensated for the want of image stabilization and the want of megapixels.

The camera that overcompensated for my want of image stabilization is the GoPro Hero 10. Although the image stabilization is decent, the 4K 60p video is very noisy even during daytime in the shade. The 5.3K cinematic video is probably better, but it can still only record in 8-bit video (their next release later in 2022, Hero 11, records in 10-bit). The GoPro also tends to overheat after 20 minutes or so of continuous recording. This is my first camera that I have to worry about overheating, and sadly isn't the last camera. As a result, I did not end up using it very much.

The other camera that overcompensated for my want of megapixels is the Fujifilm GFX 100S, which is a whopping 102 megapixels. It is very satisfying to keep enlarging the picture and play a game of "where's waldo" with all the details. However, when pixel peeing, I can see some issues with chromatic aberration and corner softness at F/4 with the 32-64mm zoom lens. I don't have a different lens to know if this is a common issue. I've seen some photographers stopping down to F/22 which tends to make the lens sharper for the high resolution.

Like its predecessor GFX 100, the 100S can shoot 4Kp30 10-bit videos but is more compact. The phase-detect autofocus works great for me. The image stabilization is excellent both for photo and video, although in some situation the video can momentarily have a wobbling effect. Above all, I am most impressed with the colors. After experimenting with a few settings, combining the DR-P (dynamic range priority) mode with Velvia film simulation gave me the best result. The DR-P on its own would give a color that is too flat, but the Velvia simulation restored some contrast; Velvia simulation on its own tends to be too contrasty, since originally it is a color reversal film that only had 5 stops of dynamic range. Using DR-P and Velvia together balanced out the colors very well.

Although I shoot in RAW+JPEG, the out-of-camera colors are so good that my attempt to color grade the RAW usually results in slightly worse colors! The film simulation can also be applied to the 10-bit videos, and I am very impressed with the video colors as well. The video is 10-bit in H.265, which is natively supported by Mac OS. The 16-bit RAW from GFX 100S are still about 100MB a piece after lossless compression, so I find myself really slowing down and be more meticulous with the composition when taking the photos. However, it still attracted unwanted attention. I was also asked to leave the premise a few times when shooting.

Towards the end of 2023, I bought two more cameras that overcompensated for the avoidance of unwanted attention.

The first was the Sony A6700. It is slightly larger than the ZV-E1, but the A6700 in APS-C has more compact lenses than the ZV-E1 which is full-frame. The video looks very clean at 4K 60p. The low-light performance is great. Clear image zoom (digital zoom) also looks excellent. Not to mention, Sony has the best autofocus. However, the image stabilization of Sony is lacking: the footage is usable when standing still, but unusable when walking, which means I would have to use the gimbal again. It also tends to overheat after 20 minutes, but the Ulanzi external fan helped.

For the second camera, I decided to try the DJI Osmo Pocket 3. I am very impressed with the low light performance. At night, its default setting tends to overexpose the image, but I find that -2 EV compensation looked the best. The 4K has a lot of details at no zoom, but will begin to lose detail with digital zoom. Autofocus is reasonable, although I do notice the occasional hunting. The image stabilization is also impeccable being a gimbal itself, and it fits into the palm of my hand! I think this has the best potential for avoiding unwanted attention.

In reality, I probably will have to use the Pocket 3 and A6700 interchangeably: shoot with Pocket 3 when walking, and A6700 with the 16-50mm lens if I need to zoom.

My most horror with the A6700 happened when a friend asked me to do a photoshoot for their family, and they insisted that I use "my latest camera." After being spoiled with Fujifilm, I was unimpressed with Sony's S-Cinetone which is supposed to be their best color profile. However, when color grading the RAW files, I have to start from the terrible lens distortion and vignetting from the pancake lens I used. Remember I bought this camera for the compactness, not for the glass! Correcting the lens defects in post is doable, which Sony can also do in camera very well, but it took me hours. This is probably why the Sony lets me apply my own LUTs in camera, and many photographers sell their LUTs online.

So my experience with cameras is indeed quite meandering, but I will summarize my thoughts on camera systems here.

Canon

I purposefully avoided Canon throughout the years. I will give them credit that they have a respectable camera system with a wide selection of EF mount lenses for DSLR and now RF mount lenses for mirrorless cameras. Also respectably, Canon still develops their own sensors with reasonably good colors, while most other cameras just use Sony sensors. However, I avoided Canon for two reasons:

Back in the days, they have deliberately crippled the video capabilities of their photography cameras to avoid cannibalizing on their cinematic camera line. This changed with the Canon EOS R5 which can shoot 8K internally in RAW. Although R5 has overheating issues, the R5C addressed it with an internal fan.
Although there are many third-party EF lenses and third-party cameras supporting the EF mount, they are reverse-engineered and not officially supported by Canon. Furthermore, Canon started being litigious with third-party RF lenses.

This means that photographers are buying into the Canon walled-garden and have to put up with their superficial product segmenting which could come again in the future if not now.

Nikon

I really don't have much experience with Nikon. I know some people are happy with it. That's about it.

Micro Four-Thirds

I have the most experience with Micro Four-Thirds, which includes cameras and lenses made by Panasonic and OM Systems (was Olympus). Panasonic lenses are pretty good for video, but I like the Olympus lenses best for their superior optical quality and bokeh. I am absolutely in love with the clutch manual focus of Olympus lenses: clutching the focus ring will engage in mechanical manual focus, and de-clutching it will allow autofocus as well as focus by wire. It gives you a taste of pulling manual focus like a cine lens, but Olympus lenses only have 90° of focus pull rotation compared to the 200° for cine lenses, which makes it a little too sensitive for cinematography.

Micro Four-Thirds sensor size is 1/4 the surface area of full-frame with exactly 2x crop factor, so focal length conversion is pretty straightforward. Due to the smaller sensor, the native telephoto lenses are miniature compared to the full-frame counterparts. The sensor size works well with Super35 (APS-C) cinema lenses. It is also possible to use the camera with B4 ENG lenses made for 2/3" sensor with the 2x teleconverter. In theory, this makes this camera system the most versatile and diverse, but throughout the years it had a lot of unrealized potential.

First is that the cameras were late to the phase-detect autofocus. Olympus had PDAF first, but their cameras could not shoot 10-bit video. Panasonic only recently released G9 II with PDAF. Even so, the G9 II body is exactly the same size as the full-frame S5 IIx. I feel that Micro Four-Thirds makes more sense with compact cameras and pancake lenses, but so far Sony is eating their lunch on compactness.

L-Mount

Another camera system of interest is the L-Mount, which is originally designed by Leica but now includes Panasonic, Sigma and DJI in a closed consortium. It has great lens selection in full-frame and APS-C format, and the APS-C system can potentially be very compact as well. Leica cameras and lenses are premium, but Panasonic also has great cameras and lenses for video.

It is a promising system with good vendor diversity.

Sony E-Mount

Sony makes the vast majority of camera sensors, so there is no question about the image quality there. I am just not personally a fan of their color science.

Sony is also reasonable when it comes to licensing E-Mount to other lens makers, so there is a great diversity of lenses from Sigma, Voigtlander, and even Fujinon. However, the E-Mount is fundamentally designed for APS-C sensor size and not for full-frame, as the throat-diameter partially occludes the corners of a full-frame sensor. This can become a challenge in lens design where the distortion and vignetting will have to be corrected in camera or in post. Lens distortion and vignetting is prominent even in their high-end FE G-master lenses (e.g. FE 20-70mm F/4 G, FE 20mm F/1.8 G).

This means photographers who want to work with RAW will find themselves spending more time in post-processing. They could save some time if they save their own LUTs to the camera and let the camera do both lens and color correction, which explains why Sony cameras seem to have a thriving after-market of LUTs.

Sony makes almost all E-Mount cameras, but it is not easy to choose the right camera, as they have many lines of camera segmented for every need and every price point. This is as far as I could gather about their camera lineup, as of 2023.

Purpose	Full Frame	APS-C
Cinema	FX3, FX6, FX9	FX30
Vlogging	ZV-E1	ZV-E10
Photography	A7R V, A1	A6700
Videography	A7C II, A9 III	n/a

Some of the cameras share the same sensor, for example FX3 and ZV-E1 share the same full-frame sensor, and FX30 and A6700 share the same APS-C sensor. Apart from the obvious form-factor difference, newer cameras tend to have more computational features that are missing from older cameras. Sony is known for neglecting their firmware updates.

The A6700 is a more recent camera that is supposed to have the most intuitive user interface, but I still find myself confused. For example, if I shoot RAW photos, it also disables Clear Image Zoom for video. H.265 only has 60p or 24p, but 30p is missing. Sony also does not use the common names for video codecs: H.264 is called XAVC S, and H.265 is called XAVC HS. The camera has three separate modes: photo, video, and S&Q (slow and quick), but the custom presets in these modes are all completely independent. In photo mode, the video button can still record video, but in video mode the shutter button is disabled. None of it makes sense!

As far as I can tell, all the non-cinema cameras can overheat when shooting long video, and that is probably by design. Sony cameras tend to be the most compact, which makes thermal management more challenging.

Fujifilm / Fujinon

Fujifilm is the camera brand, while Fujinon is the lens brand. Fuji has two separate systems: XF mount for APS-C, and G mount for medium format. Sigma and Tokina also make third-party XF lenses. Venus Laowa also makes some G mount lenses. However, the vast majority of XF and G lenses are made by Fujinon, so the selection is more limited, though you should be able to find a lens for most focal length and aperture.

What Fujifilm is famous for is its legendary film simulation in camera, as I have mentioned before. The film simulation profiles are based on color analysis of real film stock: Astia, Provia, Velvia, Eterna, Acros, etc. I think they are successful not so much because of the nostalgic factor, but because the colors from the original films were already time-proven.

They recently released GFX 100 II which can shoot 8K, but due to the computational requirement to scale 100 MP down to 8K, the 8K is cropped from the 35mm image area which is a shame. I actually think that a GFX 40 MP 8K makes a lot of sense. They already have X-H2 which is 40MP in APS-C that can shoot 8K non-cropped. I would be happy buying into their XF system in the future.

Action Cameras, etc.

Action cameras like the GoPro can be fun to use for sports and travel and fairly easy to carry around as an extra camera just in case or for behind-the-scenes footage. They are more durable than camera phones and can stand a lot of abuse. However, the GoPro market share is increasingly getting eroded by Insta360, DJI, and other Chinese companies. Action cameras are also known to have poor low-light performance. The DJI Osmo Pocket 3 is probably going to eat their lunch, except it is not weather-proof or water-proof. People who use action cameras will probably always juggle multiple cameras anyways.

Conclusion

If you ask for my recommendation, my favorite camera system by far is still the Fujifilm for the legendary color science alone. It is such a joy to shoot.

I still think that the Micro Four-Thirds has a lot of potential. I would not mind replacing my GH5S with the G9 II when it breaks, but right now I do not see a need to upgrade. I would like to see more compact cameras and pancake lenses.

Sony is probably fine for casual video, but I don't recommend it for photographers because of the time they would need to spend in post-processing to correct the lens distortion and colors. Sony cameras are also not as intuitive to use, but I suppose one can get used to it.

Although my opinion about the Sony camera system is not favorable, this is just speaking from my own experience with some influence from the research done by others. In the end, photography is highly subjective to personal preference.

Friday, December 22, 2023

Why is a Transformer rated in kVA, not in kW?

I saw a post like this fly by on my Facebook feed today. I lost it when I closed the window, but the question lingered. Why is a Transformer rated in kVA, not in kW? I did not know the answer, but I should know because I have also seen kVA rating used in datacenter contract rather than kW.

Many electricians point out that kVA represents apparent power, and kW is the available amount of work under load. The difference is the power factor representing efficiency, so kW = kVA × pf. In a perfect system, kVA = kW. And electricians explain that power factor is less than 1 due to transmission loss.

But someone also pointed out that for DC (direct current), kVA = kW because the current does not go out of phase (not a great explanation though). That reminded me from college physics class that AC (alternative current) is a sine wave, and its RMS (root mean squared) voltage is smaller than the peak voltage. For sine, rmsV = (√2 / 2) × Vp ≅ 0.707 Vp. RMS voltage is used to calculate work, not the peak voltage.

As a side note, AC in the real world only approximates a sine wave, and UPS modified sine wave can look much worse.

So the answer is this: transformer is rated in kVA because it has to handle peak voltage, but the usable work kW is rated in RMS voltage which would under-specify the design requirements of a transformer.

Here is an afterthought: when physics gets applied in the real world, the electricians learn a simplified version of it stated to them as a matter of fact, which is not wrong, but they do not understand the reasoning behind it. Some people are perfectly happy to learn just enough to get stuff done. But if you want to keep innovating, you have to understand the reasoning behind an existing system, or else you would reinvent the wheel poorly by not learning from past mistakes and making additional mistakes that people before you knew to avoid.

On that topic, I recommend watching this video by SmarterEveryDay - I Was SCARED To Say This To NASA... (But I said it anyway).

Saturday, July 15, 2023

ZFS zpool replace "no such device in pool"

Since the last time I built my ZFS pool in 2016 with 4x 1TB SSDs in a thunderbolt 2 enclosure, I ran out of space in 2019 and upgraded the SSDs to 4x 2TB. I didn't have an extra enclosure, so I took a fairly risky approach to yank each SSD out, put in a new SSD, and let ZFS resilver. It only worked because my pool is a raidz1. During resilvering, the pool was in a degraded state due to the lack of redundancy.

And I am just about to run out of space again after 4 years. I also lucked out because the SSDs are heavy discounted right now due to oversupply. The chip makers are trying to reduce production, so the price might rise again.

This time, I bought 4x 4TB M.2 NVMe (non-affiliate B&H link) and put them in a Thunderbolt 3 enclosure (non-affiliate vendor link). This means that I could do a proper replace without risking the degraded state.

But when I try to replace the SSDs, I keep getting the "no such device in pool" error. I tried these variations:

$ sudo zpool replace tank disk2 disk13
cannot replace disk2 with disk13: no such device in pool
$ sudo zpool replace tank disk2 /dev/disk13
cannot replace disk2 with /dev/disk13: no such device in pool
$ sudo zpool replace tank disk2 media-728C3B9F-647B-4C40-9DDF-E140E55D8C89
cannot replace disk2 with media-728C3B9F-647B-4C40-9DDF-E140E55D8C89: no such device in pool
$ sudo zpool replace tank disk2 /var/run/disk/by-id/media-728C3B9F-647B-4C40-9DDF-E140E55D8C89
cannot replace disk2 with /var/run/disk/by-id/media-728C3B9F-647B-4C40-9DDF-E140E55D8C89: no such device in pool

None of the above worked, and it was driving me nuts. When I was younger, I would have pulled my hair out. Now that my hair is more spase, I've learned to take better care of it by showing restraint.

It turned out that "no such device in pool" actually meant it could not find disk2, even though it is able to find disk13 just fine for some reason. This worked:

$ sudo zpool replace tank media-E235653B-0924-2F46-B708-FDEFFAB5CB15 disk13

The colon in the error message makes it appear that "no such device in pool" is about the new disk, and it is confusing why the new disk have to be added to the pool first. The error is actually about ZFS not being able to identify the existing disk in pool that needs replacement. In fact, if it could not find the new disk, the error message is different:

$ sudo zpool replace tank disk2 disk777
cannot open 'disk777': no such device in /dev
must be a full path or shorthand device name

The error message could be improved, for sure. I wonder if baldness of the people in the tech industry could have been prevented if more error messages are helpful.

Another thing I learned is that when upgrading raidz1, it is far more efficient to replace all the disks at once and let ZFS resilver them in parallel. It takes the same amount of time to resilver one disk as to resilver all disks. The reason is that resilvering requires reading all disks in order to check the parity, whether it is resilvering one disk or many. In my case, the process is bottlenecked by the read, not the write.

Friday, June 30, 2023

Beware of Passkeys Account Takeover Risks

What are Passkeys?

This is the problem passkeys promises to solve by replacing passwords, according to FIDO Alliance:

Based on FIDO standards, passkeys are a replacement for passwords that provide faster, easier, and more secure sign-ins to websites and apps across a user’s devices. Unlike passwords, passkeys are always strong and phishing-resistant.

Passkeys simplify account registration for apps and websites, are easy to use, work across most of a user’s devices, and even work on other devices within physical proximity.

There is a lot of fanfare. Or if you want to read the spec, it's easy to see only pine needles and miss the forest. Passkeys spec is built upon several FIPS standards whose underlying mathematics principle of the cryptography is sound. However, if the cryptography is used poorly, the overall security of the user experience can still be compromised.

The spec focuses on the communication protocol between the authenticator (device), the client (user), and the website (or app). A passkey consists of a public key and a private key. You give away the public key but keep the private key secret. You can prove possession of the private key without revealing it, and the evidence can be verified by the public key alone. Furthermore, you cannot guess the private key from the public key.

Yubico has an intuitive overview how passkeys work, but here is my paraphrase of their sequence diagram.

What is the Problem?

The device used to create an account on a website stores the passkey so the user can login from the same device later, but the spec does not prescribe what happens when the user wants to login from another device. When using a hardware FIDO U2F security key, the private key is stored on the hardware and cannot be copied, but you can plug in the key to another computer. Passkeys are software generated, so it is open to the solutions provided by the platform with varying degrees of security.

The solution provided by Apple seems to suggest that the "device" is an iCloud keychain. This means the user has to trust Apple for safe-keeping of the private key on their servers. Despite Apple's synchronization security claims that passkeys are end-to-end encrypted, the fact that their implementation is "rate limited to help prevent brute-force attacks" and "recoverable even if the user loses all their devices" should raise an eyebrow. This means an attacker could still "recover" your passkeys without accessing any of your devices. This is permissible because the passkeys spec allows creating roaming passkeys.

The solution provided by Google seems to suggest that when a user wants to log in from a new device, the new device generates a new public key which is shown as QR code, and the user has to take affirmative action to copy the new public key to an existing device to be authorized by an existing key. However, according to the FAQ, the private key is still synchronized to the Google account and is recoverable even if you lose your device.

Security is only as strong as its weakest link. Any time a key is recoverable without using an authorized device, that is an opportunity for account takeover. If the recovery is done by sending a verification code via SMS, it can be susceptible to SIM swapping attack. If the recovery is done by a special passcode, then the passcode itself can be brute-forced or stolen. These targeted attacks are typically done to high-profile public figures.

The one scenario that passkey does better than password is when a website suffers data breach. Since the website only knows the public keys, they are useless to hackers.

It Could Be Worse

Even if the platform makes account recovery impossible, there are still ways to design new device login that render the security useless.

Let's say an existing device lets the user copy the private key to another device. That means a malware running on the existing device can exfiltrate the private key. If the private key is shown as QR code, then anyone with a telescope could steal the private key at a distance. The private key could be encrypted by a passphrase in transit, but then the passphrase could be brute-forced.

Alternatively, the website may send a push authentication to an old device to authorize the login. When an account is under attack, the user would be overwhelmed with many push notifications and might click the wrong button and accidentally accept. They also may be duped to think that the new login comes from a device they already enrolled that somehow lost the authorization.

Best Practice

The best way to implement passkey is to store the private key in a hardware secure element that makes retrieval physically impossible. This makes the passkey always physically bound to the device. To enroll a new device, the user will have to take affirmative action to copy out the new public key to an existing authorized device. After verifying the user to be who they are, the existing device would then create a signature, using an existing key that is already trusted by the website, to certifying that the new key is trustworthy. After the new public key is enrolled by the existing device, the website will allow the new device to login.

This process of creating new passkey on a new device to be authorized by an old device can still be automated by physically tethering the two devices. Authorization of new passkeys could also be done wirelessly over a reasonably secure channel, as long as the channel is mutually authenticated between the two devices and guarded against man-in-the-middle attacks. It is important to note that the passkey is still device-specific and never shared with another device. The user can keep a backup device in a bank safe deposit box if needed.

Conclusion

It is fair to say that passkeys does mitigate data breaches from a website, but there are still a lot of ways a platform could introduce account takeover risks if they employ passkeys improperly. Even though the passkeys spec is built on sound cryptography, it does not rule out scenarios where account takeover could still happen under a targeted attack. Since security is as strong as its weakest link, this is something people should be aware of.