Life of a Computer Scientist: December 2025

Friday, December 19, 2025

My advice if you are a 13 year old vibe coding to become the next Bill Gates...

(Insert obligatory AI generated Ghibli styled graphics here.)

If you are a 13 year old vibe coding to become the next Bill Gates, here are the computer science concepts you will need to know to successfully direct AI to do the right things. I assume you want to build the next big thing and not just fixated about the 70's BASIC interpreter that Bill Gates wrote when he was 20 without AI. Also, if you want to go into fundamental research on AI or quantum computing, vibe coding is probably not what you're after, but it can be helpful to learn about randomized algorithms.

If you ask AI today the same question, you would have gotten some hand-wavy advices, and here are my takes on why they are not that useful.

Master prompt engineering.

Why this is not useful: the only way you master prompting is by having the right vocabulary for the fundamental concepts in computing, and you need to understand the concepts behind these vocabularies. It is not effective to prompt with an alphabet soup of jargons unless you use them in a meaningful way.

Learn the tech stack (e.g. Gemini, GitHub Copilot).

Why this is not useful: tech companies are going to make AI as easy to use as tap water. Obviously, there is fascinating practical engineering about water resourcing and plumbing infrastructure to get the water to your tap, but it's not exactly rocket science to learn how to turn on the faucet. Anyone who thinks they have a particular edge in prompt engineering will find out that it is quickly disappearing.

Problem solving and strategic vision.

Why this is not useful: don't just dream about things in your head. It is even more important to learn to try things and observe the outcome. If something didn't happen as expected, try to understand why. This is not something AI can do for you, since AI can only learn from its training data. Always validate your ideas in the real world and pivot as necessary.

Financial literacy.

If you pay attention in high school math class, you should have the tools you need to predict the outcome of your decisions. Watch this video about why Math Just Got Important.

Instead of recommending specific framework or product (since you can ask AI for more timely recommendations), here is a bucket list of timeless computer science concepts that you will want to learn:

Computer architecture, especially about the memory hierarchy and principle of locality in the context of cloud services. Separation of code and data, which is important for security.

Why is it important: memory hierarchy and principle of locality lets you understand how all computer systems operate under the same space-time constraints, similar to relativity in Physics, and the cost-benefit trade-offs needed to optimize it (e.g. caching).
Without the discipline to separate code and data, mixing the two is a constant source of exploits compromising cybersecurity at national levels.

Algorithms and data structures: when you learn about sorting, set your sight on the time and space complexity analysis and try to not get bogged down with the mechanism itself. Hash table is going to be relevant in load balancing, and Graph traversal for network architecture.

Learn programming in the context of algorithms and data structures so you have a vocabulary to describe them.
Why is it important: complexity analysis tells you why your system is provably not scalable when the dataset becomes larger, not even if you add more computing resources to it.

Network architecture: DNS, IP addressing, client/server and load balancing.

Why is it important: although many "cloud" providers have this abstracted away, understanding the network architecture lets you design your cloud in a more cost effective way and be able to debug when issues arise.

Server side: HTTP and REST, cryptography (TLS, JOSE), SQL (particularly about SQL injection which is what happens when you fail to separate code and data).

Why is it important: although many frameworks also abstract away most of these details, your system architecture is limited by the assumptions made by the underlying protocols. Ultimately this boils down to trade-offs, whether to accept these limitations as "good enough" for your application or seek alternatives, e.g. HTTP long polling vs. WebSocket, X.509 vs. JWK, SQL vs. NoSQL.

Client side: fundamentally, at least HTML/DOM, CSS, and Javascript.

They are very capable nowadays, so you don't really need a separate framework, but feel free to let AI try different frameworks.
Why is it important: you still need to know HTML, CSS and Javascript in order to debug framework code.

There are some more advanced topics that could be relevant for domain specific work like games, especially the triple A titles with good graphics and physics, unless you are satisfied writing yet another unimpressive Minesweeper.

Linear algebra for computer graphics and physics modeling.
More advanced data structures and algorithms like quad tree / octree, path finding.
Audio and video: digital signal processing, Fourier analysis.
Information theory: data compression, error correcting codes.

In a world where the cost of answers is dropping to zero, the value of the question becomes everything. It is still important to learn the concepts so you have a vocabulary to express ideas in your head, and to observe if your ideas work in the real-world and pivot if not.

Friday, December 12, 2025

Secretive Key Attestation Possible?

On an old MacBook Pro 2019, I've been trying out Secretive (secretive.dev) to let me use the Touch ID (aka SecureEnclave) as a hardware SSH key, like how I would use Yubikey on other computers.

One of the things I would like to do is to create a certificate authority for key signing, to avoid having to manually add new keys to each host I have to SSH into. The certificate authority needs to tell that the hardware key is indeed generated by some qualified hardware, so the private key cannot be easily exfiltrated. Also, I should not be able to cheat by generating a key pair on a computer, making a copy somewhere, then load the private key onto the hardware token and pretend it was generated there.

Normally, using U2F with SSH gives me the option to generate a key attestation that can then be verified by the CA before signing the public key certificate. Essentially, I tell ssh-keygen to write the attestation to another file as the new key is generated, and the attestation will have to be parsed using a custom script in order to check against the vendor certificates published by FIDO2 Alliance.

There are several ways to generate key from the Secure Enclave:

Secretive uses CryptoKit SecureEnclave.P256.Signing.PrivateKey to create an ecdsa key in SecureEnclaveStore.swift. This API does not currently generate key attestation.
SecKeyCreateRandomKey also does not generate key attestation. It appears to be a keychain wrapper around CryptoKit, and it does not necessarily generated the key on Secure Enclave either; it has to be specified in the parameters as [ kSecAttrTokenID: kSecAttrTokenIDSecureEnclave ] according to Protecting keys with the Secure Enclave. This function also does not generate key attestation.
There is a DeviceCheck.DCAppAttestService with two functions: generateKey and attestKey. The intended purpose is to establish the app's integrity. After generating a key, attestKey will send it to an Apple server over the network and then return an attestation object with the public key. After we validate the attestation object ourselves, the public key can presumably be used by Secretive to perform SSH agent operations.

Note that the Secretive 3.0.4 release notes mentions "GitHub attestation" but that is entirely different. It just says the binary built by GitHub actions will now have provenance.

Even though DeviceCheck.DCAppAttestService seems like a likely candidate to generate attested keys, the fact it requires a validation roundtrip to Apple's server can be problematic. It likely only works for apps uploaded to the App Store (which Secretive is not), and people will not be able to compile their own binaries and expect it to work. I'm also not sure how people may feel that Apple can now keep a record of all their SSH keys.

Probably, it's easier to just buy another Yubikey.

Thursday, December 11, 2025

Certificate Signing Request with Authentication Credentials

PKCS #10 (RFC 2986) certificate signing request (X509_REQ) typically contains at minimum the subject and the public key to be signed into a X.509 certificate issued by a certificate authority (CA). Here is an example of what I send to Let's Encrypt:

$ openssl asn1parse -item X509_REQ -in worship.likai.org.csr 
X509_REQ: 
  req_info: 
    version: 0
    subject: CN=worship.likai.org
    pubkey:     X509_PUBKEY: 
      algor: 
        algorithm: id-ecPublicKey (1.2.840.10045.2.1)
        parameter: OBJECT:secp384r1 (1.3.132.0.34)
      public_key:  (0 unused bits)
        0000 - 04 e4 5c 30 85 6f b5 37-35 f4 e9 21 3b 2c 74   ..\0.o.75..!;,t
        ... omitted for brevity ...
        005a - 33 2f c8 fb c1 21 86                           3/...!.
    attributes:
      <EMPTY>
  sig_alg: 
    algorithm: ecdsa-with-SHA256 (1.2.840.10045.4.3.2)
    parameter: <ABSENT>
  signature:  (0 unused bits)
    0000 - 30 66 02 31 00 ce 23 b9-c3 b0 53 24 2f d3 8c b0   0f.1..#...S$/...
    ... omitted for brevity ...
    0060 - cc fe a2 64 fa f2 7a e6-                          ...d..z.

X509_REQ can also contain extensions, most notably subjectAltName. Some of the extensions are documented in x509v3_config, which includes the ability to send arbitrary extensions. PKCS #9 (RFC 2985) defines the attributes that can be used in a PKCS #10 CSR. Arbitrary extensions are defined by an object identifier (OID) key and some ASN.1 value. We can lookup the OID assignments.

As currently defined, X509_REQ does not have the ability to include authentication credentials, i.e. the CA only knows the subject and its public key, and the X509_REQ itself is signed by the same subject's public key for integrity, but the CA does not know whether the subject is whom they claim to be.

That kind of trust has to be verified out of band. In the case of Let's Encrypt (how it works), the ACME protocol performs domain validation by fetching the result of a challenge response over plaintext HTTP at the domain name matching the CN subject in the CSR. This requires the host to be reachable from the public Internet, which is not feasible for internal servers. There are other challenge types: TLS-ALPN-01 also requires the host to be reachable from the public Internet; DNS-01 does not require the host to be reachable, but the requestor must be able to modify the DNS record, which can present other difficulties.

Enrollment over Secure Transport (EST) is another protocol to automate certificate signing (like ACME). It authenticates the CSR at the API layer but not as part of the CSR. However, binding the API authentication to the CSR remains a challenge.

There are a few potential OIDs we can investigate for their suitability to include authentication credential as part of the CSR:

PKCS #9 defines a challengePassword OID 1.2.840.113549.1.9.7, but it is supposedly used to request certificate revocation. It seems to be unimplemented. Also, it is generally a bad idea to repurpose a field for a different purpose than originally intended.

The content of challengePassword is plaintext equivalent, which means the CSR should be protected by other means so it is not public accessible.

RFC 7894 proposes additional fields to disambiguate the PKCS #9 challengePassword, notably: otpChallenge, revocationChallenge, estIdentityLinking.

It assumes that the primary authentication is done by the underlying transport such as EST. The additional fields can provide a second-factor authentication.

The SMI mechanism OID 1.3.6.1.5.5 references additional mechanisms tracked by IANA. In particular:

The aforementioned RFC 7894.
There is a working draft draft-ietf-lamps-csr-attestation-08 to attest the CSR with evidence, which is intended to come from a hardware TPM or PSA. Presumably, the evidence can be further extended to cover FIDO U2F or Passkeys. These evidences must not contain plaintext equivalent secrets.

Kerberos OID 1.2.840.113554.1.2.2 (RFC 1964) does not have an OID for the challenge response, as far as I can tell.

These proposals tend to avoid putting plaintext secrets into CSR, but the ability to secure secret credentials in a CSR remains elusive.

Envelope Proposal

To be able to include secrets in CSR, we can use the CA's public key to encrypt the plaintext, and we store the ciphertext in an envelopedData. There are three potential OIDs defining envelopedData:

1.2.840.113549.1.7.1.3 under the PKCS #7 namespace for id-data, but seems unused.
1.2.840.113549.1.7.3 as part of PKCS #7 (RFC 2315) envelopedData.
1.2.840.113549.1.9.16.10.1 under the PKCS #9 S/MIME EIT namespace is used to transport S/MIME Encoded Information Types (RFC 3855) for X.400, an archaic messaging protocol.

The envelopedData defined by PKCS #7 (RFC 2315) is obsolete by the latest version RFC 5652, which seems to be the correct candidate to store encrypted secrets destined for the CA. If the CA has multiple certificates, they can all be included as a recipientInfo.

Inside the envelopedData should be a SEQUENCE that contains at least the SET OF Attribute{{ IOSet }} as defined in PKCS #10 (RFC 2986). This also opens up the possibility to include anything from an LDAP directory OID 2.5.4 (RFC 4519 or ISO/IEC 9594-6:2020 or X.520) that might need to be protected from public access, such as userPassword 2.5.4.35 or maybe encryptedUserPassword 2.5.4.35.2.

Afterthoughts

On one hand, having a global OID means applications and protocols can share relevant definitions when appropriate, but navigating the maze of existing OID assignments can be an exercise in frustration. It is similar to code reuse: sometimes the code you borrow from almost does what you need except for some corner cases, but you cannot change the corner case behavior because it may break existing users relying specifically on the corner case. So you have to choose the OID carefully based on how you expect the OID to evolve in the future, and try to find an OID that best aligns with the use case.

It is tempting to start from scratch and define exactly what an application needs in an application-specific way without any historical baggage, like JOSE (JWS/JWT). Their RFCs are still long, but still shorter than the maze of OIDs, RFCs, and ITU/ISO standards. It would be nearly trivial to abbreviate the CSR to another JSON object.

It is also interesting how OID leaves the breadcrumbs showing how the Internet standards evolved. At first, RSA developed the PKCS under its OID 1.2.840.113549 {iso(1) member-body(2) us(840) rsadsi(113549)}, then this namespace is donated to IANA. Similarly, the OID 1.3.6.1 {iso(1) identified-organization(3) dod(6) internet(1)} used to be owned by DoD (Department of Defense), now hijacked by IANA as well.

Tuesday, December 9, 2025

Runtime TLS Certificate Hot-Reload with OpenSSL

For long-running microservices that can act as both client and server, it is possible the lifetime of the microservice might outlive its TLS certificate validity, especially if an organization opts for a security policy to rotate TLS certificates quickly (e.g. every day). We need to be able to reload both the client and server certificates without exiting the service.

OpenSSL provides the functionality as follows:

For a client, the caller should use SSL_CTX_set_client_cert_cb() to provide a callback which is "called when a client certificate is requested by a server and no certificate was yet set for the SSL object." The caller provides a client_cert_cb() function that modifies the X509 ** and EVP_PKEY ** (pointer to a pointer), which OpenSSL will use to call SSL_use_certificate() and SSL_use_private_key() internally.
For a server, the caller should use SSL_CTX_set_tlsext_servername_callback() to provide a callback which is called during SNI. The caller provides a cb() function that modifies the SSL * object directly to override the SSL_CTX using SSL_set_SSL_CTX().

The callback can use an out-of-band method to reload the private key and certificate as needed. Once a connection has been authenticated during handshake, it will probably remain active even after the certificate expires, so the callbacks only apply to new connections.

To actively disconnect TLS transport with an expired certificate, some kind of connection pool management will be needed to wrap OpenSSL. This connection pool can also handle load balancing.

It is generally a good idea to allow certificate rotation to have some validity overlap so connections can be "upgraded" to the new certificates gradually. It avoids the problem where all connections die and have to be reconnected all at once, which can have undesirable latency impact.