Thursday, April 19, 2018

My Experience with AWS CloudFront and Lambda@Edge

I've got a few classic Google Sites with custom domain, and now I want to enable TLS for them. I thought this would be easy; after all, both App Engine and Blogger now generate managed TLS certificates for custom domains. I thought classic Google Sites would be the same, but I was wrong. Not only is this not supported, but some guy over at the production support forum turned indignant on people seeking help like me.

The basic idea is to use a CDN to terminate TLS. There are many CDN providers, such as AWS CloudFront, Google Cloud CDN, and CloudFlare, and in theory it shouldn't matter which one you use.

AWS CloudFront worked well for my AWS S3 bucket lifecs-static.likai.org which hosts the static assets for this blog. Most S3 buckets can be accessed over HTTPS only when it isn't a custom domain, and that's because the wildcard certificate only covers the immediate *.s3.amazonaws.com but not further subdomains. Fortunately, CloudFront is able to serve from an S3 bucket directly. For other destinations, you have to specify the origin protocol. My hello.likai.org is hosted by nearlyfreespeech.net which can be accessed over HTTPS as likaior.nfshost.com. They do provide a custom script to enable Let's Encrypt, but I still decided to go with CloudFront for TLS termination.

One of the Google Sites that I want to enable TLS for is cs.likai.org which is a classic Google Sites. The site can also be accessed over HTTPS as sites.google.com/a/likai.org/cs, but when accessed in this manner, the links generated in the page have the wrong base URI (the full sites.google.com etc) which won't work when served over a custom domain. On the other hand, Google Sites refuses to serve HTTPS for custom domains, so the origin protocol would have to be plain HTTP. Even so, there is another problem: style sheet and images have HTTP links as well, so when naively served over HTTPS, content policy would block those resources and break the rendering of the site.

As I was playing around, I noticed that we can use AWS Lambda with CloudFront to rewrite requests and responses, also known as Lambda@Edge. This can be used to rewrite the links in the page to the preferred scheme-less form. For example, http://example.com/foo/bar would become //example.com/foo/bar, which means when the base document is served over HTTPS, then the link to example.com will also be HTTPS, but the link would be HTTP for base document served over HTTP. Of course, since you can rewrite the response body with Lambda, I could have used https://sites.google.com/a/likai.org/cs as my origin so I could have end-to-end encryption, and trim the base URI from the response.

Before, cs.likai.org was a CNAME for ghs.google.com. Obviously, I could not tell CloudFront to both use cs.likai.org as origin and then point cs.likai.org as a CNAME back to CloudFront, as this will create an infinite request loop, so I created a new mapping cs-http.likai.org in the Google Apps admin console. The plan is to serve cs.likai.org with CloudFront, which fetches cs-http.likai.org over unencrypted HTTP. It would have been easier if CloudFront lets me use ghs.google.com as the origin and let me override the Host header, but it wouldn't let me.

I wrote the following script for the Lambda (feel free to adapt it for your own use), to be associated as CloudFront Origin Response handler.
'use strict'

const cheerio = require('cheerio')
const url = require('url')

function stripLinkProtocol(attrname, elem) {
  const $ = cheerio
  // cheerio uses htmlparser2 and domhandler internally, so elem is a
  // domhandler DOM-like object.
  const link = $(elem).attr(attrname) // $.attr() is O(n), beware!
  const parsed = url.parse(link)
  if (!parsed || !parsed.protocol)
    return
  const newlink = link.substr(parsed.protocol.length)
  $(elem).attr(attrname, newlink) // $.attr() is O(n), beware!
}

function stripHtmlLinks(html) {
  const $ = cheerio.load(html)
  $('[href]').each((i, elem) => { stripLinkProtocol("href", elem) })
  $('[src]').each((i, elem) => { stripLinkProtocol("src", elem) })
  $('form').each((i, elem) => { stripLinkProtocol("action", elem) })
  return $.html()
}

exports.handler = (event, context, callback) => {
  const response = event.Records[0].cf.response;
  if (response.status == 200 &&
    response.headers['content-type'] &&
    response.headers['content-type'][0].value.startsWith('text/html') &&
    response.body) {
    response.body = stripHtmlLinks(response.body)
    response.headers['x-lambda-edge'] = [
      {'key': 'X-Lambda-Edge', 'value': 'html-strip-link-protocol'}
    ]
  }
  callback(null, response)
}
But deploying it wasn't as straightforward, so here are some highlights for the surprises I encountered along the way:
  • Lambda@Edge only supports NodeJS 6.10 (as of April 19, 2018) even though AWS Lambda in general supports other runtimes.
  • The IAM role for Lambda@Edge needs additional trust relationship (but not the IAM permissions, which are for associating the Lambda with CloudFront).
  • Any external dependencies (such as cheerio) must be prepackaged and uploaded as a naked zip file (added files are not contained in a root folder). File permissions in the zip has to be world readable, or you will get "EACCES: permission denied, open '/var/task/index.js'" error. The extra dependencies can be fetched easily with "npm install ..." command. The main Lambda entry point must be called index.js.
  • You have to publish a version of the Lambda before you can associate it with CloudFront. After association, this version can't be deleted until all replicas have been disassociated.
  • To test the lambda without affecting what's live, create a new CloudFront distribution specific for testing, then attach the testing Lambda version to it. Do note that CloudFront updates are very slow (can take a few hours).
I also created a unit test which helps trying out different things.
{
  "Records": [
    {
      "cf": {
        "config": {
          "distributionId": "EXAMPLE"
        },
        "response": {
          "status": "200",
          "headers": {
            "content-type": [
              {
                "value": "text/html; charset=utf-8",
                "key": "Content-Type"
              }
            ]
          },
          "statusDescription": "OK",
          "body": "<link rel='stylesheet' type='text/css' href='http://www.gstatic.com'> <a href='http://example.com/foo/bar'>"
        }
      }
    }
  ]
}
For some reason, despite the test passing, and the CloudFront had apparently finished deploying with no errors, I still wasn't able to see any evidence that my Lambda ran. The links were left as they were, and there is no indication of the X-Lambda-Edge response header I inserted in my code.

And then I discovered this fine print in Updating HTTP Responses in Origin-Response Triggers buried under the mountains of documentation.
When you’re working with the HTTP response, note that Lambda@Edge does not expose the HTML body that is returned by the origin server to the origin-response trigger. You can generate a static content body by setting it to the desired value, or remove the body inside the function by setting the value to be empty.

I feel that AWS Lambda@Edge is similar to App Engine, in the sense that if you wanted, you could write the whole server in Lambda which has access to backend storage like DynamoDB and S3, and AWS takes care of replication and scaling. But the achilles heel of AWS is the slow replication.

I should have created an App Engine to front my classic Google Sites instead, then I get HTTPS for free.

2 comments:

Anonymous said...

Perhaps you should state upfront that everything in this article will not work...

Anonymous said...

As Cloudfront is currently implementing Cloudfront Functions (a Lambda edge replacement) I added this problem to their issue list on github.

https://github.com/aws-samples/amazon-cloudfront-functions/issues/3