Simple, Scalable and Secure websites with S3 and EFS+ECS/Fargate

By Archis Gore

I was pretty excited about the feature release from AWS that allows mounting EFS volumes onto Fargate tasks. In this post, which was enabled by that very feature, I want to drill down into one particularly effective use-case of building static websites powered by a CMS.

Part 1: Security vs Usability

When Polyverse started, our entire website had a modest little home on Github Pages. It wasn’t the greatest thing since sliced bread (didn’t support SSL for custom domains yet), but people would know that we existed. To this day a number of our web-properties (conver.io,  Entropy Visualiser, etc.) still reside on Github Pages, and if you don’t need more than what they offer, I would highly advocate sticking to them as long as possible

Over time we kept the static-ness of the site, which has a number of advantages, but moved it to S3, built a CI/CD pipeline to deploy to it (went back and forth between CodeBuild/CodeDeploy and ultimately TravisCI), CloudFront‘ed it.

Our backend changed a bit too – starting with the humble Markdown renderings, to the more sophisticated Ghost and now React/Next.js.

I bet you’re wondering – why not use a CMS? What’s with all this self-inflicted complexity? You wouldn’t be the first to bring this point up. CMSes were prone to very trivial vulnerabilities (Polyscripting didn’t exist yet), required constant upkeep with patching and updates, and so on. Despite the complexity of our site, there was a comfort in knowing that in no reasonable world would we wake up to find that S3 or CloudFront wasn’t handling the traffic being thrown at it. While building/deploying may have been a pain, if we came down with the flu (or worse, COVID19), nobody had to monitor the website and we knew it was always going to be fine.

We traded usability in favor of security.  While the draw of a CMS was not lost on us it required a couple of experiments to get there.

Part 2: Polyscripting

The next key part of the puzzle was ensuring that no code that we didn’t want running could run on our CMS.

In native code there is a distinction between executable instructions (loaded once and are fixed for the lifetime of the program), and data blocks (which can change as much as they want, but can never be executed.)

Polyscripting does something similar to this for WordPress and prevents a large variety of attacks. While it is a fascinating technology it is out of scope for this post. 

Part 3: Serverless WordPress

With Polyscripting we now have a CMS with reasonable defense against willy-nilly code-injections in our WordPress (which we verified by commissioning a 3rd party penetration test.)

However, we couldn’t get Polyscripting on any hosting provider (and despite what HackerNews would have you believe, code-signing isn’t a solved problem.) And I didn’t have the heart to follow AWS’s WordPress Reference Architecture Diagram.

aws-refarch-wordpress-v20171026.jpeg
NOT what we do

Just… No! All of that complexity so one architecture can serve two distinct goals. The entity that’s serving websites to millions of people is simultaneously also somehow securely and easily trying to help a minuscule number of those people to author content on it.

For me, the solution necessitated Persistent (though not necessarily Stateful) filesystem mounting on Fargate Tasks. Let me get this out of the way: EFS latency can suck. EFS throughput can suck if you don’t have provisioned IO. I wouldn’t recommend running an internet-facing WordPress backed by EFS without very careful monitoring and tuning, at least initially. I personally prefer things I don’t have to look at.

But there are two things to consider:

  1. The interfaces are there. EFS may use NFS, but it doesn’t have to. I don’t care what it uses. And that’s the point. AWS can go make it faster later.
  2. Authoring doesn’t require nearly as much provisioned capacity as serving. At best there might be five people editing content on our website while there might be thousands reading it.

That’s the crucial difference between a Headless CMS and what we ended up building. Even though a Headless CMS might separate the rendering of content, it still merges authoring with viewing.

We separated out the CMS for authoring versus the simple S3 bucket for serving. Ignoring the serving part for a moment, this is how we Architect WordPress:

Yes there’s a VPC in there (EFS still needs to figure out who it is when not connected to “A Server”.) But it’s far simpler with fewer moving parts and definitely fewer “servers” than the reference. The only thing we worry about is patching the Polyscripted PHP in the container and periodically updating WordPress.

The fortuitous launch of CloudWatch Synthetics gave us an External Cron which doubles up as a health-check without third-party services.

In total it took about an hour to set up and needs very little babysitting. Even if it were to go down though, it’s not a big deal, as we’re about to see…

Putting it all together

Now let us put it all together. How does polyverse.com work? What’s the workflow? How does it all come together? Here is the complete architecture:

On every change to WordPress, a travis build is triggered using a simple plugin, which renders the website into static pages, applies templating and styling and then dumps everything into S3, which is CloudFront’ed for serving.

Authors never know this architecture exists. From their point of view, they are simply using the most popular CMS in the world. They can find help, guidance, tutorials, troubleshooting and support because it’s simply WordPress. They aren’t learning or using esoteric geeky technology in the name of security. Operationally, nobody is worried about scaling, predictions, load or taking down “a server”.

To sum it up, I wanted to call out how a seemingly small AWS feature reduced MASSIVE complexity and helped us achieve a no-compromise use-case where we say yes to Security AND Usability. To learn more about Polyscripting and how it can protect your WordPress visit our website.

Interested in learning more?

Be the first to hear about the latest product releases and cybersecurity news.

The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive licensee of Linus Torvalds, owner of the mark on a world­wide basis.