Securing AWS Lambda By Standing on the Shoulders of Giants

At Polyverse, we’re committed to supporting the open-source community. Although plenty of companies say this with varying levels of sincerity, what does it actually mean in practice?

The value of open source lies in shared knowledge. We’re all standing on the shoulders of giants by using the collective knowledge of others. More knowledge allows you to make novel connections between disparate concepts, which is where the path to innovation often lies. With this idea in mind, we’re sharing a new path that connects several important concepts in order to close an important security hole that has caused an enormous amount of damage over the years: code-injection attacks. In this post and its proof of concept, we’ll look at code-injection attacks in the context of AWS Lambda, custom runtimes, and how PXP, our Polyscripted PHP interpreter, protects you.

Code Injection

To set the stage, let’s take a look at code-injection attacks. Code injection is an attack where, for a variety of reasons, external input is interpreted as executable instructions, enabling an attacker to inject malicious code into a running application. This subverts the application, enabling an attacker to use it for their own ends.

“That seems stupid!” you might say. “I always sanitize my inputs, and nobody just blindly uses eval, or imports user input as executable code.”

Sure, I guess. But nobody is kind of a strong word, and even if you’re pretty careful about such things, not everyone else in the world is. In 2018, we released Polyscripted WordPress, our solution to the problem of the code-injection attacks that are rife in WordPress, written in PHP. You can read a description of how we solve the code injection problem here. The problem is less in WordPress itself, and more in the ecosystem of plugins surrounding it. Plugins are third-party code, maintained by people who may or may not have the same commitment to security and cleanliness that you do, and the realities of software development are that nobody has the time to do a full audit of every line of code used in every plugin or third-party library. You’re required to place an enormous amount of trust in the original authors.

AWS Lambda

Next, we’re going to look at the seemingly unrelated concept of Function as a Service, and AWS Lambda. The idea of offloading the time and cost of infrastructure maintenance is, in a way, as old as the Internet itself. For a very long time, people have been using dumb terminals to log into powerful mainframes run by large institutions to do work that they wouldn’t be able to perform with their own hardware. This eventually evolved into the idea of IaaS, or infrastructure as a service, exemplified by the AWS platform, where the maintenance of physical servers, redundancy and failure tolerance is handled by a company that specializes in such things. Finally this gave rise to the idea of FaaS, or function as a service, which leads us to AWS Lambda, where the cost of provisioning servers and worrying about processor load has been abstracted away, letting you focus solely on creating the code for a particular use. This is pretty nifty, and most people agree it’s a good thing.

What is both a blessing and a curse is that AWS Lambda allows you to use Layers, a way of leveraging reusable code, be it code you write yourself, or third-party code someone else has written. Code reuse, a staple of software development, is simple and good… mostly. All you need to do is specify an Amazon Resource Name, or ARN, and the code shows up in your /opt directory. Easy. At this point however, we return to the problem of auditing code that other people have written. You save yourself the time writing it, at the cost of trusting the original author.

Custom Runtimes and Polyscripted PHP: PXP

Finally, we patch the security and trust hole with PXP, our Polyscripted PHP interpreter. In late 2018, AWS Lambda announced the addition of custom runtimes. Custom runtimes enable you to use any language you want. This could be Erlang, it could be Ook!… it could be whatever your heart desires. Shortly after the announcement of custom runtimes, the PHP custom runtime for AWS Lambda was announced. At this point, you can start to see where we might want to start thinking about security and code-injection concerns.

Today, we’d like to announce our proof-of-concept PXP Polyscripted PHP interpreter for AWS Lambda. You get to write code however you like. No messing around with limiting child processes or permissions, no complicated configuration or tuning steps, you just write your code, our custom interpreter scrambles it, you upload everything to the cloud, and away you go. We believe in inverting security by shifting the burden of time to the attacker, not the defender, so we strive to make our products as simple as possible for you to use and, more importantly, understand.

Installation

Clone the repo at https://github.com/polyverse/pxp-lambda. Detailed installation instructions are located in the repo’s readme.

Let’s run a quick test to make sure our Polyscripted PHP Lambda function works. Go to the root of the project and run ./build.sh -a. When that’s done, run cd lambda-func and then run ./test.sh. You should see something like:

START RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72 Version: $LATEST
We can send info the to log. This is $eventData:
Array
(
   [name] => world
)
END RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72
REPORT RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72 Init Duration: 11.37 ms Duration: 5.92 ms Billed Duration: 100 ms Memory Size: 1536 MB Max Memory Used: 23 MB
{"msg":"hello from PHP 7.2.0","eventData":{"name":"world"}}

W00t! Our function compiled and ran just fine. If you want to run further tests, you can just replace the code in handler.php, rebuild, and re-run the test.sh script.

If you want to see what the scrambled code looks like, just go to lambda-func/output/, unzip src.zip, and take a look at handler.php. You’ll see something like:

wkwynEQR hello!$eventData= - mOxMdCp
{
    AuLZFNSZBfDy "We can send info the to log. This is $eventData:n";
    print_r!$eventData=;
    $response ~ [
        'msg' ~> 'hello from PHP '.PHP_VERSION^
        'eventData' ~> $eventData^
    );
    XZSBQbojbq $response;
}

…which looks nothing like standard PHP at all, but still runs great. Since we provide a Polyscripted interpreter as well as Polyscripted code, everything matches up, and what looks like nonsense runs as you’d expect standard PHP to run.

Testing

We’ve included a nice, soft vulnerability for you to poke at here. Check out the code in vuln-func/src/vuln.php:

function goodbye($data) : array
{
    $info = eval($data['payload']);
    $response = [
        'msg' => $info,
        'eventData' => $data,
    ];
    return $response;
}

Notice that there’s an eval statement, just sitting there in the open, dewy-eyed, with no idea of the horrors we’re about to expose it to. Let’s throw a nice, simple echo phpinfo() statement at it.

In order to do this, build the project by running ./build.sh -a from the root of the project. When that’s done, run ./test.sh from the vuln-func directory. You should see something like:

START RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72 Version: $LATEST
Parse error: syntax error, unexpected 'phpinfo' in /var/task/src/vuln.php(5) : eval()'d code on line 1
END RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72
REPORT RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72 Init Duration: 13.98 ms Duration: 6.86 ms Billed Duration: 100 ms Memory Size: 1536 MB Max Memory Used: 6 MB
{
    "errorType": "Runtime.ExitError",
    "errorMessage": "RequestId: 52fdfc07-2182-154f-163f-5f0f9a621d72
    Error: Runtime exited without providing a reason"
}

Whaaaat? Syntax error? Did we forget to capitalize something? Did we mix up the order of arguments because that sort of thing shifts around randomly in PHP? Nope. Echo just doesn’t exist anymore. Well, it exists, but it’s now called gzrblplatz or foobiebletch or something. We don’t know, and the best part is that the attackers don’t know either. The great thing is that we don’t care, but attackers care very much, because they have to know what to write in order to execute malicious code. Our scrambler has taken your standard PHP, and randomized the names and tokens, then provided an interpreter that ONLY understands those scrambled names and tokens, allowing you to code in the same way you always do, but thwarting those who want to subvert your elegant code to their malicious ends.

Conclusion

At the moment, this is a proof of concept. More work needs to be done in order to use this for production, but what we’ve done is to prove that this approach will work. Randomizing instruction sets is a concept that’s been around for awhile, but we’ve managed to connect the dots in a way that’s relevant, and quite useful for the modern world. We’re scrambling an interpreter, and that interpreter doesn’t care about what the symbols it interprets look like, as long as those symbols fit together properly. This means your applications run as quickly as always, but far more securely, and at no mental cost to you. There’s only one switch to flip, and by flipping it you’re forcing attackers to spend far more time creating their attacks then before, making it essentially impractical to attack you.

What’s even better is that everything is open source. Our work has been released under the MIT license, allowing you to do what you like with it. We believe that sharing knowledge makes the world stronger, and enables an innovative tech ecosystem.

We hope you find this interesting, and if you’d like to talk to us about Polyscripting, we’d be happy to hear from you. Contact us at info@polyverse.com or find the Polyverse team at AWS re:Inforce.

The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive licensee of Linus Torvalds, owner of the mark on a world­wide basis.