NPM/Node.js code injection attack

By Archis Gore

NPM/Node.js recently had a clever, yet simple, code injection attack using a “dependency confusion” vulnerability. Below, I describe the attack as conducted (simulated, really), and a systemic solution that Polyverse specifically built to solve this problem.

A recap of the attack, for baseline:

Node dependencies are specified by name and version but not address/location. Take a look at an example scenario, {“sorter”: “1.0”, “binary-search”: “2.0”, “polyverse-billing”: 1.0}.

Notice the last one? It’s intended to be Polyverse internal, and contains our proprietary (and sensitive) billing code. Obviously it does not exist on, the public upstream node package repository. Instead it comes from a private repository hosted by Polyverse.

In a Sequence Diagram, this is how the flow worked before the attack. Pretty straight-forward.

All an attacker had to do was host a public repository called “polyverse-billing” on npmjs and the flow would go like this:

The attacker was able to inject code through a dependency called polyverse-billing that I did not intend.

The fundamental problem: package identity does not consider package source.

Simply put, if there are two packages originating from two different places:

  1. AND

Once downloaded, either sorter can be substituted for the other and their identity is considered the same.

A necessary step to mitigation

A simple, effective and necessary step to ensuring all dependencies are sourced from the expected place is to identify the most restricted location in the dependency chain, and use that and only that as the single source for all dependencies.

What this means is, if any one dependency in the entire tree comes from a private repository (more restricted than a public repository), then ALL dependencies should be served from a private repository (and ideally only one private repository.)

An all-private repository closure looks like:

With this model, you are insulated from any actions, malicious or otherwise. There is only one source of truth and it is the most restrictive of all the original sources (therefore any less restrictive source of truth only becomes that much more controlled.)

This leaves an open question though.

How can we enforce an approved source for all packages?

The problem with the model above is that Node itself still doesn’t know the difference between:

  1. AND

Code signing does help in transport, but doesn’t do much good at execution time. How do you ensure that at the time your application is executing, it isn’t executing a sorter it obtained from

This is where Polyscripting comes in to play. A description of Polyscripting is out of scope for this document, but can be found on our website here:

Polyscripting creates a grammar that is unique to your instance, cluster, fleet, team, organization, etc. Packages with regular grammar (or any other Polyscripted grammars) are completely unexecutable on instances with any particular unique Polyscripted grammar. This enforcement happens in the interpreter itself at the very source of execution. This means that even if a package signature is forged, or encryption is broken or somehow a side-channel transport is used to place it on a host, it remains unexecutable.

We demonstrate how such an enforcement is made easy and straightforward with Polyscripting. An organization may have any number of N unique grammars assigned to individual instances. The transformation dictionary and its association known only to the registry. Let’s assume for simplicity that there is only one grammar that is unique across the entire organization.

Now let’s see how the code injection flow would work with Polyscripting:

As you can see the unintended code would never execute. You are assured that the code that executes came from an auditable authorized place, and if it did not, you are assured that it is never executed. Even when someone adds a new package or pulls a package out-of-band.

Polyscripting provides a clear solution to supply chain attacks, especially Dependency Confusion.

Interested in learning more?

Be the first to hear about the latest product releases and cybersecurity news.

The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive licensee of Linus Torvalds, owner of the mark on a world­wide basis.