Fuzzing NPM packages

In today's post we are going to explore how to fuzz test an npm package from within the Node.js ecosystem.

Just to recap, fuzzing, in the context of software development, is a technique used to discover bugs, vulnerabilities, and unexpected behavior in code by providing random or mutated inputs to a program. It involves automated testing and is particularly useful for identifying software (security) flaws.

This technique proved to be very successful for bug hunters and security researchers, but we are yet to see much traction for fuzzing testing in widely adopted software testing or in Secure Software Development Life Cycle (SDLC) program implementations.

Generally, fuzzers are available for pretty much any programming language, but only a handful are available for Javascript/Typescript, which seem to contradict with how common these languages are used in today’s world.

The NPM Ecosystem ranges from well-documented and thoroughly tested packages to unmaintained and abandoned ones, presenting a wide spectrum of package quality and reliability.

The majority of security concerns in the context of JavaScript/TypeScript applications arise from the challenges of managing dependencies, commonly referred to as "dependency hell". This problem occurs when a large number of dependencies are included, and sometimes these packages are either unmaintained or not readily visible to developers. The famous incident involving the left-pad npm package serves as a notable example in this regard.

Even when it is known that a given package is vulnerable to a certain attack it is quite common that proof-of-concept is missing or a package is marked incorrectly as vulnerable (https://github.com/jaredhanson/utils-merge/issues/8).

No matter how well-developed or tested a codebase is, bugs are an inevitable part of software development. Writing tests is a tedious and very unpopular part of the development.

By looking over quite a few npm packages many of them seem to lack defined tests. The rest is mainly using hard-coded test cases a.k.a. example-based test.

Common example:

// Prevent prototype pollution.
cases('get(Object, key)', ({ key, value, test }) => {
 shvl.set({}, key, value);
 expect(shvl.get(Object, test || key)).toEqual(undefined);
}, {
 "set({}, '__proto__.b', 'foo')": { key: '__proto__.b', value: 'foo' },
 "set({}, 'a.__proto__.b', 'foo')": { key: 'a.__proto__.b', value: 'foo' },
 "set({}, 'constructor.prototype.b', 'foo')": { key: 'constructor.prototype.b', value: 'foo'},
 "set({}, 'a.constructor.prototype.b', 'foo')": { key: 'a.constructor.prototype.b', value: 'foo'}
});

Unit tests define fixed values for inputs and expected return values of functions. Similarly, integration and end-to-end tests involve hardcoding inputs and scenarios to be covered. In the case of end-to-end tests, the expected outputs of a function are replaced by what is anticipated to be displayed on the screen. Consequently, the majority of tests rely on precise sets of inputs and anticipate specific outcomes.

So what is better? Property based testing or fuzzing? I recommend reading an interesting piece here on this topic: https://hypothesis.works/articles/what-is-property-based-testing/

Let’s see how we can fuzz test a package from npmjs.com

As we are not intending to reveal any 0-days in this post, let's pick a known vulnerability and try to find it with Nodebee. A very popular package with over 91 million weekly downloads yarg-parser will be a good candidate.

A serious prototype pollution vulnerability was discovered in this package back in March 2020 and was swiftly patched after. https://security.snyk.io/vuln/SNYK-JS-YARGSPARSER-560381

Preparing the target package

So first let's grab the vulnerable version from the source repository releases.

https://github.com/yargs/yargs-parser/releases/tag/v18.1.0

Download, extract and install locally:

~# 
wget https://github.com/yargs/yargs-parser/archive/refs/tags/v18.1.0.zip
Connecting to github.com (140.82.121.3:443)
Connecting to codeload.github.com (140.82.121.9:443)
saving to 'v18.1.0.zip'
v18.1.0.zip          100% |*************************************************************************************************************************| 57750  0:00:00 ETA
'v18.1.0.zip' saved
~ # unzip v18.1.0.zip
Archive:  v18.1.0.zip
  creating: yargs-parser-18.1.0/

~ # cd yargs-parser-18.1.0/
~/yargs-parser-18.1.0 # npm i --omit=dev

added 2 packages, and audited 3 packages in 8s

found 0 vulnerabilities
~/yargs-parser-18.1.0 #

After installing the package locally, we can verify from the Node.js REPL that it works. It is important to test how the target function behaves and what the inputs should look like.

Looking at the code the main export is the `Parser` function:

function Parser (args, opts) {
 const result = parse(args.slice(), opts)
 return result.argv
}


// parse arguments and return detailed
// meta information, aliases, etc.
Parser.detailed = function (args, opts) {
 return parse(args.slice(), opts)
}


module.exports = Parser

Which calls to the `parse` internal function:

function parse (args, opts) {
 opts = Object.assign(Object.create(null), opts)
 // allow a string argument to be passed in rather
 // than an argv array.
 args = tokenizeArgString(args)

Where the `args` is returned by the `tokenizeArgString` library function:

module.exports = function (argString) {
 if (Array.isArray(argString)) {
   return argString.map(e => typeof e !== 'string' ? e + '' : e)
 }


 argString = argString.trim()


 let i = 0
 let prevC = null
 let c = null
 let opening = null
 const args = []


 for (let ii = 0; ii < argString.length; ii++) {
   prevC = c
   c = argString.charAt(ii)


   // split on spaces unless we're in quotes.
   if (c === ' ' && !opening) {
     if (!(prevC === ' ')) {
       i++
     }
     continue
   }


   // don't split the string if we're in matching
   // opening or closing single and double quotes.
   if (c === opening) {
     opening = null
   } else if ((c === "'" || c === '"') && !opening) {
     opening = c
   }


   if (!args[i]) args[i] = ''
   args[i] += c
 }


 return args
}

We can see in the call stack above that the `Parser` takes some arguments, string input and an `options` object. Checking the Snyk proof-of-concept (PoC) we only need the parse argument string as this seems to be a main vector. Options are rarely controlled by user inputs so it is not that interesting for now.

Creating a fuzzing harness

We quickly sketch out a basic harness which looks like the following:

// https://github.com/yargs/yargs-parser
// https://www.npmjs.com/package/yargs-parser
// vulnerable version 18.1.0

const yargsParserOld = require("/root/yargs-parser-18.1.0/");

module.exports = {
 handler: function (param1) {
  
 if (typeof yargsParserOld === 'undefined') {
     console.log("yargsParserOld : UNDEFINED");
     process.exit(1);
 }
 // PoC by Snyk
 // const parser = require("yargs-parser");
 // console.log(parser('--foo.__proto__.bar baz')); console.log(({}).bar);
 //
 // https://security.snyk.io/vuln/SNYK-JS-YARGSPARSER-560381


 const argv = yargsParserOld(param1);
 },

 inputSpec: [{
   handler: { // a handler field is currently required to distinguish multiple inputSpecs;
     type: "fixed",
     data: "handler1"
   },
   param1: {
     type: "string",
   }
 }],


 paramKeys: ["param1"] //this tells the fuzzer what parameters to call the handler with and in what order
}

The above simply feeds a string through the `param1` parameter from the fuzzing engine and mutates the input based on the coverage provided back to the algorithm. (There are a range of various configuration options, which are not highlighted here.)

After we are confident that this harness will work, let's instrument the package so the tool can gather feedback from the execution.

We can either use the AFL or RQM mode. Without going into much detail for now the main difference between the two algorithms is that AFL is good at finding simpler inputs fast, while RQM is slightly slower, but is better at figuring out how to bypass bottlenecks and explicit checks thus revealing more complex inputs.

Instrumenting the package — **Fuzzing**
So let's stick with AFL mode for now and start fuzzing.

For catching we will rely on the built-in sanitizers/checkers, however it is important to highlight that these are fully customizable and can be written in JavaScript, in case we want to describe a given behavior we want to test against (e.g. a behavior not strictly security related).

For more insight check out one of our previous blog posts “Solving a CTF challenge with NodeBee” at the “Custom Sanitizers” section.

We can see that our fuzzer starts finding valid inputs and quickly (within 4 minutes) manages to find an input that can be used to trigger a prototype pollution vulnerability.

Validation of the finding

Checking the saved input we can quickly validate the finding in the Node.js REPL.

Indeed it looks like a valid prototype pollution vulnerability and the input vector is very similar to the one that the Snyk research team found.

Now with a PoC saved, maintainers could quickly start working on fixes and use the fuzzer to validate that the issue has been properly mitigated.

Here the package maintainers decided to use a `sanitizeKey` callback function which aims to disarm the `__proto__` key passed here as an input. However we can see there might be room for further fuzzing even on the latest version and check if it is possible to circumvent this mitigation.

// TODO(bcoe): in the next major version of yargs, switch to
// Object.create(null) for dot notation:
function sanitizeKey (key: string): string {
 if (key === '__proto__') return '___proto___'
 return key
}

Summary

To summarize with a very simple setup and minimal effort it is possible to easily reveal high severity vulnerabilities in the target code.

Furthermore we can see that fuzzing can also be easily used to verify that either a fix has been properly implemented or a new change does not introduce a flaw into the code base.

We keep hearing that fuzzing is only a component for organizations with more mature SDLC or Application Security programmes and requires deep security expertise, which is clearly not the case.

In our forthcoming posts, we will demonstrate the implementation of the previous fuzzing method within a CI/CD environment, commonly employed by open-source package maintainers. Furthermore, we will illustrate how to utilize the outputs in a manner akin to a state-of-the-art security scanner or analyzer.

Stay tuned and let us know what targets would you like to see us fuzzing!

Fuzzing NPM packages

Let’s see how we can fuzz test a package from npmjs.com

Preparing the target package

Creating a fuzzing harness

Fuzzing

Validation of the finding

Summary

References:

CBC Padding Oracle attack in JavaScript Explained