Fuzzer for the Fonto XPath engine - How to fuzz your XPath expressions?

A few days ago I wrote a fuzzer for our XPath engine. A fuzzer is essentially a tool for finding bugs in code automatically by generating random inputs for a program and seeing whether it crashes on those inputs. Take a look at the following definition from Wikipedia:

Fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks. Typically, fuzzers are used to test programs that take structured inputs.

In our case this means generating random XPath expressions and executing them with FontoXPath. In the following sections, I’ll go over the different components of the fuzzer. Take a look at the Add basic fuzzer #334 PR if you’re interested in the code.

This blog post is not intended to cover every aspect of fuzzing, if you’re interested in fuzzing, I highly recommend watching the Fuzz Week playlist on YouTube by Gamozolabs. Furthermore, I recommend following his Twitter, YouTube and Twitch channels.

Basic setup

The following pseudocode describes the basic loop that makes up the fuzzer:

while (true) {
	// 1. Select randomly from corpus
	let input = corpus.getRandomInput();
 
	// 2. Mutate the input
	mutate(input);
 
	// 3. Try to execute
	try {
		let _ = evaluateXPath(input);
	} catch (error) {
		// 4. Filter out expected errors
		// ...
 
		// 5. Report the unexpected crash
		report_error(error);
	}
 
	// 6. Report progress
	report_progress();
}

The essence of this loop is to mutate() the input and execute to see if it causes an error (catch(error)). In the following sections I’ll explain each step in more detail. Remember the goal of the fuzzer is find unexpected crashes given the input, it is not intended to test the resulting output for correctness, hence the let _ =. If you want to verify correctness, you’re better off writing unit – or integration tests.

1. Select randomly from corpus

A corpus is a set of inputs. These inputs may either be valid or invalid for the program under test. In our case those inputs are existing XPath expressions which I took from one of the Fonto applications. This yielded a corpus of ~200 valid XPath expressions.

This was sufficient to get a basic fuzzer running, but it is by no means a good corpus: it contains a lot of repetition, and it does not include a good representation of all possible XPath language constructs. But hey, you have to start somewhere. Building a good corpus is an art in itself, and I’m by no means an expert, so all I can do is refer you to the Gamozolabs videos. A better corpus will improve the performance of the fuzzer.

For every iteration in the fuzzing loop, the fuzzer selects a random input from the corpus.

2. Mutate the input

Now that we have a randomly selected input from our corpus, we need to mutate it because executing from the corpus over and over again won’t yield new results.

A mutation could be as simple as replacing a random character in the input with a randomly generated character. This assumes your input is a string, but you can do this for binary data as well by replacing random bits. This is sometimes referred to as a bit flipper.

Now you may think: “Making random mutations may not generate a valid XPath!”. And you’d be correct! But that is also the point: by making invalid inputs you may hit edge cases in the program under test which may cause crashes and that is what a fuzzer is looking for.

For the XPath fuzzer, I wrote two mutation functions:

A simple character flipper which replaces up to 4 characters randomly from the set abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_-[]{}()<>,./?:;\"\'|\\!@#$%^&*+=.
A string mutation which has a 5% chance to randomly reverse the input, 20% chance to delete parts of the input and a 20% chance to copy a random part.

The numbers in these mutation functions are hyperparameters which can be tuned to optimize the performance of the fuzzer.

These mutation functions are crude because it is unlikely to generate new XPath syntax. For example, it is very unlikely that it will generate a call to an XPath function like fn:subsequence. However, they’re good enough for an initial implementation. There are more intelligent mutation strategies, but that is subject for another time.

3. Try to execute & 4. Filter out expected errors

Now that we have a mutated input, we evaluate it and see what happens. That is exactly what we do: try to parse and execute the mutated XPath expression. This is done in a try/catch block to capture any errors thrown by FontoXPath, such that we can examine them.

Note the result of the evaluateXPath is not used. This is because the fuzzer has no idea what the expected value should be. A fuzzer is not meant to find those kinds of correctness bugs. Those bugs are better left to unit – or integration tests.

The FontoXPath engine throws specific errors for errors it specifically handles. For example, the XPST0003 error is thrown to signal an expression is not syntactically valid. Those are expected errors, so we filter them out to focus on unexpected crashes.

5. Report the unexpected crash

If an expected crash is caught, it is logged because it may very well be a bug. The log includes the input that caused the crash, which acts as a repro case, as well as the full details of the crash including the description and the stack trace.

Now you can imagine that lots of different inputs trigger the same crash and thus are symptoms of the same problem. It is generally not helpful to see all different inputs that cause the same crash. To prevent spamming logs, I implemented a straightforward de-duplication based on the stack trace. If the stack trace is the same, it is considered to be the same error. This is not perfect because there may be different paths in the program that lead to the same top of the stack but in our case it is sufficient.

Each unique crash may be an actual bug and deserves further investigation. This investigation needs to be done by a dev manually, but with the input and the crash details, it points the dev in the right direction.

6. Report progress

The final component of our fuzzer is the reporting of progress. It prints the total number of fuzz cases, the number of fuzz cases per second, and the number of unique crashes found.

Although your fuzzer does not necessarily need this, it is nice to see progress. The program under test may go into an endless loop for example and that would be visible in the total number of cases not increasing. Also, when tuning the hyperparameters mentioned earlier it is essential to see what the impact is on performance.

Results

Even with this basic fuzzer implementation, I managed to find 4 different unique crashes. 3 of which are arguably bugs in the XPath engine implementation. The final one is: Not implemented: let expressions with namespace usage, which is a known issue and thus should be filtered out.

It took about 16 million fuzz cases to find those 4 unique crashes. Remember: a fuzzer is essentially brute-forcing its way through the space of all possible inputs randomly. It took about 30 minutes on my laptop. Certainly there is plenty of room for optimization.

Although a fuzzer is not intended as a benchmark to measure performance, it does offer a good stress test for your program and a reasonable indication of how fast your program actually is.

Conclusion

Fuzzing is a useful technique to automatically find bugs in your program by randomly changing inputs and see if that causes unexpected crashes. Due to its random nature, it tends to generate inputs that you won’t think to write a unit test for or to be resistant against. Fuzzing does not prove your program works as expected, it only finds unexpected errors.

Writing a basic fuzzer is easy to do once you understand the core concepts of fuzzing, which I hopefully covered in this post. However, writing a good fuzzer requires specific expertise. Again, I highly recommend watching the Fuzz Week playlist on YouTube by Gamozolabs. Furthermore, I recommend following his Twitter, YouTube and Twitch channels.

Bert Willems

Product Architect @ Fonto. Passionate about structured content, web technology, computer sciences and natural language processing

Fuzzing FontoXPath

A fuzzer for the FontoXPath engine