Besides working with the source code written in C and C++, Joern supports CPGs generated from the LLVM Bitcode.
This article shows a basic example of how to use llvm2cpg with Joern.
The basic workflow is the following:
- Convert a program into LLVM Bitcode
- Generate a CPG using llvm2cpg
- Import the CPG into Joern and start the analysis
Let's start with a simple C program:
You can use the following command to convert
foo.c to the bticode format:
Here is a brief explanation of what each flag does:
-emit-llvmtells clang to emit LLVM Bitcode instead of an object file or an executable
-Sforces clang to emit the bitcode in a human-readable, textual format
-genables debug info. Strictly speaking, this one is not needed, but it's essential if we want to map bitcode instructions back to the original source code
-O1by default, clang emits a very inefficient bitcode with a lot of redundancy. This flag tells clang to apply some optimizations to make the bitcode a bit more concise
-o foo.lltells clang to store the result in the file
foo.ll should contain the following:
Note: it's very likely that you have different
target datalayout and
target triple depending on the machine/OS you're running.
To convert LLVM Bitcode into CPG you need to get llvm2cpg and run the following command:
Once done, the CPG (
/tmp/foo.cpg.bin.zip) can be fed to Joern.
Let's find the simple flow in the above program:
Joern tells us that the result of the call to
source (line 5) is passed to the function
sink as an argument (line 6).
Looking at the original code it seems legit:
The previous example may seem too boring, so let's at something a bit more interesting now. Consider the following program with a double free bug:
Following the same steps, we get a CPG:
And start the analysis. Here we are interested to see if any value passed as an argument to the
free function is passed as an argument to the function
By default, we get three flows as follows:
The first two are 'loops': there is a flow from the
free to itself.
We can filter these results out by only asking for flows that are longer than one:
Which yields the double-free bug in the program!