Usage¶
Import the SUT¶
A system under test (SUT) must be imported to be used with CTADL. The general form of the command is:
ctadl import <language> <artifact> -o <workdir>
ctadl import --help lists, among other things, the languages
your installation supports.
The language is the language of the SUT.
Artifacts are specific to the language, as you’ll see below.
Importing creates a directory, <workdir> with a variety of results.
The
factssubdir represents the entire native program in a TSV (tab-separated values) formatted, suitable for input to CTADL. This format is typically referred to as Datalog “facts.”Other subdirs, such as
sources, contain decompiled output
Analyze Android APKs and Java bytecode¶
To import myapp.apk, you’d execute:
ctadl import jadx myapp.apk -o <workdir>
This creates an <workdir> directory with everything needed to run
CTADL for that myapp.apk. It includes decompiled sources (in the
sources subdir).
Ghidra PCODE¶
To decompile and import /usr/bin/ls:
ctadl import pcode /usr/bin/ls
Note
Importing a binary through Ghidra requires that Ghidra is
installed and that the GHIDRA_HOME environment variable is
set properly, typically to GHIDRA/lib/ghidra where GHIDRA
is the place where Ghidra was extracted.
Index the SUT¶
Indexing runs our compositional data flow analysis over the entire SUT.
Run the CTADL indexer with:
ctadl [--directory <working-directory>] index
By default, it looks for the import in the current directory, but you can provide a path, too. The indexing process autodetects the import language.
First, CTADL generates an index.dl containing the Datalog code for
the indexer. CTADL then checks whether it’s compiled an indexer for this
language before. If not, it calls out to Souffle to compile the indexer,
then runs it.
Next, this command creates an index, a sqlite database file
ctadlir.db. The index contains a data flow graph, a call graph, and
other analysis artifacts. The filename is unfortunately not
configurable due to the limitations of the Souffle Datalog engine’s
compiler. To optimize indexing, ensure that the index is not being
written to over the network. You can pass -j to set the number of
cores to use. I’d recommend using as many as you can.
Indexing can take some time and unfortunately there’s no good way to
measure its progress. We print a live view of resources consumed,
including load average and RAM consumption (if psutil is installed).
Query the SUT: Run Taint Analysis¶
Run a CTADL query with the command:
$ ctadl query [models.json]
CTADL reads the index from ctadlir.db and performs taint analysis.
It creates a query.dl file containing the complete Datalog code for
the query. It prints a summary of the paths, sources, sinks, and taint
labels found. CTADL outputs the query results into ctadlir.db. You
can skip the query analysis with --skip if it’s already cached in
the index.
Without a models.json argument, CTADL chooses a default query. The
default query uses a pre-selected, language-specific set of interesting
sources and sinks.