Usage ===== Import the SUT ---------------------------- A system under test (SUT) must be imported to be used with CTADL. The general form of the command is: .. code-block:: sh ctadl import -o ``ctadl import --help`` lists, among other things, the languages your installation supports. The language is the language of the SUT. Artifacts are specific to the language, as you'll see below. Importing creates a directory, ```` with a variety of results. - The ``facts`` subdir represents the entire native program in a TSV (tab-separated values) formatted, suitable for input to CTADL. This format is typically referred to as Datalog "facts." - Other subdirs, such as ``sources``, contain decompiled output Analyze Android APKs and Java bytecode ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To import ``myapp.apk``, you’d execute: .. code-block:: sh ctadl import jadx myapp.apk -o This creates an ```` directory with everything needed to run CTADL for that ``myapp.apk``. It includes decompiled sources (in the ``sources`` subdir). Ghidra PCODE ^^^^^^^^^^^^ To decompile and import ``/usr/bin/ls``: .. code-block:: sh ctadl import pcode /usr/bin/ls .. note:: Importing a binary through Ghidra requires that Ghidra is installed and that the ``GHIDRA_HOME`` environment variable is set properly, typically to ``GHIDRA/lib/ghidra`` where GHIDRA is the place where Ghidra was extracted. Index the SUT ---------------- Indexing runs our compositional data flow analysis over the entire SUT. Run the CTADL indexer with: .. code-block:: sh ctadl [--directory ] index By default, it looks for the import in the current directory, but you can provide a path, too. The indexing process autodetects the import language. First, CTADL generates an ``index.dl`` containing the Datalog code for the indexer. CTADL then checks whether it’s compiled an indexer for this language before. If not, it calls out to Souffle to compile the indexer, then runs it. Next, this command creates an index, a sqlite database file ``ctadlir.db``. The index contains a data flow graph, a call graph, and other analysis artifacts. The filename is unfortunately *not* configurable due to the limitations of the Souffle Datalog engine’s compiler. To optimize indexing, ensure that the index is not being written to over the network. You can pass ``-j`` to set the number of cores to use. I’d recommend using as many as you can. Indexing can take some time and unfortunately there’s no good way to measure its progress. We print a live view of resources consumed, including load average and RAM consumption (if ``psutil`` is installed). Query the SUT: Run Taint Analysis -------------------------------------------- Run a CTADL query with the command: .. code-block:: sh $ ctadl query [models.json] CTADL reads the index from ``ctadlir.db`` and performs taint analysis. It creates a ``query.dl`` file containing the complete Datalog code for the query. It prints a summary of the paths, sources, sinks, and taint labels found. CTADL outputs the query results into ``ctadlir.db``. You can skip the query analysis with ``--skip`` if it’s already cached in the index. Without a ``models.json`` argument, CTADL chooses a default query. The default query uses a pre-selected, language-specific set of interesting sources and sinks.