Clang Libtool

Technically, Clang receives an Abstract Syntax Tree (AST), build by Clang Parser (clang/Parse/*), not the input C/C++ code (although Parser is part of Clang code base). There is obviously a Lexer in between this process, but neither Lexer nor Parser is our focus in this tutorial. Clang is responsible to convert the AST to LLVM IR which is implemented in Clang CodeGen. Once source code is translated to LLVM IR, all instrumentation have to be in LLVM IR. It is certainly easy to instrument in fine-grain language like LLVM-IR then complicated top-level source code. But in certain cases (e.g. code formatter, code documentation etc.), it is expected to modify the source code (overwrite the source file).

Clang Libtool is an API to modify the AST while it is in buffer and write back it to the source file in input source language. So, a developer can insert/remove/modify any node in AST by the API functionality and all others will be handled by the libtool. This design gives the plugin developer flexibility to focus on specific purpose. Besides good understanding of C++ coding, a developer will require following knowledge:

  • AST tree structure, understands their nodes including their attributes
  • Basic code structure of Clang Libtool
  • AST matcher (this is optional but it gives more flexibility in tool development)

Note: Some may face various issues working on Clang Libtool. As an example: Clang Libtool only converts source code to AST but does not expand code with macros.

Writeup Explanation

In this write-up, we will write a Clang Libtool that will insert a different version of a target function in source code and replace all the call invokation of original function to the new function. Additionally, the newly defined function will print all the params value in it. Consider the following source code:

int doSum(int a, int b){
    int sum;
    sum = a + b;
    return sum;
}
int main(){
    return doSum(10, 20);
}

The above code is a simple C code which has a method doSum() that accepts two integers, sum them, store it in a local integer variable, and finally returns it as integer. The function can be called from multiple places with different arguments (e.g. here inside main()). Let’s consider, we want to print the params received by doSum() for every invocation and we do not want to modify the original function. So, we write a different version of the method just after the original:

int doSum(int a, int b){
    int sum;
    sum = a + b;
    return sum;
}
int Wrap_doSum(int a, int b){
    printf("a = %d\n", a);
    printf("b = %d\n", b);
    return doSum(a, b);
}
int main(){
    return Wrap_doSum(10, 20);
}

As we have mentioned earlier, the only user input will be the target function name, in this case doSum. But, the user may also input a prefix for the wrapper method.

To achieve this, we have to insert a new function with two printf() for the two params and also call the original target function and return its results back to the call-site. Besides that, it will also have to replace every call invocation of the target function with the newly created function. We will split this job into following tasks:

  • Identify the target function.
  • Identify the function params and their types.
  • Create a function (prefix + original name) just after the target function.
  • Create the function body with two printf() to print the params value.
  • Append it with call to target function with a return.
  • Identify every call to the target function and redirect them to the newly created function.

Abstract Syntax Tree

Let’s start with step 1: understanding the AST. We use the following command to dump the AST and redirect it to a file:

clang-check -ast-dump target_test.c --extra-arg="-fno-color-diagnostics" -- > ast.out

We use the clang-check to do a basic error check in the input source and the -ast-dump to dump the AST. We add an extra argument to discard any format styler in AST (in plain file they don’t work). The full file may look scary but the only part we want to look into is where the parser generates AST for doSum() and main().

|-FunctionDecl <../target_test.c:3:1, line:7:1> line:3:5 used doSum 'int (int, int)'
| |-ParmVarDecl <col:11, col:15> col:15 used a 'int'
| |-ParmVarDecl <col:18, col:22> col:22 used b 'int'
| `-CompoundStmt <col:24, line:7:1>
|   |-DeclStmt <line:4:5, col:12>
|   | `-VarDecl <col:5, col:9> col:9 used sum 'int'
|   |-BinaryOperator <line:5:5, col:15> 'int' '='
|   | |-DeclRefExpr <col:5> 'int' lvalue Var 0xc778510 'sum' 'int'
|   | `-BinaryOperator <col:11, col:15> 'int' '+'
|   |   |-ImplicitCastExpr <col:11> 'int' <LValueToRValue>
|   |   | `-DeclRefExpr <col:11> 'int' lvalue ParmVar 0xc778300 'a' 'int'
|   |   `-ImplicitCastExpr <col:15> 'int' <LValueToRValue>
|   |     `-DeclRefExpr <col:15> 'int' lvalue ParmVar 0xc778378 'b' 'int'
|   `-ReturnStmt <line:6:5, col:12>
|     `-ImplicitCastExpr <col:12> 'int' <LValueToRValue>
|       `-DeclRefExpr <col:12> 'int' lvalue Var 0xc778510 'sum' 'int'
`-FunctionDecl <line:8:1, line:10:1> line:8:5 main 'int ()'
  `-CompoundStmt <col:11, line:10:1>
    `-ReturnStmt <line:9:5, col:24>
      `-CallExpr <col:12, col:24> 'int'
        |-ImplicitCastExpr <col:12> 'int (*)(int, int)' <FunctionToPointerDecay>
        | `-DeclRefExpr <col:12> 'int (int, int)' Function 0xc778450 'doSum' 'int (int, int)'
        |-IntegerLiteral <col:18> 'int' 10
        `-IntegerLiteral <col:22> 'int' 20

AST is afterall a regular tree structure with nodes and every node is derived from its parent node. In this code, we can see there are two siblings node in top-level. They are both FunctionDecl for two functions in user source code. We can notice the name of the function with its signature in this format: func_name 'return_type (param_type, param_type)'. Next, we can see ParmVarDecl for every param where they also have the variable naming with respective type. Finally, a CompoundStmt starts which is the function body that ends up with a ReturnStmt. For doSum(), there are more than ReturnStmt under the CompoundStmt e.g. we can see a BinaryOperator for = operator in sum = a + b; followed by a DeclRefExpr indicates the lValue sum and another BinaryOperator for the + operation that is also followed by two other DeclRefExpr for variable a and b, each of them are derived from ImplicitCastExpr to explain that they have to be converted to rValue from the lValue. Another important node for this job is the CallExpr in main(). We can see it has two IntegerLiteral which defines the call’s two arguments.

Workspace Structure

It is a standard procedure to build the Clang Libtool from the Clang tools directory (i.e. llvm/tools/clang/tools/). So, we create a directory (clang-wrapper) for the tool and add the following line at the end of CMakeLists.txt (in llvm/tools/clang/tools/).

add_clang_subdirectory(clang-wrapper)

Inside our tool workspace, we will have another CMakeLists.txt which usually looks like following:

cmake_minimum_required(VERSION 2.8.8)
project(syssec-workshop)

include_directories(${CMAKE_CURRENT_SOURCE_DIR}/..)
set(LLVM_LINK_COMPONENTS support)

add_clang_executable(clang-wrapper
  wrap-method.cpp
  )
target_link_libraries(clang-wrapper
  clangTooling
  clangBasic
  clangASTMatchers
  )
install(TARGETS clang-wrapper RUNTIME DESTINATION bin)

At the beginning, it ensures the cmake minimum version to build this tool. Then we give a project name (in this case, syssec-workshop). Next, include_directories is for the clang source code path. The set basic LLVM link support.

The add_clang_executable is important, we first give the executable name (or tool name) and than the list of dependent source code to compile. The target_link_libraries is also important which mentions what clang runtime support will be required for this tool (i.e. clang-wrapper). We will use clangTooling (libtool support), clangBasic (clang basic support for user input handling), and clangASTMatchers (for clang ASTMatcher api). install will set the path where to install the tool.

Next to the CMakeLLists.txt, we should add the C++ source code for the libtool (we will have one i.e. wrap-method.cpp). Once we have everything ready to compile, we can build the tool the same way we build the clang/llvm from its build directory. The libtool will be available in the build directory (i.e. build/bin/).

Installation

The workspace directory is available here: https://github.com/mustakcsecuet/syssec-workshop/tree/master/clang-libtool/clang-wrapper. Copy this to your (llvm/tools/clang/tools/) and add the directory to the CMakeLists.txt file there. Build the tool with your Clang build. Use the tool in following way:

clang-wrapper -wrap=true -wrap-prefix=wrap -wrap-target=doSum target_test.c --

Basic Code Structure

The code structure of a Clang libtool can be divided into four parts. We definitely require a main function which will process command line user input and prepare the next phase. The next phase is responsible to preapare the Rewriter (metaphore a pen) for the Compiler processed AST buffer. In the third step, we will prepare the AST matcher (metaphore pattern recognization engine). Finally, we will write handler to use the Rewritter for matched AST.

Command Line Parser: Inside the main(), we process the command line options. We can do extra verfication on user input here. It also creates the ClangTool instance and runs the tool with a customized ASTFrontendAction. Libtool API use the factory design pattern to return the instance of the customized ASTFrontendAction.

SYSSECFrontEndAction: This class extends from ASTFrontendAction where we can override the CreateASTConsumer()and prepare the Rewriter for the AST buffer. Later, we subvert the control flow to our customized ASTConsumer. Besides that, we also override EndSourceFileAction() to inform the compiler to commit the change in buffer to source file at the end of process.

SYSSECASTConsumer: This class extends from ASTConsumer where we can finally have the ASTContext and define-use the MatchFinder. In its private member, we have the MatchFinder and two handlers. In the class constructor, we define the match pattern and set their respective callback handler. We override HandleTranslationUnit() to request the compiler to start the MatchFinder process.

Handlers: The handlers are automatically called when MatchFinder finds a match. We have two handler, first one to create wrap function immediate to target function with identical function signature, the later one redirect every call expression to original target function to the wrap function.

So, overall the code structure looks like:

class SYSSECWrapper : public MatchFinder::MatchCallback {
public:
  SYSSECWrapper(Rewriter &Rewrite) : Rewrite(Rewrite) {}
  virtual void run(const MatchFinder::MatchResult &Result) {
    // action for the matched pattern
  }
private:
  Rewriter &Rewrite;
};
class SYSSECRedirect : public MatchFinder::MatchCallback {
private:
public:
  SYSSECRedirect(Rewriter &Rewrite) : Rewrite(Rewrite) {}
  virtual void run(const MatchFinder::MatchResult &Result) {
    // action for the matched pattern
  }
private:
  Rewriter &Rewrite;
};
class SYSSECASTConsumer : public ASTConsumer {
public:
  SYSSECASTConsumer(Rewriter &R) : handleWrapper(R), handleRedirect(R) {
    // define MatchFinder pattern
  }
  void HandleTranslationUnit(ASTContext &Context) override {
    Matcher.matchAST(Context);
  }

private:
  SYSSECWrapper handleWrapper;
  SYSSECRedirect handleRedirect;
  MatchFinder Matcher;
};
class SYSSECFrontEndAction : public ASTFrontendAction {
public:
  SYSSECFrontEndAction() {}
  void EndSourceFileAction() override {
    TheRewriter.getEditBuffer(TheRewriter.getSourceMgr().getMainFileID())
        .write(llvm::outs());
  }
  std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
                                                 StringRef file) override {
    TheRewriter.setSourceMgr(CI.getSourceManager(), CI.getLangOpts());
    return llvm::make_unique<SYSSECASTConsumer>(TheRewriter);
  }
private:
  Rewriter TheRewriter;
};
int main(int argc, const char **argv) {
  CommonOptionsParser op(argc, argv, SYSSEC_COMPILER_WORKSHOP);
  ClangTool Tool(op.getCompilations(), op.getSourcePathList());
  // process command line option
  return Tool.run(newFrontendActionFactory<SYSSECFrontEndAction>().get());
}

Notice, once we have defined Rewriter in SYSSECFrontEndAction, we always carry it to the upper steps of the code structure.

Command Line Option

At the beginning of the source code (global space), we have defined multiple cl fields for different command-line operations.

// creates a option category to show what functionality available in this tool
// try clang-wrapper -help
static llvm::cl::OptionCategory
    SYSSEC_COMPILER_WORKSHOP("SYSSEC-CLANG-WRAPPER");

// creates multiple options to feed user inputs
// -wrap takes true/false and required
static llvm::cl::opt<bool>
    wFlag("wrap", llvm::cl::desc("Do you want to wrap a function?"),
          llvm::cl::Required, llvm::cl::cat(SYSSEC_COMPILER_WORKSHOP));
// -wrap-prefix takes a string and optional
static llvm::cl::opt<std::string>
    wrapPrefix("wrap-prefix",
               llvm::cl::desc("Select the prefix of the wrapper."),
               llvm::cl::Optional, llvm::cl::cat(SYSSEC_COMPILER_WORKSHOP));
// -wrap-prefix takes a string and optional
static llvm::cl::opt<std::string>
    targetMethod("wrap-target", llvm::cl::desc("Name the function to wrap."),
                 llvm::cl::Optional, llvm::cl::cat(SYSSEC_COMPILER_WORKSHOP));

// it is possible to show enhance message about the tool
static llvm::cl::extrahelp MoreHelp("\nA Clang Libtool to create a wrapper for "
                                    "a function to show its input values\n");

Inside the main(), we can do further sanitization on user provided command line input.

  // we can do simple extra user input validation
  if (wFlag) {
    // we atleast need to know the target function name
    if (targetMethod.length()) {
      llvm::errs() << "The target wrap function: " << targetMethod << "\n";
      if (wrapPrefix.length()) {
        llvm::errs() << "Prefix (User): " << wrapPrefix << "\n";
      } else {
        // this is default prefix if user does not provide
        wrapPrefix = "syssec";
        llvm::errs() << "Prefix (Default): " << wrapPrefix << "\n";
      }
    } else {
      llvm::errs() << "Please, input a target function name.\n";
      return 0;
    }
  }

As an independent compiler tool, user interaction is important and libtool gives enough flexibility to handle them.

AST Matcher

We have AST matcher to match two different patterns: 1) function declaration of target function, 2) call expression with target function as callee.

To match following AST node:

|-FunctionDecl 0xc778450 <../target_test.c:3:1, line:7:1> line:3:5 used doSum 'int (int, int)'

We have following straightforward matcher:

    // we ask AST to match the function declaration with the target function
    // name, if so, callback the hangleWrapper
    Matcher.addMatcher(functionDecl(hasName(targetMethod)).bind("wrapFunc"),
                       &handleWrapper);

The .bind(tag) basically tag the findings so that we can use the tag later to retrieve the instance of matched pattern.

To match the following AST:

-CallExpr 0xc778860 <col:12, col:24> 'int'
        |-ImplicitCastExpr 0xc778848 <col:12> 'int (*)(int, int)' <FunctionToPointerDecay>
        | `-DeclRefExpr 0xc7787b8 <col:12> 'int (int, int)' Function 0xc778450 'doSum' 'int (int, int)'

We write the following matcher:

// we ask AST to match the call expression with target function as callee
    // and if so, callback the handleRedirect
    Matcher.addMatcher(callExpr(callee(functionDecl(hasName(targetMethod))))
                           .bind("callMatched"),
                       &handleRedirect);

This may seem confusing because there is no function declaration under call expression. Basically, callExpr(callee)returns the DeclRefExpr which we have fine-grained by functionDecl as callee must be a functionDecl (it could be cxxMethodDecl for C++ code).

Details about AST matcher is available here: https://clang.llvm.org/docs/LibASTMatchersReference.html

Handlers

This is where we start writing the action for matched pattern (with the help of the Rewriter). We have two different handlers for two different matcher. Both of them extends the MatchFinder::MatchCallback. The override run() receives const MatchFinder::MatchResult & from which we can extract the findings using the binding tag. We can also have the ASTContext to access the SourceManager.

SYSSECWrapper: We bind the FunctionDecl with tag wrapFunc which we access as following:

const FunctionDecl *func =
        Result.Nodes.getNodeAs<clang::FunctionDecl>("wrapFunc");

Once we have the FunctionDecl instance, we get access to every information related to the function. As an example: we can access the return type and number of params of the function:

string retType = func->getReturnType().getAsString();
unsigned int paramNum = func->getNumParams();

In a similar fashion, we can access ParamVarDecl, function body Expr etc. For any Expr, we can access their source location using getBeginLoc() or getEndLoc() (there are more functions available to access different locations related to an expression).

      // the target function end point is the '}', so we ask for +1 offset
      SourceLocation TARGET_END = func->getEndLoc().getLocWithOffset(1);
      std::stringstream wrapFunction;
      string wrapFunctionName = wrapPrefix + "_" + targetMethod;
      // we create the entire wrap function text
      wrapFunction << "\n" + retType + " " + wrapFunctionName + +"(" +
                          funcParamSignature + ")\n{\n" + funcBody + "\n}";
      // let's insert the wrap function at the end of target function
      Rewrite.InsertText(TARGET_END, wrapFunction.str(), true, true);

Finally, we like to insert a function at the end of the target function. So, once our function body and signature is read (wrapFunction), we use the Rewrite.InsertText(). We first mention the location where to insert and then the text. The third param of this method asks if we want to insert after the location or before, the fourth param is for indentation to new line.

SYSSECRedirect: We bind the CallExpr in the matcher with tag callMatched and we access it here. Once we have it, we only need to replace the callee with the wrap function. So, we use Rewrite.ReplaceText() where we first mention the source range of the callee that we want to modify and then we mention the new callee name.

const CallExpr *cexpr =
        Result.Nodes.getNodeAs<clang::CallExpr>("callMatched");
Rewrite.ReplaceText(cexpr->getCallee()->getSourceRange(), wrapFunctionName);

That’s All

Yes, thats all you require to know to write your own Clang libtool. Just follow the code structure and write your code using all the functionality available from Clang. Do not try to break the structure, it will help you to avoid any issue.

SVF: Interprocedural Static Value-Flow Analysis in LLVM

SVF is a static analysis framework implemented in LLVM that allows value-flow construction and pointer analysis to be performed in an iterative manner (sparse analysis – analysis conducted into stages, from overapproximate analysis to precise, expensive analysis). It uses (default) points-to information from Andersen’s analysis and constructs an interprocedural memory SSA (Static-Single Assignment) form where def-use chains of both top-level and address-taken variables are included. From the design perspective, the SVF can be seen as two modules: pointer analysis and memory SSA. To be capable of allowing another points-to analysis than Andersen’s, the design of pointer analysis is split into three parts: Graph, Rules, and Solver. To give control of scalability and precision, the memory SSA is designed to allow users to partition memory into a set of regions. Comparing to previous work, SVF takes a big leap by constructing an interprocedural value-flow graph. To identify bugs, it is an important requirement to be able to analyze a program across its procedural boundaries.

The latest version of SVF is implemented on LLVM-7.0. SVF takes LLVM IR code (a bitcode file) as input (LLVM IR module) and can analysis both C/C++ programs. To use SVF, the source needs to be compiled with Clang to generate the bitcode (clang -emit-llvm source.c -o source.bc) and later links all (.bc) file into a large single (.bc) file through LLVM Gold plugin (llvm-link source1.bc source2.bc -o source.bc).

As mentioned before, the pointer analysis consists of three loosely coupled components: Graph, Rules, and Solver. The graph is a higher level abstraction, shows where a pointer analysis should take place. A rule defines how to derive the points-to information from a statement in the graph. The ‘rules’ is the location where anyone can implement a different set of points-to analysis mechanism than Andersen’s. As the pointer-analysis will take place interprocedural, it is an important requirement to design a completely separate part to indicate which order should the points-to analysis work on the graph, i.e. the Solver. As an example, for Andersen’s points-to analysis, it uses transitive closure rules and a solver named wave. Similarly, a flow-sensitive pointer analysis uses a set of flow-sensitive strong/weal update rules with a points-to propagation solver on a sparse value-flow graph.

There are four steps in the value-flow graph construction. First, it performs a ‘Mod-Ref Analysis’ to capture interprocedural reference and modification side effect for each variable (a set of indirect defs). The ‘Mem Region Partitioning’ separates the memory partition into regions to let the user have the capability of deciding scalability and precision trade-offs. The ‘Memory SSA’ take place once the indirect uses and defs are known. It follows a standard SSA conversion algorithm to structure the program in memory SSA form. Finally, the ‘VFG Construction’ module links the uses of a variable to its defs and construct the value-flow graph.

The following statements are important to construct a complete value-flow graph for address-taken variables:

  1. ADDROF is known as an allocation site (a stack object [alloca instruction] or a global object [an allocation site or a program function name] or a dynamically created object at its allocation site). Note: For flow-sensitive pointer analysis, the initializations for global objects take place at the entry of main().
  2. Copy denotes either a casting instruction or a value assignment from a PHI instruction in LLVM.
  3. PHI is introduced at a confluence point in the CFG to select the value of a variable from different control-flow branches.
  4. LOAD is a memory accessing instruction that reads a value from an address-taken object.
  5. STORE is a memory accessing instruction that writes a value into an address-taken object.
  6. CALL denotes a call instruction, where the target can be either a global variable (for a direct call) or a stack virtual register (for an indirect call). In a cell site, a return value or in the entry of a procedure, the parameter could be defined as direct or indirect.

One featured client of SVF is source-sink analysis (detecting security bugs). An example bug detector is a memory leak detector where it is important to find out whether a memory allocation (source) will reach a free site (sink) or not. SVF’s value-flow graph could be used by a source-sink analysis client to achieve it. The other client could be a selective instruction client which will allow dynamic analysis tool to look only at suspicious sites. It could also be used for developing a debug tool to guide the developer.

Besides their 40K lines of code for SVF, the group also open-sourced a pointer analysis mirco benchmark, PTABen.

How I’ve Learned Intel Pin Tool

The most difficult part of doing research is prototyping. Especially when it’s about security, its a must one. A researcher has to prove the proposed system is legitimate. It’s true for both attack and defense. Researchers greatly depend on existing technology and software to implement their prototype. It cuts the development time to start from scratch. But on another hand, it is also complicated to work on others work. Most cases the technology is also a prototype from other researcher and presumably unstable, have not enough documentation and very limited feature. Luck sometimes favor for researcher too. Today I am going to discuss how I learned to use Intel Pin tool. It’s developed by Intel research team and well documented for reference. The only limitation is it is only usable for Intel architecture.

Intel pin tool is a great dynamic analysis tool. We can instrument the binary at runtime and collect information about execution. We can collect memory information, register value, which instruction is executing and so on. The main site is here: https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool. Here I will describe how I get familiar with this tool. I have used Linux distribution (Ubuntu) with Intel x86-64 architecture to install Intel pin 3.4 (https://software.intel.com/en-us/articles/pin-a-binary-instrumentation-tool-downloads). The manual will be available here: https://software.intel.com/sites/landingpage/pintool/docs/97438/Pin/html/.

Download the .tar.gz of Intel pin 3.4 from the above link. Untar it to any location. Inside of it, we will see a pin executive which is our tool. To demonstrate that our system works correctly, let’s start with building few example tools delivered with the source. Follow the following commands:

cd source/tools/ManualExamples/
make all TARGET=intel64
../../../pin -t obj-intel64/inscount0.so -- /bin/ls

So with these commands, we basically go inside of ManualExamples delivered with Intel pin tool and build all tools inside that targeting Intel x86-64 architecture. The last command will execute one tool named inscount0.so for the binary /bin/ls. The inscount0.so tool basically instrument all instruction /bin/ls to count how many instructions being executed. The result will be stored in inscount.out file. If the test is successful, we will have that output file and inside we have how many instructions being executed. Let’s try to do the same last command with a different argument for /bin/ls.

../../../pin -t obj-intel64/inscount0.so -- /bin/ls ..

So now /bin/ls executed with pin tool inscount0.so for path .. from the current position.

This is a good time to look at the tool code to understand how it count number of instructions. Let’s open inscount0.cpp from source/tools/ManualExamples. So, this is our source code and our make command create the .so file inside obj-intel64/ which we execute with Intel pin. Here is the source code:

/*BEGIN_LEGAL 
Intel Open Source License 

Copyright (c) 2002-2017 Intel Corporation. All rights reserved.
 
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.  Redistributions
in binary form must reproduce the above copyright notice, this list of
conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.  Neither the name of
the Intel Corporation nor the names of its contributors may be used to
endorse or promote products derived from this software without
specific prior written permission.
 
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE INTEL OR
ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
END_LEGAL */
#include <iostream>
#include <fstream>
#include "pin.H"

ofstream OutFile;

// The running count of instructions is kept here
// make it static to help the compiler optimize docount
static UINT64 icount = 0;

// This function is called before every instruction is executed
VOID docount() { icount++; }
    
// Pin calls this function every time a new instruction is encountered
VOID Instruction(INS ins, VOID *v)
{
    // Insert a call to docount before every instruction, no arguments are passed
    INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
}

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool",
    "o", "inscount.out", "specify output file name");

// This function is called when the application exits
VOID Fini(INT32 code, VOID *v)
{
    // Write to a file since cout and cerr maybe closed by the application
    OutFile.setf(ios::showbase);
    OutFile << "Count " << icount << endl;
    OutFile.close();
}

/* ===================================================================== */
/* Print Help Message                                                    */
/* ===================================================================== */

INT32 Usage()
{
    cerr << "This tool counts the number of dynamic instructions executed" << endl;
    cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
    return -1;
}

/* ===================================================================== */
/* Main                                                                  */
/* ===================================================================== */
/*   argc, argv are the entire command line: pin -t <toolname> -- ...    */
/* ===================================================================== */

int main(int argc, char * argv[])
{
    // Initialize pin
    if (PIN_Init(argc, argv)) return Usage();

    OutFile.open(KnobOutputFile.Value().c_str());

    // Register Instruction to be called to instrument instructions
    INS_AddInstrumentFunction(Instruction, 0);

    // Register Fini to be called when the application exits
    PIN_AddFiniFunction(Fini, 0);
    
    // Start the program, never returns
    PIN_StartProgram();
    
    return 0;
}

This is a C code written using Intel pin library (pin.H).  It is always the best idea to start exploring code from main method. In the main method, we can see a conditional check using PIN_Init(argc, argv) which validate the command line argument if there are in proper order or not. It also initiates Intel pin system (Pin provides its own locking and thread management API’s, which the Pintool should use).

Next, we have opened a file “inscount.out” to write. If we look back how we declare the filename variable KnobOutputFile, it will look weird at first why we create things complicated for such case. KnobOutputFile is a type Knob defined within Intel pin to automate the parsing and management of command line switches. Just consider, we want to use a different filename in different test execution; the best way is sending the filename with the command line. Because of the knob, we can now send filename in command line like as follows:

../../../pin -t obj-intel64/inscount0.so -o pininscount.out -- /bin/ls

So, the instruction count now will be stored in pininscount.out instead of inscount.out. The only difference we include -o filename here, and even we don’t need to make any change in the source code to overwrite the default filename. So, how pin parse the filename if we want to send such multiple information from the command line. Look closely at the declaration of KnobOutputFile.

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", "o", "inscount.out", "specify output file name");

The first argument is KNOB_MODE_WRITEONCE  which means we want to open the file to write once. Next, we have “pintool” which mentions which family it is supposed to be included; and then we have “o” which works as a flag suggest pin tool that in command line whatever after -o will be the desired string for this variable. Look how we replace the default filename last time (-o pininscount.out). The last argument just gives a suggestion the user what he is missing providing in command line if that field is required. For this case, we provide a default filename “inscount.out” as the fourth argument.

Again, come back to our main method.  Next, we have INS_AddInstrumentFunction() where we send our instrumentation callback method as the first argument and the second argument is a variable length argument where we can send argument for the instrumentation method. In this case, we send Instruction as instrumentation callback method and this method does not require any argument; so we send 0 in the second argument.

Let’s have a look at our instrumentation callback method, Instruction. Typical instrumentation callback method should be looked like following:

typedef VOID(* LEVEL_PINCLIENT::INS_INSTRUMENT_CALLBACK)(INS ins, VOID *v)

So, our Instruction method as the similar signature with (INS ins, VOID *v). The first argument ins is the targeted instruction which is going to be instrumented. Inside our callback method, we have INS_InsertCall() call. The first argument of this call is the target instruction, here ins. Then we have IPOINT_BEFORE, which is a pin flag. This flag means instrument before the target instruction. This is a really important part of instrumentation and wisely need to use for different purposes. Right now, just keep it simple like we can instrument before or after instruction or based on target instruction evaluation (like conditional instruction; it could be taken or not taken). Next, we have a function which should be called before executing the target instruction, here it is named as docount. We have to cast it to (AFUNPTR); it’s a defined return type for such method in pin system to match any return type for the method. Later, we can pass arguments to our docount method if required; but we don’t so we indicate it with IARG_END. The docount method only increments a counter to track how many instructions being executed.

Come back again to the main method, we have another call to PIN_AddFiniFunction() similar to previous one. We send another callback method Fini; which will be called just before application exit. Our Fini method do the file write in the usual way and close it.

At last, we have PIN_StartProgram() which will start executing the target binary with full preparation to instrument all instruction before they executed to count how many instructions are executed. That is a good example to introduce with Intel pin tool.

Let’s try to build our own first tool. Consider we like to know what an instruction read from a global variable. We know for Intel x86-64 architecture, in instruction global variables are represented with memory address based on RIP register. A regular instruction read from a global variable will look like following:

mov r9, qword ptr [rip+0x217cad]
mov r9, qword ptr [rip+0x217cad]
mov rdi, qword ptr [rip+0x217af6]
mov rax, qword ptr [rip+0x217c31]
mov rax, qword ptr [rip+0x217c31]
mov rsi, qword ptr [rip+0x217e23]
mov rax, qword ptr [rip+0x394279]
mov eax, dword ptr [rip+0x217e55]

As a first step, we create a directory under “pin/source/tools/” named “dreamlandcoder”. Now we copied two files from the “pin/source/tools/ManualExamples/”; files are “makefile” and “makefile.rules”. Next, we edit our “makefile.rules” to change only this variable “TEST_TOOL_ROOTS”. Previously it was assigned to source names as comma separated list, but as now we have only one tool and we named our source file as “globalTrace.cpp”; so we do:

TEST_TOOL_ROOTS := globalTrace

So, we open a new .cpp file named “globalTrace.cpp” where we will write our pin tool instructions. The complete code looks like following:

#include <iostream>
#include <fstream>
#include "pin.H"

ofstream outFile;

VOID ReadContent(ADDRINT *addr)
{
    ADDRINT value;
    PIN_SafeCopy(&value, addr, sizeof(ADDRINT));
    outFile << "Address 0x" << hex << (unsigned long long)addr << dec << ":\t" << value << endl;
    return;
}

VOID Instruction(INS ins, VOID *v)
{
    if (INS_IsMov(ins) && INS_OperandIsMemory(ins, 1) && INS_MemoryBaseReg(ins) == REG_RIP)
    {
        INS_InsertCall(ins,
                       IPOINT_BEFORE,
                       AFUNPTR(ReadContent),
                       IARG_MEMORYREAD_EA,
                       IARG_END);
    }
}

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool",
                            "o", "global_dump.out", "specify output file name");

// This function is called when the application exits
VOID Fini(INT32 code, VOID *v)
{
    outFile.close();
}

INT32 Usage()
{
    cerr << "This tool dump heap memory information (global variable) ..." << endl;
    cerr << endl
         << KNOB_BASE::StringKnobSummary() << endl;
    return -1;
}

int main(int argc, char *argv[])
{
    // Initialize pin
    if (PIN_Init(argc, argv))
        return Usage();
    outFile.open(KnobOutputFile.Value().c_str());
    // Register Instruction to be called to instrument instructions
    INS_AddInstrumentFunction(Instruction, 0);
    // Register Fini to be called when the application exits
    PIN_AddFiniFunction(Fini, 0);

    // Start the program, never returns
    PIN_StartProgram();

    return 0;
}

We have most of the codes from our previous example except instruction check to instrument and Intel pin mechanism to read from memory. Let’s go through our new codes.

Start with Instruction() method where we can do a check of the instruction first, and then if it passed; do the instrumentation. In our last example, we don’t do any check; we just instrument all instructions. Our intention here is to be able to detect instructions that read from memory and that memory has a base which is basically RIP register. That’s also the basic properties of usage of global variables in the system. First, we do INS_OperandIsMemory checking operand 1 in ins instruction is a memory or not. Next, we do INS_MemoryBaseReg for ins instructor and check if the base register of memory is RIP register or not. We also have an INS_IsMov for ins instruction at the beginning which at first filter only move instructions.

When an instruction passed all these conditions, we should do instrument to that instruction to read the memory content. Now, as like the last example, we mentioned the action method here too which is ReadContent(). Now, this method needs to know the read memory address to perform the retrieval of the content from that memory. So, we send this piece of information with our INS_InsertCall() using IARG_MEMORYREAD_EA argument passing to ReadContent(). Look at the ReadContent() method as a parameter as ADDRINT which accept the memory address of reading memory address passed by INS_InsertCall().

Now, discuss what ReadContent() function does here for us. At first, we declared an ADDRINT to hold the memory content which we are just going to fetch from the memory address. Next, we have called PIN_SafeCopy() with passing the memory address and how many bytes we want to read from there beside where we expect the content will be written. Finally, we do writing down the information as:

memory_address memory_content

So, if we look back to our output file, we will have something like following:

Address 0x7f27a4b630a0:	149632
Address 0x7f27a4b61cf8:	3219913727
Address 0x7f27a4b61d00:	0
Address 0x7f27a4b630a8:	139807916161040
Address 0x7f27a4b61ca4:	10114392040684128785
Address 0x7f27a4b630c8:	139807916875776
Address 0x7f27a4b630d0:	139807916868768
Address 0x7f27a4b61e08:	0
Address 0x7f27a4b61e60:	140726958392976
Address 0x7f27a4b61fe0:	139808243855680
Address 0x7f27a4b61e08:	0
Address 0x7f27a4b630c8:	139807916875776
Address 0x7f27a4b630d0:	139807916868806
Address 0x7f27a4b630c8:	139807916875776
Address 0x7f27a4b630d0:	139807916869992
Address 0x7f27a4b61cb8:	4096
Address 0x7f27a4b61cb8:	4096

We can build the new tool in following steps:

make

make obj-intel64/globalTrace.so

and then we can run this tool with “/bin/ls” as like:

../../../pin -t obj-intel64/globalTrace.so -- /bin/ls

We will enrich this example more in next update.