ROSE Compiler Framework/How-tos

From Wikibooks, open books for an open world
Jump to: navigation, search

Quick, short, and focused tutorials about how to do common tasks as a ROSE developer.

Please create a new wikibook page for each how-to topic. Each how-to wiki page should NOT contain any level one (=) or level two(==) heading so it can be included at the correct levels in the print version of this wikibook.

Contents

How to write a How-to[edit]

Quick, short, and focused tutorials about how to do common tasks as a ROSE developer. Please create a new wikibook page for each how-to topic. Each how-to wiki page should NOT contain any level one (=) or level two(==) heading so it can be included at the correct levels in the print version of this wikibook.

Create a new page[edit]

==[[ROSE Compiler Framework/How to write a How-to|How to write a How-to]]==
{{:ROSE Compiler Framework/How to write a How-to}}
  • rename three places of the pasted text with the desired page name, for example
==[[ROSE Compiler Framework/How to do XYZ|How to do XYZ]]==
{{:ROSE Compiler Framework/How to do XYZ}}
  • click save page
  • You will see red text trying to link to the not yet existing How to do XYZ page
  • click any of the red text, it will bring you to an editing window to add content of your new how-to page
  • you can now add new content and save it.
    • Again, each how-to wiki page should NOT contain any level one (=) or level two(==) heading so it can be included at the correct levels in the print version of this wikibook.

Insert image to wiki page[edit]

Rules of the content[edit]

  • Only level three headings (===) and higher are allowed in a how-to page. This is necessary for the how-to page to be correctly included into the final one-page print version of this wikibook. Sorry about this restriction.
    • Again, please don't use level one (=) or level two (==) headings in a how-to page!
  • Keep each how-to short and focused. Readers are expected to only spend 30-minutes or much less to quickly learn how to do something using ROSE.
  • After you created a new how-to page and saved your contributions. Please go to the print version to make sure it shows up correctly.
  • please specify the how-to topic is the current practice or the proposed new ways of doing things. So we can have clear guideline for code review for what is mandatory and what is optional.

How to incrementally work on a project[edit]

Developing a big, sophisticated project entails many challenges. To mitigate some of these challenges, we have adopted several best practices: incremental development, code review, and continuous integration.

Here are some tips on how to divide up a big project into smaller, bite-sized pieces so each piece can be incrementally developed, code reviewed, and integrated.

  • Input: define different sets of test inputs based on complexity and difficulty. Tackle simpler sets first.
  • Output: define intermediate results leading to the final output. Often, results A and B are needed to generate C. So the project can have multiple stages, based on the intermediate results.
  • Algorithm: complex compiler algorithms are often just enhanced versions of more fundamental algorithms. Implement the fundamental algorithms first to gain insight and experience. Then, afterward, you can implement the full-blown versions.
  • Language: for projects dealing with multiple languages, focus on one language at a time.
  • Platform: limit the scope of supported platforms: Linux, Ubuntu, OS X (TODO: add reference to ROSE supported platforms)
  • Performance: Start with a basic, working implementation first. Then try to optimize its performance, efficiency.
  • Scope: your translator could first focus on working at a function scope, then grow to handle an entire source file, or even multiple files, at the same time.
  • Skeleton then meat: a project should be created with the major components defined first. Each component can be enriched separately later on.
  • Annotations (manual vs. automated): Performing one compiler task often requires results from many other tasks being developed. Defining source code annotations as the interface between two tasks can decouple these dependencies in a clean manner. The annotations can be first manually inserted. Later the annotations can be automatically generated by the finished analysis.
  • Optional vs. Default: introducing a flag to turn on/off your feature. Make it as a default option when it matures.

How to visualize AST[edit]

Overview[edit]

Three things are needed to visualize ROSE AST:

  • Sample input code: you provide it
  • a dot graph generator to generate a dot file from AST: ROSE provides dot graph generators
  • a visualization tool to open the dot graph: ZGRViewer and Graphviz are used by ROSE developers

If you don't want to install ROSE+ZGRview + Graphvis from scratch, you can directly use ROSE virtual machine image, which has everything you need installed and configured so you can just visualize your sample code.

Sample input code[edit]

Please prepare simplest input code without including any headers so you can get a small enough AST to digest.

Dot Graph Generator[edit]

We provide ROSE_INSTALLATION_TREE/bin/dotGeneratorWholeASTGraph (complex graph) and dotGenerator (a simpler version) to generate a dot graph of the detailed AST of input code.

Tools to generate AST graph in dot format. There are two versions

  • dotGenerator: simple AST graph generator showing essential nodes and edges
  • dotGeneratorWholeASTGraph: whole AST graph showing more details. It provides filter options to show/hide certain AST information.

command line:

 dotGeneratorWholeASTGraph  yourcode.c  // it is best to avoid including any header into your input code to have a small enough tree to visualize.
dotGeneratorWholeASTGraph --help
   -rose:help                     show this help message
   -rose:dotgraph:asmFileFormatFilter           [0|1]  Disable or enable asmFileFormat filter
   -rose:dotgraph:asmTypeFilter                 [0|1]  Disable or enable asmType filter
   -rose:dotgraph:binaryExecutableFormatFilter  [0|1]  Disable or enable binaryExecutableFormat filter
   -rose:dotgraph:commentAndDirectiveFilter     [0|1]  Disable or enable commentAndDirective filter
   -rose:dotgraph:ctorInitializerListFilter     [0|1]  Disable or enable ctorInitializerList filter
   -rose:dotgraph:defaultFilter                 [0|1]  Disable or enable default filter
   -rose:dotgraph:defaultColorFilter            [0|1]  Disable or enable defaultColor filter
   -rose:dotgraph:edgeFilter                    [0|1]  Disable or enable edge filter
   -rose:dotgraph:expressionFilter              [0|1]  Disable or enable expression filter
   -rose:dotgraph:fileInfoFilter                [0|1]  Disable or enable fileInfo filter
   -rose:dotgraph:frontendCompatibilityFilter   [0|1]  Disable or enable frontendCompatibility filter
   -rose:dotgraph:symbolFilter                  [0|1]  Disable or enable symbol filter
   -rose:dotgraph:emptySymbolTableFilter        [0|1]  Disable or enable emptySymbolTable filter
   -rose:dotgraph:typeFilter                    [0|1]  Disable or enable type filter
   -rose:dotgraph:variableDeclarationFilter     [0|1]  Disable or enable variableDeclaration filter
   -rose:dotgraph:variableDefinitionFilter      [0|1]  Disable or enable variableDefinitionFilter filter
   -rose:dotgraph:noFilter                      [0|1]  Disable or enable no filtering
Current filter flags' values are: 
         m_asmFileFormat = 0 
         m_asmType = 0 
         m_binaryExecutableFormat = 0 
         m_commentAndDirective = 1 
         m_ctorInitializer = 0 
         m_default = 1 
         m_defaultColor = 1 
         m_edge = 1 
         m_emptySymbolTable = 0 
         m_expression = 0 
         m_fileInfo = 1 
         m_frontendCompatibility = 0 
         m_symbol = 0 
         m_type = 0 
         m_variableDeclaration = 0 
         m_variableDefinition = 0 
         m_noFilter = 0 

Dot Graph Visualization[edit]

To visualize the generated dot graph, you have to install

Please note that you have to configure ZGRViewer to have correct paths to some commands it uses. You can do it from its configuration/setting menu item. Or directly modify the text configuration file (.zgrviewer).

One example configuration is shown below (cat .zgrviewer)

<?xml version="1.0" encoding="UTF-8"?>
<zgrv:config xmlns:zgrv="http://zvtm.sourceforge.net/zgrviewer">
    <zgrv:directories>
        <zgrv:tmpDir value="true">/tmp</zgrv:tmpDir>
        <zgrv:graphDir>/home/liao6/svnrepos</zgrv:graphDir>
        <zgrv:dot>/home/liao6/opt/graphviz-2.18/bin/dot</zgrv:dot>
        <zgrv:neato>/home/liao6/opt/graphviz-2.18/bin/neato</zgrv:neato>
        <zgrv:circo>/home/liao6/opt/graphviz-2.18/bin/circo</zgrv:circo>
        <zgrv:twopi>/home/liao6/opt/graphviz-2.18/bin/twopi</zgrv:twopi>
        <zgrv:graphvizFontDir>/home/liao6/opt/graphviz-2.18/bin</zgrv:graphvizFontDir>
    </zgrv:directories>
    <zgrv:webBrowser autoDetect="true" options="" path=""/>
    <zgrv:proxy enable="false" host="" port="80"/>
    <zgrv:preferences antialiasing="false" cmdL_options=""
        highlightColor="-65536" magFactor="2.0" saveWindowLayout="false"
        sdZoom="false" sdZoomFactor="2" silent="true"/>
    <zgrv:plugins/>
    <zgrv:commandLines/>
</zgrv:config>

You have to configure the run.sh script to have correct path also

cat run.sh

#!/bin/sh

# If you want to be able to run ZGRViewer from any directory,
# set ZGRV_HOME to the absolute path of ZGRViewer's main directory
# e.g. ZGRV_HOME=/usr/local/zgrviewer

ZGRV_HOME=/home/liao6/opt/zgrviewer-0.8.1

java -jar $ZGRV_HOME/target/zgrviewer-0.8.1.jar "$@"

Example session[edit]

A complete example

# make sure the environment variables(PATH, LD_LIBRARY_PATH) for the installed rose are correctly set
which dotGeneratorWholeASTGraph
~/workspace/masterClean/build64/install/bin/dotGeneratorWholeASTGraph

# run the dot graph generator
dotGeneratorWholeASTGraph -c ttt.c

#see it
which run.sh
~/64home/opt/zgrviewer-0.8.2/run.sh

run.sh ttt.c_WholeAST.dot

How to create a translator[edit]

Translator basically converts one AST to another version of AST. The translation process may add, delete, or modify the information stored in AST.

Overview[edit]

A ROSE-based translator usually has the following steps

  1. Search for the AST nodes you want to translate.
  2. Perform the translation action on the found AST nodes. This action can be one of two major variants
  • Updating the existing AST nodes
  • Creating new AST nodes to replace the original ones. This is usually cleaner approach than patching up existing AST and is better supported by SageBuilder and SageInterface functions.
  • Deep copying existing AST subtrees to duplicate the code. May expression subtrees should not be shared. So deep copy them is required to get the correct AST.
  • Optionally update other related information for the translation.

First Step[edit]

Get familiar with the ASTs before and after your translation. So you know for sure what your code will deal with and what AST you code will generate.

The best way is to prepare simplest sample codes and carefully examine the whole dot graphs of them.

Design considerations[edit]

It is usually a good idea to

  • separate the searching step from the translation step so one search (traversal) can be reused by all sorts of translations.
  • When design the order of searching and translation, be careful about if the translation will negatively impact on the searching
    • Please void pre-order traversal since you may end up modifying AST nodes to be visited later on, similar to the effect of iterator invalidation.
    • please use post-order, or reverse order of pre-order for your traversal hooked up with translation

Searching for the AST node[edit]

There are multiple ways to find things you want to translate in AST.

AST Query[edit]

  • Via AST Query: Node query returns a list of AST nodes in the same type. This is often enough to simple translations
Rose_STL_Container<SgNode*> ProgramHeaderStatementList = NodeQuery::querySubTree (project,V_SgProgramHeaderStatement);
for (Rose_STL_Container<SgNode*>::iterator i = ProgramHeaderStatementList.begin(); i != ProgramHeaderStatementList.end(); i++)
{
    SgProgramHeaderStatement* ProgramHeaderStatement = isSgProgramHeaderStatement(*i);
    ...
}


More information about AST Query can be found at "6 Query Library" of the ROSE User Manual pdf.

AST Traversal[edit]

  • Through AST traversal: walks through whole AST using different orders (pre-order or post order). Post-order traversal is recommended to avoid modifying things the traversal will hit later on (similar problem as iterator invalidation in C++)
    • The AST traversal gives visit() functions to hook up your translation functions. A switch statement is can be used for handling different types of AST node.
class f2cTraversal : public AstSimpleProcessing
{
  public:
    virtual void visit(SgNode* n);
};

void f2cTraversal::visit(SgNode* n)
{
  switch(n->variantT())
  {
    case V_SgSourceFile:
      {
        SgFile* fileNode = isSgFile(n);
        translateFileName(fileNode);
      }
      break;
    case V_SgProgramHeaderStatement:
      {
        ...
      }
      break;
    default:
      break;
  }
}

More information about AST Traversal can be found at "7 AST Traversal" of the ROSE User manual pdf online.

Performing Translation[edit]

Before you write your translator, please read Chapter 32 AST Construction of ROSE tutorial pdf documentation (http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf). It contains essential information for any translation writers.

The translations you want to do often depend on the types of the AST nodes you visit. For example you can have a set of translation functions defined in your namespace

  • void translateForLoop(SgForLoop* n)
  • void translateFileName(SgFile* n)
  • void translateReturnStatement(SgReturnStmt* n), and so on

Other tips

Updating Tree[edit]
  • You might need to handle some details, like removing symbol, updating parent, and symbol table.
  • Be careful to use deepDelete() and deepCopy(). Some information might not be updated properly. For example, deepDelete might not update your symbol table.

Verify the correctness[edit]

You can use wholeAST graph to verify your translation.

All ROSE-based translators should call AstTests::runAllTests(project) after all the transformation is done to make sure the translated AST is correct.

This has a higher standard than just correctly unparsed to compilable code. It is common for an AST to go through unparsing correctly but fail on the sanity check.

More information is at Sanity_check

Sample translators[edit]

Here we list a few sample translators which can grow to more sophisticated ones you want.

Find pragmas[edit]

/*
toy code
by Liao, 12/14/2007
*/
#include "rose.h"
#include <iostream>
using namespace std;

class visitorTraversal : public AstSimpleProcessing
{
  protected:
    virtual void visit(SgNode* n);
};

void visitorTraversal::visit(SgNode* node)
{
  if (node->variantT() == V_SgPragmaDeclaration) {
      cout << "pragma!" << endl;
  }
}

int main(int argc, char * argv[])
{
  SgProject *project = frontend (argc, argv);
  visitorTraversal myvisitor;
  myvisitor.traverseInputFiles(project,preorder);

  return backend(project);
}

Loop transformation[edit]

SageInterface namespace (http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageInterface.html) has many translation functions, such as those for loops.

For example, there is a loop tiling function defined in https://github.com/rose-compiler/rose/blob/master/src/frontend/SageIII/sageInterface/sageInterface.C :

//     Tile the n-level (starting from 1) loop of a perfectly nested loop nest using tiling size s.
bool     loopTiling (SgForStatement *loopNest, size_t targetLevel, size_t tileSize)


An example Test translator is provided to test this function:

And it has a test input file:

How to build your translator[edit]

See How to set up the makefile for a translator

How to create a cross-language translator[edit]

In this HOW-to, it presents the steps of generating a cross-language translator. We will use Fortran to C translator as an example here.

Change the sourcefile information[edit]

  • change the output file name. The suffix name has to be changed with this following function.
void SgFile::set_unparse_output_filename (std::string unparse_output_filename ) 
  • change the output language type.
void SgFile::set_outputLanguage(SgFile::outputLanguageOption_enum outputLanguage)       
  • Set the output to be target-language only.
 We use set_C_only for the Fortran to C translation.  This process might be optional.
void SgFile::set_C_only(bool C_only)

Identify language-dependent AST node[edit]

  • Example: ROSE AST uses different AST nodes to present a loop in C and Fortran. The following two figures represent the same loop for different languages.
C uses SgForStatement for the for loops.
C SgForStatement
Fortran uses SgFortranDo for the do loops.
Fortran SgFortranDo

Implement the translation functions[edit]

  • Use the wholeAST as reference to implement the translation function.
  • Generate the new AST node by copy required information from the original AST node.
  • Remove the original node, and make sure the parent/child relationship in AST is setup properly.

Testing output code[edit]

  • If compiler is available to test the output code, run the backend to generate object by the backend compiler.
  • If compiler is not available for the target language, make sure output codes can be generated from the testing cases. It is suggested to run the compilation tests for all the testing output.

How to set up the makefile for a translator[edit]

In this How-to, you will create a makefile to compile and test your own custom ROSE translator.

You may want to first look at "How-to install ROSE": ROSE Compiler Framework/Installation.

Environment variables[edit]

You must have the proper environment variable set so you translator can find the librose.so during execution.

export LD_LIBRARY_PATH=${ROSE_INSTALL}/lib:${BOOST_INSTALL}/lib:$LD_LIBRARY_PATH

Translator Code[edit]

Here is a simplest ROSE translator.

// ROSE translator example: identity translator.
//
// No AST manipulations, just a simple translation:
//
//    input_code > ROSE AST > output_code

#include <rose.h>

int main (int argc, char** argv)
{
    // Build the AST used by ROSE
    SgProject* project = frontend(argc, argv);

    // Run internal consistency tests on AST
    AstTests::runAllTests(project);

    // Insert your own manipulations of the AST here...

    // Generate source code from AST and invoke your
    // desired backend compiler
    return backend(project);
}

Makefile for ROSE-edg4.x[edit]

ROSE's headers are now installed under install_path/include/rose. librose.so also depends on boost_iostreams

## A sample Makefile to build a ROSE tool.
##
## Important: remember that Makefile recipes must contain tabs:
##
##     <target>: [ <dependency > ]*
##         [ <TAB> <command> <endl> ]+
## So you have to replace spaces with Tabs if you copy&paste this file from a browser!

## ROSE installation contains
##   * libraries, e.g. "librose.la"
##   * headers, e.g. "rose.h"
ROSE_INSTALL=/home/liao6/workspace/rose-2014-10-13/install

## ROSE uses the BOOST C++ libraries
BOOST_INSTALL=/home/liao6/opt/boost_1.47.0

## Your translator
TRANSLATOR=myTranslator
TRANSLATOR_SOURCE=$(TRANSLATOR).cpp

## Input testcode for your translator
TESTCODE=hello.cpp

#-------------------------------------------------------------
# Makefile Targets
#-------------------------------------------------------------

all: $(TRANSLATOR)

# compile the translator and generate an executable
# -g is recommended to be used by default to enable debugging your code
# Note: depending on the version of boost, you may have to use something like -I $(BOOST_ROOT)/include/boost-1_40 instead. 
$(TRANSLATOR): $(TRANSLATOR_SOURCE)
        g++ -g $(TRANSLATOR_SOURCE) -I$(BOOST_INSTALL)/include -I$(ROSE_INSTALL)/include/rose -L$(ROSE_INSTALL)/lib -lrose -lboost_iostreams -o $(TRANSLATOR)

# test the translator
check: $(TRANSLATOR)
        source ./set.rose ; ./$(TRANSLATOR) -c -I. -I$(ROSE_INSTALL)/include $(TESTCODE) 

clean:
        rm -rf $(TRANSLATOR) *.o rose_* *.dot

Complete examples[edit]

There are three project examples demonstrating different ways of building your projects using ROSE's headers/libraries.

They are available at: https://github.com/chunhualiao/rose-project-templates

A few templates for independent projects using ROSE. By independent, we mean the projects are located outside of ROSE's source tree.

  • template-project-v1 : using Makefile to build the project
  • template-project-v2 : using autotools to build the project, but requiring manual changes to paths.
  • template-project-v3 : using autotools, and support --with-boost and --with-rose options

Makefile for ROSE-edg3[edit]

Here is a sample makefile. Please make sure replacing some leading spaces of make rules with leading Tabs if you copy & paste this sample.

## A sample Makefile to build a ROSE tool.
##
## Important: remember that Makefile rules must contain tabs:
##
##     <target>: [ <dependency > ]*
##         [ <TAB> <command> <endl> ]+
## 
## Please replace space with TAB if you copy & paste this file to create your Makefile!!

## ROSE installation contains
##   * libraries, e.g. "librose.la"
##   * headers, e.g. "rose.h"
ROSE_INSTALL=/path/to/rose/installation

## ROSE uses the BOOST C++ libraries, the --prefix path
BOOST_INSTALL=/path/to/boost/installation

## Your translator
TRANSLATOR=my_translator
TRANSLATOR_SOURCE=$(TRANSLATOR).cpp

## Input testcode for your translator
TESTCODE=input_code_ifs.cpp

#-------------------------------------------------------------
# Makefile Targets
#-------------------------------------------------------------

all: $(TRANSLATOR)

# compile the translator and generate an executable
# -g is recommended to be used by default to enable debugging your code
$(TRANSLATOR): $(TRANSLATOR_SOURCE)
        g++ -g $(TRANSLATOR_SOURCE) -o $(TRANSLATOR) -I$(BOOST_INSTALL)/include -I$(ROSE_INSTALL)/include -L$(ROSE_INSTALL)/lib -lrose

# test the translator
check: $(TRANSLATOR)
        ./$(TRANSLATOR) -c -I. -I$(ROSE_INSTALL)/include $(TESTCODE) 

clean:
        rm -rf $(TRANSLATOR) *.o rose_* *.dot

A complete example[edit]

The sample Makefile prepared within ROSE virtual machine image.

demo@ubuntu:~/myTranslator$ cat makefile 
## A sample Makefile to build a ROSE tool.
##
## Important: remember that Makefile recipes must contain tabs:
##
##     <target>: [ <dependency > ]*
##         [ <TAB> <command> <endl> ]+
## So you have to replace spaces with Tabs if you copy&paste this file from a browser!


## ROSE installation contains
##   * libraries, e.g. "librose.la"
##   * headers, e.g. "rose.h"
ROSE_INSTALL=/home/demo/opt/rose-inst

## ROSE uses the BOOST C++ libraries
BOOST_INSTALL=/home/demo/opt/boost-1.40.0

## Your translator
TRANSLATOR=myTranslator
TRANSLATOR_SOURCE=$(TRANSLATOR).cpp

## Input testcode for your translator
TESTCODE=hello.cpp

#-------------------------------------------------------------
# Makefile Targets
#-------------------------------------------------------------

all: $(TRANSLATOR)

# compile the translator and generate an executable
# -g is recommended to be used by default to enable debugging your code
# Note: depending on the version of boost, you may have to use something like -I $(BOOST_ROOT)/include/boost-1_40 instead. 
$(TRANSLATOR): $(TRANSLATOR_SOURCE)
        g++ -g $(TRANSLATOR_SOURCE) -I$(BOOST_INSTALL)/include -I$(ROSE_INSTALL)/include -L$(ROSE_INSTALL)/lib -lrose -o $(TRANSLATOR)

# test the translator
check: $(TRANSLATOR)
        ./$(TRANSLATOR) -c -I. -I$(ROSE_INSTALL)/include $(TESTCODE) 

clean:
        rm -rf $(TRANSLATOR) *.o rose_* *.dot


demo@ubuntu:~/myTranslator$ make check
g++ -g myTranslator.cpp -I/home/demo/opt/boost-1.40.0/include -I/home/demo/opt/rose-inst/include -L/home/demo/opt/rose-inst/lib -lrose -o myTranslator
./myTranslator -c -I. -I/home/demo/opt/rose-inst/include hello.cpp 

How to debug a translator[edit]

It is rare that your translator will just work after your finish up coding. Using gdb to debug your code is indispensable to make sure your code works as expected. This page shows examples of how to debug your translator.

Preparations[edit]

First and foremost, make sure your ROSE installation and your translator was built with -g and without GCC optimizations turned on. This will ensure all debug information will be best preserved.

You can use "make V=1" to check the actual compilation options. If the default ROSE's configure options are not desirable, you can turn off optimizations by:

 configure --with-C_OPTIMIZE=no --with-CXX_OPTIMIZE=no 

Before you debug your own translators, you may want to doublecheck if ROSE's builtin translator (identityTranslator) can handle your input code properly. If not, you should report the bug to the ROSE team.

If identityTranslator can handle it but your customized translator cannot. The problem may be caused by the customizations you introduced in your translators.

Another thing is to reduce your input code to be as small as possible so it can just trigger the error you are interested in. This will simplify the bug hunting process dramatically. It is very difficult to debug a translator processing thousands of lines of code.

A translator not built by ROSE's build system[edit]

If the translator is built using a makefile without using libtool. The debugging steps of your translator are just classic steps to use gdb.

  • Make sure your translator is compiled with the GNU debugging option -g so there is debugging information in your object codes

These are the steps of a typical debugging session:

1. Set a break point

2. Examine the execution path to make sure the program goes through the path that you expected

3. Examine the local data to validate their values

# how to print out information about a AST node
#-------------------------------------
(gdb) print n
$1 = (SgNode *) 0xb7f12008

# Check the type of a node
#-------------------------------------
(gdb) print n->sage_class_name()
$2 = 0x578b3af "SgFile"

(gdb) print n->get_parent()
$7 = (SgNode *) 0x95e75b8  

# Convert a node to its real node type then call its member functions
#---------------------------
(gdb) isSgFile(n)->getFileName ()

#-------------------------------------
# When displaying a pointer to an object, identify the actual (derived) type of the object 
# rather than the declared type, using the virtual function table. 
#-------------------------------------
(gdb) set print object on
(gdb) print astNode
$6 = (SgPragmaDeclaration *) 0xb7c68008


# unparse the AST from a node
# Only works for AST pieces with full scope information
# It will report error if scope information is not available at any ancestor level.
#-------------------------------------
(gdb) print n->unparseToString()

# print out Sg_File_Info 
#-------------------------------------
(gdb) print n->get_file_info()->display()

A translator shipped with ROSE[edit]

ROSE turns on -O2 and -g by default so the translators shipped with ROSE should already have some debugging information available. But some variables may be optimized away. To preserve the max debugging information, you may have to reconfigure/recompile rose to turn off optimizations.

../sourcetree/configure --with-C_OPTIMIZE=no --with-CXX_OPTIMIZE=no  ...


ROSE uses libtool so the executables in the build tree are not real -- they're simply wrappers around the actual executable files. You have two choices:

  • Find the real executable in the .lib directory then debug the real executables there
  • Use libtool command line as follows:
$ libtool --mode=execute gdb --args ./built_in_translator file1.c

If you can set up alias command alias debug='libtool --mode=execute gdb -args' then all your debugging sessions can be as simple as

$ debug ./built_in_translator file1.c


The remaining steps are the same as a regular gdb session with the typical operations, such as breakpoints, printing data, etc.

Example 1: Fixing a real bug in ROSE

1. Reproduce the reported bug:

$ make check
...
./testVirtualCFG \
    --edg:no_warnings -w -rose:verbose 0 --edg:restrict \
    -I$ROSE/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
    -I$ROSE/sourcetree/tests/CompileTests/A++Code \
    -c $ROSE/sourcetree/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C

...
lt-testVirtualCFG: $ROSE/src/frontend/SageIII/virtualCFG/virtualCFG.h:111:
    VirtualCFG::CFGEdge::CFGEdge(VirtualCFG::CFGNode, VirtualCFG::CFGNode):
    Assertion `src.getNode() != __null && tgt.getNode() != __null' failed.

Ah, so we've failed an assertion within the virtualCFG.h header file on line 111:

Assertion `src.getNode() != __null && tgt.getNode() != __null' failed

And the error was produced by running the lt-testVirtualCFG libtool executable translator, i.e. the actual translator name is testVirtualCFG (without the lt- prefix).

2. Run the same translator command line with Libtool to start a GDB debugging session:

$ libtool --mode=execute gdb --args ./testVirtualCFG \
    --edg:no_warnings -w -rose:verbose 0 --edg:restrict \
    -I$ROSE/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
    -I$ROSE/sourcetree/tests/CompileTests/A++Code \
    -c $ROSE/sourcetree/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C

GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-42.el5_8.1)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from ${ROSE_BUILD_TREE}tests/CompileTests/virtualCFG_tests/.libs/lt-testVirtualCFG...done.
(gdb)

The GDB session has started, and we're provided with a command line prompt to begin our debugging.

3. Let's run the program, which will hit the failed assertion:

(gdb) r
Starting program: \
    ${ROSE_BUILD_TREE}/tests/CompileTests/virtualCFG_tests/.libs/lt-testVirtualCFG \
    --edg:no_warnings -w -rose:verbose 0 --edg:restrict \
    -I${ROSE}/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
    -I../../../../sourcetree/tests/CompileTests/A++Code
    -c   ${ROSE}/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
[Thread debugging using libthread_db enabled]
lt-testVirtualCFG: ${ROSE}/src/frontend/SageIII/virtualCFG/virtualCFG.h:111: 

VirtualCFG::CFGEdge::CFGEdge(VirtualCFG::CFGNode, VirtualCFG::CFGNode): Assertion `src.getNode() != __null && tgt.getNode() != __null' failed.

Program received signal SIGABRT, Aborted.
0x0000003752230285 in raise () from /lib64/libc.so.6

Okay, we've reproduced the problem in our GDB session.

4. Let's check the backtrace to see how we wound up at this failed assertion:

(gdb) bt
#0  0x0000003752230285 in raise () from /lib64/libc.so.6
#1  0x0000003752231d30 in abort () from /lib64/libc.so.6
#2  0x0000003752229706 in __assert_fail () from /lib64/libc.so.6

#3  0x00002aaaad6437b2 in VirtualCFG::CFGEdge::CFGEdge (this=0x7fffffffb300, src=..., tgt=...)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/virtualCFG.h:111
#4  0x00002aaaad643b60 in makeEdge<VirtualCFG::CFGNode, VirtualCFG::CFGEdge> (from=..., to=..., result=...)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/memberFunctions.C:82
#5  0x00002aaaad62ef7d in SgReturnStmt::cfgOutEdges (this=0xbfaf10, idx=1)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/memberFunctions.C:1471
#6  0x00002aaaad647e69 in VirtualCFG::CFGNode::outEdges (this=0x7fffffffb530)
     at ${ROSE}/../src/frontend/SageIII/virtualCFG/virtualCFG.C:636
#7  0x000000000040bf7f in getReachableNodes (n=..., s=...) at ${ROSE}/tests/CompileTests/virtualCFG_tests/testVirtualCFG.C:13
...

5. Next, we'll move backwards (or upwards) in the program to get to the point of assertion:

(gdb) up
#1  0x0000003752231d30 in abort () from /lib64/libc.so.6

(gdb) up
#2  0x0000003752229706 in __assert_fail () from /lib64/libc.so.6

(gdb) up
#3  0x00002aaaad6437b2 in VirtualCFG::CFGEdge::CFGEdge (this=0x7fffffffb300, src=..., tgt=...)
     at ${ROSE}/src/frontend/SageIII/virtualCFG/virtualCFG.h:111
111         CFGEdge(CFGNode src, CFGNode tgt): src(src), tgt(tgt) \
                   { assert(src.getNode() != NULL && tgt.getNode() != NULL); }

Okay, so the assertion is inside of a constructor for CFGEdge:

CFGEdge(CFGNode src, CFGNode tgt): src(src), tgt(tgt) \
{
    assert(src.getNode() != NULL && tgt.getNode() != NULL);  # This is the failed assertion
}

Unfortunately, we can't tell at a glance which of the two conditions in the assertion is failing.

6. Figure out why the assertion is failing:

Let's examine the two conditions in the assertion:

(gdb) p src.getNode()
$1 = (SgNode *) 0xbfaf10

So src.getNode() is returning a non-null pointer to an SgNode. How bout tgt.getNode()?

(gdb) p tgt.getNode()
$2 = (SgNode *) 0x0

Ah, there's the culprit. So for some reason, tgt.getNode() is returning a null SgNode pointer (0x0).

From here, we used the GDB up command to backtrace in the program to figure out where the node returned by tgt.getNode() was assigned a NULL value.

We eventually found a call to SgReturnStmt::cfgOutEdges which returns a variable, called enclosingFunc. In the source code, there's currently no assertion to check the value of enclosingFunc, and that's why we received the assertion later on in the program. As a side note, it is good practice to add assertions as soon as possible in your source code so in times like this, we don't have to spend time unnecessarily back-tracing.

After adding the assertion for enclosingFunc, we run the program again to reach this new assertion point:

lt-testVirtualCFG: ${ROSE}sourcetree/src/frontend/SageIII/virtualCFG/memberFunctions.C:1473: \
    virtual std::vector<VirtualCFG::CFGEdge, std::allocator<VirtualCFG::CFGEdge> > \
    SgReturnStmt::cfgOutEdges(unsigned int): \

    Assertion `enclosingFunc != __null' failed.

Okay, it's failing so we know that the assignment to enclosingFunc is NULL.

# enclosingFunc is definitely NULL (0x0)
(gdb) p enclosingFunc
$1 = (SgFunctionDefinition *) 0x0

# What is the current context?
(gdb) p this
$2 = (SgReturnStmt * const) 0xbfaf10

Okay, we're inside of an SgReturnStmt object. Let's set a break point where enclosingFunc is being assigned to:

Breakpoint 1, SgReturnStmt::cfgOutEdges (this=0xbfaf10, idx=1) at ${ROSE}/src/frontend/SageIII/virtualCFG/memberFunctions.C:1472
1472              SgFunctionDefinition* enclosingFunc = SageInterface::getEnclosingProcedure(this);

So this is the line we're examining:

SgFunctionDefinition* enclosingFunc = SageInterface::getEnclosingProcedure(this);

So the NULL value must be coming from SageInterface::getEnclosingProcedure(this);.

After code reviewing the function getEnclosingProcedure, we discovered a flaw in the algorithm.

The function tries to return a SgNode which is the enclosing procedure of the specified type, SgFunctionDefinition. However, upon checking the function's state at the point of return, we see that it incorrectly detected a SgBasicBlack as the enclosing procedure for the SgReturnStmt.

(gdb) p parent->class_name()
$12 = {static npos = 18446744073709551615,
   _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7cd0e8 "SgBasicBlock"}}

Specifically, the last piece: 0x7cd0e8 "SgBasicBlock".

But this is wrong because we're looking for SgFunctionDefinition, not SgBasicBlock.

Upon further examination, we figured out that the function simply returned the first enclosing node it found, and not the first enclosing node that matched the user's criteria.

We added the necessary logic to make the function complete, tested it to verify its correctness, and then called it a day.

Phew! Not so bad, right?

How to add a new project directory[edit]

Most code development that is layered above the ROSE library starts out its life as a project in the projects directory. Some projects are eventually refactored into the ROSE library once they mature. This chapter describes how one adds a new project to ROSE.

Required Files[edit]

A ROSE project encapsulates a complete program or set of related programs that use the ROSE library. Each project exists as a subdirectory of the ROSE "projects" directory and should include files "README", "config/support-rose.m4", "Makefile.am", and any necessary source files, scripts, tests, etc.

  • The "README" should provide an explanation about the project purpose, algorithm, design, implementation, etc.
  • The "support-rose.m4" integrates the project into the ROSE build system in a manner that allows the project to be an optional component (they can be disabled, renamed, deleted, or withheld from distribution without changing any ROSE configuration files). Most older projects are lacking this file and are thus more tightly coupled with the build system.
  • The "Makefile.am" serves as the input to the GNU automake system that ROSE employs to generate Makefiles.
  • Each project should also include all necessary source files, documentation, and test cases.

Setting up support-rose.m4[edit]

The "config/support-rose.m4" file integrates the project into the ROSE configure and build system. At a minimum, it should contain a call to the autoconf AC_CONFIG_FILES macro with a list of the project's Makefiles (without the ".am" extension) and its doxygen configuration file (without the ".in" extension). It may also contain any other necessary autoconf checks that are not already performed by ROSE's main configure scripts, including code to enable/disable the project based on the availability of the project's prerequisites.

Here's an example:

dnl List of all makefiles and autoconf-generated                          -*- autoconf -*-
dnl files for this project
AC_CONFIG_FILES([projects/DemoProject/Makefile
                 projects/DemoProject/gui/Makefile
                 projects/DemoProject/doxygen/doxygen.conf
                ])

dnl Even if this project is present in ROSE's "projects" directory, we might not have the
dnl prerequisites to compile this project.  Enable the project's makefiles by using the
dnl ROSE_ENABLE_projectname automake conditional.  Many prerequisites have probably already
dnl been tested by ROSE's main configure script, so we don't need to list them here again
dnl (although it usually doesn't hurt).
AC_MSG_CHECKING([whether DemoProject prerequisites are satisfied])
if test "$ac_cv_header_gcrypt_h" = "yes"; then
        AC_MSG_RESULT([yes])
        rose_enable_demo_project=yes
else
        AC_MSG_RESULT([no])
        rose_enable_demo_project=
fi
AM_CONDITIONAL([ROSE_ENABLE_DEMO_PROJECT], [test "$rose_enable_demo_project" = yes])

Since all configuration for the project is encapsulated in the "support-rose.m4" file, renaming, disabling, or removing the project is trivial: a project can be renamed simply by renaming its directory, it can be disabled by renaming/removing "support-rose.m4", or it can be removed by removing its directory. The "build" and "configure" scripts should be rerun after any of these changes.

Since projects are self-encapsulated and optional parts of ROSE, they need not be distributed with ROSE. This enables end users to drop in their own private projects to an existing ROSE source tree without modifying any ROSE files, and it allows ROSE developers to work on projects that are not distributed publicly. Any project directory that is not part of ROSE's main Git repository will not be distributed (this includes not distributing Git submodules, although the submodule's placeholder empty directory will be distributed).

Setting up Makefile.am[edit]

Each project should have at least one Makefile.am, each of which is processed by GNU automake and autoconf to generate a Makefile. See documentation for automake for details about what these files should contain. Some important variables and targets are:

  • include $(top_srcdir)/config/Makefile.for.ROSE.includes.and.libs: This brings in the definitions from the higher level Makefiles and is required by all projects. It should be near the top of the Makefile.am.
  • SUBDIRS: This variable should contain the names all the project's subdirectories that have Makefiles. It may be omitted if the project's only Makefile is in that project's top-level directory.
  • INCLUDES: This would have the the flags that need to be added during compilation (flags like -I$(top_srcdir)/projects/RTC/include). Your flags should be placed before $(ROSE_INCLUDES) to ensure the correct files are found. This brings in all the necessary headers from the src directory to your project.
  • lib_*: These variables/targets are necessary if you are creating a library from your project, which can be linked in with other projects or the src directory later. This is the recommended way of handling projects.
  • EXTRA_DIST: These are the files that are not listed as being needed to build the final object (like source and header files), but must still be in the ROSE tarball distribution. This could include README or configuration files, for example.
  • check-local: This is the target that will be called from the higher level Makefiles when make check is called.
  • clean-local: Provides you with a step to perform manual cleanup of your project, for instance, if you manually created some files (so Automake won't automatically clean them up).

A basic example[edit]

Many projects start as a translator, analyzer or optimizer, which takes into input code and generate output.

A basic sample commit which adds a new project directory into ROSE: https://github.com/rose-compiler/rose/commit/edf68927596960d96bb773efa25af5e090168f4a

Please look through the diffs so you know what files to be added and changed for a new project.

Essentially, a basic project should contain

  • a README file explaining what this project is about, algorithm, design, implementation, etc
  • a translator acts as a driver of your project
  • additional source files and headers as needed to contain the meat of your project
  • test input files
  • Makefile.am to
    • compile and build your translator
    • contain make check rule so your translator will be invoked to process your input files and generate expected results

To connect your project into ROSE's build system, you also need to

  • Add one more subdir entry into projects/Makefile.am for your project directory
  • Add one line into config/support-rose.m4 for EACH new Makefile (generated from each Makefile.am) used by your projects.

Installing project targets[edit]

Install your project's content to a separate directory within the user's specified --prefix location. The reason behind this is that we don't want to pollute the core ROSE installation space. By doing so, we can reduce the complexity and confusion of the ROSE installation tree, while eliminating cross-project file collisions. It also keeps the installation tree modular.

Example[edit]

This example uses a prefix for installation. It also maintains Semantic Versioning.

From projects/RosePoly:

  ## 1. Version your project properly (http://semver.org/)
  rosepoly_API_VERSION=0.1.0

  ## 2. Install to separate directory
  ##
  ##    Installation tree should resemble:
  ##
  ##    <--prefix>
  ##    |--bin      # ROSE/bin
  ##    |--include  # ROSE/include
  ##    |--lib      # ROSE/lib
  ##    |
  ##    |--<project>-<version>
  ##       |--bin      # <project>/bin
  ##       |--include  # <project>/include
  ##       |--lib      # <project>/lib
  ##
  exec_prefix=${prefix}/rosepoly-$(rosepoly_API_VERSION)

  ## Installation/include tree should resemble:
  ##   |--<project>-<version>
  ##      |--bin      # <project>/bin
  ##      |--include  # <project>/include
  ##         |--<project>
  ##      |--lib      # <project>/lib
  librosepoly_la_includedir = ${exec_prefix}/include/rosepoly

Generate Doxygen Documentation[edit]

0. Install Doxygen tool

Using MacPorts for Apple's Mac OS:

  $ port install doxygen
 
  # set path to MacPort's bin/
  # ...

Using one of the LLNL machines:

  $ export PATH=/nfs/apps/doxygen/latest/bin:$PATH


1. Create a Doxygen configuration file

  $ doxygen -g
 
Configuration file `Doxyfile' created.
 
Now edit the configuration file and enter
 
  doxygen Doxyfile
 
to generate the documentation for your project


2. Customize the configuration file (Doxyfile):

...

# If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in
# documentation are documented, even if no documentation was available.
# Private class members and static file members will be hidden unless
# the EXTRACT_PRIVATE and EXTRACT_STATIC tags are set to YES

EXTRACT_ALL            = YES

...

# If the value of the INPUT tag contains directories, you can use the
# FILE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp
# and *.h) to filter out the source-files in the directories. If left
# blank the following patterns are tested:
# *.c *.cc *.cxx *.cpp *.c++ *.d *.java *.ii *.ixx *.ipp *.i++ *.inl *.h *.hh
# *.hxx *.hpp *.h++ *.idl *.odl *.cs *.php *.php3 *.inc *.m *.mm *.dox *.py
# *.f90 *.f *.for *.vhd *.vhdl

FILE_PATTERNS          = *.cpp *.hpp

# The RECURSIVE tag can be used to turn specify whether or not subdirectories
# should be searched for input files as well. Possible values are YES and NO.
# If left blank NO is used.

RECURSIVE              = YES

...


3. Generate the Doxygen documentation

  # Invoke from your top-level directory
  $ doxygen Doxyfile


4. View and verify the HTML documentation

  $ firefox html/index.html &

5. Add target to your Makefile.am to generate the documentation

.PHONY: docs
docs:
    doxygen Doxyfile # TODO: should be $(DOXYGEN)

How to fix a bug[edit]

If you are trying to fix a bug ( your own or a bug assigned to you to fix). Here are high level steps to do the work

Reproduce the bug[edit]

You can only fix a bug when you can reproduce it. This step may be more difficult than it sounds. In order to reproduce a bug, you have to

  • find a proper input file
  • find a proper translator: a translator shipped with ROSE is easy to find. But be patient and sincere when you ask for a translator written by users.
  • find a similar/identical software and hardware environment: a bug may only appear on a specific platform when a specific software configuration is used

Possible results for this step:

  • You can reproduce the bug reliably. Bingo! Go to the next step.
  • You cannot reproduce the bug. Either the bug report is invalid or you have to keep trying.
  • You can reproduce the bug once a while (random errors). Oops. This is kind of difficult situation.

Find causes of the bug[edit]

Once you can reproduce the bug. You have to identify the root cause of the bug using a debugger like gdb.

Common steps involved

  • simplify the input code as much as possible: It can be very hard to debug a problem with a huge input. Always try to prepare the simplest possible code which can just trigger the bug.
    • Often, you have to use a binary search approach to narrow down the input code: only use half of the input at a time to try. Recursively cut the input file into two parts until no further cut is possible while you can still trigger the bug.
  • forward tracking: for the translator, it usually takes input and generate intermediate results before the final output is generated. Using a debugger to set break points at each critical stages of the code to check if the intermediate results are what you expect.
  • backwards tracking: similar to the previous techniques. But you just back tracking the problem.

Fix the bug[edit]

Any bug fix commit should contain

  • a regression test: so make check rules can make sure the bug is actually fixed and no further code changes will make the bug relapse.

How to add a ROSE commandline option[edit]

Often a feature added into ROSE comes with a set of command line options. These options can enable and customize the feature.

For example, the OpenMP support in ROSE is disabled by default. A special option is need to enable it. Also, the support can be as little as simply parsing the OpenMP directive or as complex as translating into multithreaded code.

This HOWTO quickly go through key steps to add options.

internal flags[edit]

Options need to be stored somewhere. There are two choices for the storage, as a data member of SgProject v. a data member of SgFile. If the option can be as specific as per file, it is recommended to add a new data member to SgFile to save the option value.

For example, here is a command line option to turn on the UPC language support:

ROSE/src/ROSETTA/src/support.C // add a date member for SgFile

    // Liao (6/6/2008): Support for UPC model of C , 6/19/2008: add support for static threads compilation
    File.setDataPrototype         ( "bool", "UPC_only", "= false",
                                    NO_CONSTRUCTOR_PARAMETER, BUILD_ACCESS_FUNCTIONS, NO_TRAVERSAL, NO_DELETE);

ROSETTA process this information to automatically generate a member and the corresponding member access functions (set/get_member()).

process the option[edit]

Command line options should be handled within src/frontend/SageIII/sage_support/cmdline.cpp .

File level options are handled by void SgFile::processRoseCommandLineOptions ( vector<string> & argv )


Example code for processing the -rose:openmp option

     set_openmp(false);
     ROSE_ASSERT (get_openmp() == false);
       ...
     if ( CommandlineProcessing::isOption(argv,"-rose:","(OpenMP|openmp)",true) == true )
        {
          if ( SgProject::get_verbose() >= 1 )
               printf ("OpenMP option specified \n");
          set_openmp(true);
         //side effect for enabling OpenMP, define the macro as required
           argv.push_back("-D_OPENMP");
        }


ROSE commandline options should be removed after being processed, to avoid confusing the backend compiler

SgFile::stripRoseCommandLineOptions ( vector<string>& argv ) should have the code to strip off the option.

use the option[edit]

In your code, you can use the automatically generated access functions to set/retrieve the stored option values.

For example

  if (sourceFile->get_openmp())
     //... do something here ....

document the option[edit]

Any option should be explained by the online help output.

Please add brief help text for your option in void SgFile::usage ( int status ) of ./src/frontend/SageIII/sage_support/cmdline.cpp: