Jump to content

ROSE Compiler Framework/FAQ

From Wikibooks, open books for an open world

We collect a list of frequently asked questions about ROSE, mostly from the rose-public mailing list link

General

[edit | edit source]

How to search rose-public mailinglist for previously asked questions?

[edit | edit source]

Use the following command on google search
site:https://mailman.nersc.gov/pipermail/rose-public $(ADD_YOUR_SEARCH_TERM_HERE)

How to check the version of ROSE?

[edit | edit source]

ROSE_Install_path/include/rose/rosePublicConfig.h

/* Define to the version of this package. */
#define ROSE_PACKAGE_VERSION "0.9.8.54"


To check this in your code

bool checkRoseVersionNumber(const std::string &need) {
    std::vector<std::string> needParts = rose::StringUtility::split('.',
need);
    std::vector<std::string> haveParts = rose::StringUtility::split('.',
ROSE_PACKAGE_VERSION);

    for (size_t i=0; i < needParts.size() && i < haveParts.size(); ++i) {
        if (needParts[i] != haveParts[i])
            return needParts[i] < haveParts[i];
    }
    // E.g., need = "1.2" and have = "1.2.x", or vice versa
    return true;
} 

Why can't ROSE staff members answer all my questions?

[edit | edit source]

It can feel very frustrating when you get no responses to your questions submitted to the rose-public@nersc.gov mailing list. You may wonder why the ROSE staff cannot help neither sometimes.

Here are some possible excuses:

  • They are just as busy as everybody else in the research and development fields. They may be working around the clock to meet deadlines for proposals, papers, project reviews, deliverables, etc.
  • They don't know every corner of their own compiler, given the breadth and depth of contributions made to ROSE by collaborators, former staff members, post-docs, and interns. Moreover, most contributions lack good documentation--something that should be remedied in the future.
  • Some questions are simply difficult and open research and development questions. They may have no clue, either.
  • They just feel lazy sometimes or are taking a thing called vacation.

Possible alternatives to have your questions answered and your problems solved in a timely fashion:

  • Please do you own homework first (e.g. Google).
  • The ROSE team is actively addressing the documentation problem, through an internal code review process to enforce well-documented contributions going forward.
  • Help others to help yourself. Answer questions on the rose-public@nersc.gov mailing list and contribute to this community-editable Wikibook.
  • Find ways to formally collaborate with, or fund, the ROSE team. Things go faster when money is flowing :-) Sad, but true, reality in this busy world.

How many lines of source code does ROSE have?

[edit | edit source]

Excluding the EDG submodule and all source code comments, the core of ROSE (rose/src) has about 674,000 lines of C/C++ source code as of July 11, 2012.

Including tests, projects, and tutorial directories, ROSE has about 2 Million lines of code.

Some details are shown below:

[rose/src]./cloc-1.56.pl .
    3076 text files.
    2871 unique files.                                          
     716 files ignored.

http://cloc.sourceforge.net v 1.56  T=26.0 s (91.7 files/s, 39573.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                            908          75280          93960         354636
C                              123          12010           3717         199087
C/C++ Header                   915          28302          38412         121373
Bourne Shell                    17           3346           4347          25326
Perl                             4            743           1078           7888
Java                            18           1999           4517           7096
m4                               1            747             20           6489
Python                          34           1984           1174           5363
make                           148           1682           1071           3666
C#                              11            899            274           2546
SQL                              1              0              0           1817
Pascal                           5            650             31           1779
CMake                          168           1748           4880           1702
yacc                             3            352            186           1544
Visual Basic                     6            228            421           1180
Ruby                            11            281            181            809
Teamcenter def                   3              3              0            606
lex                              2            103             47            331
CSS                              1             95             32            314
Fortran 90                       1             34              6            244
Tcl/Tk                           2             29              6            212
HTML                             1              8              0             15
-------------------------------------------------------------------------------
SUM:                          2383         130523         154360         744023
-------------------------------------------------------------------------------

How large is ROSE?

[edit | edit source]

To show top level information only (in MB): du -msl * | sort -nr

170	tests
109	projects
90	src
19	docs
16	winspecific
16	ROSE_ResearchPapers
15	binaries
7	scripts
5	LicenseInformation
4	tutorial
4	autom4te.cache
2	libltdl
2	exampleTranslators
2	configure
2	config
2	ChangeLog

Sort directories by their sizes in MegaBytes

 du -m | sort -nr >~/size.txt
709	.
250	./.git
245	./.git/objects
243	./.git/objects/pack
170	./tests
109	./projects
90	./src
76	./tests/CompileTests
50	./tests/RunTests
40	./tests/RunTests/FortranTests
34	./tests/RunTests/FortranTests/LANL_POP
29	./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1
27	./src/3rdPartyLibraries
23	./tests/roseTests
23	./src/frontend
22	./tests/CompileTests/Fortran_tests
21	./tests/CompilerOptionsTests
19	./docs
18	./tests/CompileTests/RoseExample_tests
18	./src/midend
18	./docs/Rose
16	./winspecific
16	./ROSE_ResearchPapers
15	./tests/CompileTests/Fortran_tests/gfortranTestSuite
15	./binaries/samples
15	./binaries
14	./tests/CompileTests/Fortran_tests/gfortranTestSuite/gfortran.dg
14	./src/roseExtensions
11	./projects/traceAnalysis
10	./tests/CompileTests/A++Code
10	./tests/CompilerOptionsTests/testCpreprocessorOption
10	./tests/CompilerOptionsTests/A++Code
10	./src/roseExtensions/qtWidgets
10	./src/frontend/Disassemblers
10	./projects/symbolicAnalysisFramework
10	./projects/SATIrE
10	./projects/compass
9	./winspecific/MSVS_ROSE
9	./tests/RunTests/A++Tests
9	./tests/roseTests/binaryTests
9	./src/frontend/SageIII
9	./projects/symbolicAnalysisFramework/src
9	./docs/Rose/powerpoints
8	./winspecific/MSVS_project_ROSETTA_empty
8	./projects/simulator
7	./tests/RunTests/FortranTests/LANL_POP_OLD
7	./tests/CompileTests/Cxx_tests
7	./src/midend/programTransformation
7	./src/midend/programAnalysis
7	./src/3rdPartyLibraries/libharu-2.1.0
7	./scripts
7	./projects/symbolicAnalysisFramework/src/mpiAnal
7	./projects/RTC
6	./winspecific/MSVS_ROSE/Debug
6	./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/ncdap_test
6	./tests/roseTests/programAnalysisTests
6	./src/3rdPartyLibraries/ckpt
6	./src/3rdPartyLibraries/antlr-jars
6	./projects/SATIrE/src
5	./tests/RunTests/FortranTests/LANL_POP/pop-distro
5	./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/libcf
5	./tests/CompileTests/ElsaTestCases
5	./src/ROSETTA
5	./src/3rdPartyLibraries/qrose
5	./projects/DatalogAnalysis
5	./projects/backstroke
5	./LicenseInformation
5	./docs/Rose/AstProcessing

To list files based on size

 find . -type f -print0 | xargs -0 ls -s | sort -k1,1rn
241568 ./.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.pack
13484 ./tests/CompileTests/RoseExample_tests/Cxx_Grammar.h
10240 ./projects/traceAnalysis/vmp-hw-part.trace
6324 ./tests/RunTests/FortranTests/LANL_POP_OLD/poptest.tgz
5828 ./winspecific/MSVS_ROSE/Debug/MSVS_ROSETTA.pdb
4732 ./.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.idx
4488 ./binaries/samples/bgl-helloworld-mpicc
4488 ./binaries/samples/bgl-helloworld-mpixlc
4080 ./LicenseInformation/edison_group.pdf
3968 ./projects/RTC/tags
3952 ./src/frontend/Disassemblers/x86-InstructionSetReference-NZ.pdf
3908 ./tests/CompileTests/RoseExample_tests/trial_Cxx_Grammar.C
3572 ./winspecific/MSVS_project_ROSETTA_empty/MSVS_project_ROSETTA_empty.ncb
3424 ./src/frontend/Disassemblers/x86-InstructionSetReference-AM.pdf
2868 ./.git/index
2864 ./projects/compassDistribution/COMPASS_SUBMIT.tar.gz
2864 ./projects/COMPASS_SUBMIT.tar.gz
2740 ./ROSE_ResearchPapers/2007-CommunicatingSoftwareArchitectureUsingAUnifiedSingle-ViewVisualization-ICECC
S.pdf
2592 ./docs/Rose/powerpoints/rose_compiler_users.pptx
2428 ./src/3rdPartyLibraries/ckpt/wrapckpt.c
2408 ./projects/DatalogAnalysis/jars/weka.jar
2220 ./scripts/graph.tar
1900 ./src/3rdPartyLibraries/antlr-jars/antlr-3.3-complete.jar
1884 ./src/3rdPartyLibraries/antlr-jars/antlr-3.2.jar
1848 ./src/midend/programTransformation/ompLowering/run_me_defs.inc
1772 ./src/3rdPartyLibraries/qrose/docs/QROSE.pdf
1732 ./tests/CompileTests/Cxx_tests/longFile.C
1724 ./src/midend/programTransformation/ompLowering/run_me_task_defs.inc
1656 ./ChangeLog
1548 ./tests/roseTests/binaryTests/yicesSemanticsExe.ans
1548 ./tests/roseTests/binaryTests/yicesSemanticsLib.ans
1480 ./ROSE_ResearchPapers/1997-ExpressionTemplatePerformanceIssues-IPPS.pdf
1408 ./docs/Rose/powerpoints/ExaCT_AllHands_March2012_ROSE.pptx

...

Compilation

[edit | edit source]

Cannot download the EDG binary tar ball

[edit | edit source]

Three possible reasons

  • the website hosting EDG binaries is down (there is a manual way to get the binary)
  • we don't support the platform you use so there is no EDG binary is available for you.
  • you cloned your rose from an un-official repo so the build process cannot figure out the right version of EDG binary for you. (there is a solution mentioned below)

It is possible that the rosecompiler.org website is down for maintenance.

So you may encounter the following error message:

make[3]: Entering directory `/home/leo/workspace/github-rose/buildtree/src/frontend/CxxFrontend' test -d /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries && cp /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz . || wget http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz --2012-08-05 12:58:29-- http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz Resolving www.rosecompiler.org... 128.55.6.204 Connecting to www.rosecompiler.org|128.55.6.204|:80... failed: No route to host. make[3]: *** [roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz] Error 4

In this case, you should ask for the missing tar ball or find it on our backup location

You don't have to clone the entire edge binary repo since it is big. You can just download the one you need (click raw file link on github.com).

Once you get the bar ball, copy it to your build tree's CxxFrontend subdirectory:

  • buildtree/src/frontend/CxxFrontend

Then you should be able to normally build rose by typing make.

TODO: automate the search using the alternative path to obtain edg binary

Another possible reason is that you cloned your local rose repo from an unofficial repository.

  • In order to maintain the correct matching between rose source and EDG binary, we require a canonical repository to be available.

make[3]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend/Clang'
Unable to find a remote tracking a canonical repository.  Please add a
canonical repository as a remote and ensure it is up to date.  Currently
configured remotes are:

   origin => git@xxx.com/myrose.git

Potential canonical repositories include:

   anything ending with "rose.git" (case insensitive)
Unable to find a remote tracking a canonical repository.  Please add a
canonical repository as a remote and ensure it is up to date.  Currently
configured remotes are:

   origin => git@xxx.com/myrose.git

Potential canonical repositories include:

   anything ending with "rose.git" (case insensitive)
make[3]: Entering directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend'
test -d /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries && cp /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz  . || wget http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz
--2013-02-15 17:26:42--  http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz
Resolving www.rosecompiler.org... 128.55.6.204
Connecting to www.rosecompiler.org|128.55.6.204|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2013-02-15 17:26:42 ERROR 404: Not Found.

make[3]: *** [roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz] Error 1
make[3]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend'
make: *** [all-recursive] Error 1
make: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src'

Solution: add an official rose repo as an additional remote repo of your local repo

  • add a canonical repository, like the one at github: git add remote official-rose https://github.com/rose-compiler/rose.git
  • git fetch official-rose // to retrieve hash numbers etc in the canonical repository
  • Now you can build rose again. it should find the canonical repo you just added and use it to find a matching EDG binary

How to access EDG or EDG-SAGE connection code?

[edit | edit source]

From page 5 of http://rosecompiler.org/ROSE_UserManual/ROSE-UserManual.pdf

The connection code that was used to translate EDG’s AST to SAGE III was derived loosely from the EDG C++ source generator and has formed the basis of the SAGE III translator from EDG to SAGE III’s IR.

Under the license we have, the EDG source code and the translation from the EDG AST in distributions are excluded from source release and are made available through a binary format. No part of the EDG work is visible to the user of ROSE. The EDG source are available only to those who have the EDG research or commercial license.

Chapter 2.6 "Getting a Free EDG License for Research Use" of the manual has instructions about how to obtain the EDG license.

Once you obtain the license, please contact the staff members of ROSE to verify your license. After that, they will give you more instructions about how to proceed.

How to speedup compiling ROSE?

[edit | edit source]

Question It takes hours to compile ROSE, how can I speed up this process?

Answer:

  • if you have multi-core processors, try to use make -j4 (make by using four processes or even more if you like).
  • also try to only build librose.so under src/ by typing make -C src/ -j4
  • Or only try to build the language support you are interested in during configure, such as
    • ../sourcetree/configure --enable-only-c # if you are only interested in C/C++ support
    • ../sourcetree/configure --enable-only-fortran # if you are only interested in Fortran support
    • ../sourcetree/configure --help # show all other options to enable only a few languages.

Can ROSE accept incomplete code?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2011-July/001015.html

ROSE does not handle incomplete code. Though this might be possible in the future. It would be language dependent and likely depend heavily on some of the language specific tools that we use internally. This is however, not really a priority for our work. If you want to for example demonstrate how some of the internal tools we are using or alternative tools that we could use might handle incomplete code, this might be interesting and we could discuss it.

For example, we are not presently using Clang, but if it handled incomplete code that might be interesting for the future. I recall that some of the latest EDG work might handle some incomplete code, and if that is true then that might be interesting as well. I have not attempted to handle incomplete code with OFP, so I am not sure how well that could be expected to work. Similarly, I don't know what the incomplete code handling capabilities of ECJ Java support is either. If you know any of these questions we could discuss this further.

I have some doubts about how much meaningful information can come from incomplete code analysis and so that would worry me a bit. I expect it is very language dependent and there would be likely some constraints on the incomplete code. So understanding the subject better would be an additional requirement for me.

Can ROSE analyze Linux Kernel sources?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2011-April/000856.html

Question: I'm trying to analyze the Linux kernel. I was not sure of the size of the code-base that can be handled by ROSE, and could not find references as to whether it has been tried on the Linux kernel source. As of now I'm trying to run the identity translator on the source, and would like to know if it can be done using ROSE, and if it has been successfully tested before.

Short answer: Not for now

Long answer: We are using EDG 3.3 internally by default and this version of EDG does not handle the GNU specific register modifiers used in the asm() statements of the Linux Kernel code. There might be other problems, but that was at least the one that we noticed in previous work on this some time ago. But we are working on upgrading the EDG frontend to be a more recent version 4.4.

Can ROSE compile C++ Boost library?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2010-November/000544.html

not yet.

I know of a few cases where ROSE can't handle parts of Boost. In each case it is an EDG problem where we are using an older version of EDG. We are trying to upgrade to a newer version of EDG (4.x), but that version's use within ROSE does not include enough C++ support, so it is not ready. The C support is internally tested, but we need more time to work on this.

How to find XYZ in AST?

[edit | edit source]

The usually steps to retrieve information from AST are:

  • prepare a simplest (preferably 5-10 lines only), compilable sample code with the code feature you want to find (e.g array[i][j] if you are curious about how to find use of multi-dimensional arrays in AST), avoid including any headers (#include file.h) to keep the code small.
    • Please note: don't include any headers in the sample code. A header (#include <stdio.h> for example) can bring in thousands of nodes into AST.
  • use dotGeneratorWholeASTGraph to generate a detailed AST dot graph of the input code
  • use zgrviewer-0.8.2's run.sh to visualize the dot graph
  • visually/manually locate the information you want in the dot graph, understand what to look and where to look

Some sample AST graphs are available at https://github.com/chunhualiao/rose-ast

How to get children of an AST node?

[edit | edit source]

Once you know how to find a child in the AST manually. You can use codes to walk the AST using AST member functions, traversal, or SageInteface functions, etc to retrieve the information you want

  • ROSE provides member access functions like get_X() by default for a child named X. such as get_lhs_operand() for SgBinaryOp with a child named lhs_operand in the AST graph.
  • The names are shown in AST graph as labels of edges from parents to children.

To get a child by index use the function (not recommended though):

virtual SgNode * 	get_traversalSuccessorByIndex (size_t idx)

and/or related, similarly named functions.

How to filter out header files from AST traversals?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2010-April/000144.html

Question: I want to exclude functions in #include files from my analysis/transformations during my processing.

By default, AST traversal may visit all AST nodes, including the ones come from headers.

So AST processing classes provide three functions :

  • T traverse (SgNode * node, ..): traverse full AST , nodes which represent code from include files
  • T traverseInputFiles(SgProject* projectNode,..) traverse the subtree of AST which represents the files specified on the command line
  • T traverseWithinFile(SgNode* node,..): only the nodes which represent code of the same file as the start node

Should SgIfStmt::get_true_body() return SgBasicBlock?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2011-April/000930.html

Both true/false bodies were SgBasicBlock before.

Later, we decided to have more faithful representation of both blocked (with {...}) and single-statement (without { ..} ) bodies. So they are SgStatement (SgBasicBlock is a subclass of SgStatement) now.

But it seems like the document has not been updated to be consistent with the change.

You have to check if the body is a block or a single statement in your code. Or you can use the following function to ensure all bodies must be SgBasicBlock.

//A wrapper of all ensureBasicBlockAs*() above to ensure the parent of s is a scope statement with list of statements as children, otherwise generate a SgBasicBlock in between.

SgLocatedNode * SageInterface::ensureBasicBlockAsParent (SgStatement *s)

How to handle #include "header.h", #if, #define etc. ?

[edit | edit source]

It is called preprocessing info. within ROSE's AST. They are attached before, after, or within a nearby AST node (only the one with source location information.)

An example translator is provided to traverse the input code's AST and dump information about the found preprocessing information. The source code of this translator is https://github.com/rose-compiler/rose/blob/master/exampleTranslators/defaultTranslator/preprocessingInfoDumper.C .

To use the translator:

buildtree/exampleTranslators/defaultTranslator/preprocessingInfoDumper -c main.cxx
-----------------------------------------------
Found an IR node with preprocessing Info attached:
(memory address: 0x2b7e1852c7d0 Sage type: SgFunctionDeclaration) in file
/export/tmp.liao6/workspace/userSupport/main.cxx (line 3 column 1)
-------------PreprocessingInfo #0 ----------- :
classification = CpreprocessorIncludeDeclaration:
  String format = #include "all_headers.h"

relative position is = before

SgClassDeclaration::get_definition() returns NULL?

[edit | edit source]

If you look at the whole AST graph carefully, you can find defining and non-defining declarations for the same class.

A symbol is usually associated with a non-defining declaration. A class definition is associated with a defining declaration.

You may want to get the defining declaration from the non-defining declaration before you try to grab the definition, as in this function:

SgFunctionDefinition* getFunctionDefinitionFromDeclaration(const SgFunctionDeclaration* funcDecl) {
  //Get the defining declaration (we don't know if funcDecl is the defining or nonDefining declaration
  SgFunctionDeclaration* funcDefDecl = isSgFunctionDeclaration(funcDecl->get_definingDeclaration()); 
  ROSE_ASSERT(funcDefDecl != NULL);

  //Get the definition from the defining declaration
  SgFunctionDefinition* funcDef = isSgFunctionDefinition(funcDefDecl->get_definition());  
  ROSE_ASSERT(funcDef != NULL);
  return funcDef;
}

How to handle arrays?

[edit | edit source]

The first step is to get familiar with the AST representing Array types (SgArrayType) and array references (SgPntrArrRefExp). Then you can retrieve the necessary information from the AST.

To understand array types and array references, Here is one example,

// cat ~/temp/array.c 
int a[5][10][15];  // array declaration, a type is declared
int foo()
{
  return a[0][1][2]; // a reference to array element
}

An Array Type is represented by SgArrayType.

int a[5][10][15], corresponding three SgArrayType linked together

List a->get_type() will return the first one

  • SgArrayType_1: (index=5, base_type = SgArrayType_2)
  • SgArrayType_2: (index=10, base_type = SgArrayType_3)
  • SgArrayType_3: (index=15, base_type = SgTypeInt )

So a traverse from the first to the element type will get all dimension sizes 5-10-15

The subtree looks like

     SgArrayType_1
      /       \
     5      SgArrayType_2
            /       \
           10      SgArrayType_3  
                     /    \
                    15     SgTypeInt

An array reference is represented by SgPntrArrRefExp

A reference like: a[0][1][2]

  • SgPntrArrRefExp_1 <lhs= ref_2, rhs=2>
  • SgPntrArrRefExp_2 <lhs= ref_3, rhs=1>
  • SgPntrArrRefExp_3 <lhs= SgVarRefExp (a_symbol), rhs=0>

The subtree should look like the following:

    a[0][1][2] //SgPntrArrRefExp
      /    \
  a[0][1]  2 // SgIntVal
    / \
 a[0]  1
  / \
a    0

SgVarRefExp 

There are quite a few functions related to array handling in http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageInterface.html

You can just search "array" to find them:

//Check if an expression is an array access (SgPntrArrRefExp). If so, return its name expression and subscripts if requested. Users can use convertRefToInitializedName() to get the possible name. It does not check if the expression is a top level SgPntrArrRefExp. 

SageInterface::isArrayReference (SgExpression *ref, SgExpression **arrayNameExp=NULL, std::vector< SgExpression * > **subscripts=NULL)


// 	returns the array dimensions in an array as defined for arrtype
std::vector< SgExpression * > 	SageInterface::get_C_array_dimensions (const SgArrayType &arrtype)

// 	Get the number of dimensions of an array type. 
 int 	SageInterface::getDimensionCount (SgType *t)

// 	Get the element type of an array. 
 SgType * 	SageInterface::getArrayElementType (SgType *t)

Some example code using these functions can be found in https://github.com/rose-compiler/rose-develop/blob/master/src/midend/programTransformation/ompLowering/omp_lowering.cpp

For example, void linearizeArrayAccess(SgPntrArrRefExp* top_array_ref) rewrites array reference using multiple-dimension subscripts to a reference using one-dimension subscripts:

  • a[i][j] is changed to a[i*col_size +j]
  • a [i][j][k] is changed to a [(i*col_size + j)*K_size +k]

Sample code to handle 1-D array references

For 1-D array element access a[0], the AST with 3 nodes looks like:

   a[0]          // node 1: SgPntrArrRefExp
  /    \
a       0    //node 3:  SgIntVal
|
// node 2: SgVarRefExp

So the code searching for SgVarRefExp will find a. The next step is to check its type.

SgVarRefExp *vref = ... 
ROSE_ASSERT (vref != NULL);

 SgType* t = vref->get_type();

  if (SgArrayType* atype= isSgArrayType(t)) // now you have array type
  {
    // obtain the dimension vector
    vector<SgExpression*> dimensions  =  SageInterface::get_C_array_dimensions (* atype);
    // dimensions.size() should be 1 if you only handle 1-D array types
    if (dimensions.size() ==1)
    {
      SgPntrArrRefExp * arr_ref_exp = vref->get_parent(); // now you get a[0] from a.
      //do your things you want , with a (vref) and a[o] (arr_ref_exp)

    }
  }
   else if (SageInterface::isScalarType(t))// if scalar types, handle them differently
   {
     ...
   } 

How to add new AST nodes?

[edit | edit source]

There is a section named "1.7 Adding New SAGE III IR Nodes (Developers Only)" in ROSE Developer’s Guide (http://www.rosecompiler.org/ROSE_DeveloperInstructions.pdf)

But before you decide adding new nodes, you may consider if AstAttribute (user defined objects attached to AST) would be sufficient for your problem.

For example, the 1st version of the OpenMP implementation in ROSE (rose/projects/OpenMP_Translator) started by using AstAttribute to represent information parsed from pragmas. Only in the 2nd version we introduced dedicated AST nodes.

There are two separate steps when new kinds of IR nodes are added into ROSE:

  • First step (declaration): Adding class declaration/implementation into ROSE for the new IR nodes. This step is mostly related to ROSETTA.
  • Second step (creation): Creating those new IR nodes at some point: such as somewhere within frontend, midend, or even backend if desired. So this step is decided case by case.

If the new types of IR come from their counterparts in EDG, then modifications to the EDG/SAGE connection code are needed. If not, the EDG/SAGE connection code may be irrelevant.

If you are trying to add new nodes to represent pragma information, you can create your new nodes without involving EDG or its connection to ROSE. You just parse the pragma string in the original AST and create your own nodes to get a new version of AST. Then it should be done.

How does the AST merge work?

[edit | edit source]

tests that demonstrate the AST Merge are in the directory:

    tests/nonsmoke/functional/CompileTests/mergeAST_tests

(run "make check" to see hundreds of tests go by).

parent vs. scope

[edit | edit source]

An AST node can have a parent node which is different from the its scope.

For example: the struct declaration's parent is the typedef declaration. But the struct's scope is the scope of the typedef declaration.

typedef struct frame {int x;} s_frame;

Parsing text into AST

[edit | edit source]

There is some experimental support to parse simple code text into AST pieces. It is not intended to parse entire source codes. But the support should be able to be extended to handle more types of input.

Some documentation about this work:

Example project using the parser building blocks

  • projects/pragmaParsing should work.

Translation

[edit | edit source]

How to skip system headers in translation?

[edit | edit source]

Often we are only interested in user code. The AST represents all codes from users and system headers. We need to skip things from system headers.


// Final most complete version, skip all header files, we cannot unparse changed AST from header files , at least by default

    if (Inliner::skipHeaders)
    { 
      string filename= funcall->get_file_info()->get_filename();
      string suffix = StringUtility ::fileNameSuffix(filename);
      //vector.tcc: This is an internal header file, included by other library headers
      if (suffix=="h" ||suffix=="hpp"|| suffix=="hh"||suffix=="H" ||suffix=="hxx"||suffix=="h++" ||suffix=="tcc")
        return false;

      // also check if it is compiler generated, mostly template instantiations. They are not from user code.
      if (funcall->get_file_info()->isCompilerGenerated() )
        return false;

      // check if the file is within include-staging/ header directories
      if (insideSystemHeader(funcall))
       return false;

    }

//------------partial solutions

bool processStatements(SgNode* n)
{
  ROSE_ASSERT (n!=NULL);
  // Skip compiler generated code, system headers, etc.
  if (isSgLocatedNode(n))
  {
    if (isSgLocatedNode(n)->get_file_info()->isCompilerGenerated())
      return false;
  }
 ...
}

This is based on Sg_File_Info

Inside of Sg_File_Info::display(debug.......) 
     isTransformation                      = false 
     isCompilerGenerated                   = true (no position information) 
     isOutputInCodeGeneration              = false 
     isShared                              = false 
     isFrontendSpecific                    = true (part of ROSE support for gnu compatability) 
     isSourcePositionUnavailableInFrontend = false 
     isCommentOrDirective                  = false 
     isToken                               = false 
     file_id  = 2 
     filename = /home/liao6/daily-test-rose/upcwork/install/include/gcc_HEADERS/rose_edg_required_macros_and_functions.h 
     line     = 167  column   = 1 

....
shared[1] int gsj;
Inside of Sg_File_Info::display(debug.......) 
     isTransformation                      = false 
     isCompilerGenerated                   = false 
     isOutputInCodeGeneration              = false 
     isShared                              = false 
     isFrontendSpecific                    = false 
     isSourcePositionUnavailableInFrontend = false 
     isCommentOrDirective                  = false 
     isToken                               = false 
     filename = /home/liao6/svnrepos/mycode/rose/upc/unshared.upc 
     line     = 6  column = 1 
     file_id  = 1 
     filename = /home/liao6/svnrepos/mycode/rose/upc/unshared.upc 
     line     = 6  column   = 1 


Another way, rose make a copy for all system headers and store them in dedicated paths

  bool insideSystemHeader (SgLocatedNode* node)
  {
    bool rtval = false;
    ROSE_ASSERT (node != NULL);
    Sg_File_Info* finfo = node->get_file_info();
    if (finfo!=NULL)
    {
      string fname = finfo->get_filenameString();
      string buildtree_str1 = string("include-staging/gcc_HEADERS");
      string buildtree_str2 = string("include-staging/g++_HEADERS");
      string installtree_str1 = string("include/edg/gcc_HEADERS");
      string installtree_str2 = string("include/edg/g++_HEADERS");
      // if the file name has a sys header path of either source or build tree
      if ((fname.find (buildtree_str1, 0) != string::npos) ||
          (fname.find (buildtree_str2, 0) != string::npos) ||
          (fname.find (installtree_str1, 0) != string::npos) ||
          (fname.find (installtree_str2, 0) != string::npos)
          )
        rtval = true;
    }
    return rtval;                                                                                                              
  } 



Can ROSE identityTranslator generate 100% identical output file?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2011-January/000604.html

Questions: Rose identityTranslator performs some modifications, "automatically".

These modifications are:

  • Expanding the assert macro.
  • Adding extra brackets around constants of typedef types (e.g. c=Typedef_Example(12); is translated in the output to c = Typedef_Example((12));)
  • Converting NULL to 0.

Can I avoid these modifications?

Answer: No.

There is no easy way to avoid these changes currently. Some of them are introduced by the cpp preprocessor. Others are introduced by the EDG front end ROSE uses. 100% faithful source-to-source translation may require significant changes to preprocessing directive handling and the EDG internals.

We have had some internal discussion to save raw token strings into AST and use them to get faithful unparsed code. But this effort is still at its initial stage as far as I know.

How to build a tool inserting function calls?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2010-July/000319.html

Question: I am trying to build a tool which insert one or more function calls whenever in the source code there is a function belonging to a certain group (e.g. all functions beginning with foo_*). During the ast traversal, how can I find the right place, i.e., there is a function in ROSE that searches for a string pattern or something similar?

Answers:

  • In Chapter 28 AST Construction of the ROSE tutorial, there are examples to instrument function calls into the AST using traversals or a queryTree. I would approach this by checking the node for the specific SgFunctionDefinition (or whatever you need) and then check the name of the node to find its location.
  • You can
    • use the AST query mechanism to find all functions and store them in a container. e.g Rose_STL_Container<SgNode*> nodeList = NodeQuery::querySubTree(root_node,V_Sg????);
    • Then iterate the container to check each function to see if the function name matches what you want.
    • use SageBuilder namespace's buildFunctionCallStmt() to create a function call statement.
    • use SageInterface namespace's insertStatement () to do the insertion.

How to insert a header into an input file?

[edit | edit source]

There is an SageInterface function for doing this:

// Insert include "filename" or include <filename> (system header) into the global scope containing the current scope, right after other include XXX.
PreprocessingInfo *     SageInterface::insertHeader (const std::string &filename, PreprocessingInfo::RelativePositionType position=PreprocessingInfo::after, bool isSystemHeader=false, SgScopeStatement *scope=NULL) 

How to copy/clone a function?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2011-April/000919.html

We need to be more specific about the function you want to copy. Is it just a prototype function declaration (non-defining declaration in ROSE's term ) or a function with a definition (defining declaration in ROSE's term)?

  • Copying a non-defining function declaration can be achieved by using the following function:
// Build a prototype for an existing function declaration (defining or nondefining is fine).
SgFunctionDeclaration* SageBuilder::buildNondefiningFunctionDeclaration (const SgFunctionDeclaration *funcdecl, SgScopeStatement *scope=NULL)
  • Copying a defining function declaration is semantically a problem since it introduces redefinition of the same function.

It is at least a hack to first introduce something wrong and later correct it. Here is an example translator to do the hack (copy a defining function, rename it, fix its symbol):


#include <rose.h>
#include <stdio.h>
using namespace SageInterface;

int main(int argc, char** argv)
{
  SgProject* project = frontend(argc, argv);
  AstTests::runAllTests(project);

// Find a defining function named "bar" under project

  SgFunctionDeclaration* func=
findDeclarationStatement<SgFunctionDeclaration> (project, "bar", NULL,
true);
  ROSE_ASSERT (func != NULL);

// Make a copy and set it to a new name
  SgFunctionDeclaration* func_copy =
isSgFunctionDeclaration(copyStatement (func));
  func_copy->set_name("bar_copy");

// Insert it to a scope
  SgGlobal * glb = getFirstGlobalScope(project);
  appendStatement (func_copy,glb);

#if 0  // fix up the missing symbol, this should be optional now since SageInterface::appendStatement() should handle it transparently. 
  SgFunctionSymbol *func_symbol =  glb->lookup_function_symbol
("bar_copy", func_copy->get_type());
  if (func_symbol == NULL);
  {
    func_symbol = new SgFunctionSymbol (func_copy);
    glb ->insert_symbol("bar_copy", func_symbol);
  }
#endif
  AstTests::runAllTests(project);
  backend(project);
  return 0;
}

ROSE's unparser checks for Sg_File_Info objects of AST pieces before it decides to print out text format of the AST pieces. Only the AST coming from the same file of the input file or AST generated by transformation should be unparsed by default. For example, some AST subtrees come from an included header. But it is often not desired to unparse the content of an included header.

If the file info is still the original file info, the solution is to set the copied AST to be transformation-generated:

// Recursively set source position info(Sg_File_Info) as transformation generated.
SageInterface::setSourcePositionForTransformation (SgNode *root) 

Can I transform code within a header file?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2011-May/000971.html

No. ROSE does not unparse AST from headers right now. A summer project tried to do this. But it did not finish and not well tested.

The option is -rose:unparseHeaderFiles -rose:unparseHeaderFilesRootFolder UNPARSED_HEADERS_DIR in tests/CompilerTests/UnparseHeadersTests

https://mailman.nersc.gov/pipermail/rose-public/2010-August/000344.html

I guess ROSE does not support writing out changed headers for safety/practical reasons. A changed header has to be saved to another file since writing to the original header is very dangerous (imaging debugging a header translator which corrupts input headers). Then all other files/headers using the changed header have to be updated to use the new header file.

Also all files involved have to be writable by user's translators.

As a result, the current unparser skips subtrees of AST from headers by checking file flags (compiler_generated and/or output_in_code_generation etc.) stored in Sg_File_Info objects.

How to work with formal and actual arguments of functions?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2011-June/001008.html

     //Get the actual arguments
     SgExprListExp* actualArguments = NULL;
     if (isSgFunctionCallExp(callSite))
         actualArguments = isSgFunctionCallExp(callSite)->get_args();
     else if (isSgConstructorInitializer(callSite))
         actualArguments = isSgConstructorInitializer(callSite)->get_args();
     ROSE_ASSERT(actualArguments != NULL);

     const SgExpressionPtrList& actualArgList = 
actualArguments->get_expressions();

     //Get the formal arguments.
     SgInitializedNamePtrList formalArgList;
     if (calleeDef != NULL)
         formalArgList = calleeDef->get_declaration()->get_args();

     //The number of actual arguments can be less than the number of 
formal arguments (with implicit arguments) or greater
     //than the number of formal arguments (with varargs)

How to translate multiple files scattered in different directories of a project?

[edit | edit source]

Expected behavior of a ROSE Translator:

A translator built using ROSE is designed to act like a compiler (gcc, g++,gfortran ,etc depending on the input file types). So users of the translator only need to change the build system for the input files to use the translator instead of the original compiler.

If the original compiler used by you implicitly include or link anything, you may have to make the include or linking paths explicit after the change. For example, if mpiCC transparently links to /path/to/mpilib.a, you have to add this linking flag into your modified Makefile.

On 07/25/2012 11:20 AM, Fernando Rannou wrote:
> > Hello
> >
> > We are trying to use ROSE to refactor  a big project consisting of
> > several  *.cc and *.hh files, located at various directories. Each
> > class is defined in a *.hh file and implemented in a *.cc file.
> > Classes include (#include) other class definitions. But we have only
> > found single file examples.
> >
> > Is this possible? If so, how?
> >
> >
> > Thanks

Unparsing

[edit | edit source]

Generate code into different files

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2012-August/001742.html Question: I wonder is it possible for ROSE to generate two files (.c and .cl) when it translates C-to-OpenCL ?

Answer: The ROSE outliner has an option to output the generated function into a new file.

https://github.com/rose-compiler/rose/blob/master/src/midend/programTransformation/astOutlining/Outliner.hh

...
// Generate the outlined function into a separated new source file
// -rose:outline:new_file
extern bool useNewFile;
...

You may want to check how this option is used in the outliner source files to get what you want.

Binary Analysis

[edit | edit source]

How is the binary analysis capability in ROSE?

[edit | edit source]

Question: how is the binary analysis capability in ROSE? Is it just disassembly? is it possible to associate the binary code with the source if combined with ROSE source code analysis?

Answer:

ROSE has various binary disassemblers (x86, ARM, MIPS, PowerPC) that, like source code analysis, create an internal representation of the binary in the form of an AST. Although the types of AST nodes for source and binaries are largely disjoint, one can analyze the binary AST using concepts similar to source analysis. ROSE has a few binary analyses. Here are some off the top of my head:

  • Control flow graphs, both virtual and using Boost Graph Library.
  • Function call graphs.
  • Operations on control flow graphs: dominator, post-dominator
  • Pointer detection analysis that tries to figure out which memory locations are used as if they were pointers in a higher level language.
  • Instruction partitioning: figuring out how to group instructions into basic blocks, and how to group basic blocks into functions when all you have is a list of instructions. Its accuracy on automatically partitioning stripped, obfuscated code has been shown to be better than the best disassemblers that use debugging info and symbol tables.
  • Instruction semantics for x86. This is an area of active development but supports only 32-bit integer instructions. We plan to add floating point, SIMD, 64-bit, other architectures, and a simpler API. But even as it stands, it is complete enough to simulate entire ELF executables (even "vi"). See next bullet
  • An x86 simulator for ELF executables. This project is able to simulate how the Linux kernel loads an executable, and the various system calls made by the executable. It it complete enough to simulate many Linux programs, but also provides callback points for the user to insert various kinds of analyses. For instance, you could use it to disassemble an entire process after it has been dynamically linked. There are many examples in the projects/simulator directory. In contrast to simulators like Qemu, Bochs, valgrind, VirtualBox, VMware, etc. where speed is a primary design driver, the ROSE simulator is designed to provide user-level access to as many aspects of execution as possible.
  • Plugins for instruction semantics. Instruction semantics is written in such a way that different "semantic domains" can be plugged in. ROSE has a symbolic domain, an interval domain, and a partial-symbolic domain. The symbolic domain can be used in conjunction with an SMT solver (currently supporting Yices). The interval domain is actually sets of intervals, and is binary-arithmetic-aware (i.e., correctly handles overflows, etc on a fixed word size). The partial-symbolic domain uses single-node expressions in order to optimize for speed and size at the expense of accuracy. Users can and have written other domains, and a new API (in the works) will make this even easier.
  • Examples of data-flow analysis (e.g., the pointer analysis already mentioned), but not a well defined framework yet (someone is working on one). Currently, data-flow type analyses are implemented using the instruction semantics support: as each instruction is "executed" the domain in which it executes causes the data to flow in the machine state. Each analysis provides its own flow equation to handle the points where control flow joins from two or more directions; and provides its own "next-instruction" function to iterate over the control flow graph.
  • Clone detection of various formats: various forms of syntactic, including one using locality-sensitive hashing; and semantic clone detection via fuzz testing in a simulator.

By Robb


You ask about combining source and binary analysis... Its certainly possible since ROSE can hold both the binary and source ASTs in memory at the same time. But I'm not aware of any analysis that "sews" them together. We do support parsing DWARF info from ELF executables, so you might be able to use that to sew the two ASTs together.

--Robb

Daily work

[edit | edit source]

git clone returns error: SSL certificate problem?

[edit | edit source]

Symptom:

git clone https://github.com/rose-compiler/rose.git
Cloning into rose...
error: SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed while accessing https://github.com/rose-compiler/rose.git/info/refs

fatal: HTTP request failed

The reason may be that you are behind a firewall which tweaks the original SSL certification.

Solutions: Tell cURL to not check for SSL certificates:

#Solution 1: Environment variable (temporary)
      $ env GIT_SSL_NO_VERIFY=true git pull

# Solution 2: git-config (permanent)
      # set local configuration
      $ git config --local http.sslVerify false

# Solution 2:  set global configuration
      $ git config --global http.sslVerify false

What is the best IDE for ROSE developers?

[edit | edit source]

https://mailman.nersc.gov/pipermail/rose-public/2010-April/000115.html

There may not be a widely recognized best integrated development environment. But developers have reported that they are using

  • vim
  • emacs
  • KDevelop
  • Source Navigator
  • Eclipse
  • Netbeans

The thing is that ROSE is huge and has some ridiculously large generated source file (CxxGrammar.h and CxxGrammar.C are generated in the build tree for example). So many code browsers may have trouble in handling ROSE.

Portability

[edit | edit source]

What is the status for supporting Windows?

[edit | edit source]

We do maintain some preliminary Windows Support of building ROSE/src to generate librose.so by leveraging cmake. However, the work is not finished.

To build librose under windows, type the following command lines in the top level source tree

 mkdir ROSE-build-cmake
 cd ROSE-build-cmake
 cmake .. -DBOOST_ROOT=${ROSE_TEST_BOOST_PATH}  // Example: boost installation path /opt/boost_1_40_0-inst

https://mailman.nersc.gov/pipermail/rose-public/2011-December/001349.html

We have not finished the Windows work yet. IT is on our list of things to do. It was started and ROSE internally compiles using MS Visual Studio (using project files generated from the Cmake build that we maintain and test within our release process for ROSE) but does not pass our tests. So it is not ready. The distribution of the EDG binaries for Windows is another step that would come after that. We don't know at present when this will be done, it is important, but not a high priority for our DOE specific work, but important for other work. The effort required is something that we could discuss. If you want to call me that would be the best way to proceed. Send me email off of the main list and we can set that up.

https://mailman.nersc.gov/pipermail/rose-public/2011-March/000798.html

Under Windows ROSE uses CMake. This is a project that is currently under development. As of November 2010 we are able to compile and link the src directory. We are also able to run example programs that link against librose and execute the frontend and backend. {\em However, this is an internal capability and not available externally yet since we don't distribute the Windows generated EDG binaries that would be required. Also the current support for Windows is still incomplete, ROSE does not yet pass its internal tests under Windows.}