ROSE Compiler Framework/Lattice

Introduction

Lattices are mathematical structures. They can be used as a general way to express an order among objects. This data can be exploited in data flow analysis.

Lattices can describe transformations effected by basic blocks on data flow values also known as flow functions.

Lattices can describe data flow frameworks when instantiated as algebraic structures consisting of a set of data flow values, a set of flow functions, and a merge operator.

Poset

Partial ordering: ${\displaystyle \leq }$

A partial ordering is a binary relation ${\displaystyle \leq }$ over a set P which is reflexive, antisymmetric and transitive, i.e.

• Reflexive x<=x
• Anti-Symmetric, if ${\displaystyle x\leq y,y\leq x}$ then x=y
• Transitive: if ${\displaystyle x\leq y,y\leq z}$ then ${\displaystyle x\leq z}$

Partial orders should not be confused with total orders. A total order is a partial order but not vice versa. In a total order any two elements in the set P can be compared. This is not required in a partial order. Two elements that can be compared are said to be comparable

A partially ordered set, also known as a poset, is a set with a partial order.

Given a poset there may exist an infimum or a supremum. However, not all posets contain these.

Given a poset P with set X and order ${\displaystyle \leq }$:

An infimum of a subset S of X is an element a of X such that

• ${\displaystyle a\leq x}$ for all x in S and
• for all y in X, if for all x in S, ${\displaystyle y\leq x}$ then ${\displaystyle y\leq a}$

The dual of this notion is the supremum which has the definition of infimum if you switch ${\displaystyle \leq }$ with ${\displaystyle \geq }$

If we simply pick an element of X that satisfies the first condition we have a lower bound. The second condition ensures that we have (if it exists) the unique greatest lower bound. Similarly for suprema.

A lattice is a particular kind of poset. In particular, a lattice L is a poset P(X, ${\displaystyle \leq }$ where For any two elements of the lattice a and b, the set {a, b} has a join and a meet

The join and meet operations MUST satisfy the following conditions

• 1) The join and meet must commute
• 2) The join and meet are associative
• 3) The join and meet are idempotent, that is, x join itself or x meet itself are both x.

If the lattice contains a meet it is a meet-semilattice, if a lattice contains a join it is a join-semilattice, similarly there exists a meet-semilattice

(Definitions obtained from wikipedia with minimal modification)

Lattice Definition

Definition of a Lattice (L, ${\displaystyle \land }$, ${\displaystyle \lor }$ )

• L is a poset under ${\displaystyle \leq }$ such that
• Every pair of elements has a unique greatest lower bound (meet) and least upper bound (join)
• Not every poset is a lattice: greatest lower bounds and least upper bounds need not exist in a poset.

Infinite vs. Finite lattices

• Infinite: An infinite lattice does not contain an 0 (bottom) or 1 (top) element, even though every pair of elements contains a greatest lower bound and a least upper bound on the entire underlying set. By the definition of unbounded or infinite sets we know that given X an unbounded set given any x in X we can find an x' that is greater than x (under some ordering, in this case the lattice). Similarly for greatest lower bounds.
• a finite/bounded lattice: the underlying set itself has a greatest lower bound and a least upper bound, For now we will call the greatest lower bound 0 and the least upper bound 1.
• if a${\displaystyle \leq }$ x, for all x in L, then a is the 0 element of L, ${\displaystyle \bot }$, recall that this is a unique element
• if a${\displaystyle \geq }$ x for all x from L, then a is the 1 element of L, ${\displaystyle \top }$

Meet ${\displaystyle \land }$ is a binary operation such that a ${\displaystyle \land }$ b take the greatest lower bound of the set (this is guaranteed by the definition lattice.

Similarly Join ${\displaystyle \lor }$ returns the least upper bound of the set, guaranteed to exist by the definition of a lattice.

To recap, a lattice L is a triple {X, ${\displaystyle \land }$, ${\displaystyle \lor }$} composed of a set, a Meet function, and a Join function

Properties of Meet and ${\displaystyle \land }$.

• We refer to the ${\displaystyle \lor }$ as ${\displaystyle \lor }$ and the ${\displaystyle \land }$ as J
• Closure: If x and y belong to L, then there exists a unique z and a unique w from L such that x ${\displaystyle \lor }$ y = z, and x ${\displaystyle \land }$ y = w
• Commutativity: for all x, y in L, x ${\displaystyle \lor }$ y = y meet x, x ${\displaystyle \land }$ y = y ${\displaystyle \land }$ x:
• Associativity: (x ${\displaystyle \lor }$ y) ${\displaystyle \lor }$ z = x ${\displaystyle \lor }$ (y ${\displaystyle \lor }$ z), similarly in the ${\displaystyle \land }$ operation
• There are two unique elements of L called bottom ( _|_ ), and top (T) , such that for all x, x ${\displaystyle \lor }$ _|_ = _|_ and x ${\displaystyle \land }$ T = T
• Many lattices, with some exceptions, notably the lattice corresponding to constant propagatioin, are also distributive: x ${\displaystyle \lor }$ y ${\displaystyle \land }$z = (x ${\displaystyle \land }$z) ${\displaystyle \lor }$ (y ${\displaystyle \land }$z)

Lattices and partial order:

${\displaystyle x\sqsubseteq y}$ if and only if ${\displaystyle x\sqcap y=x}$

A strictly ascending chain is a sequence of elements of a set X such that, for x_i in X, ${\displaystyle x_{1},x_{2},...,x_{n}}$ has the property ${\displaystyle \bot =x_{1}\sqsubset x_{2}\sqsubset ...\sqsubset x_{n}=\top }$. The greatest is the chain with final index n such that n is the greatest such final index among all strictly ascending chains.

The height of a lattice is defined as the length of the longest strictly ascending chain it contains.

If a data-flow analysis lattice has a finite height and a monotonic flow function then we know that the associated data flow analysis algorithm will terminate.

• Example: If the greatest strictly ascending chain of a lattice L is finite and it takes finitely many steps to reach the top, we can infer that the associated data flow algorithm terminates.

(wikipedia used for definitions)

Example: Bit vector Lattices

• The elements of the set are bit vectors
• The bottom is the 0 vector
• The top is a 1 vector
• Meet is a bitwise And
• Join is a bitwise Or

${\displaystyle BV^{n}}$ denotes the lattice of bit vectors of length n.

Constructing complex lattices from multiple less complex lattices

• Example: The product operation which combines (concatenates) lattices elementwise
• The product of two lattices L1 and L2 with meet operators M1, M2, respectively: L1 x L2
• The elements in the lattice: {<x1, x2> | x1 from L1, x2 from L2}
• The meet operator: <x1, x2> M <y1, y2> = <x1 M y1, x2 M y2>
• The join operation: <x1, x2> J <y1, y2> = <x1 J y1, x2 J y2>
• Example:
• BV^n is the product of n copies of the trivial bit vector attice BV^1 with bottom 0 and top 1

Graphical Representation BV^3

          111
/     |    \
110       101      011
|    x        x   \
100       010      001
\     |     /
000


Here meet and join operators induce a partial order on the lattice elements

x is less than or equal to (<=) y if an only if x M y = x

For the BV^3: 000<= 010 <= 101<=111

The partial order on the lattice is:

• Transitive x <= y and y <= z, then x <=z
• Antisymmetric: if x<=y and y<=x, then x = y
• Reflexive: for all x: x<=x:

The height of the lattice is the length of its longest strictly ascending chain:

• The maximal n such that there exists a strictly ascending chain x1, x2, ..., xn such that
• Bottom = x1 < x2 < xn = Top

For BV^3 lattice, height = 4

Monotonic Functions

A monotonic function is a function that preserves an ordering.

Examples

A function f from L to itself, f: L -> L, is monotonic if for all x, y from L, x<=y ==> f(x)<=f(y)

f: BV^3 -> BV^3: f (<x1 x2 x3>) -> <x1 1 x3>

Lattice Tuples

Simple analyses may require complex lattices:

• Problem:
• Reaching Constants: V 2^(v*c) where v is the number of variables and c is the constants
• Solution:
• Construct a tuple of lattices where each lattice corresponds to a variable

V = constant U {Top, Bottom}

integer value: ICP

This is used in constant propagation Elements: Top, Bottom, Integers, Booleans

• n M Bottom = Bottom
• n J Top = Top
• n J n = n M n = n
• Integers and booleans m,n, if m != n, then m M n = Bottom, m J n = Top
• The lattice has three levels: the top element, all other elements, the bottom element
• Join operation: Higher level to lower level
• Meet operation: Lower level to higher level

Relevance to data flow analysis

A lattice provides a set of flow values to a particular data flow analysis.

Lattices are used to argue the existence of a solution obtainable through fixed-point iteration

• At each program point a lattice represents an IN[p] or OUT[p] set (flow value)
• meet: merge flow values, e.g. set union, deal with control flow branches merge
• Top usually represents the best information (initial flow value). Note people can use top to represent worst-base information also!!
• The bottom value represents the worst-base information
• if BOTTOM <= x <= y <= TOP , then x is a conservative approximation of y. e.g. x is a superset

e.g. liveness analysis

bitvector for all variables x_1, x_2, ..., x_n

First step: design the lattice values

• top value: empty set {}, initial value, knowing nothing
• bottom value: all set {x_1, x_2, ..., x_n}: max possible value, knowing every variable is live

n = 3, 3 variable case: a flow value==> a set of live variable at a point

S = {v1, v2, v3}

value set: 2^3 = { empty, {v1},{v2}, {v3}, {v1, v2}, {v1,v3}, {v2, v3}, {v1, v2, ve} }

Design lattice

• top value, best case: none live { T } // top
• bottom value, worst ase: all live {v1, v2, v3}

Design meet: set Union (Or operation): bring the value down to the bottom, context insensitive

• design partial order <= --> ${\displaystyle \supseteq }$

In between, a partial order: inferior/conservative solutions are lower on the lattice

         Top
/    |   \
{v1}   {v2}  {v3}
|    x      x   |
{v1, v2}  {v1,v3}  {v2,v3}
\     |      /
{v1, v2, v3} = Bottom


Flow function F: ${\displaystyle f_{n}(X)=Gen_{n}\cup (X-Kill_{n}),\forall _{n}}$}

reaching definition

Value: 2^n n = number of all definitions

top: empty set: knowing nothing, initial value

bottom: all set: all definitions are reaching definition

Meet operation: set union: bring down the levels of values, from unknowing to knowing