Compiler Construction/Optimization
A Wikibookian has nominated this page for cleanup. You can help make it better. Please review any relevant discussion. |
A Wikibookian has nominated this page for cleanup. You can help make it better. Please review any relevant discussion. |
Optimization
[edit | edit source]On modern computers, a compiler can be considered to have satisfactory performance if it translates a moderate size source program (say about 1000 lines) in a matter of seconds. The way to get a compiler with satisfactory performance is more or less the same way you would get any program performing well.
- Design using good algorithms.
- Ensure your data structures match the algorithms.
- Structure using modules with clean simple interfaces.
- Implement using clear straightforward code.
- When there is an overall performance problem
- Measure the actual performance in reasonable detail.
- Identify the troublesome areas.
- Redesign and re-implement these problem areas.
In this book we will consider various algorithms and data structures and discuss their likely impact on performance.
Note that actual measurement is crucial, since the problems are often not where you guess they might be. For your initial implementation you may well have selected simple algorithms which are known to perform poorly in order to get something working quickly. Nevertheless, you should still measure performance in detail, since there may be some other source of (at least some of) your problems.
- If you are very lucky, your implementation language might have some optional facility for selectively measuring CPU time. Take care to only activate such a feature for a few crucial routines; the timing overhead could easily exceed the execution time for small routines and distort the result.
- More commonly you will be unlucky and will have to explicitly add timing code to selected routines; make sure you can easily disable it and enable it as required. Typically you will have to insert calls to some CPU timing routine at the beginning and end of a routine, and then subtract the two values to get the time for that routine, which will include the time for any routines called by it.
Various measurements on the performance of actual compilers have been reported over the years. Specific areas which have been known to cause problems include:
- Multiple routine calls during lexical analysis for each and every source character.
- Skipping white space during lexical analysis.
- Skipping over a comment.
- Decoding a tightly packed parse table during syntax analysis.
- Looking things up in the name table during semantic analysis.
- Determining whether some name is a reserved keyword or a user-definable identifier.
An extensive list of optimizations can be found on Wikipedia in the compiler optimization article. Optimization is a very rich and complex topic, so this chapter will only attempt to introduce the basics.
Optimization within a compiler is concerned with improving in some way the generated object code while ensuring the result is identical. Technically, a better name for this chapter might be "Improvement", since compilers only attempt to improve the operations the programmer has requested. Optimizations fall into three categories:
- Speed; improving the runtime performance of the generated object code. This is the most common optimization
- Space; reducing the size of the generated object code
- Safety; reducing the possibility of data structures becoming corrupted (for example, ensuring that an illegal array element is not written to)
Unfortunately, many "speed" optimizations make the code larger, and many "space" optimizations make the code slower -- this is known as the space-time tradeoff.
Common Optimization Algorithms
[edit | edit source]Common optimization algorithms deal with specific cases:
- Dead Code Elimination
- Common Sub-expression Elimination
- Copy Propagation
- Code Motion
- Induction Variable Elimination
- Reduction In Strength
- Function Chunking
Dead Code Elimination
[edit | edit source]Dead code elimination is a size optimization (although it also produces some speed improvement) that aims to remove logically impossible statements from the generated object code. Dead code is code which will never execute, regardless of input
Consider the following program:
a = 5 if (a != 5) { // Some complicated calculation } ...
It is obvious that the complicated calculation will never be performed; since the last value assigned to a before the if statement is a constant, we can calculate the result at compile-time. simple substitution of arguments produces if (5 != 5)
, which is false. Since the body of an if(false)
statement will never execute - it is dead code we can rewrite the code:
a = 5 // Some statements
The algorithm was used to identify and remove sections of dead code
Common Sub-expression Elimination
[edit | edit source]Common sub-expression elimination is a speed optimization that aims to reduce unnecessary recalculation by identifying, through code-flow, expressions (or parts of expressions) which will evaluate to the same value: the re-computation of an expression can be avoided if the expression has previously been computed and the values of the operands have not changed since the previous computation.
Consider the following program:
a = b + c d = e + f g = b + c
In the above example, the first and last statement's right hand side are identical and the value of the operands do not change between the two statements; thus this expression can be considered as having a common sub-expression.
The common sub-expression can be avoided by storing its value in a temporary variable which can cache its result. After applying this Common Sub-expression Elimination technique the program becomes:
t0 = b + c a = t0 d = e + f g = t0
Thus in the last statement the re-computation of the expression b + c
is avoided.
Copy Propagation
[edit | edit source]Given an assignment x=y, replace later uses of x with y, provided there are no intervening assignments to x or y. Here the code will become smaller. Example:
x[i]=a;
sum=x[i]+a;
instead of this we can use:
x[i]=a;
sum=a+a;
For x[i]=a, is copy statements.Use of 'a' for 'x[i]', whenever possible after copy statement.Here it may not appear to be code improvement, but opens up scope for other optimizations.
Code Motion
[edit | edit source]This optimization technique mainly deals to reduce the number of source code lines in the program. For example, consider the code below:
for (i = 0; i < n; ++i) {
x = y + z;
a[i] = 6 * i + x * x;
}
The calculations x = y + z
and x * x
can be moved outside the loop since within they are loop invariant - they do not change over the iterations of the loop - so our optimized code will be something like this:
x = y + z;
t1 = x * x;
for (i = 0; i < n; ++i) {
a[i] = 6 * i + t1;
}
This code can be optimized further. For example, strength reduction could remove the two multiplications inside the loop (6*i
and a[i]
).
Another example of code motion:
for(i=0;i<10;i++) { a = a + c; }
In the above mentioned code, a = a + c can be moved out of the 'for' loop, and the new code is
a = a + 10*c;
Induction Variable Elimination
[edit | edit source]A variable is said to be induction variable if this gets increased and decreased by a fixed amount on every iteration of a loop. ex:
for(i=0; i<10; i++) { j=17*i; }
Here i,j are induction variables If two or more induction variables in loop, it may be possible to get rid of all but one. Eventually we can eliminate j from this loop.
Reduction In Strength
[edit | edit source]This concept refers to the compiler optimization method of substituting some machine instruction(s) by cheaper one(s) maintaining equivalence in results. Since some operators have different strength i.e use different space in the memory. This type of optimization can generate high gains especially when targeting different hardware and the compiler is aware of the subtle differences it can benefit from.
For ex- Strength of *(Multiply) is "Higher" than +(Add).
So,in this type of optimization Higher Strength Operators are replaced by Lower Strength Operators.
Example
Given that
Length(S1 || S2)
where S1 and S2 have some values.
So If we apply the rule then
Length(S1 || S2) ---(Replaced By)--> [Length(S1) + Length(S2)]
As + operation is cheaper than the ||.
and it was Cleared by the above example that there will no change in Result.
Function Chunking
[edit | edit source]Function chunking is a compiler optimization for improving code locality. Profiling information is used to move rarely executed code outside of the main function body. This allows for memory pages with rarely executed code to be swapped out.