The Science of Programming/And Now for Something Completely Different

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Enough with the differentiation, already! Time for something new. New, but not all that different. Once you have learned about differentiation in your Math classes, you will surely start to learn about integration. As stated very early on, integration is the summing up of all the tiny (infinitesimal) pieces of a curve. So if dy represents one of the (infinitely) many bits of the curve y, then:

   

Indeed, if you sum up all the bits of a thing, you get the thing itself. [1]

Summing a series[edit | edit source]

As with differentiation, there are two kinds of integration, numeric and symbolic. Like differentiation, numeric integration gives an approximate result while symbolic integration gives an exact result. Since numeric integration involves the actual summing up of a bunch of little bits, let's get some practice summing.

What is the sum of:

   
   

with an infinite number of fractions following the pattern. We have a two issues to solve before we can write code to give us the answer. The first is that the sum seems to be composed of two different things, a whole number (1) and a bunch of fractions.

Life is so much easier when all things are the same; if this is so, we have no need of cluttering up our code with if expresssions to decide on what kind of thing we are working on. [2] In general, if we have two kinds of things we wish to treat the same, we need to either make the first look like the second, the second look like the first, or both look like some other thing. In the specific case of this summation, we need to either:

  • make the fractions look like whole numbers,
  • make the whole number look like a fraction, or
  • make the whole number and the fractions look like a third kind of thing

Of the three strategies, the second seems to be the easiest route since a a whole number is easily represented as a fraction by placing a one in the denominator:

   

At this point, you should familiarize yourself with iterative loops and recursive loops.


The remaining issue we need to solve is how to structure our code. It is clear that we will need some kind of loop, either recursive or iterative, to add up as many fractions as desired. In order to use a loop, we usually need to make the items we are looping over look the same or at least be a function of the loop counter. In our case, how do we make each fraction look alike?

If we look at the denominator, we see that each can be described as a power of two:

   

It seems that we can use a loop counter (starting at zero) as the exponent in the denominator of each fraction. With these two issues out of the way, we can begin to write some code. Let's try the recursive route first.

The recursive way[edit | edit source]

We will use n as our loop counter:

   function total(n)
       {
       1 / (2 ^ n) + total(n + 1);
       }

If we call total with an initial value of zero for n, we should get:

   1 / (2 ^ 0) + total(1)

Upon the next recursive call, we should get:

   1 / (2 ^ 0) + [1 / (2 ^ 1) + total(2)]

And then:

   1 / (2 ^ 0) + [1 / (2 ^ 1) + [1 / (2 ^ 2) + total(3)]]

This is exactly what we want, but we are going to end up summing an infinite number of terms. We have neglected to have a base case for stopping the recursion. Let's pass in the number of fractions we wish to sum up as max:

   function total(n,max)
       {
       if (n == max)
           {
           0;
           }
       else
           {
           1 / (2 ^ n) + total(n + 1,max);
           }
       }

The recursion stops when n reaches max.

What do we get if we total up five fractions?

   sway> total(0,5);
   REAL_NUMBER: 1.9375000000

Is this correct? Let's add up five fractions explicitly:

   sway> 1.0 / 1 + (1.0 / 2) + (1.0 / 4) + (1.0 / 8) + (1.0 / 16);
   REAL_NUMBER: 1.9375000000

Seems to be. How about ten fractions?

   sway> total(0,10);
   REAL_NUMBER: 1.9980468750

Fifty fractions?

   sway> total(0,50);
   REAL_NUMBER: 2.0000000000

One hundred fractions?

   sway> total(0,100);
   REAL_NUMBER: 2.0000000000

These results lead us to believe that if we were to add up these fractions out to infinity, we would get 2 as a total. Of course, since our result is a real number, we need to be wary of trusting it absolutely, but in this case, I'd be willing to bet the result is correct.

How about we add some visualization to this function? We probably should have done this first to make sure our code was behaving as expected:

   function total(n,max)
       {
       if (n == max)
           {
           println("0");
           0;
           }
       else
           {
           var denom = 2 ^ n;
           print("1/",integer(denom)," + ");
           1 / denom + total(n + 1,max);
           }
       }

Note how we 'precomputed' the denominator since we need it twice, once for the visualization and one for the actual computation. We also use the integer function to convert the real number that exponentiation produces back to an integer.

   sway>total(0,5);
   1/1 + 1/2 + 1/4 + 1/8 + 1/16 + 0
   REAL_NUMBER: 1.9375000000

Looking good! Of course, our visualization does not produce a valid Sway expression (whitespace and precedence problems), but that is rarely necessary for a visualization.

An iterative approach[edit | edit source]

Let's try using an iterative loop instead. There is a standard methodology for converting a recursive loop to an iterative loop. The first step is to initialize a local variable, say result, to the recursive loop's base case return value (and then return it). In this case, the return value is zero:

   function total(n,max)
       {
       var result = 0;
       return result;
       }

Now we add a while loop with the opposite of the test found in the recursive loop's if expression:

   function total(n,max)
       {
       var result = 0;
       while (not(n == max))     //better is n != max)
           {
           }
       return result;
       }

In the body of the loop, we place the recursive case calculation:

   function total(n,max)
       {
       var result = 0;
       while (n != max))
           {
           var denom = 2 ^ n;
           1 / denom + total(n + 1,max);
           }
       return result;
       }

replacing the recursive call with result:

   function total(n,max)
       {
       var result = 0;
       while (n != max))
           {
           var denom = 2 ^ n;
           1 / denom + result;   //was total(n + 1,max)
           }
       }

and assigning the whole expression back to result:

   function total(n,max)
       {
       var result = 0;
       while (n != max))
           {
           var denom = 2 ^ n;
           result = 1 / denom + result;   //was total(n + 1,max)
           }
       return result;
       }

Finally, if any variable was updated in the original recursive call, update the variable at the bottom of the loop using assignment:

   function total(n,max)
       {
       var result = 0;
       while (n != max))
           {
           var denom = 2 ^ n;
           result = 1 / denom + result;
           n = n + 1;
           }
       return result;
       }

Voila! We're done. This technique generally works well and will even work if there are multiple base cases and mulitple recursive cases. Sometimes, however, things go a little bit wrong in the transformation. Here's and example; let's convert the recursive version of the greatest common divisor function:

   function gcd(n,d)
       {
       if (d == 0)
           {
           n;
           }
       else
           {
           gcd(d,n % d);
           }
       }

We start with the base case return value:

    function gcd(n,d)
        {
        var result = n;
        return result;
        }

Next, we add a while loop:

    function gcd(n,d)
        {
        var result = n;
        while (d != 0)
            {
            }
        return result;
        }

Then we copy over the recursive case calculation:

    function gcd(n,d)
        {
        var result = n;
        while (d != 0)
            {
            gcd(d,n % d);
            }
        return result;
        }

We replace the recursive call in the calculation by result:

    function gcd(n,d)
        {
        var result = n;
        while (d != 0)
            {
            result;      //was gcd(d,n % d);
            }
        return result;
        }

Now we update the variables that change in the recursive call:

    function gcd(n,d)
        {
        var result = n;
        while (d != 0)
            {
            result;      //was gcd(d,n % d);
            n = d;
            d = n % d;
            }
        return result;
        }

We should be done at this point, but we still have a couple of problems. The first is that the statement:

   result;

in the body of the while loop isn't doing anything useful. This is our first clue that something is amiss. For now, let's delete it:

    function gcd(n,d)
        {
        var result = n;
        while (d != 0)
            {
            n = d;
            d = n % d;
            }
        return result;
        }

The second problem is that in the recursive call, the update to variable d used the old value of n but in our transformation, the update to d uses the new value of n. We need to save the old value and we use a temporary variable to do so:

    function gcd(n,d)
        {
        var result = n;
        while (d != 0)
            {
            var temp = n;
            n = d;
            d = temp % d;
            }
        return result;
        }

The final problem is that result never changes. That's because we initially set result to the original value of n, not the last value of n. Note that the recursive version uses the last value of n as desired. Thus, we update result after the while loop terminates to get the last value of n.

    function gcd(n,d)
        {
        var result = n;
        while (d != 0)
            {
            var temp = n;
            n = d;
            d = temp % d;
            }
        result = n;
        return result;
        }

Of course, we can clean up our function up a bit by realizing that result isn't doing anything at all; we can just return the last value of n:

    function gcd(n,d)
        {
        while (d != 0)
            {
            var temp = n;
            n = d;
            d = temp % d;
            }
        return n;
        }

This technique, at best, gets you the right answer. At worst, it gets you well on the way towards the right answer.

Summing as an abstraction[edit | edit source]

Note that in both our recursive and iterative solutions to the original summation, we have hard-wired the function that computes the next element in the sequence (namely the inverted exponentiation). An important design strategy for writing computer programs is called separation of concerns. With this strategy, we try to write functions, or sets of functions, so that each function performs a single task. In our solutions, both the generation of the next fraction (one concern) and the summing of those fractions (another concern), are found in a single function, We can separate those concerns into individual functions. One function generates the fractions to be summed; the other performs the actual summing, given the previously generated collection of fractions. Our first task, then, is to generate the collection of fractions. We will examine to ways of gathering a collection together, arrays and lists.


At this point, you should familiarize yourself with arrays and lists.


Here is the code for generating a list of the desired fractions:

   function invPowOfTwo(n,max)
       {
       if (n == max)
           {
           :null;
           }
       else
           {
           1 / (2 ^ n) join invPowOfTwo(n + 1,max);
           }
       }

If I wanted an array instead of a list, I might write the function as:[3]

   function invPowOfTwo(n,max)
       {
       var a = allocate(max);
       while (n < max)
           {
           a[n] = 1 / (2 ^ n);
           n = n + 1;
           }
       a;
       }

Either way, once I have my collection of fractions, I can now sum then with a general purpose summer:

   function sum(items)
       {
       if (items == :null)
           {
           0;
           }
       else
           {
           head(items) + sum(tail(items);
           }
       }

Whether I pass an array or list to sum, I get the total of all the items in the collection. .[4]

I could also have written sum iteratively:

   include("basics");
    
   function sum(items)
       {
       var i;
       var total = 0;
       for-each(i,items)
           {
           total = total + i;
           }
       total;
       }

This version will also sum up the collection of items in either list or array form, but for one of these forms, the process is rather slow (see the exercises).

The advantage of separating the generation of elements from the process of summing is this. Once we have our list of elements, we can do more things than just sum with them. We can visualize them, invert them, (after inverting) find the ones that are Mersenne primes plus 1, and so on. Likeise, we can use our summing function to sum up other kinds of collections.

Pseudo-infinite sequences[edit | edit source]

What if we go through the trouble a generating a set of elements to be summed or processed in some fashion and neglected to produce a large enough collection? We are stuck with regenerating the collection again, throwing away all the work we did previously.

The problem is we had to specify the max before we perform the summing:

   function invPowOfTwo(n,max)
       {
       if (n == max)
           {
           :null;
           }
       else
           {
           1 / (2 ^ n) join invPowOfTwo(n + 1,max);
           }
       }

It sure would be nice if we didn't have to specify in advance how large the collection should be. Not requiring a max also would simplify the code:

   function invPowOfTwo(n)
       {
       1 / (2 ^ n) join invPowOfTwo(n + 1);
       }

The problem is, without a base case, we will fall into an infinite recursive loop. However, we can avoid this pitfall through delayed or lazy evaluation.


At this point, you should familiarize yourself with lazy evaluation.


Let's write a version of invPowOfTwo that delays evaluation of the recursive call:

   function invPowOfTwo(n)
       {
       1 / (2 ^ n) join delay(invPowOfTwo(n + 1));
       }

Note the use of the delay function wrapped around the recursive call. What delay does is save everything needed to compute the recursive call without actually making the call. The actual call can be made later using the information saved by delay.

Let's look at the result of calling our new function:

   sway> var items = invPowOfTwo(0);
   LIST: (1.0000000000 # <THUNK 11167>)

We see the first element of our list is 1.0, as expected. Rather than seeing the rest of our list, however, we see that a thunk has been glued on as the tail of the list (the sharp sign signifies that the tail of the list is not a proper list). We can examine the thunk:

   sway> var t = tail(items);
   THUNK: <THUNK 11167>
    
   sway> pp(t);
   <THUNK 11086>:
       context: <OBJECT 11071>
       code: invPowOfTwo(n + 1)
   THUNK: <THUNK 11086>

We see that the code field of the thunk object holds the actual call. The context field holds an environment containing the values of n and invPowOfTwo, all that is needed to make the actual call.

A list of this sort is known as a delayed stream or stream for short. The stream metaphor is used since as long as there is source of water (memory), the stream will never run dry.

How do we get the elements other than the first element out of the stream? With the force function. The force function, when given a thunk as an argument, performs the evaluation that was delayed in the first place:

   sway> head(items);
   REAL_NUMBER: 1.0000000000
     
   sway> tail(items):
   THUNK: <THUNK 11167>
   sway> force(tail(items));
   LIST: (0.5000000000 # <THUNK 11193>)

By forcing the tail of the original list, we end up with a new list generated by the call invPowOfTwo(n + 1). When this call was delayed, n had a value of zero. Now that the call is being made, invPowOfTwo is passed a value of one. This generates a new list with at the head and another delayed recursive call. This time, delay will remember that n has a value of one instead of zero.

Let's define a function to obtain any element of a stream. We will pass in the index and then successively force the tail of the list until we get to the correct element:

   function streamIndex(s,i)
       {
       if (index == 0) // the head of the stream is desired
           {
           head(s);
           }
       else
           {
           streamIndex(force(tail(s)),i - 1);
           }
       }

Note that if we want element i of a list with more than i elements, that is the same as wanting element i - 1 of the tail of the list. When we take the tail of the list, the desired element appears to move one step closer to the front of the list. [5]

Now we can sum a stream:

   function sumStream(s,count)
       {
       if (count == 0)
           {
           0;
           }
       else
           {
           head(s) + sumStream(force(tail(s)),count - 1);
           }
       }

without having to worry about how many elements were originally generated. Iteratively, summing a stream might look like:

   function sumStream(s,count)
       {
       var total = 0;
       while (count > 0)
           {
           total = total + head(s);
           s = force(tail(s));
           count = count + 1;
           }
       total;
       }

Questions[edit | edit source]

Footnotes[edit | edit source]

  1. Assuming you don't believe in that magical marketing term, synergy.
  2. We saw this previously when we made a constant look like a term.
  3. Lists lend themselves to recursive solutions, arrays to iterative solutions.
  4. Don't try this with other languages! In almost every other language, the operators/functions for decomposing a list are different than decomposing an array.
  5. This is very important. Convince yourself this is so.

Do What You Can · Up The Down Staircase