Introduction to Chemical Engineering Processes/Excel

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Introduction to Spreadsheets[edit | edit source]

This tutorial probably works with other spreadsheets (such as w:open office) with minor modifications.

A spreadsheet such as Excel is a program that lets you analyze moderately large amounts of data by placing each data point in a cell and then performing the same operation on groups of cells at once. One of the nice things about spreadsheets is that data input and manipulation is relatively intuitive and hence easier than doing the same tasks in a programming language like MATLAB (discussed next). This section shows how to do some of these manipulations so that you don't have to by hand.

Anatomy of a spreadsheet[edit | edit source]

A spreadsheet has a number of parts that you should be familiar with. When you first open up the spreadsheet program, you will see something that looks like this (the image is from the German version of open office)


First off, notice that the entire page is split up into boxes, and each one is labeled. Rows are labeled with numbers and columns with letters. Also, try typing something in, and notice that the box above the spreadsheet (to the right of ) will change automatically as you type. When you're just putting in numbers, this info box will just have the same number in there. But when you're putting in formulas, the cell will display the value calculated from the formula, while the info box will display what the formula was.

Inputting and Manipulating Data in Excel[edit | edit source]

The first step in any spreadsheet analysis is to input the raw data you want to analyze. It is most effective if you put it in columns, with one column for each variable. It lets you see more data at once, and it also is less limited because the maximum number of rows is much larger than the maximum number of columns.

It is good practice to use the first row for the names of the variables, and the remaining for the data points. Make sure you include units. In this section, the following data will be used as illustration:

(Rownumber) Column A Column B
1 t(min) D (yards)
2 1.1 559.5
3 1.9 759.5
4 3.0 898.2
5 3.8 1116.3
6 5.3 1308.7

Using formulas[edit | edit source]

In order to tell the spreadsheet that you want to use a formula rather than just enter a number, you have to start the entry with an equal sign (=). You can then use combinations of decimal values and cell designations. A cell designation is simply the column letter followed by the row number containing the value you wish to manipulate. For example, if you wanted to find the product of the distance traveled and the time spent traveling, you could put in the formula:

= A2*B2

into any empty cell and it would give you the answer. From here out it will be assumed that this value is in cell C2. You should label the column with the type of calculation you're performing.

Performing Operations on Groups of Cells[edit | edit source]

The question may arise: why not just put in the numbers themselves instead of referencing the cell? There are two major reasons for this:

  1. If you change the value in the referenced cell, the value calculated in the formula will automatically change.
  2. The built-in dragging capability of most spreadsheets.

The dragging capability is a simple concept. If you have put a formula into a spreadsheet, you can have it copied to any number of cells you want. To do this, select the cell with the formula and bring the mouse pointer to its lower-right hand corner. You should see a dark + icon:

Info Bar "=A2*B2"
(rownum) Column A Column B Column C
1 t(min) D(yards) t*D
2 1.1 559.5
| 625.45 |
3 1.9 759.5
4 3.0 898.2
5 3.8 1116.3
6 5.3 1308.7

Click on the + and drag it down. This will cause the formulas to change according to how you drag the box. In this case, if you drag it down to row 6, the spreadsheet will produce the following:

    A            B              C
1  t (min)      D (yards)      t*D
2  1.1         559.5         615.45
3  1.9         759.5         1443.05
4  3.0         898.2         2694.6
5  3.8         1116.3         4241.94
6  5.3         1308.7         6936.11

If you click on the last value in column C (6936.11) the info bar will display:


This is very useful for performing the same operations on multiple sets of data at once; rather than having to do the multiplication 5 separate times here, we just do it once and drag down the box.

Special Functions in Excel[edit | edit source]

In order to do many mathematical operations in Excel (or at least the easiest way), it is necessary to use functions (not to be confused with formulas). A function is simply an implementation someone already wrote for the mathematical operation, so all you have to do is know how to tell it to do the operation and where to put it when it's done. In excel, you can call a function named "function" by typing the following into a cell:


The function will then execute, and the cell containing the call will display the answer. The necessary inputs are sometimes numbers but are more often the cell addresses. For example, in the data above, say you wanted to take the exponential () of all the time points in column A, and place the result in column D. The function for exponential is exp, and it can only accept one input at a time, but due to the dragging capability of Excel this will not matter much, you can just call it once and then drag the cell as you would with any formula containing cell addresses. So to do this you would type into cell D2:


Hit enter, then click the + in the bottom right and drag the cell down. You should end up with something like this after labeling the D column appropriately:

         A          B            C          D
1       t (min)   D(yards)     t*D       e^t
2       1.1      559.5       615.45    3.004166024
3       1.9      759.5       1443.05   6.685894442
4       3        898.2       2694.6    20.08553692
5       3.8      1116.3      4241.94   44.70118449
6       5.3      1308.7      6936.11   200.33681

All excel functions output only one value at a time, though some can accept multiple cells at a time as input (mostly statistical functions).

Following is a brief synopsis of the functions available. For a complete list, see the help files for your spreadsheet, as the availability of each function may vary depending on which one you are using. CELL signifies either the row/column designation of the cell you want to pass to the function as input, or some numerical value you enter manually.

Mathematics Functions[edit | edit source]

Generally these only take one input at a time.

abs(CELL): Absolute value of CELL
sqrt(CELL): Square root of CELL [to do nth roots, use CELL^(1/n)]
ln(CELL): Natural log of CELL
log10(CELL): Log of CELL to base 10
log(CELL, NUM): Log of CELL to the base NUM (use for all bases except e and 10)
exp(CELL): Exponential(e^x) of CELL. Use since Excel doesn't have a built-in constant "e".
sin(CELL), cos(CELL), tan(CELL): Trigonometric functions sine, cosine, and tangent of CELL. CELL must be in radians
asin(CELL), acos(CELL), atan(CELL): Inverse trigonometric functions (returns values in radians)
sinh(CELL), cosh(CELL), tanh(CELL): Hyperbolic functions
asinh(CELL), acosh(CELL), atanh(CELL): Inverse hyperbolic functions

Statistics Functions[edit | edit source]

These are examples of useful statistics functions in Excel, they are not by any means the only ones.

GROUP means a group of cells that are directly next to each other. Define a group by the syntax FIRSTCELL:LASTCELL, for example, using GROUP = A2:A5 passes all the cells between A2 and A5 (inclusive) to the function. If a function requires two different groups (for example, a y and an x), both groups must be within continuous groups of cells.

average(CELL1, CELL2, ...) OR average(GROUP): Computes the arithmetic average of all inputs.
intercept(GROUP1, GROUP2): Calculates the y-intercept (b) of the regression line where y = GROUP1 and x = GROUP2.
   GROUP1 and GROUP2 must have the same size.
pearson(GROUP1, GROUP2): Calculates the Pearson correlation coefficient (R) between GROUP1 and GROUP2.
stdev(CELL1, CELL2, ...) OR stdev(GROUP): Computes the sample standard deviation (divides by n-1) of all inputs.
slope(GROUP1, GROUP2): Calculates the slope (m) of the regression line where y = GROUP1 and x = GROUP2. 
   GROUP1 and GROUP2 must have the same size.

Programming Functions[edit | edit source]

Solving Equations in Spreadsheets: Goal Seek[edit | edit source]

Excel and possibly other spreadsheets have a very useful tool called goalseek which allows the user to solve single-variable equations (and can be used as an aid in guess-and-check for systems of algebraic equations). Let's suppose for the purposes of this tutorial that you wish to find a solution to the equation:

In order to set up the problem in Goalseek, it is necessary to define a cell for the variable you want to change (X) and a cell for the function you want to evaluate.

Goalseek will only work if you tell it to evaluate some function until it reaches a constant value. You cannot tell it to equal something that can change, so for example you cannot do something like this:

because neither side is a constant. The easiest way around this is generally to solve the function for zero and then use that as the evaluating function.

Here, we could set up the cells as follows:

   A    B          
1  X    f(X)
2  -1    =A2^3 + 2*A2^2 - A2 + 1

To solve this one, go to

Tools > Goalseek...

It'll give you three boxes: "Set Cell", "To Value", and "By Changing Cell". Since we want the value in cell B2 to equal 0, enter B2 into the "Set Cell" box and 0 into the "To Value" box. Since cell B2 depends on cell A2, we want to change A2 so that B2 equals 0. Hence, the "By Changing Cell" box should contain A2. Put that in and click "OK", and Goalseek will converge to an answer:

   A           B          
1  X          f(X)
2  -2.54683   -0.00013

Notice that the success of the goalseek depends on what your initial guess was. If you try to put in an initial guess of 0 in this example (instead of -1), goalseek will diverge. It will tell you so, saying "Goal Seeking with Cell B2 May Not have Found a Solution". However, the algorithm is generally fairly robust so it shouldn't take too many guesses to obtain convergence.

You can only enter one cell into each of the "Set Cell" and "By Changing Cell" boxes, and the value in "To Value" must be a constant


Graphing Data in Excel[edit | edit source]

In Excel, there are a variety of ways to graph the data you have inserted, such as bar graphs, pie charts, and many others. The most commonly-used in my experience is the scatterplot, which is the name Excel uses for the typical x-y "line graph" plot that you probably think of first when you think of a graph.

Scatterplots[edit | edit source]

Scatterplots can be made relating any one independent variable to any number of dependent variables, though if you try to graph too many it will get crowded and hard to read. Excel will automatically give each different dependent variable a different color and a different shape, so that you can distinguish between them. You can also name each "series" of data differently and Excel will automatically set up a legend for you.

This is how to make a scatterplot:

  1. Put the data into columns just like it was given in the problem statement.
  2. Now we need to set up the graph. Go to: .
  3. Select "XY (scatter)" and click "next".
  4. Click the "series" tab (on top). If there are any series present, remove them with the remove button (since it usually guesses wrong what you want to graph).

Now we can add a series for each dependent variable we want to graph as follows:

  1. Click "add".
  2. Next to "X values" click the funky arrow symbol to the right of the text box. A small box will pop up.
  3. Click on the first value for the independent variable and drag the mouse down to the last value. Click the funky symbol again to bring you back to the main window.
  4. Do the same thing with the "Y values" but this time you want to select the values of the dependent variable.
  5. Click next, and give the graph a title and labels if you want. Then click next and "finish" to generate your graph.

Performing Regressions of the Data from a Scatterplot[edit | edit source]

Once you have a scatterplot of your data, you can do one of several types of regression: logarithmic, exponential, polynomial (up to 6th degree), linear, or moving-average. Excel will plot the regression curve against your data automatically, and (except for moving average) you can tell it to give you an equation for the curve. To do this:

  1. Right click on one of the data points (it doesn't matter which). Click "add trendline..."
  2. A new window will come up, asking you for the type of regression. Choose the type of regression you want to use.
  3. Click on the "options" tab, and check the "Display Equation on Chart" box (and, if you want, the "Display R-squared value on Chart" box). Click OK.

If you chose a "linear" regression with the sample data above, the equation and value appear on the graph as . Note Excel displays rather than R (so that we don't need to worry about negative vs. positive values); if you want R just take the square root, which is 0.9921 as we calculated in the section on linear regressions.

Further resources for Spreadsheets[edit | edit source]

Excel and other spreadsheets can do far more than what is described here. For additional information, see Microsoft Office, w:Excel, or the help files for the program you are using.