# Recipes for the Design of Experiments/Chapter 3: Completely Randomized Designs

This is a sample recipe for a two-factor, multi-level experiment. It uses data retrieved from "Cars.csv" (some information from raw data is missing, so we drop these observations.)to determine if the model year of a vehicle or the country in which a vehicle was made has any effect on the vehicle's horse power. the factor model year has 13 levels and the factor country of origin has 3 levels. This study does not involve randomization, because this dataset describe the information for each car model. we can believe that before they get the car model information, sample cars are randomly picked. For example, For information of Toyota Camry, the same types of cars are randomly picked, tesed and the type information are calculated. Replication is also not included in this work. Replication is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated. We do not have sample to replicate. An analysis of variance is performed to determine if the variation of horse power means between vehicle samples is a result of the vehicle model year and the country of origin. We first test the horse power means among different model years based on ANOVA. The null hypothesis in this test is that the means of horse power crossing each year are equal. Then we have a one way ANOVA test and get a p-value < 0.0001, so we reject the null hypothesis and state that at least one of the means of horse power are not equal to others. In the second test, we focus on the factor of country, and the null hypothesis is that the means of horse power crossing each country region are equal. However, the result in ANOVA test shows that at least one of the means of horse power are not equal to others (p-value<0.0001). In the end, we analyzed the variation in horse power as a result of the interaction of model year and country of origin, and we got a p-value<0.05. It is possible that the interaction term has an effect on the horse power mean. A Tukey's Honestly Significant Difference is performed to identify specifically which horse power means are significantly different from the other means. http://rpubs.com/maxwinkelman/32711

The following link leads to a three-factor, multi-level analysis. The data used was from a study that involved traffic fatality rates, alcohol and drunk driving laws, and general demographic information by state taken over a number of years. In the study we perform an analysis of variance on three factors, minimum legal drinking age, whether or not the state mandates a jail sentence for offenses, and per capita personal income for that state to determine whether or not they have an effect on the response variable, traffic fatality rates (per 10,000) in that state. The data has been obtained from the data package "Ecdat" in R. It is not clear whether the data had been gathered completely randomly. However, data analysis on this data has been performed operating under the assumption that data was collected randomly. Talking about replicates and repeated measures, each configuration of input variables appears only once in the data and hence, there is no evidence of repeated measures in effect. During the analysis, all data was analyzed together and hence there was no blocking as well. With this background of the data, data analysis was divided into three steps: (i) Exploratory Data Analysis (EDA), (ii) Hypothesis Testing using ANOVA, and (iii) Diagnostics and Model Accuracy Checking. http://rpubs.com/Tothk2/DOERecipe3

In this study, a two-factor, multi-level experiment is be performed (using Lahman's Baseball Database) to see if either the number of hits earned by a given team in a given season or the number of homeruns earned by a given team in a given season (or, both via interaction) has a statistically significant effect on the number of wins that a given team earns in a given season. In the dataset, the factor ‘H’ refers to the number of hits that a given team earned in a given year and the factor ‘HR’ refers to the number of homeruns that a given team earned in a given year. Additionally, this analysis’ response variable is referred to in the dataset as ‘W’, which denotes the number of regular season wins that a given team earned in a given year. In determining this level of significance, an ANOVA analysis is performed and Tukey Honest Significant Differences are computed.

http://rpubs.com/howelb/46076<br\>

This analysis is a three factor multiple level design that uses data from a CS-A exam. The data is from all 50 states and recorded over a span of several years. However, we take a subset of this data to analyze the effect of 'yield per teacher'and the percentage of female students passing on the performance of all minority groups on this CS exam. The third factor taken is period or each successive year. The rationale behind this hypothesis is that with each passing year given the increased use of computers in day to day life we can analyze its effects on the education side too for example performance of minority communities on a CS exam. The design of this experiment is more focused towards ANOVA testing along with Tukey's HSD test. http://rpubs.com/Uzma_1004/32868

The following link leads to a two-factor, multi-level data analysis. The data of interest is a series of absorbance values obtained via Fourier Transform Infrared Spectroscopy experimentation to evaluate the presence of residual solvent after biomaterial fabrication. The experimental runs were done in random order. Each treatment has 11 replicates and the average is used in the analysis. The analysis uses a fixed effect model where factors "Treatment" and "Day" account for the explained variance in the data. The analysis focuses on analysis of variance which is expanded upon through the usage of Tukey's Honest Significant Difference Test which determines significant difference between each level of each factor being analyzed. http://rpubs.com/adamato/32887

Following test is a three-factor, multiple-level data analysis. Dataset is obtained from WHO website which contains information about cigarette consumption. The aim of this analysis is to test three possible factors that may influence percentage of people smoking, which are region, gender and years of education. Also models about combinations of them are analyzed by ANOVA in order to explain variance in percentage of smoking. Tukey's test and other model checking methods are used to select and check model adequacy. This analysis may have practicle meaning in reducing smoking in the future. http://rpubs.com/chenh16/32918

The following analysis is a multi-factor, multi-level analysis of variance. The data was gathered by the World Health Organization, and looks at the effect of gender, country, region, income group, and other factors on mortality rate (out of 1000 individuals). In this ANOVA, mortality rate was taken as the response variable, and models were designed to analyze what factors might cause the variation in groups. A Tukey Test was then used to look at the differences between various treatment levels in the groups. Through model adequacy testing, it was determined that the data is not normal, and therefore, it does not fulfill the needed assumptions for an ANOVA. Because of this, a Kruskal-Wallis was performed as a non-parametric two-way ANOVA. http://rpubs.com/braunj6/32931

The following link leads to a two-factor, multi-level experiment. The dataset is the California Test Score Data Set from the Ecdat package in R. The data comes from schools in California in the years 1998 and 1999. In this dataset there are 17 continuous variables. For this experiments, the two factors observed are the continuous variables: number of computers per students and the student-teacher ratio. The response variable is the average reading score for the school. Because the dataset doesn't contain all schools in California, we can assume assume that the schools were randomly selected by some type of sampling design.

The ANOVA test was used to analyzed if the variation in average reading score could come from the variation in the number of computers per students or from the variation in the student-teacher ratio. The null hypothesis for this experiment was that the variation in average reading scores can't come from the variation in number of computers per students or the student-teacher ratio. The alternative hypothesis was that the variation can come from the variation in number of computers per student or the student-teacher ratio. This experiment was then used to test the hypothesis.

Three different ANOVA tests were utilized. The first ANOVA test was used to determine if the variation in the average reading score could come from the variation in the number of computers per students. From the results of this test, the null hypothesis is rejected and it's possible to explain the average reading test score by something other than randomization such as the number of computers per student. The second ANOVA test was used to determine if the variation in the average reading score could come from the variation in the student-teacher ratio. From the results of this test, the null hypothesis is also rejected and it's possible to explain the average reading test score by something other than randomization such as the student-teacher ratio. The third ANOVA test was used to determine if the variation in the average reading score could come from the variation in the interaction of the number of computers per students and the student-teacher ratio. From the results of this test, it possible to attribute the average reading test scores to the number of computers per student or the student-teacher ration. With regard to the interaction of the number of computers per student and the student-teacher ratio, the total variation can't be a result of anything other than randomization. http://rpubs.com/tranc3/32941

The following analysis is a multi-factor, multi-level analysis of variance. The data was gathered by data collectors in the Metropolitan California area and attempts to give some insight as to the air quality in an area based on many factors. In this ANOVA, the air quality is the response variable, and the factors being tested include the location and amount of rain in the area. A Tukey Test was then utilized to validate the model and check for model adequacy. The test determines significant difference between each level of the factor being analyzed. http://rpubs.com/macchm/32950

The following link looks at a set of statistics on Baseball Catchers and examines the effect of 3 individual catchers statistics on their teams' ERA or runs given up. The particular statistics that were focused on were Errors (E), Put Outs (PO), and Stolen Bases (SB). These factors represent the catchers attributes that could have a direct impact on runs scored by the opposing team. A Multi-factor, multi-level ANOVA test is performed. Multiple models are created to examine the effects and the models are then checked to ensure they follow the assumptions. A Tukey's range test analyzes every combination of the factors at each level to test for a difference in means between each group. Interactions between the groups are also analyzed, to pay respect to the notions of causation and relationships: http://rpubs.com/svoboa/33092

This study explores data about passing rates of Advanced Placement to evaluate the effect of various covariates. The main objective is to assess whether two factors, number of schools and number of exams given, influence the total number of exams passed on each state. A preliminary analysis of variance was conducted to determine treatment effects on the dependent variable. Then both factors were tested for independent effects, and interaction effects were estimated by using a blocking technique. Moreover, the model was evaluated for adequacy by examination of normality, fitted v residuals, and a TukeyHSD plot. Finally, an interaction plot was constructed to observe any potential interaction effects between the factors. https://rpubs.com/manzat/32405

The study was collected on data to examine the effect the “colour” and “clarity” of a diamond have on the price of the diamond. The study consists of a multi-factor, multi-level analysis variance; which is done on already collected data. The first thing to note is that because we were not present for the experiment (data collection), as such a completely randomized design cannot be assumed. A completely randomized design consists of a random order in which to conduct the tests/collect the data. This is typically done by randomly assigning a sample order of all the experiments that will be run before the data is collected. The way the data is organized, seems to show that it was not random; although it could also mean that it was formatted after collection. The price of a diamond is defined by the 4 C’s (Carat, Clarity, Colour and Cut), and the analysis was done on two of these factors. It would have been possible to also include Carat as a factor since it is included in the data set, however for educational purposes it is possible that two factors may have been deemed sufficient. After the aov models were built, it can be seen that the p-value for the “clarity” factor is approaching 0 and it has a fairly large F-value. Thus the alternative hypothesis (there is a relation between “clarity” and “price”) can be accepted. For “colour” however, the p-value is 0.095 and the F-value (1.9) is smaller. When the interaction effect is tested, the p-value is small 0.016, and the F-value is 1.85 which shows that there may be interaction effects that affect the price. Depending on what thresholds are being used, the result shows that the clarity level of the diamond and/or the interaction between the clarity and color of the diamond may help to explain the variance in diamond pricing. http://rpubs.com/serena049/doehw3

The following analysis of a completely randomized experiment design uses two-factor ANOVA to examine the effect of race of mother and frequency of doctor visits on the birth weight of babies. This is of particular importance as the consequence of low birth rate are high rates of infant mortality and birth defects. 189 samples of data give information on mother’s behavior during pregnancy, which is a strong predictor of the health of a newborn. Predictor variables or effects are thought to be smoking habits, dietary habits, and level of prenatal care. The factors of interest in this case are race and level of prenatal care, describe by variables RACE and FTV (first trimester physician visits). Bar plots are generated which compare the race of those who responded to the frequency of visits. Response variable (birth weight, depicted BWT), is measured in grams and descriptive statics are presented. A randomized experiment is generated that tests the hypothesis that either RACE or FTV can predict BWT. Boxplots show a wide range of BWT for white newborns, and more narrow spread with an outlier for black newborns and ‘other’. The fit of the QQ plot assumes some normality in the data, but the results are not perfect. There may be slight interactions between RACE and FTV, but the effect is not immediately clear. Null hypothesis may be rejected, which proposed that randomization alone could account for the variation in BWT. RACE is a significant factor with P-value .0071, but neither FTV nor the interaction effect between FTV*RACE were found to be significant. The Tukey comparison highlights a difference in mean birth weight between white mothers and ‘black/other’ categories. Plotting the residuals validates the models assumption of normality. http://rpubs.com/konraz/39538

**Fall 2016 Projects **

The following link examines the Stars dataset within the “Ecdat” package. Star data set is used for examining “Effects on Learning of Small Class Sizes” and consist of 8 variables and 5748 observations. 4000 observations selected randomly among the observations, and Class type (3 levels), Teacher experience (2 levels), Gender (2 levels), Free lunch support (2 levels) were defined as Factors. Response variable of the analysis is the Math Score of the student which is also a continuous variable. For analyzing the main and interaction effects the null hypothesis set as: “learning of small classes does not affected by class type, gender of the students, experience of teacher, free lunch qualification and any 2 way interactions of these factors.” ANOVA results showed us that all main effects and also some of the interaction effects are significant. Main effects and interaction effects calculated and also presented with pid package. Model adequacy confirmed with the normality and homogeneity evaluation with QQ and residuals versus fitted variables plots. http://rpubs.com/unnuk/216193

This link examines a dataset entitled wages from the "Ecdat" package available in R. This dataset was the result of a survey of of 595 heads of household over the course of 7 years, for a total of 4165 observations. Of the original 11 factors, 4 factors of 2 levels each were chosen: Blue Collar, South, Sex, and Union-Set Wages. Main and Interaction plots were produced, and the statistical significance of the results was tested using one or two-factor ANOVA. Significance was found for three main effects, and four of six interaction effects, although some interaction effects were small. http://rpubs.com/mtwassick/217369

The following link examines Global health data among the 100 interesting data sets. In Global health data, ‘Mental health’ is selected, which includes Mental health governance (3 factors: legislation, plan, and policy), Human resources (1 factor: psychiatrists), and Suicide rates (1 response variable). The data set was collected for examining Effects of Mental health care(governance and Human resources) on Suicide rate and consist of 5 variables and 160 observations. We are using 4 factors with two levels analyzing their main and interaction effects on suicide rate. The factors of legislation and psychiatrists (main effects) are statistically significant at even 1% significance level. However, there are not statistically significant interaction effects at 5% significance level in this experiment. http://rpubs.com/bokjh3/217510

This experiment examines the effects of several childhood experiences on future wages. The data for this experiment was taken from a study on wages, schooling, and proximity to college campuses. The data was collected from individuals in the United States in 1976. This data set is the Schooling data set in the Ecdat package. There are 3010 observations and 28 variables in this data set. This experiment examines the effects of several childhood experiences on future wages. For this experiment four factors were studied, each with two levels. The factors studied were whether or not a person grew up in a metropolitan statistical area (an area with a relatively high population density), with the levels yes and no, whether or not a person grew up in close proximity to a 4 year college, with the levels yes and no, whether or not a person had a library card at age 14, with levels yes and no, and whether or not a person had a single mother at age 14, with levels yes and no. The response variable is the log of wages, and is a continuous numerical value with a normal distribution. This study walks through the experimental design, exploratory data analysis, and tests the four main effects and six interaction effects being examined. All four main effects were found to be significant at the 5% significance level, and one of the interaction effects was found to be significant at this level. http://rpubs.com/shamuswheeler/217564

In this study, we intend to perform a statistical analysis of the survivors on the Titanic ship using the Titanic dataset on Kaggle. The main question that we are addressing here is whether there is a statistically significance relation between the survival of the person and their passenger class, age, sex and/or port where they embarked their journey. http://rpubs.com/prasanna_date/217915