Recipes for the Design of Experiments/Chapter 5: Blocked Designs with multiple explanatory and nuisance factors

From Wikibooks, open books for an open world
Jump to navigation Jump to search

This recipe takes a look at a set of data regarding the fuel economy of a large sample size of vehicles. The data was obtained by the US EPA and contains a number of different factors with a large number of levels. This experimental analysis will focus on the effects of fuel type, drive type, and transmission type of vehicles on the response variable of highway gas mileage. The novelty of this experimental analysis is in the blocking of two different variables, the year and the make of the vehicles, to reduce variation in the response variable caused by these two nuisance factors.

The recipe uses data taken from 'storms' to formulate a multi-factor, completely randomized design. Statistical analysis will be used to determine the effect of storm year, day, and hour on the storm pressure (millibars). The factors, storm type and storm month, will serve as two blocking factors. Several anovas will be conducted to determine if the variation of storm pressure can be attributed to the variation of these factors. Tukey's HSD will also be conducted to determine which pressure means are significantly different from each other.

The following analysis was completed using the EPA's fuel economy data. The type of analysis in question was a completely randomized block "pseudo design." 2 blocking variables were selected and then tested for their adequacy as blocking variables. The two variables ended up being the vehicle make and also the fuel type. 3 factors of interest were also selected, which were drive type, transmission, and v class. These were used as the factors in the ANOVA, whereas the blocking variables were blocked against. The response variable in this analysis was the average city gas mileage of the car. Tukey tests were completed to look at the difference in means between groups of the same factors, and nonparametric tests were completed due to the non-normality of the data.

This recipe includes an analysis of the fuel economy dataset initially provided by the US EPA and compiled by Hadley Wickham. This experiment utilized a completely randomized block design and seeks to understand which factors have an effect on highway fuel efficiency, measured in miles per gallon. The factors under test include Year, Class, and Transmission Type. The nuisance factors that were blocked for to increase model accuracy and reduce unwanted variation (chosen on the basis of low within group variation and high between group variation) were Engine Displacement and Number of Cylinders. Analysis of Variance is also used to determine the individual factor effects, as well as the interaction effects on the response, for the years 1985 and 1986. These years were chosen because there were significant improvements in fuel efficiency from previous years. This research seeks to understand which combination of factors contributed most to this dramatic increase.

In this project, we use the “2014 EPA fuel economy” dataset to construct a completely randomized block “pseudo-design” with three factors and two blocks. The purpose of the experiment to test whether the number of cylinders or/and drive type or/tr/and transmission type contribute to the variation in vehicle highway fuel economy, under the control of two blocks: year and vehicle make. The result shows that adding blocks to the experiment will help to increase the model estimation result precision, and we have proved that the variation in highway fuel economy is not only due to sample randomization.

Following analysis is based on dataset of 'fueleconomy' in order to invesigate how different factors influence vehicle city fuel economy. Here we use a complete randomized block design, selecting 3 factors under test(transmission type, drive train and vehicle size class) and 2 blocks (fuel type and number of cylinder). Factors and blocks are chosen based on exploratory analysis. Through ANOVA, Tukey's test and model adequacy checking, it is proved that vehicle city fuel economy can be explained by factors and their interactions rather than randomization.

Using fuel economy data collected by the EPA from 1985-2015, a completely randomized three-factor, multi-level block design is created to see if the “make,” the "drive type," or the "transmission" of a vehicle has a statistically significant effect on the "city fuel economy" (in miles per gallon, 'mpg') of that vehicle. This analysis includes two blocking variables (as a means for controlling any nuisance factors that may exist in this experiment), which are the "year" and the "fuel type" of each vehicle being considered. In determining this level of significance, an ANOVA analysis is performed and Tukey Honest Significant Differences are computed.

This analysis utilizes fuel economy survey data collected by EPA and it ranges from the year 1985 to 2015. It is a completely randomized three factor-multiple level design with two blocking variables. We study the effects model by considering 'transmission type', 'drive type' and 'cylinder number' as factors and 'city mileage' of each vehicle being the response variable. However, there are various other independent variables in the data-set that are known to have significant impact on the response variable. Through this analysis we identify those factors(often called nuisance variables) and block two of them (fuel type and make of the vehicle) since they are not of interest for this specific analysis. ANOVA (analysis of variance) is performed within each block and Tukey's Honest Significance Differences test is also done.

This recipe analyses the storms dataset. This experiemnet is a multilevel, multifactor analysis with two blocking factors in order to help determine if month, day, and hour have an affect on wind pressure when blocked by year and type of storm. An ANOVA analysis and a Tukey test was performed on each block and factor in order to analyze these effect and variances.

The following recipes uses fuel economy data that was collected by the EPA between 1985 and 2014. The experiment is a multi-level, 3 factor analysis using two blocking factors. The study examines the effects of fuel type, year and drive type to determine if they have an effect on the response variable of highway gas mileage. The data is blocked by two factors, cylinder and displacement which both contain two levels. The blocking factors are first checked to see if they represent "good" blocking factors and then an analysis of variance is performed followed by model adequacy checking. -

The recipe below examines the flights dataset from the nycflights13 package. An 3 factor analysis blocked by 2 addition factors is setup and analyzed. The design looks to determine in the airline carrier, month and hour of departure have an effect on the arrival delay time of flights out of NYC airports while the origin and destination cities used as the blocking factors. The data is subsetted since there were too many levels with each factor to run an interaction model. Initial analysis is performed, ANOVA models are created, the model adequacy is checked, and contingencies are discussed and performed.

This recipe is examining the vehicle data from the fueleconomy package.This dataset contains fuel economy data as a result of vehicle testing done at the Enviornmental Protection Agency's National Vehicle and Fuel Emissions Laboratory in Ann Arbor,Michigan.This experiment is testing the effect of three factors and 2 blocking factors on the city fuel economy.

This recipe examines and analyzes the fuel economy dataset. The data was collected by the EPA between 1985 and 2014. The experiment is a completely blocked design with multiple explanatory and nuisance factors, specifically blocked with 2 factors and 3 levels. This study examines the effects of gas mileage, specifically city gas mileage, and year and make of vehicle to determine if they have an effect on each other. the data is blocked by year and make of vehicle. An analysis of variance is performed followed by model-adequacy checking. -

The following analysis of a completely randomized experiment uses a five-factor ANOVA to examine the effect of car drive, number of cylinders, and fuel type on city gas mileage, blocking on two additional factors of year and car class to increase precision.