Statistics for Sociology/Presenting Data

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Creating a Frequency Distribution in R[edit]

It's quite useful to create a frequency for many variables, particularly those that are nominal and ordinal, to get a sense of how many cases fall into each category. In R, creating a frequency distribution can be done with the following command:


If you want to contrast two variables, you can use the same command, but with two variables:

table(dataset$variable1, variable2)

If you want to convert the table into proportions, you would need to first save the table as a variable then use a different command:

frequencytable <- table(dataset$variable1, variable2)
prop.table(frequencytable, 1)

NOTE: By default, R excludes missing values in tables. If you want missing values include, the following option needs to be included in the command "exclude = NULL". So, it would look like this:

table(dataset$variable, exclude = NULL)

Creating Charts and Graphs in R[edit]

Line Charts[edit]

The default command will create a scatterplot of your variable (VAR) with automatically detected scales for the two-axes and no line connecting the plotted values:


To connect the lines, use type "o":

plot(VAR, type="o")

To add color, add the color variable:

plot(VAR, type="o", col="blue")

To adjust the scales of the axes, using the "ylim" or "xlim" variables:

plot(VAR, type="o", col="blue", ylim=c(0,89))

To add a title, use the title command:

title(main="Age", col.main="green", font.main=7)

To label the x and y-axes:

title(xlab="Years", col.lab=rgb(0,0.5,0))
title(ylab="Total", col.lab=rgb(0,0.5,0))

To create a legend:

legend(1, c("VAR1","VAR2"), cex=0.8, 
   col=c("blue","red"), pch=21:22, lty=1:2);

Bar Charts[edit]

For nominal or ordinal variables, barplots are preferred. First create a table of your nominal variables (NOMVAR) values:

NOMVAR_table <- table(NOMVAR)

Then use that table to create the bar chart:


Additional attributes you can add include names for the categories of the variable:

barplot(NOMVAR_table, names.arg = c("NAME1", "NAME2", "NAME3")


barplot(NOMVAR_table, density=c(10, 20, 30, 40, 50))


barplot(NOMVAR_table, fill=rainbow(4))


barplot(NOMVAR_table, border="red")

Pie Charts[edit]

To create a pie chart with a nominal variable, the first step is to create a table:

NOMVAR_table <- table(NOMVAR)

Then use that table to create the pie chart:


The additional attributes are similar to those above for Bar Charts.