Displaying data: R code for Chapter 2 examples
I will put all of my R markdown code for my rpubs examples here. The data are in the data folder - coreysparks/Rcode. You are not signed in. You don't need an account to work on this lesson, but if you want to save your work, remember to sign in or create an account before you get started. ## year sex oceanAgeYears lengthMm massKg ## 1 1996 FALSE 3 513 3.090 ## 2 1996 FALSE 3 513 2.909 ## 3 1996 FALSE 3 525 3.056 ## 4 1996 FALSE 3 501 2.690 ## 5 1996 FALSE 3 513 2.876 ## 6 1996 FALSE 3 501 2.978. Histograms, testing different bin widths. Comparing the effect of setting bin width to 0.1, 0.3, and 0.5 kg.
Download the R code on this page as a single file here.
New methods on this page
Click on a function argument for a short description of its meaning. The variable names are plucked from the examples further below.
Read data from a comma-separated (.csv) file into a data frame. R will take columns having only numbers and make them numeric variables in the data frame. Columns having characters will be converted to variables, rather than to character variables.
<- read.csv()
Inspect data in a data frame. The
head
function shows the first few lines of the data frame to confirm the data was read properly. The nrow
function indicates the number of cases (rows) in the data frame.head()
nrow()
Basic strip chart relating a numeric variable to a categorical variable:
stripchart(, , ,
)
Frequency table or contingency table:
table()
Bar graph for count data:
barplot(, , )
Histogram for a numeric variable:
hist(, )
Scatter plot showing association between two numeric variables:
plot(, )
Other new methods:
Multiple-histogram plot.
Line plot.
Load an R package.
Set the desired ordering of categories of a factor variable in tables and graphs.
Multiple-histogram plot.
Line plot.
Load an R package.
Set the desired ordering of categories of a factor variable in tables and graphs.
Figure 2.1-2. Locust serotonin
Strip chart of serotonin levels in the central nervous system of desert locusts that were experimentally crowded for 0 (the control group), 1, and 2 hours.
Read the data and store in data frame (here named locustData). The following command uses
read.csv
https://awaredownload.mystrikingly.com/blog/windows-casino-games. to grab the data from a file on the internet (on the current web site).Show the first few lines of the data, to ensure it read correctly. Determine the number of cases in the data.
Draw a stripchart (the tilde '~' means that the first argument below is a formula, relating one variable to the other).
We show how to draw a fancier strip chart, closer to that shown in Figure 2.1-2, by including more options.
Figure 2.1-3. Education spending
![Rcode 2 8 Rcode 2 8](https://user-images.githubusercontent.com/4749052/89999797-3a001100-dc87-11ea-9aeb-18ef9ebbb441.png)
A bar graph of education spending per student in different years in British Columbia.
Read the data into a data frame named
educationSpending
, and inspect.Draw a bar graph. The
names
argument generates numbers between 1998 and 2004 in 1-year increments, to be used as labels along the horizontal axis.Game 2.8
Snapmotion 4 2 5. A slightly fancier bar graph like that in Figure 2.1-3, which includes additional options, is shown .
Example 2.2A. Deaths from tigers
Frequency table and bar graph showing activities of 88 people at the time they were killed by tigers near Chitwan National Park, Nepal, from 1979 to 2006.
Read the data into data frame named
tigerData
Generate a frequency table. The
sort
function is included to sort the categories by their frequencies.You can arrange the frequency table vertically.
Use the
addmargins
command to include sums in your frequency table.Draw a bar graph. The additional arguments
cex.names = 0.5
shrinks the axis labels by 50%, and las = 2
flips the labels, so that they all fit in the window.A slightly fancier bar graph of the data with more options, like that shown in Figure 2.2-1, is shown .
Example 2.2B. Desert bird abundance
Frequency table and histogram illustrating the frequency distribution of bird species abundance at Organ Pipe Cactus National Monument.
Read and inspect the data.
Generate a frequency table of the numeric bird abundance variable. The option
right = FALSE
ensures that abundance value 300 (for example) is counted in the 300-400 bin rather than in the 200-300 bin.The same table oriented vertically and including the sum.
Draw a histogram of bird abundances.
Commands to draw a histogram with more options, such as Figure 2.2-3, are .
Figure 2.2-5. Salmon body size
Histograms of body mass of 228 female sockeye salmon sampled from Pick Creek in Alaska. The same data are plotted for three different interval widths: 0.1 kg, 0.3 kg, and 0.5 kg.
Read the data into data frame named
salmonSizeData
Histograms with different bin widths (0.1, 0.3, and 0.5).
Example 2.3A. Bird malaria
Contingency table, grouped bar plot, and mosaic plot showing the association between egg removal treatment and incidence of malaria in female great tits.
Read the data from the data file and inspect.
Optional step: Set the desired order of treatment categories in tables and graphs. The
factor
command allows us to order the levels so that the category 'Egg removal' comes before 'Control' in tables and graphs (categories are otherwise ordered alphabetically).Create a contingency table. Use
table
with the names of two categorical variables as arguments, beginning with the response variable.Include row and column sums in the contingency table.
Draw a grouped bar graph using the contingency table.
Commands to produce a grouped bar graph with more options are shown .
Draw a mosaic plot. The '
t
' command flips (transposes) the table to ensure that the explanatory variable is along the horizontal axis. Pyramid slot game. Blueharvest 7 0 0 for mac crack keygen download.Example 2.3B. Guppy father and son comparison
Scatter plot of the relationship between the ornamentation of male guppies and the average attractiveness of their sons.
Read the data from the file.
Draw a scatter plot using the formula approach (with the '~').
See for commands to draw a fancier scatter plot like that in Figure 2.3-3.
Example 2.3C. Human hemoglobin and elevation
Strip chart, box plot, and multiple histograms showing hemoglobin concentration in men living at high altitude in three different parts of the world and in a sea level population (USA).
Read the data.
Obtain the sample sizes in each of the four populations
Draw a strip chart of the hemoglobin data.
Commands for a fancier strip chart like that shown in Figure 2.3-4 are .
R 8253-2 Code Du Travail
Draw a box plot, relating hemoglobin to population.
A box plot with more options, like that in Figure 2.3-4, is shown here.
Show the association between hemoglobin and population using the multiple histograms approach (Figure 2.3-5). The following commands use the
lattice
package, which must first be loaded. we show commands to draw the multiple histograms in basic R, without using the
lattice
Club player casino no deposit bonus codes 2018. package. This approach is more tedious, but the resulting graphs are often easier to modify.Example 2.4A. Measles outbreaks
Line graph showing confirmed cases of measles in England and Wales from 1995 to 2011. Annual counts are quarterly.
Read the data from file.
Drawing a line graph uses the same command as a scatter plot but adds the
type = 'l'
argument to draw a line graph instead.Commands to draw a fancier line plot (like that in Figure 2.4-1) are shown .
A factor is a special variable type in R that keeps track of known categories (“levels”) of a categorical variable.
The name you choose for the R data frame where the data from the file will be stored.
Name of a data file in comma-separated format. Replace this with the exact phrase
file.choose()
if instead you want a window to pop up that lets you select the file on your computer by point-and-click.Name of the data frame.
The '~' indicates that this is a formula relating a numeric response variable (
serotoninLevel
) on the y-axis to a categorical explanatory variable (treatmentTime
) on the x-axis.Indicates the name of the data frame containing the two variables.
Displaces the points a small random amount to reduce overlap.
The name of the categorical variable being tabulated. For a contingency table, include the names of two categorical variables, separated by comma.
Name of the frequency table or variable being plotted.
Shrinks the x-axis labels by 50% so that they fit in the window.
![Rcode Rcode](https://uploads-us-west-2.insided.com/coursera-en/attachment/cab9815a-d693-495b-8827-47182d6d0a94.png)
This option makes the histogram bins 'right-open'. This means that abundance value 300 (for example) is counted in the 300-400 bin rather than in the 200-300 bin.
A formula indicating the name of the variable to be plotted on the vertical axis (named on the left of the '~') and the variable to be plotted on the horizontal axis.
The name of the data frame containing the two variables.