mirror of
https://asciireactor.com/otho/cs-5821.git
synced 2024-12-18 09:35:06 +00:00
611 lines
16 KiB
Plaintext
611 lines
16 KiB
Plaintext
|
|
|||
|
|
|||
|
|
|||
|
> x <- c (1 ,3 ,2 ,5)
|
|||
|
> x
|
|||
|
[1] 1 3 2 5
|
|||
|
Note that the > is not part of the command; rather, it is printed by R to
|
|||
|
indicate that it is ready for another command to be entered. We can also
|
|||
|
save things using = rather than <- :
|
|||
|
> x = c (1 ,6 ,2)
|
|||
|
> x
|
|||
|
[1] 1 6 2
|
|||
|
> y = c (1 ,4 ,3)
|
|||
|
Hitting the up arrow multiple times will display the previous commands,
|
|||
|
which can then be edited. This is useful since one often wishes to repeat
|
|||
|
a similar command. In addition, typing ?funcname will always cause R to
|
|||
|
open a new help file window with additional information about the function
|
|||
|
funcname .
|
|||
|
We can tell R to add two sets of numbers together. It will then add the
|
|||
|
first number from x to the first number from y , and so on. However, x and
|
|||
|
y should be the same length. We can check their length using the length()
|
|||
|
length()
|
|||
|
function.
|
|||
|
> length ( x )
|
|||
|
[1] 3
|
|||
|
> length ( y )
|
|||
|
[1] 3
|
|||
|
> x+y
|
|||
|
[1] 2 10 5
|
|||
|
The ls() function allows us to look at a list of all of the objects, such
|
|||
|
ls()
|
|||
|
as data and functions, that we have saved so far. The rm() function cSan be
|
|||
|
rm()
|
|||
|
used to delete any that we don’t want.
|
|||
|
> ls ()
|
|||
|
[1] " x " " y "
|
|||
|
> rm (x , y )
|
|||
|
> ls ()
|
|||
|
character (0)
|
|||
|
It’s also possible to remove all objects at once:
|
|||
|
> rm ( list = ls () )44
|
|||
|
2. Statistical Learning
|
|||
|
The matrix() function can be used to create a matrix of numbers. Before
|
|||
|
matrix()
|
|||
|
we use the matrix() function, we can learn more about it:
|
|||
|
> ? matrix
|
|||
|
The help file reveals that the matrix() function takes a number of inputs,
|
|||
|
but for now we focus on the first three: the data (the entries in the matrix),
|
|||
|
the number of rows, and the number of columns. First, we create a simple
|
|||
|
matrix.
|
|||
|
> x = matrix ( data = c (1 ,2 ,3 ,4) , nrow =2 , ncol =2)
|
|||
|
> x
|
|||
|
[ ,1] [ ,2]
|
|||
|
[1 ,]
|
|||
|
1
|
|||
|
3
|
|||
|
[2 ,]
|
|||
|
2
|
|||
|
4
|
|||
|
Note that we could just as well omit typing data= , nrow= , and ncol= in the
|
|||
|
matrix() command above: that is, we could just type
|
|||
|
> x = matrix ( c (1 ,2 ,3 ,4) ,2 ,2)
|
|||
|
and this would have the same effect. However, it can sometimes be useful to
|
|||
|
specify the names of the arguments passed in, since otherwise R will assume
|
|||
|
that the function arguments are passed into the function in the same order
|
|||
|
that is given in the function’s help file. As this example illustrates, by
|
|||
|
default R creates matrices by successively filling in columns. Alternatively,
|
|||
|
the byrow=TRUE option can be used to populate the matrix in order of the
|
|||
|
rows.
|
|||
|
> matrix ( c (1 ,2 ,3 ,4) ,2 ,2 , byrow = TRUE )
|
|||
|
[ ,1] [ ,2]
|
|||
|
[1 ,]
|
|||
|
1
|
|||
|
2
|
|||
|
[2 ,]
|
|||
|
3
|
|||
|
4
|
|||
|
Notice that in the above command we did not assign the matrix to a value
|
|||
|
such as x . In this case the matrix is printed to the screen but is not saved
|
|||
|
for future calculations. The sqrt() function returns the square root of each
|
|||
|
sqrt()
|
|||
|
element of a vector or matrix. The command x^2 raises each element of x
|
|||
|
to the power 2 ; any powers are possible, including fractional or negative
|
|||
|
powers.
|
|||
|
> sqrt ( x )
|
|||
|
[ ,1]
|
|||
|
[1 ,] 1.00
|
|||
|
[2 ,] 1.41
|
|||
|
> x ^2
|
|||
|
[ ,1]
|
|||
|
[1 ,]
|
|||
|
1
|
|||
|
[2 ,]
|
|||
|
4
|
|||
|
[ ,2]
|
|||
|
1.73
|
|||
|
2.00
|
|||
|
[ ,2]
|
|||
|
9
|
|||
|
16
|
|||
|
The rnorm() function generates a vector of random normal variables,
|
|||
|
rnorm()
|
|||
|
with first argument n the sample size. Each time we call this function, we
|
|||
|
will get a different answer. Here we create two correlated sets of numbers,
|
|||
|
x and y , and use the cor() function to compute the correlation between
|
|||
|
cor()
|
|||
|
them.2.3 Lab: Introduction to R
|
|||
|
45
|
|||
|
> x = rnorm (50)
|
|||
|
> y = x + rnorm (50 , mean =50 , sd =.1)
|
|||
|
> cor ( x , y )
|
|||
|
[1] 0.995
|
|||
|
By default, rnorm() creates standard normal random variables with a mean
|
|||
|
of 0 and a standard deviation of 1. However, the mean and standard devi-
|
|||
|
ation can be altered using the mean and sd arguments, as illustrated above.
|
|||
|
Sometimes we want our code to reproduce the exact same set of random
|
|||
|
numbers; we can use the set.seed() function to do this. The set.seed()
|
|||
|
set.seed()
|
|||
|
function takes an (arbitrary) integer argument.
|
|||
|
> set . seed (1303)
|
|||
|
> rnorm (50)
|
|||
|
[1] -1.1440
|
|||
|
1.3421
|
|||
|
. . .
|
|||
|
2.1854
|
|||
|
0.5364
|
|||
|
0.0632
|
|||
|
0.5022 -0.0004
|
|||
|
We use set.seed() throughout the labs whenever we perform calculations
|
|||
|
involving random quantities. In general this should allow the user to re-
|
|||
|
produce our results. However, it should be noted that as new versions of
|
|||
|
R become available it is possible that some small discrepancies may form
|
|||
|
between the book and the output from R .
|
|||
|
The mean() and var() functions can be used to compute the mean and
|
|||
|
mean()
|
|||
|
variance of a vector of numbers. Applying sqrt() to the output of var() var()
|
|||
|
will give the standard deviation. Or we can simply use the sd() function.
|
|||
|
sd()
|
|||
|
> set . seed (3)
|
|||
|
> y = rnorm (100)
|
|||
|
> mean ( y )
|
|||
|
[1] 0.0110
|
|||
|
> var ( y )
|
|||
|
[1] 0.7329
|
|||
|
> sqrt ( var ( y ) )
|
|||
|
[1] 0.8561
|
|||
|
> sd ( y )
|
|||
|
[1] 0.8561
|
|||
|
2.3.2 Graphics
|
|||
|
The plot() function is the primary way to plot data in R . For instance,
|
|||
|
plot()
|
|||
|
plot(x,y) produces a scatterplot of the numbers in x versus the numbers
|
|||
|
in y . There are many additional options that can be passed in to the plot()
|
|||
|
function. For example, passing in the argument xlab will result in a label
|
|||
|
on the x-axis. To find out more information about the plot() function,
|
|||
|
type ?plot .
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
x = rnorm (100)
|
|||
|
y = rnorm (100)
|
|||
|
plot (x , y )
|
|||
|
plot (x ,y , xlab =" this is the x - axis " , ylab =" this is the y - axis " ,
|
|||
|
main =" Plot of X vs Y ")46
|
|||
|
2. Statistical Learning
|
|||
|
We will often want to save the output of an R plot. The command that we
|
|||
|
use to do this will depend on the file type that we would like to create. For
|
|||
|
instance, to create a pdf, we use the pdf() function, and to create a jpeg,
|
|||
|
pdf()
|
|||
|
we use the jpeg() function.
|
|||
|
jpeg()
|
|||
|
> pdf (" Figure . pdf ")
|
|||
|
> plot (x ,y , col =" green ")
|
|||
|
> dev . off ()
|
|||
|
null device
|
|||
|
1
|
|||
|
The function dev.off() indicates to R that we are done creating the plot.
|
|||
|
dev.off()
|
|||
|
Alternatively, we can simply copy the plot window and paste it into an
|
|||
|
appropriate file type, such as a Word document.
|
|||
|
The function seq() can be used to create a sequence of numbers. For
|
|||
|
seq()
|
|||
|
instance, seq(a,b) makes a vector of integers between a and b . There are
|
|||
|
many other options: for instance, seq(0,1,length=10) makes a sequence of
|
|||
|
10 numbers that are equally spaced between 0 and 1 . Typing 3:11 is a
|
|||
|
shorthand for seq(3,11) for integer arguments.
|
|||
|
> x = seq (1 ,10)
|
|||
|
> x
|
|||
|
[1] 1 2 3 4 5 6 7
|
|||
|
> x =1:10
|
|||
|
> x
|
|||
|
[1] 1 2 3 4 5 6 7
|
|||
|
> x = seq ( - pi , pi , length =50)
|
|||
|
8 9 10
|
|||
|
8 9 10
|
|||
|
We will now create some more sophisticated plots. The contour() func-
|
|||
|
contour()
|
|||
|
tion produces a contour plot in order to represent three-dimensional data; contour plot
|
|||
|
it is like a topographical map. It takes three arguments:
|
|||
|
1. A vector of the x values (the first dimension),
|
|||
|
2. A vector of the y values (the second dimension), and
|
|||
|
3. A matrix whose elements correspond to the z value (the third dimen-
|
|||
|
sion) for each pair of ( x , y ) coordinates.
|
|||
|
As with the plot() function, there are many other inputs that can be used
|
|||
|
to fine-tune the output of the contour() function. To learn more about
|
|||
|
these, take a look at the help file by typing ?contour .
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
y=x
|
|||
|
f = outer (x ,y , function (x , y ) cos ( y ) /(1+ x ^2) )
|
|||
|
contour (x ,y , f )
|
|||
|
contour (x ,y ,f , nlevels =45 , add = T )
|
|||
|
fa =( f - t ( f ) ) /2
|
|||
|
contour (x ,y , fa , nlevels =15)
|
|||
|
The image() function works the same way as contour() , except that it
|
|||
|
image()
|
|||
|
produces a color-coded plot whose colors depend on the z value. This is2.3 Lab: Introduction to R
|
|||
|
47
|
|||
|
known as a heatmap, and is sometimes used to plot temperature in weather heatmap
|
|||
|
forecasts. Alternatively, persp() can be used to produce a three-dimensional
|
|||
|
persp()
|
|||
|
plot. The arguments theta and phi control the angles at which the plot is
|
|||
|
viewed.
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
image (x ,y , fa )
|
|||
|
persp (x ,y , fa )
|
|||
|
persp (x ,y , fa , theta =30)
|
|||
|
persp (x ,y , fa , theta =30 , phi =20)
|
|||
|
persp (x ,y , fa , theta =30 , phi =70)
|
|||
|
persp (x ,y , fa , theta =30 , phi =40)
|
|||
|
2.3.3 Indexing Data
|
|||
|
We often wish to examine part of a set of data. Suppose that our data is
|
|||
|
stored in the matrix A .
|
|||
|
> A = matrix (1:16 ,4 ,4)
|
|||
|
> A
|
|||
|
[ ,1] [ ,2] [ ,3] [ ,4]
|
|||
|
[1 ,]
|
|||
|
1
|
|||
|
5
|
|||
|
9
|
|||
|
13
|
|||
|
[2 ,]
|
|||
|
2
|
|||
|
6
|
|||
|
10
|
|||
|
14
|
|||
|
[3 ,]
|
|||
|
3
|
|||
|
7
|
|||
|
11
|
|||
|
15
|
|||
|
[4 ,]
|
|||
|
4
|
|||
|
8
|
|||
|
12
|
|||
|
16
|
|||
|
Then, typing
|
|||
|
> A [2 ,3]
|
|||
|
[1] 10
|
|||
|
will select the element corresponding to the second row and the third col-
|
|||
|
umn. The first number after the open-bracket symbol [ always refers to
|
|||
|
the row, and the second number always refers to the column. We can also
|
|||
|
select multiple rows and columns at a time, by providing vectors as the
|
|||
|
indices.
|
|||
|
> A [ c (1 ,3) , c (2 ,4) ]
|
|||
|
[ ,1] [ ,2]
|
|||
|
[1 ,]
|
|||
|
5
|
|||
|
13
|
|||
|
[2 ,]
|
|||
|
7
|
|||
|
15
|
|||
|
> A [1:3 ,2:4]
|
|||
|
[ ,1] [ ,2] [ ,3]
|
|||
|
[1 ,]
|
|||
|
5
|
|||
|
9
|
|||
|
13
|
|||
|
[2 ,]
|
|||
|
6
|
|||
|
10
|
|||
|
14
|
|||
|
[3 ,]
|
|||
|
7
|
|||
|
11
|
|||
|
15
|
|||
|
> A [1:2 ,]
|
|||
|
[ ,1] [ ,2] [ ,3] [ ,4]
|
|||
|
[1 ,]
|
|||
|
1
|
|||
|
5
|
|||
|
9
|
|||
|
13
|
|||
|
[2 ,]
|
|||
|
2
|
|||
|
6
|
|||
|
10
|
|||
|
14
|
|||
|
> A [ ,1:2]
|
|||
|
[ ,1] [ ,2]
|
|||
|
[1 ,]
|
|||
|
1
|
|||
|
5
|
|||
|
[2 ,]
|
|||
|
2
|
|||
|
648
|
|||
|
2. Statistical Learning
|
|||
|
[3 ,]
|
|||
|
[4 ,]
|
|||
|
3
|
|||
|
4
|
|||
|
7
|
|||
|
8
|
|||
|
The last two examples include either no index for the columns or no index
|
|||
|
for the rows. These indicate that R should include all columns or all rows,
|
|||
|
respectively. R treats a single row or column of a matrix as a vector.
|
|||
|
> A [1 ,]
|
|||
|
[1] 1 5
|
|||
|
9 13
|
|||
|
The use of a negative sign - in the index tells R to keep all rows or columns
|
|||
|
except those indicated in the index.
|
|||
|
> A [ - c (1 ,3) ,]
|
|||
|
[ ,1] [ ,2] [ ,3] [ ,4]
|
|||
|
[1 ,]
|
|||
|
2
|
|||
|
6
|
|||
|
10
|
|||
|
14
|
|||
|
[2 ,]
|
|||
|
4
|
|||
|
8
|
|||
|
12
|
|||
|
16
|
|||
|
> A [ - c (1 ,3) ,-c (1 ,3 ,4) ]
|
|||
|
[1] 6 8
|
|||
|
The dim() function outputs the number of rows followed by the number of
|
|||
|
dim()
|
|||
|
columns of a given matrix.
|
|||
|
> dim ( A )
|
|||
|
[1] 4 4
|
|||
|
2.3.4 Loading Data
|
|||
|
For most analyses, the first step involves importing a data set into R . The
|
|||
|
read.table() function is one of the primary ways to do this. The help file
|
|||
|
read.table()
|
|||
|
contains details about how to use this function. We can use the function
|
|||
|
write.table() to export data.
|
|||
|
write.
|
|||
|
Before attempting to load a data set, we must make sure that R knows table()
|
|||
|
to search for the data in the proper directory. For example on a Windows
|
|||
|
system one could select the directory using the Change dir. . . option under
|
|||
|
the File menu. However, the details of how to do this depend on the op-
|
|||
|
erating system (e.g. Windows, Mac, Unix) that is being used, and so we
|
|||
|
do not give further details here. We begin by loading in the Auto data set.
|
|||
|
This data is part of the ISLR library (we discuss libraries in Chapter 3) but
|
|||
|
to illustrate the read.table() function we load it now from a text file. The
|
|||
|
following command will load the Auto.data file into R and store it as an
|
|||
|
object called Auto , in a format referred to as a data frame. (The text file data frame
|
|||
|
can be obtained from this book’s website.) Once the data has been loaded,
|
|||
|
the fix() function can be used to view it in a spreadsheet like window.
|
|||
|
However, the window must be closed before further R commands can be
|
|||
|
entered.
|
|||
|
> Auto = read . table (" Auto . data ")
|
|||
|
> fix ( Auto )2.3 Lab: Introduction to R
|
|||
|
49
|
|||
|
Note that Auto.data is simply a text file, which you could alternatively
|
|||
|
open on your computer using a standard text editor. It is often a good idea
|
|||
|
to view a data set using a text editor or other software such as Excel before
|
|||
|
loading it into R .
|
|||
|
This particular data set has not been loaded correctly, because R has
|
|||
|
assumed that the variable names are part of the data and so has included
|
|||
|
them in the first row. The data set also includes a number of missing
|
|||
|
observations, indicated by a question mark ? . Missing values are a common
|
|||
|
occurrence in real data sets. Using the option header=T (or header=TRUE ) in
|
|||
|
the read.table() function tells R that the first line of the file contains the
|
|||
|
variable names, and using the option na.strings tells R that any time it
|
|||
|
sees a particular character or set of characters (such as a question mark),
|
|||
|
it should be treated as a missing element of the data matrix.
|
|||
|
> Auto = read . table (" Auto . data " , header =T , na . strings ="?")
|
|||
|
> fix ( Auto )
|
|||
|
Excel is a common-format data storage program. An easy way to load such
|
|||
|
data into R is to save it as a csv (comma separated value) file and then use
|
|||
|
the read.csv() function to load it in.
|
|||
|
> Auto = read . csv (" Auto . csv " , header =T , na . strings ="?")
|
|||
|
> fix ( Auto )
|
|||
|
> dim ( Auto )
|
|||
|
[1] 397 9
|
|||
|
> Auto [1:4 ,]
|
|||
|
The dim() function tells us that the data has 397 observations, or rows, and
|
|||
|
dim()
|
|||
|
nine variables, or columns. There are various ways to deal with the missing
|
|||
|
data. In this case, only five of the rows contain missing observations, and
|
|||
|
so we choose to use the na.omit() function to simply remove these rows.
|
|||
|
na.omit()
|
|||
|
> Auto = na . omit ( Auto )
|
|||
|
> dim ( Auto )
|
|||
|
[1] 392
|
|||
|
9
|
|||
|
Once the data are loaded correctly, we can use names() to check the
|
|||
|
names()
|
|||
|
variable names.
|
|||
|
> names ( Auto )
|
|||
|
[1] " mpg "
|
|||
|
[5] " weight "
|
|||
|
[9] " name "
|
|||
|
" cylinders "
|
|||
|
" d i s p l a c e m e n t " " horsepower "
|
|||
|
" a c c e l e r a t i o n " " year "
|
|||
|
" origin "
|
|||
|
2.3.5 Additional Graphical and Numerical Summaries
|
|||
|
We can use the plot() function to produce scatterplots of the quantitative
|
|||
|
variables. However, simply typing the variable names will produce an error
|
|||
|
message, because R does not know to look in the Auto data set for those
|
|||
|
variables.
|
|||
|
scatterplot50
|
|||
|
2. Statistical Learning
|
|||
|
> plot ( cylinders , mpg )
|
|||
|
Error in plot ( cylinders , mpg ) : object ’ cylinders ’ not found
|
|||
|
To refer to a variable, we must type the data set and the variable name
|
|||
|
joined with a $ symbol. Alternatively, we can use the attach() function in
|
|||
|
attach()
|
|||
|
order to tell R to make the variables in this data frame available by name.
|
|||
|
> plot ( Auto$cylinders , Auto$mpg )
|
|||
|
> attach ( Auto )
|
|||
|
> plot ( cylinders , mpg )
|
|||
|
The cylinders variable is stored as a numeric vector, so R has treated it
|
|||
|
as quantitative. However, since there are only a small number of possible
|
|||
|
values for cylinders , one may prefer to treat it as a qualitative variable.
|
|||
|
The as.factor() function converts quantitative variables into qualitative
|
|||
|
as.factor()
|
|||
|
variables.
|
|||
|
> cylinders = as . factor ( cylinders )
|
|||
|
If the variable plotted on the x-axis is categorial, then boxplots will
|
|||
|
automatically be produced by the plot() function. As usual, a number
|
|||
|
of options can be specified in order to customize the plots.
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
>
|
|||
|
plot ( cylinders ,
|
|||
|
plot ( cylinders ,
|
|||
|
plot ( cylinders ,
|
|||
|
plot ( cylinders ,
|
|||
|
plot ( cylinders ,
|
|||
|
ylab =" MPG ")
|
|||
|
mpg )
|
|||
|
mpg ,
|
|||
|
mpg ,
|
|||
|
mpg ,
|
|||
|
mpg ,
|
|||
|
boxplot
|
|||
|
col =" red ")
|
|||
|
col =" red " , varwidth = T )
|
|||
|
col =" red " , varwidth =T , horizontal = T )
|
|||
|
col =" red " , varwidth =T , xlab =" cylinders " ,
|
|||
|
The hist() function can be used to plot a histogram. Note that col=2
|
|||
|
hist()
|
|||
|
has the same effect as col="red" .
|
|||
|
histogram
|
|||
|
> hist ( mpg )
|
|||
|
> hist ( mpg , col =2)
|
|||
|
> hist ( mpg , col =2 , breaks =15)
|
|||
|
The pairs() function creates a scatterplot matrix i.e. a scatterplot for every
|
|||
|
pair of variables for any given data set. We can also produce scatterplots
|
|||
|
for just a subset of the variables.
|
|||
|
scatterplot
|
|||
|
matrix
|
|||
|
> pairs ( Auto )
|
|||
|
> pairs (∼ mpg + d i s p l a c e m e n t + horsepowe r + weight +
|
|||
|
acceleration , Auto )
|
|||
|
In conjunction with the plot() function, identify() provides a useful
|
|||
|
identify()
|
|||
|
interactive method for identifying the value for a particular variable for
|
|||
|
points on a plot. We pass in three arguments to identify() : the x-axis
|
|||
|
variable, the y-axis variable, and the variable whose values we would like
|
|||
|
to see printed for each point. Then clicking on a given point in the plot
|
|||
|
will cause R to print the value of the variable of interest. Right-clicking on
|
|||
|
the plot will exit the identify() function (control-click on a Mac). The
|
|||
|
numbers printed under the identify() function correspond to the rows for
|
|||
|
the selected points.2.3 Lab: Introduction to R
|
|||
|
51
|
|||
|
> plot ( horsepower , mpg )
|
|||
|
> identify ( horsepower , mpg , name )
|
|||
|
The summary() function produces a numerical summary of each variable in
|
|||
|
summary()
|
|||
|
a particular data set.
|
|||
|
> summary ( Auto )
|
|||
|
mpg
|
|||
|
Min .
|
|||
|
: 9.00
|
|||
|
1 st Qu .:17.00
|
|||
|
Median :22.75
|
|||
|
Mean
|
|||
|
:23.45
|
|||
|
3 rd Qu .:29.00
|
|||
|
Max .
|
|||
|
:46.60
|
|||
|
cylinders
|
|||
|
Min .
|
|||
|
:3.000
|
|||
|
1 st Qu .:4.000
|
|||
|
Median :4.000
|
|||
|
Mean
|
|||
|
:5.472
|
|||
|
3 rd Qu .:8.000
|
|||
|
Max .
|
|||
|
:8.000
|
|||
|
horsepower
|
|||
|
Min .
|
|||
|
: 46.0
|
|||
|
1 st Qu .: 75.0
|
|||
|
Median : 93.5
|
|||
|
Mean
|
|||
|
:104.5
|
|||
|
3 rd Qu .:126.0
|
|||
|
Max .
|
|||
|
:230.0 weight
|
|||
|
Min .
|
|||
|
:1613
|
|||
|
1 st Qu .:2225
|
|||
|
Median :2804
|
|||
|
Mean
|
|||
|
:2978
|
|||
|
3 rd Qu .:3615
|
|||
|
Max .
|
|||
|
:5140
|
|||
|
year
|
|||
|
Min .
|
|||
|
:70.00
|
|||
|
1 st Qu .:73.00
|
|||
|
Median :76.00
|
|||
|
Mean
|
|||
|
:75.98
|
|||
|
3 rd Qu .:79.00
|
|||
|
Max .
|
|||
|
:82.00 origin
|
|||
|
Min .
|
|||
|
:1.000
|
|||
|
1 st Qu .:1.000
|
|||
|
Median :1.000
|
|||
|
Mean
|
|||
|
:1.577
|
|||
|
3 rd Qu .:2.000
|
|||
|
Max .
|
|||
|
:3.000
|
|||
|
displacement
|
|||
|
Min .
|
|||
|
: 68.0
|
|||
|
1 st Qu .:105.0
|
|||
|
Median :151.0
|
|||
|
Mean
|
|||
|
:194.4
|
|||
|
3 rd Qu .:275.8
|
|||
|
Max .
|
|||
|
:455.0
|
|||
|
acceleration
|
|||
|
Min .
|
|||
|
: 8.00
|
|||
|
1 st Qu .:13.78
|
|||
|
Median :15.50
|
|||
|
Mean
|
|||
|
:15.54
|
|||
|
3 rd Qu .:17.02
|
|||
|
Max .
|
|||
|
:24.80
|
|||
|
name
|
|||
|
amc matador
|
|||
|
: 5
|
|||
|
ford pinto
|
|||
|
: 5
|
|||
|
toyota corolla
|
|||
|
: 5
|
|||
|
amc gremlin
|
|||
|
: 4
|
|||
|
amc hornet
|
|||
|
: 4
|
|||
|
chevrolet chevette : 4
|
|||
|
( Other )
|
|||
|
:365
|
|||
|
For qualitative variables such as name , R will list the number of observations
|
|||
|
that fall in each category. We can also produce a summary of just a single
|
|||
|
variable.
|
|||
|
> summary ( mpg )
|
|||
|
Min . 1 st Qu .
|
|||
|
9.00
|
|||
|
17.00
|
|||
|
Median
|
|||
|
22.75
|
|||
|
Mean 3 rd Qu .
|
|||
|
23.45
|
|||
|
29.00
|
|||
|
Max .
|
|||
|
46.60
|
|||
|
Once we have finished using R , we type q() in order to shut it down, or
|
|||
|
q()
|
|||
|
quit. When exiting R , we have the option to save the current workspace so
|
|||
|
workspace
|
|||
|
that all objects (such as data sets) that we have created in this R session
|
|||
|
will be available next time. Before exiting R , we may want to save a record
|
|||
|
of all of the commands that we typed in the most recent session; this can
|
|||
|
be accomplished using the savehistory() function. Next time
|