Learning R Software: Advantages, Challenges, and Coefficient of Determination | Lecture notes Logic

it easy to program new statistical methods. The

graphics of the language allow easy production of

advanced, publication-quality graphics. Since

a wide variety of experts use the program, R

includes a comprehensive library of statistical

functions, including many cutting-edge statistical

methods. In addition to this, many third-party spe-

cialized methods are publicly available. And most

important, R is free and open source.

A common concern of beginning users of R is

the steep learning curve involved in using it. Such

concern stems from the fact that R is a command-

driven environment. Consequently, the statistical

analysis is performed in a series of steps, in which

commands are typed out and the results from each

step are stored in objects that can be used by fur-

ther inquiries. This is contrary to other programs,

such as SPSS and SAS, which require users to

determine all characteristics of the analysis up

front and provide extensive output, thus relying on

the users to identify what is relevant to their initial

question.

Another source of complaints relates to the

difficulty of writing new functions. The more

complex the function, the more difficult it

becomes to identify errors in syntax or logic. R

will prompt the user with an error message, but

no indication is given of the nature of the prob-

lem or its location within the new code. Conse-

quently, despite the advantage afforded by being

able to add new functions to R, many users may

find it frustrating to write new routines. In addi-

tion, complex analyses and simulations in R tend

to be very demanding on the computer memory

and processor; thus, the more complex the anal-

ysis, the longer the time necessary to complete

the task, sometimes days.

Large data sets or complex tasks place heavy

demands on computer RAM, resulting in slow

output.

Brandon K. Vaughn and Aline Orr

See also SAS; SPSS; Statistica; Systat

Web Sites

Comprehensive R Archive Network (CRAN): http://

CRAN.R-project.org

The R Project for Statistical Computing: http://www

.r-project.org

R-squared (R

)isastatisticthatexplainsthe

amount of variance accounted for in the rela-

tionship between two (or more) variables. Some-

time R

is called the coefficient of determination,

and it is given as the square of a correlation

coefficient.

Given paired variables ðXi;YiÞ, a linear model

that explains the relationship between the vari-

ables is given by

Y¼β0þβ1Xþe,

where eis a mean zero error. The parameters of

the linear model can be estimated using the least

squares method and denoted by ^

β0and ^

β1,respec-

tively. The parameters are estimated by minimizing

the sum of squared residuals between variable Yi

and the model β0þβ1Xi,thatis,ð^

β0;^

β1Þ¼

argmin

β0;β1ðYiβ0þβ1XiÞ2.

It can be shown that the least squares estima-

tions are

β0¼

Y

XSxy

Sxx

and ^

β1¼Sxy

Sxx

where the sample cross-covariance Sxy is defined as

Sxy ¼1

i¼1ðXi

XÞðYi

YÞ¼XY 

X

Statistical packages such as SAS, SPLUS, and R

provide a routine for obtaining the least squares

estimation. The estimated model is denoted as

Y¼^

β0þ^

β1X:

With the above notations, the sum of squared

errors (SSE), or the sum of squared residuals, is

given by

SSE¼X

i¼1ðYi^

YiÞ2:

SSE measures the amount of variability in Y

that is not explained by the model. Then how does

one measure the amount of variability in Ythat is

explained by the model? To answer this question,

1187

Learning R Software: Advantages, Challenges, and Coefficient of Determination, Lecture notes of Logic

Related documents

Partial preview of the text

Download Learning R Software: Advantages, Challenges, and Coefficient of Determination and more Lecture notes Logic in PDF only on Docsity!

R

SSE¼

SST¼

Y^ ¼ 1

R^2 ¼

SSR

SST

SST SSE

SSE

SSE¼

SST¼

RADIAL P LOT

Learning R Software: Advantages, Challenges, and Coefficient of Determination, Lecture notes of Logic

Related documents

Partial preview of the text

Download Learning R Software: Advantages, Challenges, and Coefficient of Determination and more Lecture notes Logic in PDF only on Docsity!

R

SSE¼

SST¼

Y^ ¼ 1

R^2 ¼

SSR

SST

SST SSE

SSE

SSE¼

SST¼

RADIAL P LOT

Y^ ¼ 1