Workshop on Modern Regression and Classification Using R -
Preparation
Participants will be expected to bring their own laptops (PC or MacOS
X or Linux), with a recent version of R (preferably R-2.13.2 or more
recent; the current version is R-2.14.0) already installed.
Additionally, a number of R packages should be installed.
For details of R Packages that should be
installed, click here
Intending participants with limited previous experience with R should
do some modest amount of preparation.
In preparation for the Course - Getting Familiar with R
Copy down the R binary, install it on your machine, start up R, and
start typing!
- Windows
users: Click here to obtain R
- Other systems, click here to look for a binary for your system (MacOS X, some flavours of Linux).
What should I type?
> 1+1
|
This may suggest some other possibilities! |
> nn <- 1:5
|
Create in the workspace an integer
vector nn that holds the values 1,2,3,4,5.
NB: <- is the assignment symbol. |
> nn |
Display (print) the contents of nn |
> ls() |
Show the contents of the workspace. You should
see "nn" listed. |
> q() |
End (quit) the session. When asked if you want to save the
workspace, make a habit of clicking on "Yes". This saves everything
in the workspace into a file (called .RData, for
those who really must know) in the
working directory.
|
There will at some point be a need to know the path to the working
directory. Start R again (the workspace, if saved on the previous
exit, gets reloaded), and type:
> ls() |
Show the contents of the workspace. |
> getwd() |
Get the path to the working directory |
If not set or changed from the default, Windows systems are likely to use
"C:/Documents and Settings/Owner/My Documents" as the
working directory. Other uses for working directories (there can be
as many as you want) will become apparent as the course
proceeds.
There are a number of demonstrations to try.
> demo() |
Gives a list of demos that can be tried |
> demo(graphics) |
Show off the graphics. Press the ENTER key to display the
first graph,
and to display each successive graph. |
A good follow-up is to run the code that is included in the document
Datasets, and familiarisation exercises
Familiarity with these datasets will help in following the tutorials
and doing course exercises. R code is given that can be used to get
summary information and to plot graphs that will help reveal important
features of the data.
Tutorial Material for R
Work through chapters 1 and 2, and preferably also chapter 3, of the
document
http://www.maths.anu.edu.au/~johnm/courses/r/notes/rnotes1-36.pdf
Click here if you want chapter 4 also!
Click here to get scripts
Scripts for all 15 chapters of intro to R
Other Introductory Documents from the Web
Go to
http://mirror.aarnet.edu.au/pub/CRAN
and click on Documentation to see some of the possibilities.
Try, perhaps, R for Beginners (Emmanuel Paradis).
R Packages that should be installed
Laptops should as far as possible be set up, ready for use with R and
R packages, prior to the course. Note that administrator priveleges
are not required for installation of R. In the absence of
administrator priveleges, R will be installed into a user directory;
After installing R, install also the
packages animation, DAAG, gamclass, e1701,
latticist, latticeExtra, playwith, Rcmdr,
randomForest, rattle, scales, slam,
Ecdat, nws, oz, survey, mlbench,
fgui and ggplot2. Several of these packages have a
number of dependencies, so that other packages will be installed along
with them. Mac users who install from the Mac GUI should be sure to
tick the box "Install dependencies".
Other packages to which there may be reference
include dichromat, odfWeave, rpanel, fortunes,
scatterplot3d, schoolmath and sp. adabag
and ape.
For playwith, and preferably also for rattle, GTk2 should
be installed. NB: Gtk2 is not part of R. It is required in order
to use the abilities, or some of the abilities, in certain R packages
For R-2.12 or later under Windows, download and install http://downloads.sourceforge.net/gtk-win/gtk2-runtime-2.22.0-2010-10-21-ash.exe
For use with R under MacOS X, download and install
http://r.research.att.com/libs/GTK_2.18.5-X11.pkg
(For R-2.11 under Windows, download and install
http://downloads.sourceforge.net/gladewin32/gtk-2.12.9-win32-2.exe)
If R has access to a live internet connection, packages can be
installed from the menu. You will need to select a repository.
In Australia, choose an Australian repository. Alternatively, packages
can be installed from the command line. For Rcmdr, a suitable
command is:
install.packages("Rcmdr", dependencies=TRUE)
The R commander has many dependencies, indirect as well as direct.
Unless the internet connection is fast, this may take some time.
Installation of the RStudio Integrated Development
Environment
This free and open source development environment (editor, and much
more) is strongly recommended for use of R. Download it from
RStudio website (Mac: ∼ 40MB;
Windows: ∼ 24MB; Linux: ∼ 24MB)
Installation of Java JDK
For text mining using the tm package, a Java JDK must be
installed. Go to:
http://www.oracle.com/technetwork/java/javase/downloads/index.html.
Then, under Java Platform, Standard Edition, click on
click on Download JDK. (This is described as Java SE6 Update 23.)
Mac users should already have JDK installed as part of the Macintosh
system.
Checking the Installation
To check, e.g., that latticeExtra (and dependencies) is
properly installed, start R and type, on the command line:
library(latticeExtra)
As rattle will be important for the course, please check that
you are able to run it. Start up R, and type:
library(rattle)
rattle()
Mac users may get warning messages. These can almost certainly be ignored.
Further Notes on the Installation of R
See the document
Installation of R, of R packages, and editor environments
Installation of Packages (or even running R) from a DVD
DVDs
and memory sticks will however be available at the course from which
it will be possible to install, for R-2.14.0 or R-2.14.1 (if available
by the time of the course), any packages that are
lacking. Additionally, these DVDs will include an R executable that
has relevant packages already installed. Once the DVD is in a
computer's DVD drive, R can be run from the DVD.
Additional Exercises
These exercises are additional to those in the course notes
Click here to get the scripts
Weaving the Exercises (R's Sweave function; for Techos Only!)
Here is a brief introduction to the combining of LaTeX source and suitably
annotated R code in a document that can then be processed through R's
Sweave function to give the final document.
R talks to LaTeX
Sweave versions of the exercise scripts
Do you have data that you are happy to expose to wider view?
Contact the presenter with the details. Data that have been used for
a published paper may be especially suitable.
Links
-
Further exercises; and Weaving with R (strictly for those who want
some greater challenge!)
- Web site for R
(CRAN = Comprehensive R Archive Network)
-
There are further interesting R links here.
-
John Maindonald's web site
- email: john.maindonald AT anu.edu.au
Last updated: November 3 2011.