Workshop on the R System and Packages

AMSI/SSAI ASC2008 Satellite Workshop: Computing With R

June 27-29, 2008

Workshop blurb (for the historical record), and Report on Workshop

Notes and Overheads (updated July 11 2008)

Graphics: lattice, ggplot2 and rgl.
The R system has several different flavours of graphics. These include: Base or traditional graphics; Lattice's highly stylized graphics; Grid graphics on which lattice is built; and the ggplot2 implementation of Wilkinson's Grammar of Graphics. For three-dimensional rotational graphics, note rgl and the dynamical graphical abilities of the rggobi interface to GGobi.
Supplement and summary notes (28pp.)
Code for the examples.
Overheads - lattice
Generalized Linear Models (Peter Dunn)
Generalized Linear Models extend linear models in ways that have proved especially useful in the analysis of count data.
Overheads
Rattle (Graham Williams)
Rattle (the R Analytical Tool To Learn Easily) is a data mining toolkit used to analyse very large collections of data. Rattle presents statistical and visual summaries of data, transforms data into forms that can be readily modelled, builds both unsupervised and supervised models from the data, presents the performance of models graphically, and scores new datasets.
Overheads
Multi-level models (Andrew Robinson)
Many real data sets have a hierarchical multi-level structure of variation; for example multiple measurements within trees within stands within forests. The modeling approaches and that have been developed for such data provide a rich source of insights and challenges. We will showcase the nlme and lme4 packages, both of which provide extensive infrastructure for the analysis of multi-level data. The lme4 package is under very active continuing development, with new features and improvements appearing at regular intervals.
Overheads
Applications in medical statistics - meta-analysis, nonparametric testing, and power calculations (Malcolm Hudson)
Topics include:
i. graphic presentation of meta analysis results with,
ii. coin package for use with non-parametric testing and power computations, with comparisons with bootstrap procedures.
Notes
Overheads
Mixing R and LaTeX/Office (Peter Dunn)
Sweave' provides a flexible framework for mixing text and S code [R implements the S language] for automatic report generation (for example, to enable reproducible research). The basic idea is to place R code into the LaTeX/Office document, and ask R to replace the code with its output, such that the final document only contains the text and the output of the statistical analysis. Currently, there is provision for incorporating S code, with markup, into either LaTeX or Open Office documents. The S code gets replaced by its output (text, tables and/or graphs) in the final markup file. This makes it possible to re-generate a report if the input data changes. It documents code that can reproduce the analysis in the same file that also produces the report. Where published papers report statistical analyses and/or summaries, it is too often hard to be sure just what analysis was done. Reference to an Sweave version (typically on a web page) documents the analysis to a standard and with a completeness that is not otherwise possible.
Notes
Overheads
BRugs for Bayesian Analysis (Matt Wand)
The BRugs package facilitates Bayesian statistical analyses through the use of scripts; i.e. without the need for menus and mouse-clicks. Scripting in both R and the BUGS (Bayesian inference Using Gibbs Sampling) languages is required. Other than time, there is no firm limit on the complexity of Bayesian models that can be handled with BRugs. Because R is used at the front-end and back-end of the analysis one can take advantage of R's functionality for data input and pre-processing, as well as summary and graphical display. This component of the short course will provide illustrations at both introductory and advanced levels.
Spatial statistics (Adrian Baddeley, CSIRO/UWA)
There are three main kinds of spatial data: geostatistical data, where the response variable is recorded at a point location (e.g. daily temperature records at a set of weather stations); regional data, where the response variable is obtained from a spatial region (e.g. number of HIV notifications in each health authority area); and spatial point patterns, where the response is the location of an event (e.g. locations of petty crimes in Chicago). The R packages 'geoR', 'spdep' and 'spatstat' (respectively) provide functionality for these types of data.
Overheads
Time series forecasting (Rob Hyndman: web page)
The forecasting bundle of R packages provides new forecasting methods, and graphical tools for displaying and analysing forecasts.
Overheads
Package Construction (Rob Hyndman: web page)
Much of the power and flexibility of R derives from the large variety of powerful packages that are available to add on to the base system. Putting code into R packages is surprisingly straightforward, for a user who is careful to follow the rules.
Notes
Return to top of page