Essential software for statistics

Table of contents

Introduction

This web page is created for students in the Department of Statistics and Actuarial Science at the University of Waterloo. The aim is to present an overview of software packages which eventually will become essential or at least useful for statisticians.

This is an ongoing project and only part of the content on this page is actually written on behalf of the UWaterloo Department of Statistics and Actuarial Science. For topics we have not covered yet, we provide links to web pages and documents that we find useful.

The content specifically created for this web page so far is:

Integrated Development Environments (IDEs)

IDEs are used to add efficiency to the programming workflow. Many IDEs exist and their strengths are dependent on the programming language.

  • UWaterloo: Emacs and Eclipse (PDF) can be used for most programming languages including R, C, Java, and also LaTeX.

  • RStudio good R IDE that also integrates Sweave and knitR support.

LaTeX

LaTeX is a document preparation system which is used by many academics and students to write up their projects, assignments, publications and presentation slides.

Search the Internet for an introduction to LaTeX or read a book. Many good sources exist.

A popular book, based on Amazon ranking, is More Math Into LaTeX published by Springer, and for UWaterloo students available for free from the Springerlink website. The author made a set of free available videos for getting you easily started.

Once you have figured out the basics of LaTeX, inform yourself about:

Using R

Introduction to R, Emacs and Eclipse

Data manipulation

Graphics: base, grid, lattice and ggplot2

Sweave and knitR

Sweave and knitR let you mix together LaTeX source with R source. This makes for a great way to write your assignments.

Programming R

Many good books on the topic now exist.

Functions and Methods

Object oriented model

Building your own library

Accessing C and C++ from within R

Creating graphical user interfaces with the tcltk library in R

Introduction to Tcl

The tcltk library for R

Geometry managers

SAS Academic Program

Information (including access codes) about free e-learning resources for students can be found on slide/page 7.

Information on SAS Certification for students can be found on slide/page 10, and on the SAS Certification website. You’ll need to  e-mail certification@sas.com with proof that you’re a University of Waterloo student, and they’ll e-mail you a $90 discount voucher.

SAS Academic Program - Resources for Waterloo Students. pdf

General programming

C and C++

  • External file: C and C++ in 5 days by Philip Machanick

Java

Regular expressions

UNIX

Basics of UNIX

UWaterloo math servers

UWaterloo statistics students have access to several servers to run their simulations on. You will find all the necessary information on the Math Faculty Computing Facility (MFCF) website.

To access the servers, login from anywhere via ssh.

ssh username@linux.math.uwaterloo.ca

Consider setting up ssh keys.

Once logged into the servers, you may then have to change your limit options.

limit cputime unlimited
limit filesize unlimited

This can happen automatically when you login by adjusting the .cshrc file in your home directory. Replace the "limit cputime" line with "unlimit cputime".

Use nohup to run your process. For example to run an R simulation use

nohup R CMD BATCH myRfile.R &


Author: Adrian Waddell <arwaddel@math.uwaterloo.ca>
Date: 2014-02-13

Faculty Joint PublicationsMap of Faculty and PhD Students backgroundsDavid Sprott Distinguished Lectures

Faculty Research Chairs

Actuarial Science

David Landriault
Tier II Canada Research Chair

Ken Seng Tan 
Sun Life Fellow

Ruodu Wang
University Research Chair

Gord Willmot 
Munich Re Chair in Insurance

Tony Wirjanto
University Research Chair

Statistics

Grace Yi
University Research Chair