Essential software for statistics

Table of contents

Introduction

This web page is created for students in the Department of Statistics and Actuarial Science at the University of Waterloo. The aim is to present an overview of software packages which eventually will become essential or at least useful for statisticians.

This is an ongoing project and only part of the content on this page is actually written on behalf of the UWaterloo Department of Statistics and Actuarial Science. For topics we have not covered yet, we provide links to web pages and documents that we find useful.

The content specifically created for this web page so far is:

Integrated Development Environments (IDEs)

IDEs are used to add efficiency to the programming workflow. Many IDEs exist and their strengths are dependent on the programming language.

  • UWaterloo: Emacs and Eclipse (PDF) can be used for most programming languages including R, C, Java, and also LaTeX.

  • RStudio good R IDE that also integrates Sweave and knitR support.

LaTeX

LaTeX is a document preparation system which is used by many academics and students to write up their projects, assignments, publications and presentation slides.

Search the Internet for an introduction to LaTeX or read a book. Many good sources exist.

A popular book, based on Amazon ranking, is More Math Into LaTeX published by Springer, and for UWaterloo students available for free from the Springerlink website. The author made a set of free available videos for getting you easily started.

Once you have figured out the basics of LaTeX, inform yourself about:

Using R

Introduction to R, Emacs and Eclipse

Data manipulation

Graphics: base, grid, lattice and ggplot2

Sweave and knitR

Sweave and knitR let you mix together LaTeX source with R source. This makes for a great way to write your assignments.

Programming R

Many good books on the topic now exist.

Functions and Methods

Object oriented model

Building your own library

Accessing C and C++ from within R

Creating graphical user interfaces with the tcltk library in R

Introduction to Tcl

The tcltk library for R

Geometry managers

SAS Academic Program

Free SAS software is available to all University of Waterloo students, professors and researchers. Our free cloud platform, SAS OnDemand for Academics, includes SAS Studio and a connection to Jupyter Hub.  

If SAS OnDemand for Academics does not suite your needs, please connect with Lindsay.Hart@sas.com to explore other free options available to you.  

For access to free e-learning courses, curriculum content, and certification resources, please visit the SAS Academic Hub. You must register with your University of Waterloo email address.  

General programming

C and C++

  • External file: C and C++ in 5 days by Philip Machanick

Java

Regular expressions

UNIX

Basics of UNIX

UWaterloo math servers

UWaterloo statistics students have access to several servers to run their simulations on. You will find all the necessary information on the Math Faculty Computing Facility (MFCF) website.

To access the servers, login from anywhere via ssh.

ssh username@linux.math.uwaterloo.ca

Consider setting up ssh keys.

Once logged into the servers, you may then have to change your limit options.

limit cputime unlimited
limit filesize unlimited

This can happen automatically when you login by adjusting the .cshrc file in your home directory. Replace the "limit cputime" line with "unlimit cputime".

Use nohup to run your process. For example to run an R simulation use

nohup R CMD BATCH myRfile.R &


Author: Adrian Waddell <arwaddel@math.uwaterloo.ca>
Date: 2014-02-13