Skill - Building Workshops

Workshops are offered each day from January 12 - 16 (Monday - Friday) and feature instruction in software tools for modeling, analysis, scientific computing and visualization, as well as how to use cluster, grid and cloud resources.  

Location:

Workshops take place in the Northwest Building, 52 Oxford Street, B1 level, and are open to the public. 

Questions? Email computefest@seas.harvard.edu 

Registration is open! Click below on each day's workshop schedule to register.

 

Introduction to R (9:30am)

9:30am - 12:30pm
Introduction to R (Presented by the Institute for Quantitative Social Science)

This workshop offers a hands-on introduction to R, the open-source system for statistical computation and graphics. You will learn how to import and manage datasets, create R objects, install and load R packages, conduct basic statistical analyses, and create common graphical displays. This workshop is appropriate for those with little or no prior experience with R. Participants should bring a laptop with a recent version of R (http://cran.r-project.org/) and RStudio (http://www.rstudio.com/products/rstudio/download/) installed.

Register here

Scientific and Research Computing on AWS (9:30am)

9:30am - 12:00pm and 1:30pm - 2:30pm
Scientific and Research Computing on Amazon Web Services (Presented by Angel Pizarro of Amazon AWS)

It is possible to bootstrap a personal compute cluster on Amazon Web Services within minutes, but really what does that mean? Which services should you use, and what are the implications of those services on how you develop algorithms and data analysis pipelines? In this workshop, we'll cover the essential services for scientific computing on AWS. We'll also discuss some practical examples of using AWS for HPC workloads, with some hands-on experiences.

Note: The morning session runs from 9:30 am - 12:00pm and will include all classroom instruction. The workshop breaks for lunch from 12:00 - 1:30pm.  The afternoon session takes place from 1:30 - 2:30pm and will consist of the hands-on portion of the worskhop. Attendees that would like to accomplish the hands-on exercises on-site are expected to bring their own internet-connected laptop.

Register here

Vowpal Wabbit (1:30pm)

1:30 - 4:30pm
Vowpal Wabbit (Presented by John Langford, Machine Learning Research Scientist at Microsoft Research)

Vowpal Wabbit (aka VW) is an open source fast out-of-core learning system library and program.  Vowpal Wabbit is notable as an efficient scalable implementation of online machine learning and support for a number of machine learning reductions, importance weighting, and a selection of different loss functions and optimization algorithms.

Register Here

Intro to Programming in Python (10:00am)

10:00am - 12:30pm
Introduction to Programming in Python (presented by Harvard SEAS Computing)

This session will provide an introduction to basic programming concepts using the Python language, and enable participants to write basic programs, read and write files, and generate data plots. Hands-on.  Participants must bring a laptop with Anaconda Python Distribution installed. Click here to download Python and retrieve tutorial documents for the workshop. 

Register here

R graphics with ggplot2 (9:30am)

9:30am - 12:30pm
R graphics with ggplot2 (Presented by the Institute for Quantitative Social Science)

This introduction to the popular ggplot2 R graphics package will show you how to create a wide variety of graphical displays in R. Topics covered included aesthetic mapping and scales, faceting, and themes.  This is an intermediate level workshop appropriate for those already familiar with R. Participants should be familiar with importing and saving data and manipulating data.frames in R. Participants should bring a laptop with a recent version of R (http://cran.r-project.org/) and RStudio (http://www.rstudio.com/products/rstudio/download/) installed.

Register here

Introduction to MATLAB: Problem-Solving and Programming (1:00pm)

1:00 - 5:00pm
Introduction to MATLAB: Problem-Solving and Programming (A hands-on workshop, presented by MathWorks

MATLAB is a high-level language that allows you to quickly perform computation and visualization through easy-to-use programming constructs. This hands-on lab presents the essentials you need to use MATLAB for your classes or research.  In this hands-on workshop, attendees will learn how to import data from an external file, plot the data over time, then perform some analysis to view the data trends. You’ll learn how to write a MATLAB script and publish it to a format for sharing, such as HTML. You’ll also learn how to write your own MATLAB functions, use flow control, and create loops.  By the end of the session, you’ll have learned to create an application in MATLAB.

Key Topics Include:

  • Navigating the MATLAB desktop
  • Working with variables in MATLAB
  • Calling MATLAB functions
  • Importing and extracting data
  • Visualizing data
  • Conducting computational analysis
  • Fitting data to a curve
  • Automating analysis with scripts
  • Publishing MATLAB programs
  • Programming in MATLAB

NOTE: Attendees should bring a laptop with MATLAB installed.  Necessary Workshop files will be provided.  In advance of the session MathWorks will provide each registrant with a temporary MATLAB license that attendees will be required to install.  Please register for the hands-on workshop only if you have 100% certainty of your ability to attend.

Presented By: Eoin Moore delivers MATLAB training to academic and corporate audiences and also develops online self-paced MATLAB courses. He holds a B.S. in physics from the University of Massachusetts and an M.S. in physics from the University of California San Diego. His graduate research involved the analysis of plasmas, including turbulent flow and nonlinear dynamics.

Register Here

Deep Learning with Theano & Pylearn2 - Part I (9:30am)

9:30 - 11:30am
Deep Learning with Theano and Pylearn2 - PART I (Presented by the Institute for Applied Computational Science)

Classical machine learning techniques often involve extracting some form of manually designed features from data and then training a model for classification. Designing the features is an important task, often involving domain knowledge and manual tuning. Deep learning methods instead learn multiple levels of feature representations directly from the data. Learning the features has been shown to improve classification results, and make the same model applicable to different data modalities like images, speech, and text.
Theano and Pylearn2 are two great Python libraries, that facilitate using deep learning methods, even on the GPU.

In this workshop we will introduce the basics of deep learning and models like deep networks, autoencoder, and convolutional networks, and how to train them. Part one (Wednesday, January 14) will focus on the fundamentals and how to learn features in a supervised and unsupervised setting. We will also discuss best practices for training these complex models. Part two (Thursday, January 15) will introduce convolutional neural networks and cover advanced tips and tricks for training.

Part 1 (Wednesday, January 14): Deep learning fundamentals, Deep neural networks, Autoencoder and best practices for training.

Part 2 (Thursday, January 15): Convolutional neural networks and advanced tips and tricks (dropout, maxout, etc.).

Prerequisites: Python programming, Laptop with Python 2.7, IPython, Theano and Pylearn2 installed for in class work.  Installation instructions will emailed to all registrants.  Registrants may sign up for either or both sessions.

Register for Part I here

Data Analytics & Machine Learning with MATLAB (9:30am)

9:30 - 11:30am:
Data Analytics & Machine Learning with MATLAB (Presented by MathWorks)

Learn approaches and techniques available in MATLAB to turn large volumes of complex data into actionable information and how to use MATLAB to develop and integrate effective analytics.

Using Data Analytics to turn large volumes of complex data into actionable information can help you improve design and decision-making processes. However, developing effective analytics and integrating them into business systems can be challenging. In this seminar you will learn approaches and techniques available in MATLAB® to tackle these challenges.

Highlights include:

  • Accessing, exploring, and analyzing data stored in files, the web, and data warehouses
  • Techniques for cleaning, exploring, visualizing, and combining complex multivariate data sets
  • Prototyping, testing, and refining predictive models using machine learning methods
  • Integrating and running analytics within enterprise business systems and interactive web applications

Presented By: Adam Filion holds a BS and MS in Aerospace Engineering from Virginia Tech where his research involved nonlinear controls of spacecraft and periodic orbits in the three-body problem. After graduating he joined the MathWorks Engineering Development Group in 2010 and moved to Application Engineering in 2012. He currently focuses on data analytics, machine learning and big data.

Register Here

Scientific Programming in Python (1:30pm)

1:30 - 4:30pm
Scientific Programming in Python (Presented by Continuum Analytics)

In many cases, Python can be a good or even better alternative to programming in Matlab, C, R and other languages. This workshop will introduce NumPy (provides high performance numerical data structures and associated routines for multi-dimensional vectors), and then build on this by exploring matplotlib (for Matlab-like plotting), scipy (a collection of common numerical methods implemented in C with Python interfaces), and multiprocessing (for small to medium scale parallel processing). As well, tools and techniques that make Python a great language for any scientist will be presented, along with short hands-on exercises for those with Anaconda Python.  Participants must bring a laptop with Anaconda Python Distribution installed

Register here

Intro to Julia (1:30pm)

1:30 - 4:30pm
Julia

Registration and details coming soon!

Tackling Big Data with MATLAB (9:30am)

9:30 - 11:30am
Tackling Big Data with MATLAB (Presented by MathWorks)

Discover new big data capabilities in MATLAB 2014b, including distributed memory, MapReduce programming techniques, cluster computing, and Hadoop.

Are the data sets you need to analyze becoming uncomfortably large to work with in memory? Are they taking too long to compute? Are you finding it challenging to scale your algorithms to big data sets? In this seminar, you will learn strategies and techniques for handling large amounts of data in MATLAB. New big data capabilities in MATLAB R2014b will be highlighted.

Highlights include:

  • Using best practices for memory use in MATLAB
  • Accessing data in large text files, databases or from the Hadoop Distributed File System (HDFS)
  • Leveraging distributed memory to work with large data sets
  • Processing data using the MapReduce programming technique
  • Developing algorithms on your desktop and scaling to a cluster, cloud or Hadoop

Presented By: Adam Filion holds a BS and MS in Aerospace Engineering from Virginia Tech where his research involved nonlinear controls of spacecraft and periodic orbits in the three-body problem. After graduating he joined the MathWorks Engineering Development Group in 2010 and moved to Application Engineering in 2012. He currently focuses on data analytics, machine learning and big data.

Register Here

Deep Learning with Theano and Pylearn2 - Part II (9:30am)

9:30 - 11:30am
Deep Learning with Theano and Pylearn2 - PART II (Presented by the Institute for Applied Computational Science)

Classical machine learning techniques often involve extracting some form of manually designed features from data and then training a model for classification. Designing the features is an important task, often involving domain knowledge and manual tuning. Deep learning methods instead learn multiple levels of feature representations directly from the data. Learning the features has been shown to improve classification results, and make the same model applicable to different data modalities like images, speech, and text.
Theano and Pylearn2 are two great Python libraries, that facilitate using deep learning methods, even on the GPU.

In this workshop we will introduce the basics of deep learning and models like deep networks, autoencoder, and convolutional networks, and how to train them. Part one (Wednesday, January 14) will focus on the fundamentals and how to learn features in a supervised and unsupervised setting. We will also discuss best practices for training these complex models. Part two (Thursday, January 15) will introduce convolutional neural networks and cover advanced tips and tricks for training.

Part 1 (Wednesday, January 14): Deep learning fundamentals, Deep neural networks, Autoencoder and best practices for training.

Part 2 (Thursday, January 15): Convolutional neural networks and advanced tips and tricks (dropout, maxout, etc.).

Prerequisites: Python programming, Laptop with Python 2.7, IPython, Theano and Pylearn2 installed for in class work.  Installation instructions will be emailed to all registrants.  Registrants may sign up for either or both sessions.

Register for Part II here

Advanced Python (1:30pm)

1:30 - 4:30pm
Advanced Python (Presented by Continuum Analytics)

Generators, decorators, descriptors, function wrappers, meta-classes, standard library containers you may not know, and a mental model of how Python works that will give you the high ground in any debate about how to use Python properly. This workshop will cover a range of advanced Python topics through live coding demonstrations and opportunities to follow along and complete short hands-on exercises for anyone with Anaconda Python.  Participants must bring a laptop with Anaconda Python Distribution installed

Register here

Intro to GPU Computing & Use of Libraries (1:30pm)

1:30 - 5:00pm
Introduction to GPU Computing and Use of Libraries (Presented by NVIDIA)

NVIDIA GPUs are the world’s fastest and most efficient accelerators delivering world record scientific application performance. 

NVIDIA CUDA is the most pervasive parallel computing model, used by over 250 scientific applications and over 150,000 developers worldwide.  

The 1.5 Day Programming Workshop will focus on introducing scientific computing programming utilizing NVIDIA GPUs to accelerate applications. The Workshop will introduce programming techniques using CUDA and OpenACC paradigms as well as optimization, profiling, and debugging methods for GPU programming.  

Topics covered include GPU Architecture, OpenACC, Introduction to CUDA, CUDA Libraries, and CUDA performance tools such as NVIDIA Visual Profiler along with an introduction to GPUs for Machine Learning.

This workshop will cover:

  • High Level Overview of GPU architecture (lecture)
  • OpenACC 2.0 update
  • Basics of CUDA Programming (hands-on)
  • CUDA Syntax
  • Memory Allocation
  • Launching Simple Kernels
  • CUDA Enabled Libraries (hands-on)
  • Amgx
  • cuDNN
  • Examples of linking to libraries

Register here

NVIDIA: Performance and Optimizations (9:30am)

9:30am - 12:30pm
Performance and Optimizations (Presented by NVIDIA)

NVIDIA GPUs are the world’s fastest and most efficient accelerators delivering world record scientific application performance. 

NVIDIA CUDA is the most pervasive parallel computing model, used by over 250 scientific applications and over 150,000 developers worldwide.  

The 1.5 Day Programming Workshop will focus on introducing scientific computing programming utilizing NVIDIA GPUs to accelerate applications. The Workshop will introduce programming techniques using CUDA and OpenACC paradigms as well as optimization, profiling, and debugging methods for GPU programming.  Topics covered include GPU Architecture, OpenACC, Introduction to CUDA, CUDA Libraries, and CUDA performance tools such as NVIDIA Visual Profiler along with an introduction to GPUs for Machine Learning.

This workshop session will cover fundamental performance optimizations (hands-on); global and shared memory optimizations with matrix transpose and matrix multiply, as well as the use of NVIDIA Profiler to identify performance bottlenecks.

Register here

Mathematica and the Wolfram Language (9:30am)

9:30am - 12:00pm
Mathematica and the Wolfram Language (Presented by Wolfram Research)

Developed for more than 25 years as part of Mathematica, the Wolfram Language offers a vast depth of built-in algorithms and knowledge, easily accessible through its symbolic nature. Through "hands-on" coding of simple programs, we will learn the basic principles of the language, and how to use its advanced functionalities, such as automated machine learning.

Prior to the workshop, registrants should create a Wolfram ID and download a free 15-day trial of Mathematica here: https://www.wolfram.com/mathematica/trial/

Register Here

Deep Machine Learning Frameworks with GPUs (1:30pm)

1:30 - 5:00pm
Deep Machine Learning Frameworks with GPUs (Presented by NVIDIA)

NVIDIA GPUs are the world’s fastest and most efficient accelerators delivering world record scientific application performance. 

NVIDIA CUDA is the most pervasive parallel computing model, used by over 250 scientific applications and over 150,000 developers worldwide.  

The 1.5 Day Programming Workshop will focus on introducing scientific computing programming utilizing NVIDIA GPUs to accelerate applications. The Workshop will introduce programming techniques using CUDA and OpenACC paradigms as well as optimization, profiling, and debugging methods for GPU programming.  

Topics covered include GPU Architecture, OpenACC, Introduction to CUDA, CUDA Libraries, and CUDA performance tools such as NVIDIA Visual Profiler along with an introduction to GPUs for Machine Learning.

This workshop will cover:

  • Advanced Optimizations: Using Streams and Concurrency to overlap communication and computation. CUBLAS example.
  • Machine Learning
  • Machine learning on GPUs, current uses, frameworks (lecture)
  • Hands-on example showing machine learning on GPUs 

Register here

Intro to Julia (1:30pm)

1:30 - 4:30pm
Julia

Registration and details coming soon!

Thank you to our sponsors: