Workshops

ALL WORKSHOPS PROCEEDING AS SCHEDULED

SKILL BUILDING WORKSHOPS
Dates: January 17-19, 2018
Learn computational skills in a hands-on format with expert instructors.  Workshops will include: advanced Python, Deep Learning with Image Classification, Microsoft Azure ML, H2O, Domino, Stan, Tableau, Deep Learning with NVIDIA.  

REGISTRATION
Workshops are free and open to Harvard Affiliates, ONLY.  Affiliates include Harvard staff, postdocs, researchers, faculty, students, and alumni.  
Workshops will not be video taped or livestreamed.

SEARCH WORKSHOPS BY DAY:

Intro to Tableau

Introduction to Tableau
Presented by Lauren Kearney, Tableau
Wednesday, Jan 17, 2018
8:30 AM - 11:30 AM

Get ready to learn how to inform and inspire others when presenting data with Tableau’s business intelligence platform. Join us for this hands-on session to discover how to turn data into actionable insights and find out why Forbes recently ranked Tableau as the technical skill with the third biggest rise in demand. We’ll provide an overview of Tableau and walk you through the key features and functionality of our visual analytics platform. This hands-on workshop will excite you to explore your data in a different way!

Note: Participants should bring a laptop with Tableau downloaded and installed.  Participants will be emailed instructions to download Tableau prior to the workshop.

REGISTER HERE

Build a Bot with Domino Data Lab!

Build a Bot with Domino Data Lab!
Presented by Mac Rogers & Team, Domino Data Lab
Wednesday, January 17, 2018
8:30 AM - 11:30 AM
 

Data science in industry is more than just testing ideas and writing reports - you have to create data products your business can leverage. However, data science can seem like magic at times - prompting questions from skeptical downstream consumers of your work. So, good data science products have to be explainable and reproducible. 

In complex, regulated industries like healthcare, finance, and insurance, auditability and collaboration are key. You have to make sure your models can retrain themselves to stay on the cutting edge; make them accessible to and decipherable by regulators; and fit it into the broader stack of work in your organization, because data science is a team sport. 

In this workshop, participants will use Domino Data Lab, an enterprise R&D, collaboration, and deployment platform for data science teams. You’ll experience the full data science life cycle using cutting-edge industry technology: data prep and exploration; model development and tuning; and implementation and deployment of API endpoints and web apps. 

REGISTER HERE

Intro to Bayesian Inference with Stan

Introduction to Bayesian Inference with Stan
Presented by Michael Betancourt, Stan Development Team
Wednesday, Jan 17, 2018
12:00 PM - 5:00 PM

Despite the promise of big data, inferences are often limited not by the size of data but rather by its systematic structure. Only by carefully modeling this structure can we take fully advantage of the data -- big data must be complemented with big models and the algorithms that can fit them. Stan is a platform for facilitating this modeling, providing an expressive modeling language for specifying bespoke models and implementing state-of-the-art algorithms to draw subsequent Bayesian inferences.

In this workshop I will introduce how to implement a robust Bayesian workflow in Stan, from constructing models to analyzing inferences and validating the underlying modeling assumptions. The workshop will emphasize interactive exercises run through PyStan, the Python interface to Stan.

Pre-requisites: The workshop will assume familiarity with the basics of calculus, probability, and statistics but the core concepts will be reviewed.

In order to participate in the interactive exercises attendees must provide a laptop with PyStan 2.17 (https://pystan.readthedocs.io/en/latest/) installed. Note that installation on Windows machines is highly nontrivial so Windows users are encouraged to try to install as soon as possible and report any issues at http://discourse.mc-stan.org.

REGISTER HERE

Intro to Microsoft Azure Machine Learning Services

Intro to Microsoft Azure Machine Learning Services
Presented by Heather Shapiro, Azure Machine Learning
Wednesday, January 17, 2018
3:00 PM - 6:00 PM

The Azure Machine Learning Workbench is an integrated, end-to-end data science and advanced analytics solution that helps professional data scientists to prepare data, develop experiments, and deploy models at cloud scale. The Machine Learning Workbench includes cutting edge AI frameworks from both Microsoft and the open source community.

In this session, we will walk through the basics of the Machine Learning services (preview) and use the Machine Learning Workbench for preparing, building, and deploying a Machine Learning model.

Pre-requisites:

  • If you don't have an Azure subscription, please create a free account before the workshop.
  • Download the latest Azure Machine Learning Workbench installer (Windows)(Mac)


NOTE: Currently, you can install the Azure Machine Learning Workbench desktop app on the following operating systems:

  • Windows 10
  • Windows Server 2016
  • macOS Sierra
  • macOS High Sierra

REGISTER HERE

 

 

Intro to Automatic Machine Learning with H20

Intro to Automatic Machine Learning with H2O
H20 Workshop for Python Users
Presented by: Lauren DiPerna, H2O
Thursday, Jan 18, 2018
8:30 AM - 11:30 AM

H2O is an open-source machine-learning platform designed to build accurate models on distributed datasets at impressive speeds. This means you can use H2O to tackle terabytes of data in an ecosystem like Hadoop or Spark, but it doesn't mean you have to; you can also use H2O to build fast and accurate models on your laptop. The platform includes multiple APIs including Python, R, and Scala, and a web-based UI called Flow - so the learning curve is easy for new users. In this session we will focus on the Python API and show you how to use Flow in parallel, to view your data and monitor your models in the UI.

This workshop will kick off with an introduction to the H2O platform, then jump into a hands-on session, where you will learn how to build models using H2O's Python API. In the first half of the workshop, you will learn how to read in data, do basic data manipulations, build a supervised learning model, and use your model to make predictions.

In the second half of the workshop, you will learn about H2O's AutoML: an easy-to-use interface that automates the challenging process of training a large number of candidate models. AutoML includes the Stacked Ensemble algorithm, which will automatically train on the collection of individual models to produce a highly predictive ensemble model which, in most cases, will be the top performing model.

By the end of the workshop you will know how to use H2O to read in data, build models, and tune a selection of supervised models with AutoML.

Topics Covered:

  • What is H2O?
  • Training models on big data.
  • Building a supervised model with H2O's Python API.
  • Automatic machine learning with H2O's AutoML.

Pre-requisites & Pre-work:

All are welcome, however, the following assumptions are made:

  • You have completed this pre-workshop survey  by January 15, 2018
  • You have a laptop with H2O installed (see the Installation Requirements section below for details on how to download H2O).
  • You have a basic understanding of machine learning (i.e. you are familiar with supervised learning, understand the concept of train/validation/test splits or cross-validation, and are familiar with common methods used to evaluate a model's performance).
  • You are comfortable with Python to the extent that you can import data, manipulate it, and understand functions.

Installation Requirements:

1. Please see the H2O Miniumn Requirements document for requirements and installation help.
2. Please install a Jupyter Notebook on your laptop that uses either python 2 or 3.  Installation instructions can be found here.

Note: if you have issues installing H2O please come 15-30 minutes early to the workshop.

REGISTER HERE

Practical Applications of Deep Learning with MATLAB

Practical Applications of Deep Learning – a Hands-on MATLAB Workshop
Presented by Jianghao Wang, MathWorks
Thursday, January 18
12:00 PM - 2:30 PM

Are you new to deep learning and want to learn how to apply these techniques it in your work? Deep learning achieves human-like accuracy for many tasks considered algorithmically unsolvable with traditional machine learning. It is frequently used to develop applications such as face recognition, automated driving, and image classification.

In this hands-on workshop, you will write code and use MATLAB to:

  1. Learn the fundamentals of deep learning and understand terms like “layers”, “networks”, and “loss”
  2. Build a deep network that can classify your own handwritten digits
  3. Access and explore various pretrained models
  4. Use transfer learning to build a network that classifies different types of food
  5. Train deep learning networks on GPUs in the cloud
  6. Learn how to use GPU code generation technology to accelerate inference performance

Pre-requisites:

Attendees who are unfamiliar with MATLAB are encouraged to to review the 2-hour MATLAB Onramp tutorial. Additionally, attendees could come prepared with questions if they complete the 2-hour Deep Learning Onramp also hosted at MATLAB Academy.

Attendees should bring a laptop; attendees will access MATLAB Online to access MATLAB through their web browsers.

REGISTER HERE

Python Machine Learning with sklearn

Python Machine Learning with sklearn
Presented by Rahul Dave, Institute for Applied Computational Science
Thursday, January 18, 2018
3:00 PM - 6:00 PM
 

We'll learn the basic concepts of machine learning models through the python library sklearn. We'll start out with the basic concepts of learning a model: training, preventing overfitting, choosing the correct complexity, and cross-validation. In this process we will learn both regression (the prediction of continuous outcomes), and the use of the sklearn api. We'll then apply our learning to classification (the prediction of labels), including concepts such as feature selection, cross-validation, and regularization.

This workshop is now full.

Pandas: Relational Database Concepts for Data Science

Pandas: Relational Database Concepts for Data Science
Presented by Rahul Dave, Institute for Applied Computational Science
Friday, January 19, 2018
8:30 AM - 11:30 AM


We'll learn the basics of relational querying of data using the pandas library in Python. You will learn how to use pandas to access relational databases as well as CSV files, and the use of sqlite as a backend for data science in python. Along the way we'll explore how a few verbs underly "relational" data analysis systems, from dplyr to pandas to SQL. These concepts scale from the smaller size databases accessed by these tools to larger sized databases and data access patterns used in large systems such as spark.

REGISTER HERE

Deep Learning Image Classification with Keras

Introduction to Deep Learning Image Classification using Keras
Presented by Pavlos Protopapas, Harvard Institute for Applied Computational Science
Friday, January 19, 2018
8:30 AM - 11:30 AM

This workshop will cover the following topics:

  • Theory about neural networks, backpropagation, optimization, etc.
  • Keras basics (functions, etc.)
  • Dropout and/or other kinds of normalization (theory + Keras)
  • Matplotlib visualization

Attendees will be emailed installation instructions prior to the workshop.

This workshop is now full.

Deep Learning for Health Care Image Analysis

Deep Learning For Healthcare Image Analysis
Presented by Barton Fiske and Brad Palmer, NVIDIA
Friday, January 19, 2018
12:00 - 2:30 PM

In this workshop, you will learn how to apply Convolutional Neural Networks (CNNs) to Magnetic Resonance Imaging (MRI) scans to perform a variety of medical tasks and calculations. 

We will present two hands on lab exercises:

·        Perform image segmentation on MRI images to determine the location of the left ventricle.

·        Calculate ejection fractions by measuring differences between diastole and systole using CNNs applied to MRI scans to detect heart disease.

This workshop is from NVIDIA's Deep Learning Institute (DLI). The original course is 6 hours; we will present an abridged version of the course and will cover as much material as the allotted time allows. Learn more about NVIDIA’s DLI here.

Participants should bring a laptop to class with an SSH client already installed. A GPU in your laptop is not required. We will use the SSH client to access remote servers with pre-configured GPU resources. PuTTY is the recommended SSH client.

REGISTER HERE.

Digital Content Creation Using GANS and Autoencoders

Digital Content Creation Using GANs and Autoencoders
Presented by Barton Fiske and Brad Palmer, NVIDIA
Friday, January 19, 2018
3:00 - 6:00 PM

In this course, you will receive hands-on training on the latest techniques for designing, training and deploying generative, adversarial networks (GANs) for digital content creation (DCC). You will learn how to:

·        Train a Generative Adversarial Network (GAN) to generate images.

·        Train your own denoiser for rendered images.

Time allowing, we will also explore the architectural innovations and training techniques used to make arbitrary video style transfer.  On completion, you will be able to create digital assets using deep learning approaches.

This workshop is from NVIDIA's Deep Learning Institute (DLI). The original course is 6 hours; we will present an abridged version of the course and will cover as much material as the allotted time allows. Learn more about NVIDIA’s DLI here.

Participants should bring a laptop to class with an SSH client already installed. A GPU in your laptop is not required. We will use the SSH client to access remote servers with pre-configured GPU resources. PuTTY is the recommended SSH client.

REGISTER HERE

 

SEARCH WORKSHOPS BY TOPIC:

Bayesian Inference with Stan

Introduction to Bayesian Inference with Stan
Presented by Michael Betancourt, Stan Development Team
Wednesday, Jan 17, 2018
12:00 PM - 5:00 PM

Despite the promise of big data, inferences are often limited not by the size of data but rather by its systematic structure. Only by carefully modeling this structure can we take fully advantage of the data -- big data must be complemented with big models and the algorithms that can fit them. Stan is a platform for facilitating this modeling, providing an expressive modeling language for specifying bespoke models and implementing state-of-the-art algorithms to draw subsequent Bayesian inferences.

In this workshop I will introduce how to implement a robust Bayesian workflow in Stan, from constructing models to analyzing inferences and validating the underlying modeling assumptions. The workshop will emphasize interactive exercises run through PyStan, the Python interface to Stan.

Pre-requisites: The workshop will assume familiarity with the basics of calculus, probability, and statistics but the core concepts will be reviewed.

In order to participate in the interactive exercises attendees must provide a laptop with PyStan 2.17 (https://pystan.readthedocs.io/en/latest/) installed. Note that installation on Windows machines is highly nontrivial so Windows users are encouraged to try to install as soon as possible and report any issues at http://discourse.mc-stan.org.

REGISTER HERE

Build a Bot with Domino Data Lab

Build a Bot with Domino Data Lab!
Presented by Mac Rogers & Team, Domino Data Lab
Wednesday, January 17, 2018
8:30 AM - 11:30 AM
 

Data science in industry is more than just testing ideas and writing reports - you have to create data products your business can leverage. However, data science can seem like magic at times - prompting questions from skeptical downstream consumers of your work. So, good data science products have to be explainable and reproducible. 

In complex, regulated industries like healthcare, finance, and insurance, auditability and collaboration are key. You have to make sure your models can retrain themselves to stay on the cutting edge; make them accessible to and decipherable by regulators; and fit it into the broader stack of work in your organization, because data science is a team sport. 

In this workshop, participants will use Domino Data Lab, an enterprise R&D, collaboration, and deployment platform for data science teams. You’ll experience the full data science life cycle using cutting-edge industry technology: data prep and exploration; model development and tuning; and implementation and deployment of API endpoints and web apps. 

REGISTER HERE

Deep Learning Image Classification with Keras

Introduction to Deep Learning Image Classification using Keras
Presented by Pavlos Protopapas, Harvard Institute for Applied Computational Science
Friday, January 19, 2018
8:30 AM - 11:30 AM

This workshop will cover the following topics:

  • Theory about neural networks, backpropagation, optimization, etc.
  • Keras basics (functions, etc.)
  • Dropout and/or other kinds of normalization (theory + Keras)
  • Matplotlib visualization

Attendees will be emailed installation instructions prior to the workshop.

This workshop is now full.

Deep Learning for Healthcare Image Analysis

Deep Learning For Healthcare Image Analysis
Presented by Barton Fiske and Brad Palmer, NVIDIA
Friday, January 19, 2018
12:00 - 2:30 PM

In this workshop, you will learn how to apply Convolutional Neural Networks (CNNs) to Magnetic Resonance Imaging (MRI) scans to perform a variety of medical tasks and calculations. 

We will present two hands on lab exercises:

·        Perform image segmentation on MRI images to determine the location of the left ventricle.

·        Calculate ejection fractions by measuring differences between diastole and systole using CNNs applied to MRI scans to detect heart disease.

This workshop is from NVIDIA's Deep Learning Institute (DLI). The original course is 6 hours; we will present an abridged version of the course and will cover as much material as the allotted time allows. Learn more about NVIDIA’s DLI here.

Participants should bring a laptop to class with an SSH client already installed. A GPU in your laptop is not required. We will use the SSH client to access remote servers with pre-configured GPU resources. PuTTY is the recommended SSH client.

REGISTER HERE.

Deep Learning with MATLAB

Practical Applications of Deep Learning – a Hands-on MATLAB Workshop
Presented by Jianghao Wang, MathWorks
Thursday, January 18
12:00 PM - 2:30 PM

Are you new to deep learning and want to learn how to apply these techniques it in your work? Deep learning achieves human-like accuracy for many tasks considered algorithmically unsolvable with traditional machine learning. It is frequently used to develop applications such as face recognition, automated driving, and image classification.

In this hands-on workshop, you will write code and use MATLAB to:

  1. Learn the fundamentals of deep learning and understand terms like “layers”, “networks”, and “loss”
  2. Build a deep network that can classify your own handwritten digits
  3. Access and explore various pretrained models
  4. Use transfer learning to build a network that classifies different types of food
  5. Train deep learning networks on GPUs in the cloud
  6. Learn how to use GPU code generation technology to accelerate inference performance

Pre-requisites:

Attendees who are unfamiliar with MATLAB are encouraged to to review the 2-hour MATLAB Onramp tutorial. Additionally, attendees could come prepared with questions if they complete the 2-hour Deep Learning Onramp also hosted at MATLAB Academy.

Attendees should bring a laptop; attendees will access MATLAB Online to access MATLAB through their web browsers.

REGISTER HERE

Digital Content Creation Using GANS and Autoencoders

Digital Content Creation Using GANs and Autoencoders
Presented by Barton Fiske and Brad Palmer, NVIDIA
Friday, January 19, 2018
3:00 - 6:00 PM

In this course, you will receive hands-on training on the latest techniques for designing, training and deploying generative, adversarial networks (GANs) for digital content creation (DCC). You will learn how to:

·        Train a Generative Adversarial Network (GAN) to generate images.

·        Train your own denoiser for rendered images.

Time allowing, we will also explore the architectural innovations and training techniques used to make arbitrary video style transfer.  On completion, you will be able to create digital assets using deep learning approaches.

This workshop is from NVIDIA's Deep Learning Institute (DLI). The original course is 6 hours; we will present an abridged version of the course and will cover as much material as the allotted time allows. Learn more about NVIDIA’s DLI here.

Participants should bring a laptop to class with an SSH client already installed. A GPU in your laptop is not required. We will use the SSH client to access remote servers with pre-configured GPU resources. PuTTY is the recommended SSH client.

REGISTER HERE

 

Machine Learning with H20

Intro to Automatic Machine Learning with H2O
H20 Workshop for Python Users
Presented by: Lauren DiPerna, H2O
Thursday, Jan 18, 2018
8:30 AM - 11:30 AM

H2O is an open-source machine-learning platform designed to build accurate models on distributed datasets at impressive speeds. This means you can use H2O to tackle terabytes of data in an ecosystem like Hadoop or Spark, but it doesn't mean you have to; you can also use H2O to build fast and accurate models on your laptop. The platform includes multiple APIs including Python, R, and Scala, and a web-based UI called Flow - so the learning curve is easy for new users. In this session we will focus on the Python API and show you how to use Flow in parallel, to view your data and monitor your models in the UI.

This workshop will kick off with an introduction to the H2O platform, then jump into a hands-on session, where you will learn how to build models using H2O's Python API. In the first half of the workshop, you will learn how to read in data, do basic data manipulations, build a supervised learning model, and use your model to make predictions.

In the second half of the workshop, you will learn about H2O's AutoML: an easy-to-use interface that automates the challenging process of training a large number of candidate models. AutoML includes the Stacked Ensemble algorithm, which will automatically train on the collection of individual models to produce a highly predictive ensemble model which, in most cases, will be the top performing model.

By the end of the workshop you will know how to use H2O to read in data, build models, and tune a selection of supervised models with AutoML.

Topics Covered:

  • What is H2O?
  • Training models on big data.
  • Building a supervised model with H2O's Python API.
  • Automatic machine learning with H2O's AutoML.

Pre-requisites & Pre-work:

All are welcome, however, the following assumptions are made:

  • You have completed this pre-workshop survey  by January 15, 2018
  • You have a laptop with H2O installed (see the Installation Requirements section below for details on how to download H2O).
  • You have a basic understanding of machine learning (i.e. you are familiar with supervised learning, understand the concept of train/validation/test splits or cross-validation, and are familiar with common methods used to evaluate a model's performance).
  • You are comfortable with Python to the extent that you can import data, manipulate it, and understand functions.

Installation Requirements:

1. Please see the H2O Miniumn Requirements document for requirements and installation help.
2. Please install a Jupyter Notebook on your laptop that uses either python 2 or 3.  Installation instructions can be found here.

Note: if you have issues installing H2O please come 15-30 minutes early to the workshop.

REGISTER HERE

Machine Learning with Microsoft Azure

Intro to Microsoft Azure Machine Learning Services
Presented by Heather Shapiro, Azure Machine Learning
Wednesday, January 17, 2018
3:00 PM - 6:00 PM

The Azure Machine Learning Workbench is an integrated, end-to-end data science and advanced analytics solution that helps professional data scientists to prepare data, develop experiments, and deploy models at cloud scale. The Machine Learning Workbench includes cutting edge AI frameworks from both Microsoft and the open source community.

In this session, we will walk through the basics of the Machine Learning services (preview) and use the Machine Learning Workbench for preparing, building, and deploying a Machine Learning model.

Pre-requisites:

  • If you don't have an Azure subscription, please create a free account before the workshop.
  • Download the latest Azure Machine Learning Workbench installer (Windows)(Mac)


NOTE: Currently, you can install the Azure Machine Learning Workbench desktop app on the following operating systems:

  • Windows 10
  • Windows Server 2016
  • macOS Sierra
  • macOS High Sierra

REGISTER HERE

 

 

Machine Learning with Python sklearn

Python Machine Learning with sklearn
Presented by Rahul Dave, Institute for Applied Computational Science
Thursday, January 18, 2018
3:00 PM - 6:00 PM
 

We'll learn the basic concepts of machine learning models through the python library sklearn. We'll start out with the basic concepts of learning a model: training, preventing overfitting, choosing the correct complexity, and cross-validation. In this process we will learn both regression (the prediction of continuous outcomes), and the use of the sklearn api. We'll then apply our learning to classification (the prediction of labels), including concepts such as feature selection, cross-validation, and regularization.

This workshop is now full.

Pandas: Relational Database Concepts

Pandas: Relational Database Concepts for Data Science
Presented by Rahul Dave, Institute for Applied Computational Science
Friday, January 19, 2018
8:30 AM - 11:30 AM


We'll learn the basics of relational querying of data using the pandas library in Python. You will learn how to use pandas to access relational databases as well as CSV files, and the use of sqlite as a backend for data science in python. Along the way we'll explore how a few verbs underly "relational" data analysis systems, from dplyr to pandas to SQL. These concepts scale from the smaller size databases accessed by these tools to larger sized databases and data access patterns used in large systems such as spark.

REGISTER HERE

Tableau

Introduction to Tableau
Presented by Lauren Kearney, Tableau
Wednesday, Jan 17, 2018
8:30 AM - 11:30 AM

Get ready to learn how to inform and inspire others when presenting data with Tableau’s business intelligence platform. Join us for this hands-on session to discover how to turn data into actionable insights and find out why Forbes recently ranked Tableau as the technical skill with the third biggest rise in demand. We’ll provide an overview of Tableau and walk you through the key features and functionality of our visual analytics platform. This hands-on workshop will excite you to explore your data in a different way!

Note: Participants should bring a laptop with Tableau downloaded and installed.  Participants will be emailed instructions to download Tableau prior to the workshop.

REGISTER HERE