Introduction to Data Science and Machine Learning

24-26 September, University of Wyoming

Summary

Intensive 3-Day Workshop with Lectures, Demos, and Exercises

This workshop is intended for students, researchers, and practitioners with basic experience in data science and machine learning who want to take their skills to the next level. This intensive workshop will give you the theoretical knowledge and practical skills to apply machine learning and data science techniques in practical contexts to analyze data, build predictive models, and optimize their performance.

  • Data Analysis
  • Supervised and Unsupervised Machine Learning
  • Evaluation of Machine Learning Models
  • Parameter Tuning
  • Ensembles, Boosting, Feature Selection
  • mlr R Package (presented by the authors of mlr)
  • Basic Knowledge of Statistics, Programming, and R required

Presenters


Bernd Bischl

Julia Moosbauer

Martin Binder

Stefan Coors

Prof. Bernd Bischl leads the computational statistics group at LMU Munich and directs the Munich Center for Machine Learning. He is one of the principal authors of the mlr Machine Learning package, which the other presenters also contribute to. All presenters have extensive experience developing machine learning and data science approaches and applying them to real-world problems.

The mlr package is the most comprehensive machine learning package in R and has a rapidly growing user base. It is installed more than 15,000 times per month and the source code repository has more than 1,200 stars on GitHub. Version 3 is a complete reimplementation that takes the many lessons learned with previous versions into account to make machine learning easier, more flexible, and more efficient.

Schedule (tentative)

Tuesday, September 24

9.30 — 11.00
lecture — revision of prerequisites
11.00 — 11.15
break
11.15 — 12.30
demo/exercises — basic concepts in mlr
12.30 — 13.30
lunch break
13.30 — 15.00
lecture — mlr on titanic dataset
15.00 — 15.15
break
15.15 — 16.45
exercises

Wednesday, September 25

9.30 — 11.00
lecture — trees and tree ensembles
11.00 — 11.15
break
11.15 — 12.30
demo/exercises
12.30 — 13.30
lunch break
13.30 — 15.00
lecture — principal component analysis and clustering
15.00 — 15.15
break
15.15 — 16.45
exercises

Thursday, September 26

9.30 — 11.00
lecture — boosting and stacking
11.00 — 11.15
break
11.15 — 12.30
demo/exercises
12.30 — 13.30
lunch break
13.30 — 15.00
lecture — tuning and feature selection
15.00 — 15.15
break
15.15 — 16.45
exercises

In addition to the main workshop program, the presenters will be available on Friday September 27 to consult on individual and specific data science and machine learning problems by appointment only. To make such an appointment, please contact the organizer Lars Kotthoff at larsko@uwyo.edu and provide details of what you would like to consult on.

Prerequisites

You must have R and mlr installed before the workshop — we will not provide any help with this during the workshop. You should have basic familiarity with programming and R. You can find a list of curated resources on how to install and get started with R at https://www.rstudio.com/online-learning/.

You should be familiar with basic concepts in data science and machine learning. We will assume that you know the material that is covered in the first two days of the online introduction to machine learning course you can find at https://compstat-lmu.github.io/lecture_i2ml/articles/content.html. The course provides lecture videos, slides, and exercises with solutions. Please work through this material before September 24; otherwise you will not be able to follow the material presented in the workshop. Your main focus should be on understanding the theoretical concepts; the exercises serve to illustrate them.

Registration

The workshop is open to everyone, regardless of whether affiliated with the University of Wyoming or not. Registration is free, but mandatory. Please register your interest at https://forms.gle/n88ehcP3wGc1MoZa9 by September 1. Registrations will be accepted on a first-come, first-serve basis until spaces are filled. If there are more registrations than spaces, preference will be given to people who have a background in data science or machine learning. You will receive an e-mail confirmation of your registration or notice that the workshop is full after the registration deadline; until then you are not registered.

For any questions, please contact the workshop organizer Lars Kotthoff at larsko@uwyo.edu. You can download a flyer for the event here.