Data Analysis – Introduction to Machine Learning and Data Management in the Cloud

The last years have seen a massive growth in the volume of data that is available. In a large variety of applications and business domains, more and more data are explicitly collected for subsequent analysis. However, as the area of machine learning-based data analysis is evolving very rapidly, finding the best suited approach and algorithm(s) for solving a particular data analysis problem has become increasingly complex. At the same time, “the Cloud” has become the ubiquitously available platform for running complex applications. The first generation of the Cloud has addressed the sharing of IT resources (CPU cycles, storage capacity, etc.) with almost unlimited scalability in a pay-as-you-go mode (i.e., Cloud users only pay for the resources they actually consume – in contrast to on-premise data centers). Recently, a new generation of the Cloud comes with a rich set of APIs for advanced functionality that can be used in addition to basic IT resources. Essentially, this also includes various APIs for data analysis for data of various types, including support for complete machine learning / data analytics pipelines.

The objective of this course is twofold: first, it provides an in-depth introduction into Cloud computing and data management in the Cloud. Second, it introduces basic methods and algorithms for data analysis and machine learning. The course combines a conceptual analysis of machine learning approaches with a practical introduction to the most commonly used machine learning software on a concrete data analysis project.

 Day 1:

  • Introduction to Data Analysis and Machine Learning
  • Introduction to the Cloud
  • Practical exercise (getting started with the Cloud)
  • Practical exercise (machine learning)

Day 2:

  • Data Management in the Cloud
  • Cloud services
  • Practical exercise (Cloud machine learning APIs)
  • Practical exercise (data analysis project)

Participants and requirements: The participants should have basic programming skills, especially basic familiarity with python.

Organizers: Dr. Marcel Lüthi, Dr. Luca Rossetto, Prof. Heiko Schuldt, Prof. Thomas Vetter

Date and duration: Thursday January 16 and Friday January 17, 2020 (9.00 am – 5.00 pm each).

Location: Biozentrum, Universität Basel, Klingelbergstrasse 70, 4056 Basel, Room 106