Lecture "Data Warehousing and Data Mining Techniques"

Information
Classification: 
Master (Informatik, Wirtschaftsinformatik)
Credits: 
5
Exam: 
Oral
Regular Dates: 
Every Tuesday, starting 13th April 2021 (Online)
09:45-12:15 Hrs.
Contents
Contents: 

In this course, we examine the aspects of building, maintaining, and operating data warehouses and give an insight into the main knowledge discovery techniques. The course deals with basic issues like the storage of data, execution of analytical queries, various data mining procedures and a short introduction to deep learning.

Exams

The examination dates are now available on the front page. Please contact our secretary to schedule your oral exam slot.

This course will be held in English.

The general structure of the course is as follows:

  • Typical DW use case scenarios
  • Basic architecture of DW
  • Data modelling on conceptual, logical and physical levels
  • Multidimensional E/R modelling
  • Cubes, dimensions, measures
  • Query processing, OLAP queries (OLAP vs OLTP), roll-up, drill down, slice, dice, pivot
  • MOLAP, ROLAP, HOLAP
  • SQL99 OLAP operators, MDX
  • Snowflake, star and starflake schemas for relational storage
  • Multimedia physical storage (linearization)
  • DW Indexing as search optimization mean: R-Trees, UB-Trees, Bitmap indexes
  • Other optimization procedures: data partitioning, star join optimization, materialized views
  • ETL
  • Association rule mining, sequence patterns, time series
  • Classification: Decision trees, naive Bayes classifications, SVM
  • Cluster analysis: K-means, hierarchical clustering, agglomerative clustering, outlier analysis
  • Bootstrapping, Bagging, adaptive Boosting
  • Deep Learning Intro

Course Organization

To join this course please register for the lecture in Stud.IP. We will publish announcements and further information on Stud.IP. The lecture videos, slides and exercise materials are posted below on this website over the course of the semester.

The lectures will take place in our IfIS Webex room. We will publish a link to the room on Stud.IP before the first lecture.

Course Materials

The materials (slides [Lxx], exercises [Exx], solutions [Sxx] and videos [Vxx_EN]) will be provided in English. The lecture recap and live exercices are also held in English. However, if you are not fluent in English you can ask in the lecture or exercise for a short German explanation.

Additionally, we provide videos of the lecture from the last semester where it was held in German [Vxx_DE]. Note that those videos are old (summer term 2009) and larger amounts of the lecture changed during this time. The ground truth for the exam is the current lecture, the German videos are meant as an additional service. Furthermore, not all lectures are covered by the 2009th videos so some lectures will only have an English video recording.

Materials
  Date Topic Slides Exercises Videos
0 13.04.21 Course Organization [L00] - -
1 20.04.21 Introduction [L01] [E01], [S01] [V01_EN] / [V01_DE]
2 27.04.21 Architecture [L02] [E02], [S02] [V02_EN] / [V02_DE]
3 04.05.21 Modeling [L03] [E03], [S03] [V03_EN] / [V03_DE]
4 11.05.21 Indexing [L04] [E04], [S04] [V04_EN] / [V04_DE]
5 18.05.21 Optimization [L05] [E05], [S05] [V05_EN] / [V05_DE]
6 25.05.21 OLAP Operations & Queries [L06] [E06], [S06] [V06_EN] / [V06_DE_1], [V06_DE_2]
7 01.06.21 Build the DW, ETL [L07] [E07], [S07] [V07_EN] / [V07_DE]
8 08.06.21 Real-Time DW [L08] [E08], [S08] [V08_EN] / -
9 15.06.21 DM Overview & Association Rule Mining [L09] [E09], [S09] [V09_EN] / [V09_DE]
10 22.06.21 Sequence Patterns & Time Series Durability [L10] [E10], [S10] [V10_EN] / [V10_DE]
11 29.06.21 Classification [L11] [E11], [S11] [V11_EN] / [V11_DE]
12 06.07.21 Clustering [L12] [E12], [S12] [V12_EN] / [V12_DE]
13 13.07.21 Meta-Algorithms for Classification [L13] [E13], [S13] [V13_EN] / -
14 20.07.21 Deep Learning [L14] [E14], [S14] [V14_EN] / -
- - Complete Slide Set [LNC] - -