Lecture “Data Warehousing and Data Mining Techniques”

Information
Classification: 
Master Informatik / Wirtschaftsinformatik
Credits: 
4
Exam: 
Oral (scoring 50% of total exercise points is required to take final exam)
Regular Dates: 
Thursdays, 10:30 - 13:00, Room 161
First lecture Thursday 02.04.2009
Contents
Summary: 

In this course, we examine the aspects regarding building maintaining and operating data warehouses as well as give an insight to the main knowledge discovery techniques. The course deals with basic issues like storage of the data, execution of the analytical queries and data mining procedures.

The general structure of the course is:

  • Typical dw use case scenarios
  • Basic architecture of dw
  • Data modelling on a conceptual, logical and physical level
  • Multidimensional E/R modelling
  • Cubes, dimensions, measures
  • Query processing, OLAP queries (OLAP vs OLTP), roll-up, drill down, slice, dice, pivot
  • MOLAP, ROLAP, HOLAP
  • SQL99 OLAP operators, MDX
  • Snowflake, star and starflake schemas for relational storage
  • Multimedia physical storage (linearization)
  • DW Indexing as search optimization mean: R-Trees, UB-Trees, Bitmap indexes
  • Other optimization procedures: data partitioning, star join optimization, materialized views
  • ETL
  • Association rule mining, sequence patterns, time series
  • Classification: Decision trees, naive Bayes classifications, SVM
  • Cluster analysis: K-means, hierarchical clustering, aglomerative clustering, outlier analysis

 

Materials

Note

Please note that you need 50% of all exercise points to be admitted for the final exams. Exercises have to be turned in next Thursday of the next week before the lecture and may be completed in teams of two students each.   
Please hand in your solutions on paper into the mailbox at the IFIS floor (Mühlenpfordtstraße 23, 2nd floor). Please do not forget your name and  “Matrikelnummer” on your solutions.

 

Download

 

Date Topic Slides Exercises Video
02.04.09 Introduction Slides - Print Slides Exercise 1

Video1

09.04.09 Architecture Slides - Print Slides Exercise 2 Video2
16.04.09 DW Modeling Slides - Print Slides Exercise 3 Video3
23.04.09 Queries Slides - Print Slides Exercise 4 Video4
30.04.09 Queries Slides - Print Slides Exercise 5 Video5
07.05.09 Optimization Slides - Print Slides Exercise 6 - Solutions Video6
14.05.09 Optimization Slides - Print Slides Exercise 7 - Solutions Video7
28.05.09 Building the DW Slides - Print Slides Exercise 8 - Solutions Video8
11.06.09 BI Slides - Print Slides Exercise 9 - Solutions Video9
18.06.09 DM Patterns&Time Series Slides - Print Slides Exercise 10 - Solutions Video10
25.06.09 DM Classification Slides - Print Slides Exercise 11 - Solutions Video11
02.07.09 DM Cluster Analysis Slides - Print Slides None Video12
09.07.09 Decision Support Systems Slides - Print Slides None  Video13

 

 

Exercises results:

Matt. Nr

Percentage

(including Sheet11)

2953995 61.6
2765837 28.2
2787909 79
2830144 79
2843047 71.8
2864628 79.4
2865355 70.2
2870609 83.2
2906913 88.8
2921150 66.6
2927358 82.2
2931755 93.8
2969170 10.4
2969413 83.2
3018597 81
3020695 83
3026026 73.8
3033268 66.6
3033323 72.4
3033365 83.6