CSE5334 Data Mining

class number: 83431
Fall 2008, Tuesday/Thursday, 11:00am - 12:20pm, NH 110

First day of class: Sep. 2nd, 2008 (We will reschedule the classes of Aug. 26th and 28th.)


Syllabus      Announcements     Schedule     Resources     Project Requirements


Schedule

The tentative schedule is as follows.  We may change the schedule as necessary. 

Course Project: P0: Team Information; P1: Proposal; P2: Progress Report; P3: Project Presentation and Demo; P4: Final Report.

Date # Lecture Assignment Lecture Notes
Out Due
08/26   Rescheduled to 09/12 1-3:30pm      
08/28   Rescheduled to 09/12 1-3:30pm      
09/02 1 Course Overview     [PDF]
09/04 2 Introduction (Chapter 1) HW1   [PDF]
Data Warehousing, OLAP, Data Cube (Chapter 3, 4)
09/09 3 [CD97] An Overview of Data Warehousing and OLAP Technology. [EE]     [PDF]
09/11 4 [GCB+97] Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. [EE]   P0 [PDF]
09/12 5 Index in Data Warehousing     [PDF]
09/12 6 Course Project Topics     [PDF]
Data Preprocessing (Chapter 2)
09/16 7 Data and Data Quality   HW1 [PDF]
Classification and Prediction (Chapter 6)
09/18 8 Decision Tree MP1   [PDF]
09/23 9 Evaluating Classification Models     [PDF]
09/25 10 Bayesian Classifiers   P1 [PDF]
09/30 11 Rule-based HW2   [PDF]
10/02 12 Nearest Neighbor Classifiers, Artificial Neural Network, Support Vector Machine   MP1
(due 10/04)
[PDF]
Frequent Pattern and Association Rule Mining (Chapter 5)
10/07 13 [AS94] Fast Algorithms for Mining Association Rules in Large Databases. [EE]     [PDF]
10/09 14 Correlation Analysis   HW2 [PDF]
10/14   Midterm Exam (in class)
Clustering (Chapter 7)
10/16 15 Overview of Clustering, Similarity/Dissimilarity Measure HW3   [PDF]
10/21 16 K-means     [PDF]
10/23 17 Hierarchical     [PDF]
10/28 18 Density-Based, Outlier Analysis     [PDF]
Text and Web Mining
10/30 19 Boolean Query Model MP2 HW3 [PDF]
11/04 20 Vector Space Model   [PDF]
11/06 21 Document Classification and Clustering P2 [PDF]
11/11 22

Information Extraction:     Snowball: Extracting Relations from Large Plain-Text Collections. [EE]

[PDF]
11/13 23 MapReduce:     Simplified Data Processing on Large Clusters. [EE] HW4 MP2 [PDF]
11/18 24 Link Analysis:       Reading material on PageRank (emailed to every student, and posted in WebCT)     [PDF]
11/20 25 (continue link analysis)      
11/24  

(NOT required) Department colloquium, 12-1pm, NH202, James Caverlee (TAMU): Towards Robust Trust Establishment in Online Communities with SocialTrust

     
11/25 26 Final Review   HW4
(due 11/26)
[PDF]
11/27   No Class. Thanksgiving Holiday.
12/02 27 Project Presentation and Demo (Anirban; Arnab; Ning, Nabila, Juan)   P3
(due 12/01)
 
12/04 28 Project Presentation and Demo (Harish, Kunal; Joel; Saravanan)   P4
(due 12/07)
 
12/05   Project Presentation and Demo (Alex; Xiaonan, Hussain, Kulsawasd; Usha, Agasthya; Gaurang, Sushant)
(11:20-1pm, NH315). Please make your effort to attend.
12/09   Final Exam (Tuesday, Dec. 9th, 11-1:30pm, NH110)

University calendar: Fall 2008