Fall 2009   CSE4392 / 5334   Data Mining


Resources: Google       Google Scholar      CiteSeer       DBLP Bibliography    ACM Digital Library       IEEE Xplore       Other Computer Science articles


Course Information:

Instructor: Chengkai Li

TA: Xiaonan (Michael) Li

  • Office hours: Tue. 2pm-3pm
  • Office: GeoScience 237
  • Phone: (817) 272-0896
  • E-mail: xiaonan.li [AT] mavs [DOT] uta [DOT] edu
  • Homepage:

Course Description: This is an introductory course on data mining. Data Mining refers to the process of automatic discovery of patterns and knowledge from large data repositories, including databases, data warehouses, Web, document collections, and data streams. We will study the basic topics of data mining, including data preprocessing, data warehousing and OLAP, data cube, frequent pattern and association rule mining, correlation analysis, classification and prediction, and clustering, as well as advanced topics covering the techniques and applications of data mining in Web and text.

Prerequisites: CSE 3330/5330  Database Systems I     or     CSE 4331/5331  Database Systems II      or     similar courses    or     consent of instructor

Textbook

Grades


Announcements: Stay tuned and make sure to check WebCT frequently. Important announcements will be posted there.

Assignments and Deadlines

Regrading: Regrading request must be made within 7 days after we post scores on WebCT. TA will handle regrade requests. If student is not satisfied with the regarding results, you get 7 days to request again. The instructor will regrade, and the decision is final.

WebCT: Log in to the WebCT page http://www.uta.edu/webct with your NetID and password. We use WebCT for: (1) Announcements; (2) Assignment Submission; (3) Discussion Group;  (4) Releasing materials, assignments, scores and grades. Follow these steps exactly during electronic assignment submission.


Ethics Policies and Academic Integrity: The College cannot and will not tolerate any form of academic dishonesty by its students. This includes, but is not limited to cheating on examinations, plagiarism, or collusion (explained in the document below). Students are required to read the following document carefully, sign it, return the signed copy to the instructor, and keep a copy for their own records. Hardcopies of this document will be provided to the students in the first class, and also can be picked up in the instructor's office. If you print by yourself, please make it double-sided.

Statement on Ethics, Professionalism, and Conduct for Engineering Students

Miscellaneous: If you require accommodation based on disability, I would like to meet with you in the privacy of my office during the first week of the semester to ensure that you are appropriately accommodated. Please read the page of the office for students with disabilities.


Schedule:

Date # Lecture Assignment Lecture Notes
Out Due
08/24 1 Course Overview     [PDF]
08/26 2 Introduction (Chapter 1)     [PDF]
08/31 3 Course Project Topics     [PDF]
Data Warehousing, OLAP, Data Cube (Chapter 3, 4)
09/02 4 Data Warehousing and OLAP HW1   [PDF]
09/07   No Class. Labor Day Holiday.
09/09 5 Data Cube      
Classification and Prediction (Chapter 6)
09/14 6 Decision Tree P1   [PDF]
09/16 7 Decision Tree (cont'd)   HW1  
09/21 8 Evaluating Classification Models HW2   [PDF]
09/23 9 Evaluating Classification Models (cont'd)      
09/28 10 Bayesian Classifiers     [PDF]
09/30 11 Rule-based     [PDF]
Data Preprocessing (Chapter 2)
10/05 12 Nearest Neighbor Classifiers   HW2
(10/07)
[PDF]
10/07 13 Data, Data Quality, Data Preprocessing     [PDF]
10/12   Midterm Exam (Monday, Oct. 12th, 5:30pm-6:50pm, NH229)
Frequent Pattern and Association Rule Mining (Chapter 5)
10/14 14 Association Rule Mining P2 P1 [PDF]
10/19 15 Correlation Analysis     [PDF]
Clustering (Chapter 7)
10/21 16 Overview of Clustering, Similarity/Dissimilarity Measure HW3   [PDF]
10/26 17 K-means     [PPT]
10/28 18 Hierarchical     [PPT]
11/02 19 Hierarchical (cont'd)      
Text and Web Mining
11/04 20 Vector Space Model   P2 [PDF]
11/09 21 Document Classification P3   [PDF]
11/11 22 Document Clustering HW4 HW3 [PDF]
11/16 23 Information Extraction     [PDF]
11/18 24 MapReduce     [PPT]
11/23 25 Link Analysis: PageRank     [PDF]
11/25 26 Link Analysis (cont'd)   HW4  
11/30 27

Social Network Analysis

     
12/02 28 Final Review   P3  
12/07   Final Exam (Monday, Dec. 7th, 5:30pm-8pm, NH22)

University calendar: Fall 2009