CSE 5301 Syllabus for Data Analysis and Modeling Techniques
                                                                                                                     [Course Syllabus]   [Course Schedule]  



                  Instructor:                          Jean Gao
                  Email:                                gao@ uta.edu
                  Office:                               ERB 538, phone: 817-272-3628.
                  Office Hour:                     Tue and Thu, 12:30 - 1:30 pm or by appointment.

                   TA Info:                                   Please check the blackboard     


Course Description and Objectives:

The objective of this course is to provide students the basic data analysis and modeling concepts and methodologies using probability theory. Basic statics concepts and probability concepts will be covered. Fundamental data analysis and hypothesis techniques will be covered. Further data modeling methodologies such as Hidden Markov Models and Bayesian networks will be introduced.


Students successfully completing this course will have gained a solid understanding of probabilistic data modeling, interpretation, and analysis and thus have formed an important basis solve practical statistics and data analysis related problems arising in broad computer science and engineering, and daily life.




All students are expected to have a background in basic probability, Calculus, and Algebra before attending this course.  


Probability and Statistics for Computer Scientists, by Michael Baron, Chapman and Chapman and Hall/CRC; 2 edition (August 5, 2013),  ISBN-10: 1439875901.

References: (under 2 hours reservation in the library):

1. Art of Computer Systems Performance Analysis: Techniques For Experimental Design Measurements Simulation and Modeling, Raj Jain (Wiley; 2 edition, 2015),  ISBN: 978-1118858424.

2. A Concise Course in Advanced Level Statistics with worked examples (Oxford University Press; 4th Revised edition, 2014),  ISBN: 1408522292.

3. Additional course materials will be available electronically through the course website. 


Homework (10%):  There will be about 5~6 HWs. Some of the homework assignments may include small computer projects.  You are free to choose your most comfortable programming language. 

Hardcopy of handwritten or typed HWs are collected physically in or before class on the due date. Late submissions will not be accepted.

Exams (75%):  There will be three mid-terms, each will cover approximately one-third of the course materials.

Quizzes (15%):  There will be three 15-minute quizzes .   

Tentative Major Topics to Be Covered (subjective to change based on course progress):


1. Basic probability:

Discrete and continuous random variables, independence, covariance, central limit theorem, Chebyshev inequality, diverse continuous and discrete distributions.


2. Statistics, Parameter Estimation, and Fitting a Distribution:

   Descriptive statistics, graphical statistics, method of moments, maximum likelihood estimation


3. Random Numbers and Simulation:

Sampling of continuous distributions, Monte Carlo methods

4. Hypothesis Testing:

    Type I and II errors, rejection regions; Z-test, T-test, F-test, Chi-Square test, Bayesian test   

5. Stochastic Processes and Data Modeling:

     Markov process, Hidden Markov Models, Poisson Process, Bayesian Network, Regression, Queuing systems

Course Policy:


Class Attendance is required.


Cooperative efforts at understanding the material and the assignments are encouraged.  However, you are required to present your work that you have completed individually. There will be no make-up exams or quizzes for this course unless the instructor is notified IN ADVANCE under extenuating circumstances.

There will be extra credit assignments.


Disabilities Act

The University of Texas at Arlington is on record as being committed to both the spirit and letter of federal equal opportunity legislation; reference Public Law 93112 -- The Rehabilitation Act of 1973 as amended. With the passage of new federal legislation entitled Americans With Disabilities Act - (ADA), pursuant to section 504 of The Rehabilitation Act, there is renewed focus on providing this population with the same opportunities enjoyed by all citizens. As a faculty member, I am required by law to provide "reasonable accommodation" to students with disabilities, so as not to discriminate on the basis of that disability. Student responsibility primarily rests with informing faculty at the beginning of the semester and in providing authorized documentation through designated administrative channels.


Academic Dishonesty

It is the philosophy of The University of Texas at Arlington that academic dishonesty is a completely unacceptable mode of conduct and will not be tolerated in any form. All persons involved in academic dishonesty will be disciplined in accordance with University regulations and procedures. Discipline may include suspension or expulsion from the University. "Scholastic dishonesty includes but is not limited to cheating, plagiarism, collusion, the submission for credit of any work or materials that are attributable in whole or in part to another person, taking an examination for another person, any act designed to give unfair advantage to a student or the attempt to commit such acts." (Regentsí Rules and Regulations, Part One, Chapter VI, Section 3, Subsection 3.2, Subdivision 3.22)


Student Support Services Available

The University of Texas at Arlington supports a variety of student success programs to help you connect with the University and achieve academic success. These programs include learning assistance, developmental education, advising and mentoring, admission and transition, and federally funded programs. Students requiring assistance academically, personally, or socially should contact the Office of Student Success Programs at 817-272-6107 for more information and appropriate referrals.