Research

Publications

Professional

Teaching

Personal

Home

 

 

 

 

CSE 6339 SEC 002

DATA EXPLORATION AND ANALYSIS IN

RELATIONAL DATABASES

 

Spring 2008

 

Instructor: Dr. Gautam Das

Office: 302 Nedderman Hall
Phone: 817 272 7595
Email: gdas@cse.uta.edu

Office Hours:  TBA

 

Teaching Assistant: Arjun Dasgupta Office Hours: Tue 2-4 Email:arjundasgupta [AT) UTA [dot] edu


About the Course

Much of the world’s recorded data is locked up in structured sources such as databases, which are often the propriety information of private corporations and government agencies. Searching and exploring for information within databases is currently very cumbersome - often the data explorer has to know comprehensive query languages (such as SQL), as well as important information on how the data is structured into different tables and columns (the database schema). In recent years, researchers have pondered on the problems of improving the search and exploration capabilities for relational databases. This includes adapting probabilistic and approximate querying methods to improve the scalability of query answering, as well as information retrieval techniques such as relevance ranking and keyword search. This class will explore the recent efforts by researchers in these extremely important and challenging fields. We will read and discuss latest research literature gleaned from premier conferences in databases and information retrieval. It is hoped that this class will spur students to pursuing further research in these areas.

The following is a tentative list of topics which we will attempt to cover:

1. Probabilistic Methods in Databases

            Sampling Methods in Databases: Basics

            Approximate Query Processing

            Processing of Fuzzy/Uncertain Data

2. Unstructured Search in DatabasesKeyword Queries in Databases

            Ranking of Database Query Results 

3. DB and IR integration

            Top-K algorithms

We will cover various topics in breadth, understand the central contributions of these efforts and try and predict future research directions.


Prerequisites


Advanced Algorithms and Database II are the prerequisite courses. However, exceptions will be made on a case by case basis, especially if the student has prior exposure or demonstrates initiative to quickly learn these concepts on his/her own.


Presentations

The actual reading list, consisting of recent research papers, will be selected and finalized as the course progresses. Each student will present one or more papers (depending on the enrollment) during the semester. Students will participate in class discussions during and after each presentation. Attendance is required.


Project

In addition to reading papers and presenting it in class, students will have the option of attempting a programming project during the semester. The projects will involve developing portions of information retrieval systems for structured databases based on the techniques suggested in the papers. The projects will also be tested out using real data that the students should get access to. A long-term objective is that the more promising projects will serve as infrastructure/test-beds for students to continue with their research in these areas beyond the course.


Evaluation

The grade will be based on the paper presentations, class attendance and participation, performance in the projects, and possibly 1-2 take-home examinations.

 

Schedule

Given below is a tentative schedule for class presentations over the course of this semester.

 

Announcements

  • Please check this section regularly during the semester for updates and announcements on the course
  • Ethics statement is available here. Please print, sign and submit it to the instructor during class.
  • Presentation allotments have been put up!



 


Home | Research | Publications | Professional | Teaching | Personal