CSE 6339 SEC 002




Spring 2008


Instructor: Dr. Gautam Das

Office: 302 Nedderman Hall
Phone: 817 272 7595

Office Hours:  TBA


Teaching Assistant: Arjun Dasgupta Office Hours: Tue 2-4 Email:arjundasgupta [AT) UTA [dot] edu

About the Course

Much of the world’s recorded data is locked up in structured sources such as databases, which are often the propriety information of private corporations and government agencies. Searching and exploring for information within databases is currently very cumbersome - often the data explorer has to know comprehensive query languages (such as SQL), as well as important information on how the data is structured into different tables and columns (the database schema). In recent years, researchers have pondered on the problems of improving the search and exploration capabilities for relational databases. This includes adapting probabilistic and approximate querying methods to improve the scalability of query answering, as well as information retrieval techniques such as relevance ranking and keyword search. This class will explore the recent efforts by researchers in these extremely important and challenging fields. We will read and discuss latest research literature gleaned from premier conferences in databases and information retrieval. It is hoped that this class will spur students to pursuing further research in these areas.

The following is a tentative list of topics which we will attempt to cover:

1. Probabilistic Methods in Databases

            Sampling Methods in Databases: Basics

            Approximate Query Processing

            Processing of Fuzzy/Uncertain Data

2. Unstructured Search in DatabasesKeyword Queries in Databases

            Ranking of Database Query Results 

3. DB and IR integration

            Top-K algorithms

We will cover various topics in breadth, understand the central contributions of these efforts and try and predict future research directions.


Advanced Algorithms and Database II are the prerequisite courses. However, exceptions will be made on a case by case basis, especially if the student has prior exposure or demonstrates initiative to quickly learn these concepts on his/her own.


The actual reading list, consisting of recent research papers, will be selected and finalized as the course progresses. Each student will present one or more papers (depending on the enrollment) during the semester. Students will participate in class discussions during and after each presentation. Attendance is required.


In addition to reading papers and presenting it in class, students will have the option of attempting a programming project during the semester. The projects will involve developing portions of information retrieval systems for structured databases based on the techniques suggested in the papers. The projects will also be tested out using real data that the students should get access to. A long-term objective is that the more promising projects will serve as infrastructure/test-beds for students to continue with their research in these areas beyond the course.


The grade will be based on the paper presentations, class attendance and participation, performance in the projects, and possibly 1-2 take-home examinations.



Given below is a tentative schedule for class presentations over the course of this semester.



  • Please check this section regularly during the semester for updates and announcements on the course
  • Ethics statement is available here. Please print, sign and submit it to the instructor during class.
  • Presentation allotments have been put up!


Home | Research | Publications | Professional | Teaching | Personal