University of Texas, Arlington

                                    Computer Science and Engineering

                                     CSE5370   Bioinformatics

                                                    

                                               Instructor:           Jean Gao
                                                     Email:           gao@cse.uta.edu
                                                    Office:           338 Nedderman Hall,  phone: 817-272-3628
                                         Office Hours:          Monday and Wednesday, 12:00 - 1:00pm or by appointment
                               Course Information:        Monday and Wednesday, 4:00 - 5:20pm.
                                             Classroom:          NH229

                                                          TA:          Young Bun Kim
                                                     Email:          ybkim@uta.edu
                                                     Office:          NH 234, 817-272-5796
                                          Office Hours:         Monday and Wednesday, 2:00 - 3:00pm

   


Course Description:

Biological sciences are undergoing a revolution in how they are practiced. In the last decade, a vast amount of data (DNA sequences,
protein sequences, etc.) has become available, and computational methods are playing a fundamental role in transforming this data into scientific understanding.

    Bioinformatics involves developing and applying computational methods for managing and analyzing information about the sequence, structure and function of biological molecules and systems.  Topics will include understanding the evolutionary organization of genes (genomics), the structure and function of gene products (proteomics), and the dynamics for gene expression in biological processes (transcriptomics).

Objectives:   

    To provide students an understanding of the fundamental computational problems in molecular biology and genomics, and a core set of widely used algorithms in computational biology.  The proposed course is intended to help students have a working knowledge of a variety of publicly available data and computational tools important in bioinformatics, and a grasp of the underlying principles
of contemporary bioinformatics. 

       Prerequisites:

          1. A background in biology is not required, but students should be interested in catching up quickly on relevant topics.
          2. Half of the homework assignments is programming.  Students are free to choose any language they are comfortable with.
              For those who like Matlab, we have Matlab Bioinformatics Toolbox installed in the public labs at Ransom Halls and Nedderman Halls.
         

       Textbook:

            N. Jones & P. Pevzner, "An Introduction to Bioinformatics Algorithms," 2004, ISBN 0262101068.
          

       References:     

         
          --    Mount, D.W., "Bioinformatics : sequence and genome analysis". 2001,  Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory
                 Press. xii, 564. ISBN: 0879696087.  
                 (UTA library has electronic version of this book.)

          --   "Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids". R. Durbin, S. Eddy, A. Krogh, and G. Mitchison.
                 Cambridge University Press, 1998  

           --  "Discovering genomics, proteomics and bioinformatics", A. Malcolm Campbell, and Laurie Heyer, Benjamin Cummings, 2003.
                 ISBN: 0-8053-4722-4.

           --  "Biochemistry", L. Stryer,  5th ed,  W H Freeman and Co.


      Grading:

          Homeworks & Projects                                                                                                          30%
                 -- There will be about 4 homework assignments.
                 -- 2 are programming projects and 2 are written exercises (though can
                    be done by computer).
 
          Exam                                                                                                                                        30%
                 -- There will be one midterm.

          Class Attendance                                                                                                                   15%
                          
          Class Presentation                                                                                                                 25%
                 -- Student will give a class presentation from a given selected topics.
                 -- Grading will be based on clarity of presentation, preparedness, understanding of problem and
                      slides writing.
    
          Homework Policy:
                -- All assignments are due on the day of class time.  Hard copies with source code should be turned in at class.
                    Source code is supposed to email to TA before that.
                --  No emails or phone calls will be replied regarding to assignment within 24 hours of due time.
                --  Late submission will be deducted at 10% of each assignment score per 24 hours.

      Academic Misconduct:

          All homework assignments must be done individually.  Cheating and plagiarism will result in a default "F" grade for this course.
          Code for programming assignements must NOT be developed in groups, nor should be shared.  Discussions with peers, or TA about
          approaches and techniques are encouraged, but not at a detail level of implementation. 
       

     Tentative Topics:

        1. Introduction
            -- Introduction to bioinformatics
            -- Whirlwind tour of Chem/MolBio/BioChem
            -- Primer on probability theory

        2. Genomics
            -- Tools for sequence alignment and database searches
            -- Pairwise sequence alignment
            -- Sequence database search
            -- Multiple sequence alignment
            -- Hiden Markov Models (HMM)
            -- Gene Finding

        3. Proteomics
            -- Protein structure and its prediction
            -- Structure alignment
            -- Phylogenetic inference
            -- Molecular modeling (mechanics & dynamics)
            -- Protein threading

         4. Functional Genomics and Proteomics
           -- Construction and use of microarrays
           -- Statistical analysis of microarray data: clustering methods
           -- Microarray analysis: dimensionality reduction
           -- Mass spectrometry for proteomics: protein selection and identification


      Acknowledgement

        I would like to write a special thank you note to  Prof. Russ Altman at Stanford University and Prof. Mark Craven at University of
      Wisconsin
for providing and allowing me to use their Bioinformatics lecture notes.  Only with their help, I  would be able to create the
        adapted lecture slides.   Thanks also go to Prof. Chris Bailey-Kellogg at Purdue University for opening the door of bioinformatics for
        me.