Pierre Baldi
   

School of  Information and Computer Sciences (ICS)
Institute for Genomics and Bioinformatics (IGB)
University of California at Irvine ( UCI)
   

 

  







Bioinformatics: the Machine Learning Approach
-second edition-
-table of contents- 

Series Foreword

Preface

1.  Introduction 
Biological data in digital symbol sequences
Genomes--diversity, size, & structure
Proteins & proteomes  
On the information content of biological sequences
Prediction of molecular function & structure

2.   Machine-Learning Foundations: The Probabilistic Framework  
Introduction: Bayesian modeling 
The Cox Jaynes axioms
Bayesian inference & induction
Model structures: graphical models & other tricks
Summary
  
3.   Probabilistic Modeling & Inference: Examples
The simplest sequence models
Statistical mechanics
 
4.   Machine Learning Algorithms  
Introduction  
Dynamic programming
Gradient descent
EM/GEM algorithms
Markov-chain Monte-Carlo methods
Simulated annealing
Evolutionary & genetic algorithms
Learning algorithms: miscellaneous aspects
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX4.   5. Neural Networks: The Theory
Introduction
Universal approximation properties  
Priors & likelihoods  
Learning algorithms: backpropagation
    
6.   Neural Networks: Applications     
Sequence encoding & output interpretation
Sequence correlations & neural networks
Prediction of protein secondary structure
Prediction of signal peptides & their cleavage sites
Applications for DNA & RNA nucleotide sequences
Prediction performance evaluation  
Different performance measures     
      
7.  Hidden Markov Models: The Theory
Introduction
Prior information & initialization  
Likelihood & basic algorithms  
Learning algorithms  
Applications of HMMs: general aspects
    
8.  Hidden Markov Models: Applications  
Protein applications  
DNA & RNA applications  
Advantages & limitations of HMMs  
           
9.  Probabilistic Graphical Models in Bioinformatics
The zoo of graphical models in bioinformatics Markov models & DNA symmetries  
Markov models & gene finders  
Hybrid models & neural network parameterization of graphical models
The single-model case  
Bidirectional recurrent neural networks for protein secondary structure prediction                                        9.                                                                                            10.  Probabilistic Models of Evolution: Phylogenetic Trees 
Introduction to probabilistic models of evolution 
Substitution probabilities & evolutionary rates 
Rates of evolution
Data likelihood
Optimal trees & learning  
Parsimony 
Extensions

11.  Stochastic Grammars & Linguistics Introduction to formal grammars
Formal grammars & the Chomsky hierarchy
Applications of grammars to biological sequences 
Prior information & initialization
Likelihood
Learning algorithms
Applications of SCFGs  
Experiments 
Future directions 

12.  Microarrays & Gene Expression
Introduction to microarray data 
Probabilistic modeling of array data 
Clustering 
Gene Regulation
    
13.  Internet Resources & Public Databases
A rapidly changing set of resources Databases over databases and tools    
Databases over databases in molecular biology
Sequence & structure databases
Sequence similarity searches
Alignment
Selected prediction servers 
Molecular biology software links 
Ph.D. courses over the Internet 
Bioinformatics societies     
HMM/NN simulator     
    
A.   Statistics     
Decision theory & loss functions
Quadratic loss functions
The bias/variance trade-off     
Combining estimators
Error bars 
Sufficient statistics     
Exponential family     
Additional useful distributions     
Variational methods  

B.   Information Theory, Entropy, & Relative Entropy     
Entropy     
Relative Entropy     
Mutual Information
Jensen's Inequality      
Maximum Entropy         
Minimum Relative Entropy     

C.   Probabilistic Graphical Models 
Notation & preliminaries
The undirected case: Markov random fields
The directed case: Bayesian networks
  
D.   HMM Technicalities, Scaling, Periodic Architectures, State Functions, and Dirichlet Mixtures    
Scaling     
Periodic architectures
State functions: bendability 
Dirichlet mixtures     
           
E.   Gaussian Processes, Kernel Methods, and Support Vector Machines
Gaussian process models    
Kernel methods & support vector machines
Theorems for Gaussian processes & SVMs

F.   Symbols and Abbreviations
 

References 
           
Index