Instructor: Bernd Bickel, Jon Bollback, Christoph Lampert, Gasper Tkacik

Teaching Assistants: Anna Levina, Srdjan Sarikas

The course will consist of 4 segments of approximately 3-4 weeks each. Each segment will follow the same structure: the first week will consist mainly of lectures and background material study, the second week will introduce a small project based on real data or a computational assignment, whose results will then be presented in the last week. The emphasis will be on dealing with data and computations in a hands-on fashion.

Tentative summary of the course segments:

**Segment 1: Working with real data (Instructor: Gasper Tkacik)**

*Goal: Characterizing basic statistical properties of an unknown dataset*

- Assumptions about the data: IID samples, stationarity, statistical and systematic experimental errors
- Histograms and histogram statistics
- Quantifying correlations, principal component analysis (PCA), kmeans clustering
- Error bar estimation via bootstrap / jackknife methods
- Spectral estimates by Fourier transform, linear filtering and convolution

**Segment 2: Predictive models (Instructor: Christoph Lampert)**

*Goal: understand and be able to handle predictive models*

- What are predictive models / examples
- Linear regression and classification (least squares, logistic regression)
- Nonlinear regressio and classification (random forests, deep networks)
- Loss functions, parameter fitting by maximum likelihood
- Judging models by predictions on held-out data, overfitting, regularization
- Choosing between models by cross-validation or surrogates
- Learning and testing large-scale predictive models: tricks-of-the-trade

**Segment 3: Simulations and numerics (Instructor: Bernd Bickel)**

*Goal: Understand and apply basic computational techniques for simulations in a variety of applied science problems, with focus on differential equations*

- Discretization methods for ordinary and partial differential equations
- Comparing solvers, explicit and implicit methods
- Stability, sensitivity, and optimizations
- Control problems
- Example advanced applications: chaos, software, predictive capability

**Segment 4: Bayesian models (Instructor: Jon Bollback)**

*Goal: Understand and apply Markov Chain Monte Carlo methods, particularly for inference of model parameters*

- Conditional probabilities, likelihood, priors
- Markov Chain Monte Carlo, sampling algorithms and tricks-of-the-trade for efficient sampling
- Bayesian model inference with application to DNA sequence data

- undergraduate mathematics: linear algebra, calculus, probabilities
- basic procedural programming in a language of your choice, ability to understand C and Python code
- interest in working with real, in particular biological, data

Date | Topic | Location | Other |
---|---|---|---|

Feb 29 | Cycle 1: Histograms | Mondi 3 | |

Mar 2 | Cycle 1: Histogram statistics, error bars | Mondi 3 | |

Mar 7 | Cycle 1: Correlations | Mondi 3 | |

Mar 9 | Cycle 1: PCA, K-means | Mondi 3 | |

Mar 14 | Cycle 1: Power spectra | Mondi 3 | |

Mar 16 | Cycle 1: Power spectra | Mondi 3 | |

Apr 4 | Cycle 1: K-means, ICA | Mondi 3 | |

Apr 6 | Cycle 1: Presentations | Mondi 3 | |

------------------------------ | |||

... | |||

May 17 | Cycle 3 | Mondi 3 | |

May 18 | Cycle 3 | Mondi 3 | |

May 23 | Cycle 3 | Mondi 3 | |

May 25 | Cycle 3 | Mondi 3 |

File | Due Date | Example solutions |
---|---|---|

Segment 1 | ||

See lecture notes and trace1.zip below | HW for Week 1 due Mar 8 (11am) to TAs | |

See updated lecture notes and also trace2.zip below | HW for Week 2 due March 20 to TAs | |

See updated lecture notes and files below | HW for Week 3 due March 28 to TAs, Projects due April 6 in class | |

Segment 2 | ||

See S02E01.pdf file below | HW for Week 1 due Apr 19 to TAs | |

See S02E02.pdf file below | HW for Week 2 due Apr 25 to TAs | |

See S02E03.pdf file below | description of final project (due Apr 27) and report (due May 5 to ChLa) |

Cycle1 Lecture notes (updated) containing homework assignments (gray text) and the project.

Relevant Numerical Recipes chapters

Segment 2:

- exercise sheets: sheet1 sheet2
- project description: sheet
- Diabetes dataset: diabetes.txt
- lecture slides: lecture1 lecture2 lecture3 lecture4 addendum

Segment 3:

Lecture 1 (including exercise 1), Matlab scripts for lecture 1, additional notes (Witkin, Baraff, see HW)

Lecture 2 (including exercise 2), Matlab script for recording a movie

Lecture 3

Lecture 4

Project description, Code skeleton

Segment 4:

Lecture 1 (including homework 1)

Lecture 3 (including project description)

Python code for trees and models (this code requires the following C library for functionality)

To take a look at the additional Downloads, please click here. (you must be logged in!)