251-0566-00L Summer semester 2006




Graphical Models 

and 

Causality

arrows

This class is a weekly reading group discussing research papers on causality inference from observational or experimental data. In a purely observational setting, quantities of interest (variables) can be recorded, but not acted upon. In an experimental setting, some controllable variables can be acted upon. The selected papers aim at understanding machine learning techniques to infer causality, including causal graphs derived from "graphical models”.

          Tuesday 17:00-18:00PM, CAB H 57 (plan)

Joachim Buhmann
Pattern Analysis and Machine Learning Group
Office: CAB G 69.2 Universitätstrasse 6
Phone: (044) 63 23124

E m a il: jbuhmann@inf.ethz.ch
Each week, the student will have to read a paper or a book chapter, which will be discussed in class. The students are encouraged to familiarize themselves with Bayesian networks using the Genie software. The software works directly on Windows platforms, but may also be run through Wine on Unix.
This is tested on Ubuntu Linux 5.10:
* Install wine (http://www.winehq.com/). Most distributions should
have their own way to install this package, e.g. "apt-get
install wine" on debian based distributions.
* Download setup file of genie and run: "wine genie2_setup.exe",
this should install all the needed files to
~/.wine/drive_c/Program Files/GeNIe 2.0
* change to this directory and run: "wine genie.exe"

The class is worth 4 credit points

Requirements:

Each student will have to take the lead of the discussion on one paper.
Students should sign up for a date. This will be done on a first-come first-serve basis. Please email the instructor guyoni@inf.ethz.ch. This assignment will require a little more preparation that particular week, including writing a summary of the paper and a log of the discussion. This is NOT a presentation: slides may be used as partial support, but are not required.

Grading:

There will be no exam. Attendance to at least 2/3 of the reading groups is required. There will be a sign-up sheet every time to check presence and completion of the homework assignment (usually reading a paper or book chapter). You earn one point for presence and one point for homework completion. You must earn at least 18 points to get your credit points for the class.

Schedule

The readings are organized around 4 themes revolving around book chapters from Judea Pearl's book "Causality". They are complemented by algorithm/theory papers and book chapters of other authors presenting a complementary viewpoints and application papers. This tentative list of papers is subjet to change, please check every week the assignement for the following week. Copies of the paper to read the next week will be provided. Send requests to  guyoni@inf.ethz.ch.

  Dates
(Tuesdays)
Theme
Paper Discussion leader Summary, discussion, and/or slides

1

4 April
Introduction
Graphical models and causality reading group presentation  (tutorial)
A. Elisseeff
Slides [ppt] Slides [pdf]
2 11 April
I. Basic concepts
Bayesian models without tears, Eugene Charniak
I. Guyon
Install Genie. Build an example network with the software implementing the example of Fig. 2 of the paper.
Slides. Summary.
3 18 April
Probabilistic Reasoning. Chapter 14 of the book "Artificial Intelligence: A modern approach" by Stuart Russell and Peter Norvig.
Jiwen Li
Summary. Slides a. Slides b.
4 25 April
Introduction to probabilities, graphs, and causal models. Chapter 1 of the book "Causality" by Judea Pearl. The whole chapter is available in pdf.
Severin Hacker
Summary. Slides
5 2 May
II. Basic methods
Belief propagation.  An introduction to factor graphs, by Hans-Andrea Loeliger.
Hans-Andrea Loeliger
Slides.

6

9 May
Structure learning. A tutorial on learning with Bayesian networks, by David Heckerman.
Markus Kalisch
Summary. Slides [ppt]. Slides [pdf].
7 16 May
Variational methods. An Introduction to Variational Methods for Graphical Models, by Michael Jordan, Zoubin Ghahramani, Tommi Jaakkola, and Lawrence Saul.
Patrick Pletscher
Summary. Slides . Slide handouts.
8 23 May
A theory of inferred causation.  Chapter 2 of the book "Causality" by Judea Pearl.
(see also availability from Pearl's site)
Daniel Küttel
Slides (Pearl). Slides (Daniel). Summary
9
30 May
III. Identification of causal dependencies
Statistical causality analysis of INFOSEC alert data
Xinzhou Qin and Wenke Lee

Annie Chen
Slides.
10
6 June
Causal diagrams and the identification of causal effects.  Chapter 3 of the book "Causality" by Judea Pearl.
(see also availability from Pearl's site)
Saikumar Chalasani
Slides (Pearl). Slides (Saikumar).
11
13 June
Application. Using Bayesian networks to analyze expression data Nir Friedman, Michal Linial, Iftach Nachman, and Dana Pe'er
Andreas Kägi
Slides. Summary.
12
20 June
IV. Control, action, planning
The art of science of cause and effect.  Epilogue of the book "Causality" by Judea Pearl.
Ulf Holm Nielsen
Slides
13
27 June
N-1 experiments suffice to determine the causal relations among N variables, Frederick Eberhardt, Clark Glymour, Richard Scheines
Simon Meier
Slides. Handouts.
14
4 July
Application. Marginal structural models and causal inference in epidemiology, by James Robins, Miguel Angel Hernan, and Babette Brumback.
Thomas Fuchs
Slides

Links

We provide links to other recommended readings and software:

Other courses and reading groups:
- Rina Dechter @ UCI: probabilistic reasoning class
- Rina Dechter @ UCI: genetic linkage with graphical models
- Causality seminar @ Columbia
- Causality in the social sciences @ Columbia
- Causality is undefinable, by L. Zadeh
- Causality seminar @York

Papers:

- Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks by Alexander J. Hartemink et al.
- Using Path Diagrams as a Structural Equation Modelling Tool by Peter Spirtes
Books: 
1) Bayesian networks and Decision graphs - FB Jensen.
2) Bayesian artificial Intelligence - Kevin B. Korb and Anne K. Nicholson
3) An introduction to Graphical models - Kevin P. Murphy (http://www.cs.ubc.ca/~murphyk/Papers/intro_gm.pdf)
4) Bayesian networks and beyond - Unpublished book Daphne Koller and Nir Friedman

Software:

1) Hugin (http://www.hugin.com) is an excellent tool to construct and test Bayesian networks.
2) WinBUGS (http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml) is a statistical tool to estimate the Bayesian inference with MCMC simulations using Gibbs Sampling.

Ebooks from the ETH library:

*Applied Bayesian modelling. Peter Congdon - Wiley (2003) (http://www3.interscience.wiley.com/cgi-bin/booktoc/104531773?CRETRY=1&SRETRY=0 )

*Bayesian approach to image interpretation. Sunil K. Kopparapu, Uday B. Desai - Kluwer Academic Publishers (2001) (http://ebooks.springerlink.com/summary.asp?id=70076 )

*Bayesian approaches to clinical trials and health-care evaluation. David J. Spiegelhalter, Keith R. Abrams, Jonathan P. Myles - Wiley (2004) (http://www3.interscience.wiley.com/cgi-bin/booktoc/107614005 )

*Bayesian artificial intelligence. Kevin B. Korb, Ann E. Nicholson - Chapman & Hall/CRC (2004) (http://www.statsnetbase.com/books/1219/c3871_fm.pdf )

*Bayesian economics through numerical methods: a guide to econometrics and decision-making with prior information. Jeffrey H. Dorfman - Springer (1997) http://ebooks.springerlink.com/summary.asp?id=99654 )

*Bayesian forecasting and dynamic models. Mike West, Jeff Harrison - Springer (cop. 1997) (http://ebooks.springerlink.com/summary.asp?id=104512 )

*Bayesian nonparametrics. J.K. Ghosh, R.V. Ramamoorthi - Springer (2003) (http://ebooks.springerlink.com/summary.asp?id=98937 )

*Introduction to Bayesian statistics. William M. Bolstad - Wiley (2004) (http://www3.interscience.wiley.com/cgi-bin/bookhome/109855377 )

*Likelihood, Bayesian, and MCMC methods in quantitative genetics. Daniel Sorensen, Daniel Gianola - Springer (2002) (http://ebooks.springerlink.com/summary.asp?id=98912 )

*Measurement error and misclassification in statistics and epidemiology: impacts and bayesian adjustments. Paul Gustafson - CRC Press Company (2004) (http://www.statsnetbase.com/books/1196/c3359_fm.pdf )

*Multivariate Bayesian statistics: models for source separation and signal unmixing. Daniel B. Rowe - Chapman & Hall/CRC (2003) (http://www.statsnetbase.com/books/980/c3189_fm.pdf )