Business 41912: Applied Multivariate Analysis
Spring
Quarter of 2008
Instructor: Ruey S. Tsay
Office: HPC 455
Tel: 773-702-6750
Fax: 773-702-0458
e-mail: ruey.tsay@ChicagoGSB.edu
Office hour: (a) Wednesday 1:30 pm to
2:30 pm
(b) By appointment
You may e-mail me questions. E-mail is the easiest way
to make contact with me.
Teaching Assistant: Mr. David Matteson
Text:
Applied Multivariate Statistical Analysis
by R.A. Johnson and
D.W. Wichern
6th ed. Prentice Hall,
2007.
ISBN 0-13-187715-1
Grading:
Midterm 30% + Final Exam 45%
+ Homework 25%
where scores of each component
are normalized to be out of 100.
New
focus of the course: Recent developments in dimension reduction
such as
sliced inverse regression, high-dimensional panel data,
independent
component analsyis, etc.
Computing:
R is the main package, but students can use any other programs.
Instructions for R will be given. The following R packages are useful for the
course: mnormt, fastICA, CCA, mvoutlier
Lecture:
Week1: Review Chapters 1 to 3, and the first half of Chapter 4:
lec1
Week2: Random sample from a multivariate normal distribution & Inference about mean
lec2
Data set: Monthly simple returns of IBM, 3M,JNJ, GM, and INTC stocks
from January 1996 to December 2005: m-5c9605.txt
Data set: Monthly log returns of Boeing, Abbott Labs, Motorola, and General Motors stocks
from January 1998 to Deceomber 2007: m-ba4c9807.txt
Matlab program to obtain Chi-square QQ-plot: qqchi2.m
Matlab program to compute Hotelling T^2: hotelling.m
Matlab program for transformation: boxcox.m
Matlab program to
compute various confidence intervals for means of
components: cfinterval.m
Matlab programs to
handle missing values in a Gaussian random sample:
(a) EM-algorithm: emmiss.m (b) MCMC method: mcmcmiss.m
Splus commands to obtain Chi-square QQ-plot: splusqqchi2.txt
R package: mvoutlier has some new tools for detecting multivariate outliers
R program to compute Chi-square QQ-plot: rqqchi2.txt
R program to compute statistics for outlier detection: r-outlier.txt
R program to compute Hotelling T^2: hotelling.txt
R program to compute various confidence intervals for means:
Use data: r-cregion.txt, Use
summary statistics: r-cregiona.txt
R program for two multivariate control charts: r-t2chart.txt & r-t2future.txt
Minitab commands to obtain Chi-square QQ-plot: minitab-d1.txt
Week 3: Multivariate Analysis of Variance (MANOVA)
Lec3
R programs: r-contrast.txt, Behrens.txt , Box-M.txt
Week 4: Multivariate Analysis of Variance (MANOVA) & Linear Regression
Lec4 R-demo: r-manova Data sets used: t6-14.dat, t6-5.dat, t6-6.dat
R programs: r-profile.txt, r-growth.txt
[Part of the lecture is in Lec3]
Week 5: Multivariate Linear Regression
Lec4a
R example for regression models
with time series errors: mlreg-ts
Week 7: Multivariate multiple linear regression & Principal Component Analysis
Lec5 & Lec6 R program for MMLR: r-mmlr.txt
R commands: princomp, fastICA, and
factanal, respectively.
R demonstration: r-pca data: m-pca5c-9003.txt
Week 8: Independent Components and Factor models:
Lec7 & Data set used: T8-4.DAT
R demonstration: r-factor
Week 9: Canonical correlation analysis and Discriminant analysis and classification
Lec8 & Lec9
R program: r-discri.txt,
S-Plus demo: discr-splus.txt
Data sets used: T11-1.DAT, T11-2.DAT
Week 10: Hierarchical clustering and multidimensional scaling
Lec10 & Data sets used: T12-4.DAT, T12-5.DAT, T12-7m.DAT, T12-9.DAT
R program: r-discrim.txt (discrimination for more than 2 categories).
Reading materials for
high dimensional data analysis:
(a) Sliced inverse regression approach: sirphd
Homework assignment:
HW#1: Data sets
used: Q1-m-5c8807.txt, Q4-t4-6.DAT
Solutions: hw1s
HW#2: Data sets used: T5-2.DAT, T1-8.DAT, T5-8.DAT, T6-11.DAT, T5-12.DAT
Solutions: hw2s
HW#3: Data sets used: T6-9.DAT, T6-10.DAT, T6-12.DAT, T11-7.DAT, T6-17.DAT
Solutions: hw3s
HW#4: Data sets used: T7-6.DAT, T8-4.DAT, T8-5.DAT
Solutions: hw4s
HW#5: Data sets used: T1-5.DAT, T1-6.DAT, T7-7.DAT, T8-5.DAT
Solutions: hw5s
Minitab example:
(a) Hotelling's T^2 test: minitab-hotel.txt
(b) Square-root of a positive definite matrix: minitab-sq.txt
(c) One-way analysis of variance: crash.txt with
data crash.dat
(d) Growth curve fitting: growth.txt
Old exams
(a) Year 2004: pdf, solution:
answer
(b) Year 2006: Midterm & solutions; final & solutions
Midterm: Week 6, Friday, May 9
Exam and solutions
Final Exam: Week 11, Friday, June 13, 3:00 pm to 6:00 pm.
Exam and solutions