## Courses Taught at the University of Pennsylvania

#### Spring 2018 BSTA651 Linear models & generalized linear models

Course instructors:

Justine Shults (part I: Linear models) and Yong Chen (Part II: Generalized linear models)

Description:

This is a course on methods for generalized linear models (GLMs), rather than a course on using software for data analysis with GLMs. This course is designed to provide students with a fundamental understanding of theory and applications of the GLMs. Emphasis will be placed on statistical modeling, building from standard normal linear models, extending to GLMs, and going beyond GLMs. The main subjects are logit models for nominal and ordinal data, log-linear models, models for repeated categorical data, generalized linear mixed models and other mixture models for categorical data. Methods of maximum likelihood, weighted least squares, and generalized estimating equations will be used for estimation and inference.”

Textbooks:

1. Agresti, A. (2002). Categorical Data Analysis (Second Edition). Wiley. ISBN-10: 0471360937.

2. McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models (Second Edition). Chapman and Hall. ISBN-10: 0412317605.

Learning objectives:

Regression analysis has been developed for many years and remains one of the most commonly used statistical tools to help scientists address their scientific questions. Generalized linear models (GLMs) were formulated by John Nelder and Robert Wedderburn as a way of unifying various other statistical models, including ANCOVA, linear regression, logistic regression and log-linear models for contingency tables and count data. This lecture will introduce GLMs and some recent developments of regression techniques with focus on generalized linear models, quasi likelihood methods and estimating function approaches.

List of topics:

• Generalized linear models and maximum likelihood method

• Quasi-likelihood method and estimating equation

• Model selection

• Analysis of binary data

• Analysis of polytomous responses

• Analysis of count data: log linear models

• Analysis of contingency table

• Generalized linear mixed effect models

• Analysis of matched data

• Inference for correlated responses: marginal models and random effect models

Expectation:

By the end of the course, the students are expected to: 1) understand the main components of GLMs; 2) build and apply appropriate models to binary, nominal, ordinal or count data; 3) build and apply appropriate models to correlated outcomes; 4) make inference for a given model and interpret the results in the scientific context

#### Fall 2017 EPID582 Systematic reviews & meta-analysis

Course directors:

Craig Umscheid and Yong Chen

Objective:

This 1.0 unit graduate-level course will provide an introduction to the fundamentals of systematic reviews and meta-analyses.  It will cover introductory principles of meta-analysis; protocol development; search strategies; data abstraction methods; quality assessment; meta-analytic methods; and applications of meta-analysis.  The course is composed of a series of weekly small group lectures and discussions. Students will be expected to attend weekly didactics, participate in class discussions, review assigned readings, complete homework assignments, and draft a systematic review protocol of their choosing suitable for IRB submission.

Assignments:

Students will be required to complete readings in the textbook and articles referenced for each session. In addition, each student will complete homework assignments assigned by the instructors including a data analysis project using a meta-analysis dataset provided by the instructors: download Stata meta-analysis modules from the Stata website, review dataset variables, complete an analysis, and write-up their findings. Finally, students will draft a systematic review protocol of their choosing and present their protocol at the conclusion of the class. There are no examinations.

#### Fall 2016 BSTA622 Advanced statistical inference

Course instructors:

Yong Chen (Part I) and Jinbo Chen (Part II)

Outline of topics:

Parametric Inference:

Unbiased estimation and unbiased estimating functions

Maximum likelihood estimation: Consistency, asymptotic normality, and efficiency

Hypothesis testing: Wald test, Likelihood ratio test, Score test

Influence functions

EM algorithm

Model checking, Model mis-specification, and model selection

Examples of Non-regular maximum likelihood estimation

Marginal likelihood, Conditional likelihood, (modified) profile likelihood, composite likelihood, and pseudolikelihood

U-statistics theory

Contiguity theory

Bayes and Empirical Bayes estimators, Bayesian tests

Semiparametric Inference:

Semiparametric maximum likelihood estimation (Case-control study; Cox proportional hazards regression)

Z-estimation/M-estimation

Generalized score test, with Pearson’s Chi^2 test as an example

Semiparametric inference with incomplete data

#### Fall 2015 EPID621 Longitudinal data analysis

Course instructors:

Yong Chen

Description:

This course presents extensions of general and generalized linear models to longitudinal and correlated outcome data with special emphasis on clinical, epidemiologic, and public health applications. Major topics include generalized linear mixed linear models (GLMM) for continuous, binomial, and count data; maximum likelihood estimation; generalized estimating equations (GEE); current general and specialized software applicable to these methods; and readings from current statistical literature. Each student will be required to participate in 4 labs and complete associated problem sets. Software will include Stata.

Textbooks:

1. Diggle, P,  Heagerty, P, Liang, K-Y and Zeger, S. (2013). Analysis of Longitudinal Data (Second Edition). Oxford University Press. ISBN-10: 0198524846.

2. Fitzmaurice GM, Laird NM, Ware JH.  Applied Longitudinal Analysis.  Second Edition. New York: Wiley; 2011.  ISBN: 978-0-470-38027-7. Hardcover  740 pages; August 2011

3. Singer JD, Willett JB.   Applied Longitudinal Analysis.    New York: Oxford 2003.

Graphics texts:

Mitchell MN.   A Visual Guide to Stata Graphics.  3rd Edition.  College Station, TX: Stata Press; 2012.

## Courses Taught at the University of Texas School of Public Health

#### Fall 2014 PH1916 Generalized linear models

Course instructor:

Yong Chen

Description:

This is a course on methods for generalized linear models (GLMs), rather than a course on using software for data analysis with GLMs. This course is designed to provide students with a fundamental understanding of theory and applications of the GLMs. Emphasis will be placed on statistical modeling, building from standard normal linear models, extending to GLMs, and going beyond GLMs. The main subjects are logit models for nominal and ordinal data, log-linear models, models for repeated categorical data, generalized linear mixed models and other mixture models for categorical data. Methods of maximum likelihood, weighted least squares, and generalized estimating equations will be used for estimation and inference.”

Textbooks:

1. Agresti, A. (2002). Categorical Data Analysis (Second Edition). Wiley. ISBN-10: 0471360937.

2. McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models (Second Edition). Chapman and Hall. ISBN-10: 0412317605.

Learning objectives:

Regression analysis has been developed for many years and remains one of the most commonly used statistical tools to help scientists address their scientific questions. Generalized linear models (GLMs) were formulated by John Nelder and Robert Wedderburn as a way of unifying various other statistical models, including ANCOVA, linear regression, logistic regression and log-linear models for contingency tables and count data. This lecture will introduce GLMs and some recent developments of regression techniques with focus on generalized linear models, quasi likelihood methods and estimating function approaches.

List of topics:

• Generalized linear models and maximum likelihood method

• Quasi-likelihood method and estimating equation

• Model selection

• Analysis of binary data

• Analysis of polytomous responses

• Analysis of count data: log linear models

• Analysis of contingency table

• Generalized linear mixed effect models

• Analysis of matched data

• Inference for correlated responses: marginal models and random effect models

Expectation:

By the end of the course, the students are expected to: 1) understand the main components of GLMs; 2) build and apply appropriate models to binary, nominal, ordinal or count data; 3) build and apply appropriate models to correlated outcomes; 4) make inference for a given model and interpret the results in the scientific context.

#### Spring 2014 PH1918 Methods for correlated data

Course instructors:

Yong Chen

Description:

This course presents extensions of general and generalized linear models to longitudinal and correlated outcome data with special emphasis on clinical, epidemiologic, and public health applications. Major topics include generalized linear mixed linear models (GLMM) for continuous, binomial, and count data; maximum likelihood estimation; generalized estimating equations (GEE); current general and specialized software applicable to these methods; and readings from current statistical literature. Each student will be required to participate in 4 labs and complete associated problem sets. Software will include Stata.

Textbooks:

1. Diggle, P,  Heagerty, P, Liang, K-Y and Zeger, S. (2013). Analysis of Longitudinal Data (Second Edition). Oxford University Press. ISBN-10: 0198524846.

2. Fitzmaurice GM, Laird NM, Ware JH.  Applied Longitudinal Analysis.  Second Edition. New York: Wiley; 2011.  ISBN: 978-0-470-38027-7. Hardcover  740 pages; August 2011

3. Singer JD, Willett JB.   Applied Longitudinal Analysis.    New York: Oxford 2003.

Graphics texts:

Mitchell MN.   A Visual Guide to Stata Graphics.  3rd Edition.  College Station, TX: Stata Press; 2012.

#### Fall 2013 PH1916 Generalized linear models

Course instructor:

Yong Chen

Description:

This is a course on methods for generalized linear models (GLMs), rather than a course on using software for data analysis with GLMs. This course is designed to provide students with a fundamental understanding of theory and applications of the GLMs. Emphasis will be placed on statistical modeling, building from standard normal linear models, extending to GLMs, and going beyond GLMs. The main subjects are logit models for nominal and ordinal data, log-linear models, models for repeated categorical data, generalized linear mixed models and other mixture models for categorical data. Methods of maximum likelihood, weighted least squares, and generalized estimating equations will be used for estimation and inference.”

Textbooks:

1. Agresti, A. (2002). Categorical Data Analysis (Second Edition). Wiley. ISBN-10: 0471360937.

2. McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models (Second Edition). Chapman and Hall. ISBN-10: 0412317605.

Learning objectives:

Regression analysis has been developed for many years and remains one of the most commonly used statistical tools to help scientists address their scientific questions. Generalized linear models (GLMs) were formulated by John Nelder and Robert Wedderburn as a way of unifying various other statistical models, including ANCOVA, linear regression, logistic regression and log-linear models for contingency tables and count data. This lecture will introduce GLMs and some recent developments of regression techniques with focus on generalized linear models, quasi likelihood methods and estimating function approaches.

List of topics:

• Generalized linear models and maximum likelihood method

• Quasi-likelihood method and estimating equation

• Model selection

• Analysis of binary data

• Analysis of polytomous responses

• Analysis of count data: log linear models

• Analysis of contingency table

• Generalized linear mixed effect models

• Analysis of matched data

• Inference for correlated responses: marginal models and random effect models

Expectation:

By the end of the course, the students are expected to: 1) understand the main components of GLMs; 2) build and apply appropriate models to binary, nominal, ordinal or count data; 3) build and apply appropriate models to correlated outcomes; 4) make inference for a given model and interpret the results in the scientific context.

#### Fall 2012 PH1916 Generalized linear models

Course instructor:

Yong Chen

Description:

This is a course on methods for generalized linear models (GLMs), rather than a course on using software for data analysis with GLMs. This course is designed to provide students with a fundamental understanding of theory and applications of the GLMs. Emphasis will be placed on statistical modeling, building from standard normal linear models, extending to GLMs, and going beyond GLMs. The main subjects are logit models for nominal and ordinal data, log-linear models, models for repeated categorical data, generalized linear mixed models and other mixture models for categorical data. Methods of maximum likelihood, weighted least squares, and generalized estimating equations will be used for estimation and inference.”

Textbooks:

1. Agresti, A. (2002). Categorical Data Analysis (Second Edition). Wiley. ISBN-10: 0471360937.

2. McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models (Second Edition). Chapman and Hall. ISBN-10: 0412317605.

Learning objectives:

Regression analysis has been developed for many years and remains one of the most commonly used statistical tools to help scientists address their scientific questions. Generalized linear models (GLMs) were formulated by John Nelder and Robert Wedderburn as a way of unifying various other statistical models, including ANCOVA, linear regression, logistic regression and log-linear models for contingency tables and count data. This lecture will introduce GLMs and some recent developments of regression techniques with focus on generalized linear models, quasi likelihood methods and estimating function approaches.

List of topics:

• Generalized linear models and maximum likelihood method

• Quasi-likelihood method and estimating equation

• Model selection

• Analysis of binary data

• Analysis of polytomous responses

• Analysis of count data: log linear models

• Analysis of contingency table

• Generalized linear mixed effect models

• Analysis of matched data

• Inference for correlated responses: marginal models and random effect models

Expectation:

By the end of the course, the students are expected to: 1) understand the main components of GLMs; 2) build and apply appropriate models to binary, nominal, ordinal or count data; 3) build and apply appropriate models to correlated outcomes; 4) make inference for a given model and interpret the results in the scientific context.

#### Spring 2012 PH1918 Methods for correlated data

Course instructors:

Yong Chen

Description:

This course presents extensions of general and generalized linear models to longitudinal and correlated outcome data with special emphasis on clinical, epidemiologic, and public health applications. Major topics include generalized linear mixed linear models (GLMM) for continuous, binomial, and count data; maximum likelihood estimation; generalized estimating equations (GEE); current general and specialized software applicable to these methods; and readings from current statistical literature. Each student will be required to participate in 4 labs and complete associated problem sets. Software will include Stata.

Textbooks:

1. Diggle, P,  Heagerty, P, Liang, K-Y and Zeger, S. (2013). Analysis of Longitudinal Data (Second Edition). Oxford University Press. ISBN-10: 0198524846.

2. Fitzmaurice GM, Laird NM, Ware JH.  Applied Longitudinal Analysis.  Second Edition. New York: Wiley; 2011.  ISBN: 978-0-470-38027-7. Hardcover  740 pages; August 2011

3. Singer JD, Willett JB.   Applied Longitudinal Analysis.    New York: Oxford 2003.

Graphics texts:

Mitchell MN.   A Visual Guide to Stata Graphics.  3rd Edition.  College Station, TX: Stata Press; 2012.

#### Spring 2012 PH1916 Generalized linear models

Course instructor:

Yong Chen

Description:

This is a course on methods for generalized linear models (GLMs), rather than a course on using software for data analysis with GLMs. This course is designed to provide students with a fundamental understanding of theory and applications of the GLMs. Emphasis will be placed on statistical modeling, building from standard normal linear models, extending to GLMs, and going beyond GLMs. The main subjects are logit models for nominal and ordinal data, log-linear models, models for repeated categorical data, generalized linear mixed models and other mixture models for categorical data. Methods of maximum likelihood, weighted least squares, and generalized estimating equations will be used for estimation and inference.”

Textbooks:

1. Agresti, A. (2002). Categorical Data Analysis (Second Edition). Wiley. ISBN-10: 0471360937.

2. McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models (Second Edition). Chapman and Hall. ISBN-10: 0412317605.

Learning objectives:

Regression analysis has been developed for many years and remains one of the most commonly used statistical tools to help scientists address their scientific questions. Generalized linear models (GLMs) were formulated by John Nelder and Robert Wedderburn as a way of unifying various other statistical models, including ANCOVA, linear regression, logistic regression and log-linear models for contingency tables and count data. This lecture will introduce GLMs and some recent developments of regression techniques with focus on generalized linear models, quasi likelihood methods and estimating function approaches.

List of topics:

• Generalized linear models and maximum likelihood method

• Quasi-likelihood method and estimating equation

• Model selection

• Analysis of binary data

• Analysis of polytomous responses

• Analysis of count data: log linear models

• Analysis of contingency table

• Generalized linear mixed effect models

• Analysis of matched data

• Inference for correlated responses: marginal models and random effect models

Expectation:

By the end of the course, the students are expected to: 1) understand the main components of GLMs; 2) build and apply appropriate models to binary, nominal, ordinal or count data; 3) build and apply appropriate models to correlated outcomes; 4) make inference for a given model and interpret the results in the scientific context.