Before propensity scores can be estimated, it is necessary to handle any missing data in the covariates. I recommend multiple imputation because it has been shown to outperform other methods of handling missing data such as listwise deletion, pairwise deletion, and single imputation (hot deck imputation, regression imputation).

There are two main approaches to multiple imputation: joint modeling and multiple imputation by chained equations (MICE). I use MICE because it does not require the specification of a joint distribution of covariates.

In the video below, I review R code for multiple imputation as well as single imputation of covariates prior to propensity score estimation using MICE.

There are two main approaches to multiple imputation: joint modeling and multiple imputation by chained equations (MICE). I use MICE because it does not require the specification of a joint distribution of covariates.

In the video below, I review R code for multiple imputation as well as single imputation of covariates prior to propensity score estimation using MICE.

PROPENSITY SCORE ESTIMATION WITH LOGISTIC REGRESSION

The most common method to estimate propensity scores is logistic regression, because it is a parametric model that is familiar to many researchers. Although there are many advanced data mining methods that can potentially outperform logistic regression, I recommend that researchers use logistic regression first because it frequently produces propensity scores that result in adequate covariate balance. If you are able to achieve covariate balance using the propensity scores estimated with logistic regression, it is not necessary to use advanced data mining methods.

In the video below, I review R code for propensity score estimation with logistic regression.

The most common method to estimate propensity scores is logistic regression, because it is a parametric model that is familiar to many researchers. Although there are many advanced data mining methods that can potentially outperform logistic regression, I recommend that researchers use logistic regression first because it frequently produces propensity scores that result in adequate covariate balance. If you are able to achieve covariate balance using the propensity scores estimated with logistic regression, it is not necessary to use advanced data mining methods.

In the video below, I review R code for propensity score estimation with logistic regression.

Code for Chapter 2 Propensity Score Estimation

R Code for Propensity Score Estimation | |

File Size: | 15 kb |

File Type: | r |

Data for Example of Propensity Score Estimation

R Data for Propensity Score Estimation Example | |

File Size: | 294 kb |

File Type: | rdata |

Many data mining methods can be used to estimate propensity scores, such as generalized boosted modeling, random forests, and neural networks. In this video , I show how to estimate propensity scores using generalized boosted modeling with the

*twang*package of R.In the video below, I show how to estimate propensity scores with random forests using the

*party*package of R.**Related Research:**

Leite, W. L., Aydin, B., & D. D. Cetin-Berber (2021). Imputation of Missing Covariate Data Prior to Propensity Score Analysis: A Tutorial and Evaluation of Robustness of Practical Approaches.

*Evaluation Review*.

**https://doi.org/10.1177/0193841X211020245**

Code for the paper

Collier, Z. K., & Leite, W. L. (2021). A Tutorial on Artificial Neural Networks in Propensity Score Analysis.

*Journal of Experimental Education*. DOI: 10.1080/00220973.2020.1854158

Proudly powered by Weebly