Matching for Several Sparse Nominal Variables in a Case-Control Study of Readmission Following Surgery
Coauthor(s): Caroline Reinke, Rachel Kelz, Jeffrey Silber, Paul Rosenbaum.
Matching for several nominal covariates with many levels has usually been thought to be difficult because these covariates combine to form an enormous number of interaction categories with few if any people in most such categories. Moreover, because nominal variables are not ordered, there is often no notion of a "close substitute" when an exact match is unavailable. In a case-control study of the risk factors for readmission within 30 days of surgery in the Medicare population, we wished to match for 47 hospitals, 15 surgical procedures grouped or nested within 5 procedure groups, two genders, or 47 × 15 × 2 = 1410 categories. In addition, we wished to match as closely as possible for the continuous variable age (65–80 years). There were 1380 readmitted patients or cases. A fractional factorial experiment may balance main effects and low-order interactions without achieving balance for high-order interactions. In an analogous fashion, we balance certain main effects and low-order interactions among the covariates; moreover, we use as many exactly matched pairs as possible. This is done by creating a match that is exact for several variables, with a close match for age, and both a "near-exact match" and a "finely balanced match" for another nominal variable, in this case a 47 × 5 = 235 category variable representing the interaction of the 47 hospitals and the five surgical procedure groups. The method is easily implemented in R.
Source: The American Statistician
Zubizarreta, Jose, Caroline Reinke, Rachel Kelz, Jeffrey Silber, and Paul Rosenbaum. "Matching for Several Sparse Nominal Variables in a Case-Control Study of Readmission Following Surgery." The American Statistician 65, no. 4 (2011): 229-238.