Working with statistical model results

Working with statistical model results in R often means that the user has to learn about the class of the model to further manipulate it. A few packages can help with that.

Model results are complex data structures that often vary a lot depending on the package and underlying model procedure that produced them.

Since learning about each model object class can be initially highly instructive but then quickly bothersome, some R packages have been developed to enable users to quickly convert model results into more manipulable and/or printable objects.

This note presents some of these packages.

`broom`

The broom package renders a large range of statistical models as “tidy data frames”. Its introductory vignette explains the details and provides a few simple examples.

My small contribution to the package includes tidiers for the exponential random graph models produced by the ergm and xergm packages.

There are still a lot of different things that a package like broom could try to achieve. Out of this list of ideas by Ben Bolker, the feature that I would find most useful is a ‘tidy’ method for marginal effects and conditional probabilities: see below.

`margins` and `prediction`

The margins package is a port of the Stata margins and marginsplot commands, which produce numerical estimates and plots of marginal effects from model results. The logic of the package is very well presented in its introductory vignette.

The prediction package is the logical complement to margins, and is written by the same author. In similar fashion to the broom package, the prediction package brings more “tidy” logic to model results by wrapping around the output of many, many predict functions, in order to always return a (tidy) data frame.

Details on the prediction package are not yet available as a vignette, but the README file of its GitHub repository is explicit enough.

`texreg`

Like the stargazer package, the texreg package renders a large range of statistical models as plain text, HTML or TeX tables. The package also comes with a detailed introductory vignette.

My sole contribution to the package has consisted in adding support for displaying the residual standard error of linear models, which is usually the only goodness of fit statistic that I care about.

If you are familiar with Stata, you are almost certainly familiar with the estout, leanout, outreg and outreg2 packages, which perform similar operations. The leanout package, in particular, embeds a very convincing logic that trims down regression results to their bare essentials (i.e. t-values instead of p-values) and privileges the root mean squared error (the residual standard error) over the “R-squared.” In R, some of the same logic is embedded in the lme4 package.

This note was heavily updated on July 2, 2016 and April 19, 2017.

Update (November 24, 2016): this blog post by Joseph Rickert contains links to several recent R packages that help to manipulate various model results—see the prediction, glm.predict, MCMCvis and oddsratio packages.

First published on September 17th, 2015

Other notes