Working with statistical model results
Working with statistical model results in R often means that the user has to learn about the class of the model to further manipulate it. A few packages can help with that.
Model results are complex data structures that often vary a lot depending on the package and underlying model procedure that produced them.
Since learning about each model object class can be initially highly instructive but then quickly bothersome, some R packages have been developed to enable users to quickly convert model results into more manipulable and/or printable objects.
This note presents some of these packages.
broom
The broom
package renders a large range of statistical models as “tidy data frames”. Its introductory vignette explains the details and provides a few simple examples.
My small contribution to the package includes tidiers for the exponential random graph models produced by the ergm
and xergm
packages.
There are still a lot of different things that a package like broom
could try to achieve. Out of this list of ideas by Ben Bolker, the feature that I would find most useful is a ‘tidy’ method for marginal effects and conditional probabilities: see below.
margins
and prediction
The margins
package is a port of the Stata margins
and marginsplot
commands, which produce numerical estimates and plots of marginal effects from model results. The logic of the package is very well presented in its introductory vignette.
The prediction
package is the logical complement to margins
, and is written by the same author. In similar fashion to the broom
package, the prediction
package brings more “tidy” logic to model results by wrapping around the output of many, many predict
functions, in order to always return a (tidy) data frame.
Details on the prediction
package are not yet available as a vignette, but the README
file of its GitHub repository is explicit enough.
texreg
Like the stargazer
package, the texreg
package renders a large range of statistical models as plain text, HTML or TeX tables. The package also comes with a detailed introductory vignette.
My sole contribution to the package has consisted in adding support for displaying the residual standard error of linear models, which is usually the only goodness of fit statistic that I care about.
If you are familiar with Stata, you are almost certainly familiar with the estout
, leanout
, outreg
and outreg2
packages, which perform similar operations. The leanout
package, in particular, embeds a very convincing logic that trims down regression results to their bare essentials (i.e. t-values instead of p-values) and privileges the root mean squared error (the residual standard error) over the “R-squared.” In R, some of the same logic is embedded in the lme4
package.
This note was heavily updated on July 2, 2016 and April 19, 2017.
Update (November 24, 2016): this blog post by Joseph Rickert contains links to several recent R packages that help to manipulate various model results—see the prediction
, glm.predict
, MCMCvis
and oddsratio
packages.
- First published on September 17th, 2015