# Anova-Freak and Bayesian Hipster #rstats

This article is originally published at https://strengejacke.wordpress.com

I’m pleased to announce an update of my sjstats-package. New features are specifically implemented for the Anova and Bayesian statistic and summary functions. Here’s a short overview of what’s new…

# Anova statistics

Beside the already implemented functions to calculate eta-squared, partial eta-squared and omega-squared, it is now also possible to calculate *partial omega-squared* statistics for Anova-tables.

```
library(sjstats)
# load sample data
data(efc)
# fit linear model
fit <- aov(
c12hour ~ as.factor(e42dep) + as.factor(c172code) + c160age,
data = efc
)
omega_sq(fit, partial = TRUE)
#> # A tibble: 3 x 2
#> term partial.omegasq
#> 1 as.factor(e42dep) 0.278
#> 2 as.factor(c172code) 0.00547
#> 3 c160age 0.0649
```

Furthermore, `eta_sq()`

and `omega_sq()`

have a `ci.lvl`

-argument, which – if not `NULL`

– also computes a confidence interval.

For eta-squared, i.e. `eta_sq()`

with `partial = FALSE`

, due to non-symmetry, confidence intervals are based on bootstrap-methods. Confidence intervals for partial omega-squared, i.e. `omega_sq()`

with `partial = TRUE`

– are also based on bootstrapping. In these cases, `n`

indicates the number of bootstrap samples to be drawn to compute the confidence intervals.

```
eta_sq(fit, partial = TRUE, ci.lvl = .8)
#> # A tibble: 3 x 4
#> term partial.etasq conf.low conf.high
#> 1 as.factor(e42dep) 0.281 0.247 0.310
#> 2 as.factor(c172code) 0.00788 0.00109 0.0160
#> 3 c160age 0.0665 0.0467 0.0885
```

# Bayiasian statistics and summaries

Some of the summary-functions for Bayesian models were polished in the current update, e.g. `hdi()`

. Let’s fit a (non-sense) model with the great **brms**-package first (you’ll see later why I don’t use **rstanarm** here):

```
library(lme4)
library(brms)
library(dplyr)
data(sleepstudy)
sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE)
sleepstudy %
group_by(mygrp) %>%
mutate(mysubgrp = sample(1:30, size = n(), replace = TRUE))
m <- brm(
Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject),
data = sleepstudy
)
```

## Highest Density Intervals

`hdi()`

calculates by default the 89% HDI for Bayesian regression models:

```
hdi(m)
#> # A tibble: 3 x 3
#> term hdi.low hdi.high
#> 1 b_Intercept 232.00 268.0
#> 2 b_Days 9.15 11.8
#> 3 sigma 27.20 33.5
```

Now you can calculate the HDI for multiple tresholds (probabilities) at once, using the `prob`

-argument:

```
hdi(m, prob = c(.5, .8, .95))
#> # A tibble: 3 x 7
#> term hdi.low_0.5 hdi.high_0.5 hdi.low_0.8 hdi.high_0.8 hdi.low_0.95 hdi.high_0.95
#> 1 b_Intercept 244.00 258.0 237.00 265.0 229.00 274.0
#> 2 b_Days 9.97 11.1 9.51 11.6 8.90 12.1
#> 3 sigma 28.90 31.5 27.60 32.5 26.60 33.8
```

## Tidy summary

A tidy summary, similar to the `tidy()`

-function from the **broom**-package, can be obtained with `tidy_stan()`

. By default, for multilevel models, this function only shows the fixed effects. Use the different options from the `type`

-argument if you want random effects only or both random and fixed effects to be printed.

```
tidy_stan(m)
#> # A tibble: 3 x 8
#> term estimate std.error hdi.low hdi.high n_eff Rhat mcse
#> 1 b_Intercept 250.0 10.900 233.00 268.0 0.326 1.00 0.312
#> 2 b_Days 10.6 0.835 9.18 11.8 1.000 1.00 0.013
#> 3 sigma 30.2 1.930 27.30 33.4 1.000 1.00 0.202
```

## Intraclass Correlation Coefficient (ICC)

The ICC is a useful statistic for random-intercept models, which describes the proportion of variance that is explained by the random effects structure (see `?icc`

for more details). The ICC is based on the variance and correlation components of the model, which you typically get with the `VarCorr`

function. According to the **brms**-package, there’s a good and a bad news: from a developers perspective, the „bad“ news is that `VarCorr.brmsfit`

has recently changed in its internal code structure, so I had to revise my `icc()`

-function to work with **brms**-models again, with success:

```
icc(m)
#> Bayesian mixed model
#> Family: gaussian (identity)
#> Formula: list() Reaction ~ Days + (1 | mygrp/mysubgrp) + (1 | Subject) list()
#>
#> ICC (mygrp): 0.032317
#> ICC (mygrp:mysubgrp): 0.013529
#> ICC (Subject): 0.598439
```

The good news is, that `VarCorr.brmsfit`

has recently changed in its internal code structure. Now it is also possible to get the variance and correlation components *for each sample* of the model, which allows us to compute the ICC *for each sample* of the model, which again in turn allows us to compute an ICC including uncertainty intervals very quickly – simply add `posterior = TRUE`

to the `icc()`

-function call:

```
icc(m, posterior = TRUE)
#>
```*# Random Effect Variances and ICC*
#>
#> Family: gaussian (identity)
#> Formula: Reaction ~ Days + (1 | mygrp/mysubgrp) + (1 | Subject)
#>
#> ## mygrp
#> ICC: 0.022 (HDI 89%: 0.000-0.104)
#> Between-group: 57.406 (HDI 89%: 0.000-277.588)
#>
#> ## mygrp:mysubgrp
#> ICC: 0.011 (HDI 89%: 0.000-0.050)
#> Between-group: 28.535 (HDI 89%: 0.000-127.271)
#>
#> ## Subject
#> ICC: 0.578 (HDI 89%: 0.419-0.737)
#> Between-group: 1449.446 (HDI 89%: 682.362-2413.380)
#>
#> ## Residuals
#> Within-group: 909.039 (HDI 89%: 722.154-1085.644)

By default, the 89% interval around the ICC „point estimate“ is shown in the output, however, the `print()`

-method has an argument to change the interval range to any value, e.g.

`print(icc(m, posterior = TRUE), prob = .5, digits = 2)`

.

# Final words

The colored output above is not just playing with CSS, the R console output is indeed colored!

Thanks for visiting r-craft.org

This article is originally published at https://strengejacke.wordpress.com

Please visit source website for post related comments.