5.3 One-Way ANOVA
A one-way ANOVA is a generalized version of the two-sample t-test that is used to determine whether there is a significant difference between the means of three or more groups. The null hypothesis is that all group means are equal, and the alternative is that at least one of the means is different from the rest. Written another way, the null hypothesis is that the difference between any two means is zero, and the alternative is that the difference between at least two means is not zero.
Hypotheses:
\(H_0: \mu_1 = \mu_2 = \mu_3 = ... = \mu_k \quad \text{or} \quad \mu_i - \mu_j = 0, \quad \forall i \neq j \in 1,2,3,...,k\)
\(H_A: \mu_i \neq \mu_j \quad \text{or} \quad \mu_i - \mu_j \neq 0, \quad \text{for some } i, j \in 1,2,3,...,k\)
Note: A one-way ANOVA does not tell us which means are different—only that a difference exists.
MSnSet.utils::limma_gen is a wrapper around functions from the limma package that performs one-way ANOVA. We will use it to test if there is a significant difference between any two levels of SUBTYPE: “Immunoreactive,” “Proliferative,” “Mesenchymal,” and “Differentiated.” Since SUBTYPE is a factor, the first level (“Immunoreactive”) will be used as the reference. That is, we will be testing whether the means of the “Proliferative,” “Mesenchymal,” or “Differentiated” groups are different from the mean of the “Immunoreactive” group for each feature in the MSnSet m.
anova_res <- limma_gen(eset = m, model.str = "~ SUBTYPE", 
                       coef.str = "SUBTYPE")
head(arrange(anova_res, adj.P.Val)) # top 6 rows arranged by adjusted p-value##             SUBTYPEProliferative SUBTYPEMesenchymal SUBTYPEDifferentiated
## NP_055140.1           -0.4979740         0.24131186            -0.3342889
## NP_000388.2           -1.2232098        -0.21980158            -0.7849428
## NP_009005.1           -1.0097220         0.04832193            -0.6224298
## NP_000878.2           -0.7633419         0.07176514            -0.5563074
## NP_001944.1           -1.3465807        -0.17808291            -0.9476618
## NP_115584.1           -0.2718495         0.93758021             0.1842301
##                   AveExpr        F      P.Value    adj.P.Val
## NP_055140.1  2.269399e-18 24.74128 3.642291e-11 2.951348e-07
## NP_000388.2 -3.421920e-18 23.63972 8.266856e-11 3.349317e-07
## NP_009005.1 -1.273715e-17 19.72001 1.784885e-09 4.820974e-06
## NP_000878.2 -1.710960e-18 18.89587 3.521123e-09 5.195885e-06
## NP_001944.1 -5.322987e-18 19.03216 3.144239e-09 5.195885e-06
## NP_115584.1  1.172771e-18 18.76318 4.488608e-09 5.195885e-06The row names are the features that were tested, and the first three columns are the average log2 fold-changes for each contrast: “Proliferative - Immunoreactive,” “Mesenchymal - Immunoreactive,” and “Differentiated - Immunoreactive.” That is, a positive value indicates that the mean of the “Immunoreactive” group is lower than the mean of the other group, and a negative value indicates that the mean of the “Immunoreactive” group is higher than the mean of the other group. To find the logFC between the “Proliferative” and “Mesenchymal” groups for protein NP_055140.1, for example, we would take the difference between “SUBTYPEProliferative” and “SUBTYPEMesenchymal”: -0.498 - 0.241 = -0.739. The other columns are
- AveExproverall mean (same as- rowMeans(exprs(m), na.rm = TRUE))
- Fmoderated F-statistic
- P.Valuep-value
- adj.P.ValBH-adjusted p-value
Below is a graphical representation of the results for a specific feature.

We say features with adjusted p-values less than 0.05 are significantly different between two or more groups.
# TRUE - significant, FALSE - not significant
table(anova_res$adj.P.Val < 0.05)## 
## FALSE  TRUE 
##  7049  1054Of the features tested, 7049 do not exhibit statistically significant group differences, while 1054 do exhibit such differences.