Section 7 Differential Analysis
In this section, we will use wrappers around functions from the limma package to fit linear models (linear regression, t-test, and ANOVA) to proteomics data. While LIMMA was originally intended for use with microarray data, it is useful for other data types. When working with LIMMA, the LIMMA User’s Guide is an invaluable resource.
LIMMA makes use of empirical Bayes techniques to borrow information across all features being tested to increase the degrees of freedom available for the test statistics. This results in so-called moderated test statistics and improved power to detect differential expression (Gordon K. Smyth, 2004).
We will use the CPTAC ovarian cancer proteomics dataset for this section. The required packages are MSnSet.utils for the LIMMA wrappers and volcano plots, dplyr for data frame manipulation, and ggplot2 for p-value histograms and to further customize the volcano plots. We load the cptac_oca
data and assign oca.set
to m
, which will be used in the examples.
## Install missing packages
<- c("remotes", "dplyr", "ggplot2")
cran_packages for (pkg_i in cran_packages) {
if (!require(pkg_i, quietly = T, character.only = T))
install.packages(pkg_i)
}if (!require("MSnSet.utils", quietly = T))
::install_github("PNNL-Comp-Mass-Spec/MSnSet.utils")
remotes## ------------------------
library(MSnSet.utils)
library(dplyr)
library(ggplot2)
# MSnSet for testing
data("cptac_oca")
<- oca.set m