Package 'crmn' reference manual

Title:	CCMN and Other Normalization Methods for Metabolomics Data
Description:	Implements the Cross-contribution Compensating Multiple standard Normalization (CCMN) method described in Redestig et al. (2009) Analytical Chemistry https://doi.org/10.1021/ac901143w and other normalization algorithms.
Authors:	Henning Redestig
Maintainer:	Henning Redestig <[email protected]>
License:	GPL (>= 3)
Version:	0.0.21
Built:	2025-03-29 02:44:27 UTC
Source:	https://github.com/hredestig/crmn

Accessor for the analytes

Description

Subset an data set to only contain the analytes.

Usage

analytes(object, standards=NULL, ...)
analytes(object, standards=NULL, ...)

Arguments

`object`	an `ExpressionSet`, `matrix` or `data.frame`
`standards`	a logical vector indicating which rows are internal analytes
`...`	not used

Value

subsetted dataset

Author(s)

Henning Redestig

Examples

data(mix)
analytes(mix)
analytes(exprs(mix), fData(mix)$tag == 'IS')
data(mix)
analytes(mix)
analytes(exprs(mix), fData(mix)$tag == 'IS')

Accessor for the analytes

Description

Subset an expression set to remove the internal standards

Usage

analytes_eset(object, where = "tag", what = "IS", ...)
analytes_eset(object, where = "tag", what = "IS", ...)

Arguments

`object`	an `ExpressionSet`
`where`	Column index or name of fData which equals `what` for the ISs (and something else for the analytes)
`what`	What the column `where` does not equal for analytes. Can be vector values too.
`...`	not used

Value

ExpressionSet

Author(s)

Henning Redestig

Examples

data(mix)
analytes(mix)
fData(mix)$test <- fData(mix)$tag
analytes(mix, where="test")
data(mix)
analytes(mix)
fData(mix)$test <- fData(mix)$tag
analytes(mix, where="test")

Accessor for the analytes

Description

Subset an expression set to remove the internal standards

Usage

analytes_other(object, standards, ...)
analytes_other(object, standards, ...)

Arguments

`object`	an `ExpressionSet`
`standards`	a logical vector indicating which rows are internal standards
`...`	not used

Value

ExpressionSet

Author(s)

Henning Redestig

Examples

data(mix)
analytes(exprs(mix), fData(mix)$tag == 'IS')
data(mix)
analytes(exprs(mix), fData(mix)$tag == 'IS')

CRMN

Description

Normalize metabolomics data using CCMN and other methods

Details

Package:	crmn
Type:	Package
Developed since:	2009-05-14
Depends:	Biobase, pcaMethods (>= 1.20.2), pls, methods
License:	GPL (>=3)
LazyLoad:	yes

A package implementing the 'Cross-contribution compensating multiple standard normalization' described in Redestig et al. (2009) Analytical Chemistry, https://doi.org/10.1021/ac901143w. Can be used to normalize metabolomics data. Do openVignette("crmn") to see the manual.

Author(s)

Henning Redestig

Drop unused levels

Description

Drop unused factor levels in a data frame.

Usage

dropunusedlevels(x)
dropunusedlevels(x)

Arguments

`x`	the data frame

Author(s)

Henning Redestig

Examples

iris[1:10,]$Species
dropunusedlevels(iris[1:10,])$Species
iris[1:10,]$Species
dropunusedlevels(iris[1:10,])$Species

Make X

Description

Construct a design matrix

Usage

makeX(object, factors, ...)

## S4 method for signature 'ANY,matrix'
makeX(object, factors, ...)

## S4 method for signature 'ExpressionSet,character'
makeX(object, factors, ...)
makeX(object, factors, ...)

## S4 method for signature 'ANY,matrix'
makeX(object, factors, ...)

## S4 method for signature 'ExpressionSet,character'
makeX(object, factors, ...)

Arguments

`object`	an `ExpressionSet`
`factors`	column names from the pheno data of `object` or a design matrix
`...`	not used

Details

Make a design matrix from the pheno data slot of an expression set, taking care that factors and numerical are handled properly. No interactions are included and formula is the most simple possible, i.e. y~-1+term1+term2+.... Can also be given anything as object in which case factor must be a design matrix. It that case the same design matrix is returned.

Value

a design matrix

Author(s)

Henning Redestig

Examples

data(mix)
makeX(mix, "runorder")
runorder <- mix$runorder
makeX(mix, model.matrix(~-1+runorder))
data(mix)
makeX(mix, "runorder")
runorder <- mix$runorder
makeX(mix, model.matrix(~-1+runorder))

Accessor for the method

Description

Get the method

Usage

method(object, ...)

method(object, ...)
method(object, ...)

method(object, ...)

Arguments

`object`	an `nFit` object
`...`	not used

Value

the method (content differs between normlization methods)

Author(s)

Henning Redestig

Matrix safe accessor of expression slot

Description

Get the expression data from an ExpressionSet or just return the given matrix

Usage

mexprs(object)

mexprs(object)

## S4 method for signature 'ExpressionSet'
mexprs(object)
mexprs(object)

mexprs(object)

## S4 method for signature 'ExpressionSet'
mexprs(object)

Arguments

object

an ExpressionSet or matrix

Value

the expression data

Author(s)

Henning Redestig

Examples

data(mix)
head(mexprs(mix))
head(mexprs(exprs(mix)))
data(mix)
head(mexprs(mix))
head(mexprs(exprs(mix)))

Accessor

Description

Matrix safe setter of expression slot

Usage

mexprs(object) <- value

## S4 replacement method for signature 'ExpressionSet,matrix'
mexprs(object) <- value

mexprs(object) <- value
mexprs(object) <- value

## S4 replacement method for signature 'ExpressionSet,matrix'
mexprs(object) <- value

mexprs(object) <- value

Arguments

`object`	an `ExpressionSet` or `matrix`
`value`	the value to assign

Details

Set the expression data in an ExpressionSet or just return the given matrix

Value

the expression data

Author(s)

Henning Redestig

Examples

data(mix)
test <- mix
mexprs(test) <- exprs(mix) * 0
head(mexprs(test))
test <- exprs(mix)
mexprs(test) <- test * 0
head(mexprs(test))
data(mix)
test <- mix
mexprs(test) <- exprs(mix) * 0
head(mexprs(test))
test <- exprs(mix)
mexprs(test) <- test * 0
head(mexprs(test))

Dilution mixture dataset.

Description

Mixture dilution series

Usage

data(mix)
data(mix)

Details

Multi-component dilution series. GC-TOF/MS measurements by Miyako Kusano. Input concentrations are known and given in the original publication.

Author(s)

Henning Redestig

Examples

 data(mix)
 fData(mix)
 exprs(mix)
 pData(mix)
data(mix)
 fData(mix)
 exprs(mix)
 pData(mix)

Accessor for the model

Description

Get the model

Usage

model(object, ...)

model(object, ...)
model(object, ...)

model(object, ...)

Arguments

`object`	an `nFit` object
`...`	not used

Value

the model (content differs between normlization models)

Author(s)

Henning Redestig

Normalization model

Description

Common class representation for normalization models.

Author(s)

Henning Redestig

Normalize a metabolomics dataset

Description

Normalization methods for metabolomics data

Usage

normalize(object, method, segments = NULL, ...)
normalize(object, method, segments = NULL, ...)

Arguments

`object`	an `ExpressionSet`
`method`	the desired method
`segments`	normalization in a cross-validation setup, only to use for validation/QC purposes.
`...`	passed on to `normFit` and `normPred`

Details

Wrapper function for normFit and normPred

Value

the normalized dataset

Author(s)

Henning Redestig

Examples

data(mix)
normalize(mix, "crmn", factor="type", ncomp=3)
#other methods
normalize(mix, "one")
normalize(mix, "avg")
normalize(mix, "nomis")
normalize(mix, "t1")
normalize(mix, "ri")
normalize(mix, "median")
normalize(mix, "totL2")
## can also do normalization with matrices
Y <- exprs(mix)
G <- with(pData(mix), model.matrix(~-1+type))
isIS <- with(fData(mix), tag == "IS")
normalize(Y, "crmn", factor=G, ncomp=3, standards=isIS)
data(mix)
normalize(mix, "crmn", factor="type", ncomp=3)
#other methods
normalize(mix, "one")
normalize(mix, "avg")
normalize(mix, "nomis")
normalize(mix, "t1")
normalize(mix, "ri")
normalize(mix, "median")
normalize(mix, "totL2")
## can also do normalization with matrices
Y <- exprs(mix)
G <- with(pData(mix), model.matrix(~-1+type))
isIS <- with(fData(mix), tag == "IS")
normalize(Y, "crmn", factor=G, ncomp=3, standards=isIS)

Fit a normalization model

Description

Fit the parameters for normalization of a metabolomics data set.

Usage

normFit(
  object,
  method,
  one = "Succinate_d4",
  factors = NULL,
  lg = TRUE,
  fitfunc = lm,
  formula = TRUE,
  ...
)
normFit(
  object,
  method,
  one = "Succinate_d4",
  factors = NULL,
  lg = TRUE,
  fitfunc = lm,
  formula = TRUE,
  ...
)

Arguments

`object`	an `ExpressionSet` or a `matrix` (with samples as columns) in which case the `standards` must be passed on via `...`
`method`	chosen normalization method
`one`	single internal standard to use for normalization
`factors`	column names in the pheno data slot describing the biological factors. Or a design matrix directly.
`lg`	logical indicating that the data should be log transformed
`fitfunc`	the function that creates the model fit for normalization, must use the same interfaces as `lm`.
`formula`	if fitfunc has formula interface or not
`...`	passed on to `standardsFit`, `standards`, analytes

Details

Normalization is first done by fitting a model and then applying that model either to new data or the same data using normPred. Five different methods are implemented.

t1: divide by row-means of the $L_2$ scaled internal standards
one: divide by value of a single, user defined, internal standard
totL2: divide by the square of sums of the full dataset
nomis: See Sysi-Aho et al.
crmn: See Redestig et al.

Value

a normalization model

Author(s)

Henning Redestig

References

Sysi-Aho, M.; Katajamaa, M.; Yetukuri, L. & Oresic, M. Normalization method for metabolomics data using optimal selection of multiple internal standards. BMC Bioinformatics, 2007, 8, 93

Redestig, H.; Fukushima, A.; Stenlund, H.; Moritz, T.; Arita, M.; Saito, K. & Kusano, M. Compensation for systematic cross-contribution improves normalization of mass spectrometry based metabolomics data Anal Chem, 2009, 81, 7974-7980

Examples

data(mix)
nfit <- normFit(mix, "crmn", factors="type", ncomp=3)
slplot(sFit(nfit)$fit$pc, scol=as.integer(mix$runorder))
## same thing
Y <- exprs(mix)
G <- model.matrix(~-1+mix$type)
isIS <- fData(mix)$tag == 'IS'
nfit <- normFit(Y, "crmn", factors=G, ncomp=3, standards=isIS)
slplot(sFit(nfit)$fit$pc, scol=as.integer(mix$runorder))
data(mix)
nfit <- normFit(mix, "crmn", factors="type", ncomp=3)
slplot(sFit(nfit)$fit$pc, scol=as.integer(mix$runorder))
## same thing
Y <- exprs(mix)
G <- model.matrix(~-1+mix$type)
isIS <- fData(mix)$tag == 'IS'
nfit <- normFit(Y, "crmn", factors=G, ncomp=3, standards=isIS)
slplot(sFit(nfit)$fit$pc, scol=as.integer(mix$runorder))

Predict for normalization

Description

Predict the normalized data using a previously fitted normalization model.

Usage

normPred(normObj, newdata, factors = NULL, lg = TRUE, predfunc = predict, ...)
normPred(normObj, newdata, factors = NULL, lg = TRUE, predfunc = predict, ...)

Arguments

`normObj`	the result from `normFit`
`newdata`	an `ExpressionSet` or a `matrix` (in which case the `standards` must be passed on via `...`), possibly the same as used to fit the normalization model in order to get the fitted data.
`factors`	column names in the pheno data slot describing the biological factors. Or a design matrix.
`lg`	logical indicating that the data should be log transformed
`predfunc`	the function to use to get predicted values from the fitted object (only for crmn)
`...`	passed on to `standardsPred`, `standardsFit`, odestandards, `analytes`

Details

Apply fitted normalization parameters to new data to get normalized data. Current can not only handle matrices as input for methods 'RI' and 'one'.

Value

the normalized data

Author(s)

Henning Redestig

Examples

data(mix)
nfit <- normFit(mix, "crmn", factor="type", ncomp=3)
normedData <- normPred(nfit, mix, "type")
slplot(pca(t(log2(exprs(normedData)))), scol=as.integer(mix$type))
## same thing
Y <- exprs(mix)
G <- with(pData(mix), model.matrix(~-1+type))
isIS <- fData(mix)$tag == 'IS'
nfit <- normFit(Y, "crmn", factors=G, ncomp=3, standards=isIS)
normedData <- normPred(nfit, Y, G, standards=isIS)
slplot(pca(t(log2(normedData))), scol=as.integer(mix$type))
data(mix)
nfit <- normFit(mix, "crmn", factor="type", ncomp=3)
normedData <- normPred(nfit, mix, "type")
slplot(pca(t(log2(exprs(normedData)))), scol=as.integer(mix$type))
## same thing
Y <- exprs(mix)
G <- with(pData(mix), model.matrix(~-1+type))
isIS <- fData(mix)$tag == 'IS'
nfit <- normFit(Y, "crmn", factors=G, ncomp=3, standards=isIS)
normedData <- normPred(nfit, Y, G, standards=isIS)
slplot(pca(t(log2(normedData))), scol=as.integer(mix$type))

Muffle the pca function

Description

PCA and Q2 issues warnings about biasedness and poorly estimated PCs. The first is non-informative and the poorly estimated PCs will show up as poor overfitting which leads to a choice of fewer PCs i.e. not a problem. This function is mean to muffle those warnings. Only used for version of pcaMethods before 1.26.0.

Usage

pcaMuffle(w)
pcaMuffle(w)

Arguments

w

a warning

Value

nothing

Author(s)

Henning Redestig

Plot a statistics for CRMN normalization model

Description

Simple plot function for a CRMN normalization model.

Usage

## S3 method for class 'nFit'
plot(x, y = NULL, ...)
## S3 method for class 'nFit'
plot(x, y = NULL, ...)

Arguments

`x`	an `nFit` object
`y`	not used
`...`	passed on to the scatter plot calls

Details

Shows Tz and the optimization (if computed) of the PCA model. The number of components used for normalization should not exceed the maximum indicated by Q2. The structure shown in the Tz plot indicate the analytical variance which is exactly independent of the experimental design. The corresponding loading plot shows how this structure is capture by the used ISs.

Value

nothing

Author(s)

Henning Redestig

Examples

data(mix)
nfit <- normFit(mix, "crmn", factors="type", ncomp=2)
plot(nfit)
data(mix)
nfit <- normFit(mix, "crmn", factors="type", ncomp=2)
plot(nfit)

Accessor for the standards model

Description

Get the sFit

Usage

sFit(object, ...)

sFit(object, ...)
sFit(object, ...)

sFit(object, ...)

Arguments

`object`	an `nFit` object
`...`	not used

Value

the sFit is only defined for CRMN

Author(s)

Henning Redestig

Show method for nFit

Description

Show some basic information for an nFit model

Usage

## S4 method for signature 'nFit'
show(object)
## S4 method for signature 'nFit'
show(object)

Arguments

object

the nFit object

Value

prints some basic information

Author(s)

Henning Redestig

Examples

data(mix)
normFit(mix, "avg")
data(mix)
normFit(mix, "avg")

Show nfit

Description

Show method for nFit

Usage

show_nfit(object)
show_nfit(object)

Arguments

object

the nFit object

Value

prints some basic information

Author(s)

Henning Redestig

Accessor for the Internal Standards

Description

Subset an data set to only contain the labeled internal standards.

Usage

standards(object, standards=NULL, ...)
standards(object, standards=NULL, ...)

Arguments

`object`	an `ExpressionSet`, `matrix` or `data.frame`
`standards`	a logical vector indicating which rows are internal standards
`...`	not used

Value

subsetted dataset

Author(s)

Henning Redestig

Examples

data(mix)
standards(mix)
standards(exprs(mix), fData(mix)$tag == 'IS')
data(mix)
standards(mix)
standards(exprs(mix), fData(mix)$tag == 'IS')

Accessor for the Internal Standards

Description

Subset an data set to only contain the labeled internal standards.

Usage

standards_eset(object, where = "tag", what = "IS", ...)
standards_eset(object, where = "tag", what = "IS", ...)

Arguments

`object`	an `ExpressionSet`
`where`	Column index or name in fData which equals `what` for the ISs
`what`	What the column `where` equals for ISs
`...`	not used

Value

subsetted dataset

Author(s)

Henning Redestig

Examples

data(mix)
standards(mix)
fData(mix)$test <- fData(mix)$tag
standards(mix, where="test")
data(mix)
standards(mix)
fData(mix)$test <- fData(mix)$tag
standards(mix, where="test")

Accessor for the Internal Standards

Description

Subset an data set to only contain the labeled internal standards.

Usage

standards_other(object, standards, ...)
standards_other(object, standards, ...)

Arguments

`object`	an `matrix` or `data.frame`
`standards`	a logical vector indicating which rows are internal standards
`...`	not used

Value

subsetted dataset

Author(s)

Henning Redestig

Examples

data(mix)
standards(exprs(mix), fData(mix)$tag == 'IS')
data(mix)
standards(exprs(mix), fData(mix)$tag == 'IS')

Standards model

Description

Fit a model which describes the variation of the labeled internal standards from the biological factors.

Usage

standardsFit(object, factors, ncomp = NULL, lg = TRUE, fitfunc = lm, ...)
standardsFit(object, factors, ncomp = NULL, lg = TRUE, fitfunc = lm, ...)

Arguments

`object`	an `ExpressionSet` or a `matrix`. Note that if you pass a`matrix` have to specify the identity of the standards by passing the appropriate argument to `standards`.
`factors`	the biological factors described in the pheno data slot if `object` is an `ExpressionSet` or a design matrix if `object` is a `matrix`.
`ncomp`	number of PCA components to use. Determined by cross-validation if left `NULL`
`lg`	logical indicating that the data should be log transformed
`fitfunc`	the function that creates the model fit for normalization, must use the same interfaces as `lm`.
`...`	passed on to `Q2`, `pca` (if pcaMethods > 1.26.0), `standards` and `analytes`

Details

There is often unwanted variation in among the labeled internal standards which is related to the experimental factors due to overlapping peaks etc. This function fits a model that describes that overlapping variation using a scaled and centered PCA / multiple linear regression model. Scaling is done outside the PCA model.

Value

a list containing the PCA/MLR model, the recommended number of components for that model, the standard deviations and mean values and Q2/R2 for the fit.

Author(s)

Henning Redestig

Examples

data(mix)
sfit <- standardsFit(mix, "type", ncomp=3)
slplot(sfit$fit$pc)
## same thing
Y <- exprs(mix)
G <- model.matrix(~-1+mix$type)
isIS <- fData(mix)$tag == 'IS'
sfit <- standardsFit(Y, G, standards=isIS, ncomp=3)
data(mix)
sfit <- standardsFit(mix, "type", ncomp=3)
slplot(sfit$fit$pc)
## same thing
Y <- exprs(mix)
G <- model.matrix(~-1+mix$type)
isIS <- fData(mix)$tag == 'IS'
sfit <- standardsFit(Y, G, standards=isIS, ncomp=3)

Predict effect for new data (or get fitted data)

Description

Predicted values for the standards

Usage

standardsPred(model, newdata, factors, lg = TRUE, ...)
standardsPred(model, newdata, factors, lg = TRUE, ...)

Arguments

`model`	result from `standardsFit`
`newdata`	an `ExpressionSet` or `matrix` with new data (or the data used to fit the model to get the fitted data)
`factors`	the biological factors described in the pheno data slot if `object` is an `ExpressionSet` or a design matrix if `object` is a `matrix`.
`lg`	logical indicating that the data should be log transformed
`...`	passed on to `standards` and `analytes`

Details

There is often unwanted variation in among the labeled internal standards which is related to the experimental factors due to overlapping peaks etc. This predicts this effect given a model of the overlapping variance. The prediction is given by $\hat{X}_{IS}=X_{IS}-X_{IS}B$

Value

the corrected data

Author(s)

Henning Redestig

Examples

data(mix)
fullFit <- standardsFit(mix, "type", ncomp=3)
sfit <- standardsFit(mix[,-1], "type", ncomp=3)
pred <- standardsPred(sfit, mix[,1], "type")
cor(scores(sfit$fit$pc)[1,], scores(fullFit$fit$pc)[1,])
## could just as well have been done as
Y <- exprs(mix)
G <- model.matrix(~-1+mix$type)
isIS <- fData(mix)$tag == 'IS'
fullFit <- standardsFit(Y, G, ncomp=3, standards=isIS)
sfit    <- standardsFit(Y[,-1], G[-1,], ncomp=3,
                        standards=isIS)
pred <- standardsPred(sfit, Y[,1,drop=FALSE], G[1,,drop=FALSE], standards=isIS)
cor(scores(sfit$fit$pc)[1,], scores(fullFit$fit$pc)[1,])
data(mix)
fullFit <- standardsFit(mix, "type", ncomp=3)
sfit <- standardsFit(mix[,-1], "type", ncomp=3)
pred <- standardsPred(sfit, mix[,1], "type")
cor(scores(sfit$fit$pc)[1,], scores(fullFit$fit$pc)[1,])
## could just as well have been done as
Y <- exprs(mix)
G <- model.matrix(~-1+mix$type)
isIS <- fData(mix)$tag == 'IS'
fullFit <- standardsFit(Y, G, ncomp=3, standards=isIS)
sfit    <- standardsFit(Y[,-1], G[-1,], ncomp=3,
                        standards=isIS)
pred <- standardsPred(sfit, Y[,1,drop=FALSE], G[1,,drop=FALSE], standards=isIS)
cor(scores(sfit$fit$pc)[1,], scores(fullFit$fit$pc)[1,])

Normalize by sample weight

Description

Normalize samples by their weight (as in grams fresh weight)

Usage

weightnorm(object, weight = "weight", lg = FALSE)
weightnorm(object, weight = "weight", lg = FALSE)

Arguments

`object`	an `ExpressionSet`
`weight`	a string naming the pheno data column with the weight or a numeric vector with one weight value per sample.
`lg`	is the assay data already on the log-scale or not. If lg, the weight value is also log-transformed and subtraction is used instead of division.

Details

Normalize each sample by dividing by the loaded sample weight. The weight argument is takes from the pheno data (or given as numerical vector with one value per sample). Missing values are not tolerated.

Value

the normalized expression set

Author(s)

Henning Redestig

Examples

data(mix)
w <- runif(ncol(mix),1, 1.3)
weightnorm(mix, w)
data(mix)
w <- runif(ncol(mix),1, 1.3)
weightnorm(mix, w)

Package 'crmn'

Help Index

Accessor for the analytes

Description

Usage

Arguments

Value

Author(s)

Examples

Accessor for the analytes

Description

Usage

Arguments

Value

Author(s)

Examples

Accessor for the analytes

Description

Usage

Arguments

Value

Author(s)

Examples

CRMN

Description

Details

Author(s)

Drop unused levels

Description

Usage

Arguments

Author(s)

Examples

Make X

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Accessor for the method

Description

Usage

Arguments

Value

Author(s)

Matrix safe accessor of expression slot

Description

Usage

Arguments

Value

Author(s)

Examples

Accessor

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Dilution mixture dataset.

Description

Usage

Details

Author(s)

Examples

Accessor for the model

Description

Usage

Arguments

Value

Author(s)

Normalization model

Description

Author(s)

Normalize a metabolomics dataset

Description

Usage