Package 'calmate' reference manual

Title:	Improved Allele-Specific Copy Number of SNP Microarrays for Downstream Segmentation
Description:	The CalMaTe method calibrates preprocessed allele-specific copy number estimates (ASCNs) from DNA microarrays by controlling for single-nucleotide polymorphism-specific allelic crosstalk. The resulting ASCNs are on average more accurate, which increases the power of segmentation methods for detecting changes between copy number states in tumor studies including copy neutral loss of heterozygosity. CalMaTe applies to any ASCNs regardless of preprocessing method and microarray technology, e.g. Affymetrix and Illumina.
Authors:	Maria Ortiz [aut, ctb], Ander Aramburu [ctb], Henrik Bengtsson [aut, cre, cph], Pierre Neuvial [aut, ctb], Angel Rubio [aut, ctb]
Maintainer:	Henrik Bengtsson <[email protected]>
License:	LGPL (>= 2.1)
Version:	0.13.0
Built:	2024-08-24 05:00:20 UTC
Source:	https://github.com/HenrikBengtsson/calmate

Package calmate

Description

The CalMaTe method calibrates preprocessed allele-specific copy number estimates (ASCNs) from DNA microarrays by controlling for single-nucleotide polymorphism-specific allelic crosstalk. The resulting ASCNs are on average more accurate, which increases the power of segmentation methods for detecting changes between copy number states in tumor studies including copy neutral loss of heterozygosity. CalMaTe applies to any ASCNs regardless of preprocessing method and microarray technology, e.g. Affymetrix and Illumina.

Requirements

This package depends on a set of packages that are all available via CRAN. It has been tested and verified to run on all common operating systems on which R runs, including Linux, Windows and OSX.

Installation and updates

To install this package, do install.packages("calmate").

To get started

To process SNP and non-polymorphic signals, see calmateByTotalAndFracB(). If you are working solely with SNP signals, calmateByThetaAB() is also available, but we recommend the former.
For processing data in the aroma framework, see CalMaTeCalibration.

How to cite

Please cite [1] when using CalMaTe.

License

LGPL (>= 2.1).

Author(s)

Maria Ortiz [aut, ctb], Ander Aramburu [ctb], Henrik Bengtsson [aut, cre, cph], Pierre Neuvial [aut, ctb], Angel Rubio [aut, ctb].

References

[1] M. Ortiz-Estevez, A. Aramburu, H. Bengtsson, P. Neuvial and A. Rubio, CalMaTe: A method and software to improve allele-specific copy number of SNP arrays for downstream segmentation, Bioinformatics, 2012 [PMC3381965].

Normalize allele-specific copy numbers (CA,CB)

Description

Normalize allele-specific copy numbers (CA,CB).

Usage

## S3 method for class 'array'
calmateByThetaAB(data, references=NULL, ..., truncate=FALSE, refAvgFcn=NULL,
  flavor=c("v2", "v1"), verbose=FALSE)
## S3 method for class 'array'
calmateByThetaAB(data, references=NULL, ..., truncate=FALSE, refAvgFcn=NULL,
  flavor=c("v2", "v1"), verbose=FALSE)

Arguments

`data`	An Jx2xI `numeric` `array`, where J is the number of SNPs, 2 is the number of alleles, and I is the number of samples.
`references`	An index `vector` in [1,I] or a `logical` `vector` of length I specifying which samples are used when calculating the reference signals. If `NULL`, all samples are used. At least 3 samples.
`...`	Additional arguments passed to the internal fit function `fitCalMaTeInternal`.
`truncate`	If `TRUE`, final ASCNs are forced to be non-negative while preserving the total CNs.
`refAvgFcn`	(optional) A `function` that takes a JxI `numeric` `matrix` an argument `na.rm` and returns a `numeric` `vector` of length J. It should calculate some type of average for each of the J rows, e.g. `rowMedians`. If specified, then the total copy numbers of the calibrated ASCNs are standardized toward (twice) the average of the total copy numbers of the calibrated reference ASCNs.
`flavor`	A `character` string specifying which flavor of the CalMaTe algorithm to use for fitting the model.
`verbose`	See `Verbose`.

Value

Returns an Jx2xI numeric array with the same dimension names as argument data.

Flavors

For backward compatibility, we try to keep all major versions of the CalMaTe algorithm available. Older versions can be used by specifying argument flavor. The default flavor is v2. For more information about the different flavors, see fitCalMaTeInternal.

References

Examples

# Load example (thetaA,thetaB) signals
path <- system.file("exData", package="calmate");
theta <- loadObject("thetaAB,100x2x40.Rbin", path=path);

# Calculate (CA,CB)
thetaR <- matrixStats::rowMedians(theta[,"A",] + theta[,"B",], na.rm=TRUE);
C <- 2*theta/thetaR;

# Calibrate (CA,CB) by CalMaTe
CC <- calmateByThetaAB(theta);

# Plot two "random" arrays
Clim <- c(0,4);
subplots(4, ncol=2, byrow=FALSE);
for (ii in c(1,5)) {
  sampleName <- dimnames(C)[[3]][ii];
  sampleLabel <- sprintf("Sample #%d ('%s')", ii, sampleName);
  plot(C[,,ii], xlim=Clim, ylim=Clim);
  title(main=sampleLabel);
  plot(CC[,,ii], xlim=Clim, ylim=Clim);
  title(main=sprintf("%s\ncalibrated", sampleLabel));
}
# Load example (thetaA,thetaB) signals
path <- system.file("exData", package="calmate");
theta <- loadObject("thetaAB,100x2x40.Rbin", path=path);

# Calculate (CA,CB)
thetaR <- matrixStats::rowMedians(theta[,"A",] + theta[,"B",], na.rm=TRUE);
C <- 2*theta/thetaR;

# Calibrate (CA,CB) by CalMaTe
CC <- calmateByThetaAB(theta);

# Plot two "random" arrays
Clim <- c(0,4);
subplots(4, ncol=2, byrow=FALSE);
for (ii in c(1,5)) {
  sampleName <- dimnames(C)[[3]][ii];
  sampleLabel <- sprintf("Sample #%d ('%s')", ii, sampleName);
  plot(C[,,ii], xlim=Clim, ylim=Clim);
  title(main=sampleLabel);
  plot(CC[,,ii], xlim=Clim, ylim=Clim);
  title(main=sprintf("%s\ncalibrated", sampleLabel));
}

Normalize allele-specific copy numbers (total,fracB)

Description

Normalize allele-specific copy numbers (total,fracB), where total is the total (non-polymorphic) signal and fracB is the allele B fraction. It is only loci with a non-missing (NA) fracB value that are considered to be SNPs and normalized by CalMaTe. The other loci are left untouched.

Usage

## S3 method for class 'array'
calmateByTotalAndFracB(data, references=NULL, ..., refAvgFcn=NULL, verbose=FALSE)
## S3 method for class 'array'
calmateByTotalAndFracB(data, references=NULL, ..., refAvgFcn=NULL, verbose=FALSE)

Arguments

`data`	An Jx2xI `numeric` `array`, where J is the number of loci, 2 is total and fracB (in that order, if unnamed), and I is the number of samples.
`references`	A `logical` or `numeric` `vector` specifying which samples should be used as the reference set. By default, all samples are considered. If not NULL at least 3 samples.
`...`	Additional arguments passed to `*calmateByThetaAB()`.
`refAvgFcn`	(optional) A `function` that takes a JxI `numeric` `matrix` an argument `na.rm` and returns a `numeric` `vector` of length J. It should calculate some type of average for each of the J rows, e.g. `rowMedians`. If specified, then the total copy numbers of the calibrated ASCNs are standardized toward (twice) the average of the total copy numbers of the calibrated reference ASCNs.
`verbose`	See `Verbose`.

Value

Returns an Jx2xI numeric array with the same dimension names as argument data.

References

Examples

# Load example (thetaA,thetaB) signals
path <- system.file("exData", package="calmate");
theta <- loadObject("thetaAB,100x2x40.Rbin", path=path);

# Transform to (total,fracB) signals
data <- thetaAB2TotalAndFracB(theta);

# Calibrate (total,fracB) by CalMaTe
dataC <- calmateByTotalAndFracB(data);

# Calculate copy-number ratios
theta <- data[,"total",];
thetaR <- matrixStats::rowMedians(theta, na.rm=TRUE);
data[,"total",] <- 2*theta/thetaR;

# Plot two "random" arrays
Clim <- c(0,4);
Blim <- c(0,1);
subplots(4, ncol=2, byrow=FALSE);
for (ii in c(1,5)) {
  sampleName <- dimnames(data)[[3]][ii];
  sampleLabel <- sprintf("Sample #%d ('%s')", ii, sampleName);
  plot(data[,,ii], xlim=Clim, ylim=Blim);
  title(main=sampleLabel);
  plot(dataC[,,ii], xlim=Clim, ylim=Blim);
  title(main=sprintf("%s\ncalibrated", sampleLabel));
}


# Assert that it also works with a single unit
dummy <- calmateByTotalAndFracB(data[1,,,drop=FALSE]);
stopifnot(length(dim(dummy)) == 3);
# Load example (thetaA,thetaB) signals
path <- system.file("exData", package="calmate");
theta <- loadObject("thetaAB,100x2x40.Rbin", path=path);

# Transform to (total,fracB) signals
data <- thetaAB2TotalAndFracB(theta);

# Calibrate (total,fracB) by CalMaTe
dataC <- calmateByTotalAndFracB(data);

# Calculate copy-number ratios
theta <- data[,"total",];
thetaR <- matrixStats::rowMedians(theta, na.rm=TRUE);
data[,"total",] <- 2*theta/thetaR;

# Plot two "random" arrays
Clim <- c(0,4);
Blim <- c(0,1);
subplots(4, ncol=2, byrow=FALSE);
for (ii in c(1,5)) {
  sampleName <- dimnames(data)[[3]][ii];
  sampleLabel <- sprintf("Sample #%d ('%s')", ii, sampleName);
  plot(data[,,ii], xlim=Clim, ylim=Blim);
  title(main=sampleLabel);
  plot(dataC[,,ii], xlim=Clim, ylim=Blim);
  title(main=sprintf("%s\ncalibrated", sampleLabel));
}


# Assert that it also works with a single unit
dummy <- calmateByTotalAndFracB(data[1,,,drop=FALSE]);
stopifnot(length(dim(dummy)) == 3);

The CalMaTeCalibration class

Description

Package: calmate
Class CalMaTeCalibration

Object
~~|
~~+--ParametersInterface
~~~~~~~|
~~~~~~~+--CalMaTeCalibration

Directly known subclasses:

public static class CalMaTeCalibration
extends ParametersInterface

This class represents the CalMaTe method [1], which corrects for SNP effects in allele-specific copy-number estimates (ASCNs).

Usage

CalMaTeCalibration(data=NULL, tags="*", references=NULL, flavor=c("v2", "v1"), ...)
CalMaTeCalibration(data=NULL, tags="*", references=NULL, flavor=c("v2", "v1"), ...)

Arguments

`data`	A named `list` with data set named `"total"` and `"fracB"` where the former should be of class `AromaUnitTotalCnBinarySet` and the latter of class `AromaUnitFracBCnBinarySet`. The two data sets must be for the same chip type, have the same number of samples and the same sample names.
`tags`	Tags added to the output data sets.
`references`	An optional `numeric` `vector` specifying which samples should be as reference samples for estimating the model parameters. If `NULL`, all samples are used.
`flavor`	A `character` string specifying which flavor of the CalMaTe algorithm to use for fitting the model. See `fitCalMaTeInternal` for details.
`...`	Additional arguments passed to `calmateByTotalAndFracB`().

Fields and Methods

Methods:

	`findUnitsTodo`	-
	`getDataSets`	-
	`getFullName`	-
	`getName`	-
	`getOutputDataSets`	-
	`getPath`	-
	`getReferences`	-
	`getRootPath`	-
	`getTags`	-
	`nbrOfFiles`	-
	`process`	-
	`setTags`	-

Methods inherited from ParametersInterface:
getParameterSets, getParameters, getParametersAsString

Methods inherited from Object:
$, $<-, [[, [[<-, as.character, attach, attachLocally, clearCache, clearLookupCache, clone, detach, equals, extend, finalize, getEnvironment, getFieldModifier, getFieldModifiers, getFields, getInstantiationTime, getStaticInstance, hasField, hashCode, ll, load, names, objectSize, print, save, asThis

Reference samples

In order to estimate the calibration parameters, the model assumes that, for any given SNP, there are a majority of samples that are diploid at that SNP. Note that it does not have to be the same set of samples for all SNPs.

By using argument references, it is possible so specify which samples should be used when estimating the calibration parameters. This is useful when for instance there are several tumor samples with unknown properties as well as a set of normal samples that can be assumed to be diploid.

Theoretical, a minimum of three reference samples are needed in order for the model to be identifiable. If less, an error is thrown. However, in practice more reference samples should be used, that is, in the order of at least 6-10 reference samples with a diverse set of genotypes.

Flavors

For backward compatibility, we try to keep all major versions of the CalMaTe algorithm available. Older versions can be used by specifying argument flavor. For more information about the different flavors, see fitCalMaTeInternal.

References

Examples

## Not run: 
 
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# CRMAv2 - Preprocess raw Affymetrix data
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
library("aroma.affymetrix");  # Needed for CRMAv2
dataSet <- "Affymetrix_2006-TumorNormal";
chipType <- "Mapping250K_Nsp";
dsList <- doCRMAv2(dataSet, chipType=chipType, combineAlleles=FALSE,
                                             plm="RmaCnPlm", verbose=-10);
print(dsList);


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# CalMaTe - Post-calibration of ASCNs estimates
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
asn <- CalMaTeCalibration(dsList);
print(asn);

# For speed issues, we will here only process loci on Chromosome 17.
chr <- 17;
ugp <- getAromaUgpFile(dsList$total);
units <- getUnitsOnChromosome(ugp, chr);

dsNList <- process(asn, units=units, verbose=verbose);
print(dsNList);


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Plot allele B fractions (before and after)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Sample #1 and Chromosome 17
ii <- 1;

# Extract raw (TCN,BAF)
df <- getFile(dsList$total, ii);
dfR <- getAverageFile(dsList$total, verbose=verbose);
gamma <- extractRawCopyNumbers(df, logBase=NULL, chromosome=chr);
gammaR <- extractRawCopyNumbers(dfR, logBase=NULL, chromosome=chr);
gamma <- 2*divideBy(gamma, gammaR);
df <- getFile(dsList$fracB, ii);
beta <- extractRawAlleleBFractions(df, chromosome=chr);

# Extract calibrated (TCN,BAF)
dfN <- getFile(dsNList$fracB, ii);
betaN <- extractRawAlleleBFractions(dfN, chromosome=chr);
dfN <- getFile(dsNList$total, ii);
gammaN <- extractRawCopyNumbers(dfN, logBase=NULL, chromosome=chr);

# Plot
subplots(4, ncol=2, byrow=FALSE);
plot(beta);
title(sprintf("%s", getName(beta)));
plot(gamma);
plot(betaN);
title(sprintf("%s (CalMaTe)", getName(betaN)));
plot(gammaN);


## End(Not run)## Not run: 
 
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# CRMAv2 - Preprocess raw Affymetrix data
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
library("aroma.affymetrix");  # Needed for CRMAv2
dataSet <- "Affymetrix_2006-TumorNormal";
chipType <- "Mapping250K_Nsp";
dsList <- doCRMAv2(dataSet, chipType=chipType, combineAlleles=FALSE,
                                             plm="RmaCnPlm", verbose=-10);
print(dsList);


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# CalMaTe - Post-calibration of ASCNs estimates
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
asn <- CalMaTeCalibration(dsList);
print(asn);

# For speed issues, we will here only process loci on Chromosome 17.
chr <- 17;
ugp <- getAromaUgpFile(dsList$total);
units <- getUnitsOnChromosome(ugp, chr);

dsNList <- process(asn, units=units, verbose=verbose);
print(dsNList);


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Plot allele B fractions (before and after)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Sample #1 and Chromosome 17
ii <- 1;

# Extract raw (TCN,BAF)
df <- getFile(dsList$total, ii);
dfR <- getAverageFile(dsList$total, verbose=verbose);
gamma <- extractRawCopyNumbers(df, logBase=NULL, chromosome=chr);
gammaR <- extractRawCopyNumbers(dfR, logBase=NULL, chromosome=chr);
gamma <- 2*divideBy(gamma, gammaR);
df <- getFile(dsList$fracB, ii);
beta <- extractRawAlleleBFractions(df, chromosome=chr);

# Extract calibrated (TCN,BAF)
dfN <- getFile(dsNList$fracB, ii);
betaN <- extractRawAlleleBFractions(dfN, chromosome=chr);
dfN <- getFile(dsNList$total, ii);
gammaN <- extractRawCopyNumbers(dfN, logBase=NULL, chromosome=chr);

# Plot
subplots(4, ncol=2, byrow=FALSE);
plot(beta);
title(sprintf("%s", getName(beta)));
plot(gamma);
plot(betaN);
title(sprintf("%s (CalMaTe)", getName(betaN)));
plot(gammaN);


## End(Not run)

Package 'calmate'

Help Index

Package calmate

Description

Requirements

Installation and updates

To get started

How to cite

License

Author(s)

References

Normalize allele-specific copy numbers (CA,CB)

Description

Usage

Arguments

Value

Flavors

References

See Also

Examples

Normalize allele-specific copy numbers (total,fracB)

Description

Usage

Arguments

Value

References

See Also

Examples

The CalMaTeCalibration class

Description

Usage

Arguments

Fields and Methods

Reference samples

Flavors

References

See Also

Examples