Title: | Analysis of Parent-Specific DNA Copy Numbers |
---|---|
Description: | Segmentation of allele-specific DNA copy number data and detection of regions with abnormal copy number within each parental chromosome. Both tumor-normal paired and tumor-only analyses are supported. |
Authors: | Henrik Bengtsson [aut, cre, cph], Pierre Neuvial [aut], Venkatraman E. Seshan [aut], Adam B. Olshen [aut], Paul T. Spellman [aut], Richard A. Olshen [aut], Frank E Harrell Jr [ctb] (Weighted quantile estimator adopted from Hmisc::wtd.quantile()) |
Maintainer: | Henrik Bengtsson <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.67.0 |
Built: | 2024-12-13 04:58:32 UTC |
Source: | https://github.com/HenrikBengtsson/PSCBS |
Segmentation of allele-specific DNA copy number data and detection of regions with abnormal copy number within each parental chromosome. Both tumor-normal paired and tumor-only analyses are supported..
This package should be considered to be in an alpha or beta phase. You should expect the API to be changing over time.
To install this package, use install.packages("PSCBS")
.
To get started, see:
Vignettes 'Parent-specific copy-number segmentation using Paired PSCBS' and 'Total copy-number segmentation using CBS'.
segmentByCBS
() - segments total copy-numbers, or any
other unimodal genomic signals, using the CBS method [3,4].
segmentByPairedPSCBS
() - segments allele-specific
tumor signal from a tumor with a matched normal
using the Paired PSCBS method [1,2].
segmentByNonPairedPSCBS
() - segments allele-specific
tumor signal from a tumor without a matched normal
using the Non-Paired PSCBS method adopted from [1,2].
Please use [1] and [2] to cite when using Paired PSCBS, and [3] and [4] when using CBS. When using Non-Paired PSCBS, please cite [1] and [2] as well.
GPL (>= 2).
Henrik Bengtsson
[1] A.B. Olshen, H. Bengtsson, P. Neuvial, P.T. Spellman, R.A. Olshen, V.E. Seshan, Parent-specific copy number in paired tumor-normal studies using circular binary segmentation, Bioinformatics, 2011
[2] H. Bengtsson, P. Neuvial and T.P. Speed, TumorBoost: Normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays, BMC Bioinformatics, 2010
[3] A.B. Olshen, E.S. Venkatraman (aka Venkatraman E. Seshan), R. Lucito and M. Wigler, Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics, 2004
[4] E.S. Venkatraman and A.B. Olshen, A faster circular binary segmentation algorithm for the analysis of array CGH data, Bioinformatics, 2007
Calls/drops single-locus outliers along the genome that have a signal that differ significantly from the neighboring loci.
## Default S3 method: callSegmentationOutliers(y, chromosome=0, x=NULL, method="DNAcopy::smooth.CNA", ..., verbose=FALSE) ## S3 method for class 'data.frame' callSegmentationOutliers(y, ...) ## Default S3 method: dropSegmentationOutliers(y, ...) ## S3 method for class 'data.frame' dropSegmentationOutliers(y, ...)
## Default S3 method: callSegmentationOutliers(y, chromosome=0, x=NULL, method="DNAcopy::smooth.CNA", ..., verbose=FALSE) ## S3 method for class 'data.frame' callSegmentationOutliers(y, ...) ## Default S3 method: dropSegmentationOutliers(y, ...) ## S3 method for class 'data.frame' dropSegmentationOutliers(y, ...)
y |
|
chromosome |
(Optional) An |
x |
Optional |
method |
A |
... |
Additional arguments passed to internal outlier detection method. |
verbose |
See |
callSegmentationOutliers()
returns a logical
vector
of length J.
dropSegmentationOutliers()
returns an object of the same type
as argument y
, where the signals for which outliers were called
have been set to NA
.
Signals as well as genomic positions may contain missing
values, i.e. NA
s or NaN
s. By definition, these cannot
be outliers.
Henrik Bengtsson
Internally smooth.CNA
is utilized to identify
the outliers.
A CBS object holds results from the Circular Binary Segmentation (CBS) method for one sample for one or more chromosomes.
Package: PSCBS
Class CBS
list
~~|
~~+--
AbstractCBS
~~~~~~~|
~~~~~~~+--
CBS
Directly known subclasses:
public abstract class CBS
extends AbstractCBS
CBS(...)
CBS(...)
... |
Arguments passed to the constructor of |
Methods:
as |
- | |
c |
- | |
estimateStandardDeviation |
- | |
plotTracks |
- | |
pruneBySdUndo |
- | |
segmentByCBS |
- | |
seqOfSegmentsByDP |
- | |
writeSegments |
- | |
Methods inherited from AbstractCBS:
adjustPloidyScale, all.equal, as.data.frame, clearCalls, drawChangePoints, drawKnownSegments, dropChangePoint, dropChangePoints, dropRegion, dropRegions, extractCNs, extractChromosome, extractChromosomes, extractRegions, extractSegments, extractWIG, getChangePoints, getChromosomeOffsets, getChromosomeRanges, getChromosomes, getLocusData, getLocusSignalNames, getMeanEstimators, getSampleName, getSegmentSizes, getSegmentTrackPrefixes, getSegments, mergeThreeSegments, mergeTwoSegments, nbrOfChangePoints, nbrOfChromosomes, nbrOfLoci, nbrOfSegments, normalizeTotalCNs, ploidy, ploidy<-, plotTracks, print, pruneByDP, pruneByHClust, renameChromosomes, report, resegment, resetSegments, sampleCNs, sampleName, sampleName<-, seqOfSegmentsByDP, setLocusData, setMeanEstimators, setPloidy, setSampleName, setSegments, shiftTCN, tileChromosomes, updateMeans, writeWIG
Methods inherited from list:
Ops,nonStructure,vector-method, Ops,structure,vector-method, Ops,vector,nonStructure-method, Ops,vector,structure-method, all.equal, as.data.frame, attachLocally, callHooks, coerce,ANY,list-method, relist, type.convert, within
A CBS object is similar to DNAcopy objects with the major difference that a CBS object holds only one sample, whereas a DNAcopy object can hold more than one sample.
The segmentByCBS
() method returns an object of this class.
Henrik Bengtsson
Identifies gaps of a genome where there exist no observations.
## Default S3 method: findLargeGaps(chromosome=NULL, x, minLength, resolution=1L, ...)
## Default S3 method: findLargeGaps(chromosome=NULL, x, minLength, resolution=1L, ...)
chromosome |
(Optional) An |
x |
|
minLength |
A positive |
resolution |
A non-negative |
... |
Not used. |
Returns data.frame
zero or more rows and with columns
chromosome
(if given), start
, stop
,
and length
.
Henrik Bengtsson
Use gapsToSegments
() to turn the set of identified gaps into
the complementary set of segments such that they can be passed
to segmentByCBS
(), segmentByPairedPSCBS
() and
segmentByNonPairedPSCBS
() via argument knownSegments
.
Gets the genomic segments that are complementary to the gaps, with default chromosome boundaries being -Inf
and +Inf
.
## S3 method for class 'data.frame' gapsToSegments(gaps, resolution=1L, minLength=0L, dropGaps=FALSE, ...)
## S3 method for class 'data.frame' gapsToSegments(gaps, resolution=1L, minLength=0L, dropGaps=FALSE, ...)
gaps |
A |
resolution |
A non-negative |
minLength |
Minimum length of segments to be kept. |
dropGaps |
If |
... |
Not used. |
Returns data.frame
of least one row with columns chromosome
if that argument is given), start
, stop
and length
.
The segments are ordered along the genome.
Henrik Bengtsson
Package: PSCBS
Class NonPairedPSCBS
list
~~|
~~+--
AbstractCBS
~~~~~~~|
~~~~~~~+--
PSCBS
~~~~~~~~~~~~|
~~~~~~~~~~~~+--
NonPairedPSCBS
Directly known subclasses:
public abstract class NonPairedPSCBS
extends PSCBS
A NonPairedPSCBS is an object containing the results from the Non-paired PSCBS method.
NonPairedPSCBS(fit=list(), ...)
NonPairedPSCBS(fit=list(), ...)
fit |
A |
... |
Not used. |
Methods:
No methods defined.
Methods inherited from PSCBS:
as.data.frame, c, drawChangePoints, extractChromosomes, extractWIG, getLocusData, getLocusSignalNames, getSegmentTrackPrefixes, isLocallyPhased, isSegmentSplitter, normalizeTotalCNs, writeSegments
Methods inherited from AbstractCBS:
adjustPloidyScale, all.equal, as.data.frame, clearCalls, drawChangePoints, drawKnownSegments, dropChangePoint, dropChangePoints, dropRegion, dropRegions, extractCNs, extractChromosome, extractChromosomes, extractRegions, extractSegments, extractWIG, getChangePoints, getChromosomeOffsets, getChromosomeRanges, getChromosomes, getLocusData, getLocusSignalNames, getMeanEstimators, getSampleName, getSegmentSizes, getSegmentTrackPrefixes, getSegments, mergeThreeSegments, mergeTwoSegments, nbrOfChangePoints, nbrOfChromosomes, nbrOfLoci, nbrOfSegments, normalizeTotalCNs, ploidy, ploidy<-, plotTracks, print, pruneByDP, pruneByHClust, renameChromosomes, report, resegment, resetSegments, sampleCNs, sampleName, sampleName<-, seqOfSegmentsByDP, setLocusData, setMeanEstimators, setPloidy, setSampleName, setSegments, shiftTCN, tileChromosomes, updateMeans, writeWIG
Methods inherited from list:
Ops,nonStructure,vector-method, Ops,structure,vector-method, Ops,vector,nonStructure-method, Ops,vector,structure-method, all.equal, as.data.frame, attachLocally, callHooks, coerce,ANY,list-method, relist, type.convert, within
Henrik Bengtsson
The segmentByNonPairedPSCBS
() method returns an object of this class.
Package: PSCBS
Class PairedPSCBS
list
~~|
~~+--
AbstractCBS
~~~~~~~|
~~~~~~~+--
PSCBS
~~~~~~~~~~~~|
~~~~~~~~~~~~+--
PairedPSCBS
Directly known subclasses:
public abstract class PairedPSCBS
extends PSCBS
A PairedPSCBS is an object containing the results from the Paired PSCBS method.
PairedPSCBS(fit=list(), ...)
PairedPSCBS(fit=list(), ...)
fit |
A |
... |
Not used. |
Methods:
callAB |
- | |
callCopyNeutral |
- | |
callGNL |
- | |
callGNLByTCNofAB |
- | |
callGainNeutralLoss |
- | |
callLOH |
- | |
callNTCN |
- | |
callROH |
- | |
estimateDeltaAB |
- | |
estimateDeltaLOH |
- | |
estimateKappa |
- | |
extractCNs |
- | |
hasBootstrapSummaries |
- | |
plotTracks |
- | |
segmentByNonPairedPSCBS |
- | |
segmentByPairedPSCBS |
- | |
seqOfSegmentsByDP |
- | |
Methods inherited from PSCBS:
as.data.frame, c, drawChangePoints, extractChromosomes, extractWIG, getLocusData, getLocusSignalNames, getSegmentTrackPrefixes, isLocallyPhased, isSegmentSplitter, normalizeTotalCNs, writeSegments
Methods inherited from AbstractCBS:
adjustPloidyScale, all.equal, as.data.frame, clearCalls, drawChangePoints, drawKnownSegments, dropChangePoint, dropChangePoints, dropRegion, dropRegions, extractCNs, extractChromosome, extractChromosomes, extractRegions, extractSegments, extractWIG, getChangePoints, getChromosomeOffsets, getChromosomeRanges, getChromosomes, getLocusData, getLocusSignalNames, getMeanEstimators, getSampleName, getSegmentSizes, getSegmentTrackPrefixes, getSegments, mergeThreeSegments, mergeTwoSegments, nbrOfChangePoints, nbrOfChromosomes, nbrOfLoci, nbrOfSegments, normalizeTotalCNs, ploidy, ploidy<-, plotTracks, print, pruneByDP, pruneByHClust, renameChromosomes, report, resegment, resetSegments, sampleCNs, sampleName, sampleName<-, seqOfSegmentsByDP, setLocusData, setMeanEstimators, setPloidy, setSampleName, setSegments, shiftTCN, tileChromosomes, updateMeans, writeWIG
Methods inherited from list:
Ops,nonStructure,vector-method, Ops,structure,vector-method, Ops,vector,nonStructure-method, Ops,vector,structure-method, all.equal, as.data.frame, attachLocally, callHooks, coerce,ANY,list-method, relist, type.convert, within
Henrik Bengtsson
The segmentByPairedPSCBS
() method returns an object of this class.
Package: PSCBS
Class PSCBS
list
~~|
~~+--
AbstractCBS
~~~~~~~|
~~~~~~~+--
PSCBS
Directly known subclasses:
NonPairedPSCBS, PairedPSCBS
public abstract class PSCBS
extends AbstractCBS
A PSCBS is an object containing results from parent-specific copy-number (PSCN) segmentation.
PSCBS(fit=list(), ...)
PSCBS(fit=list(), ...)
fit |
A |
... |
Not used. |
Methods:
c |
- | |
isLocallyPhased |
- | |
normalizeTotalCNs |
- | |
writeSegments |
- | |
Methods inherited from AbstractCBS:
adjustPloidyScale, all.equal, as.data.frame, clearCalls, drawChangePoints, drawKnownSegments, dropChangePoint, dropChangePoints, dropRegion, dropRegions, extractCNs, extractChromosome, extractChromosomes, extractRegions, extractSegments, extractWIG, getChangePoints, getChromosomeOffsets, getChromosomeRanges, getChromosomes, getLocusData, getLocusSignalNames, getMeanEstimators, getSampleName, getSegmentSizes, getSegmentTrackPrefixes, getSegments, mergeThreeSegments, mergeTwoSegments, nbrOfChangePoints, nbrOfChromosomes, nbrOfLoci, nbrOfSegments, normalizeTotalCNs, ploidy, ploidy<-, plotTracks, print, pruneByDP, pruneByHClust, renameChromosomes, report, resegment, resetSegments, sampleCNs, sampleName, sampleName<-, seqOfSegmentsByDP, setLocusData, setMeanEstimators, setPloidy, setSampleName, setSegments, shiftTCN, tileChromosomes, updateMeans, writeWIG
Methods inherited from list:
Ops,nonStructure,vector-method, Ops,structure,vector-method, Ops,vector,nonStructure-method, Ops,vector,structure-method, all.equal, as.data.frame, attachLocally, callHooks, coerce,ANY,list-method, relist, type.convert, within
Henrik Bengtsson
Segment genomic signals using the CBS method of the DNAcopy package.
This is a convenient low-level wrapper for the DNAcopy::segment()
method. It is intended to be applied to a sample at the time.
For more details on the Circular Binary Segmentation (CBS) method
see [1,2].
## Default S3 method: segmentByCBS(y, chromosome=0L, x=NULL, index=seq_along(y), w=NULL, undo=0, avg=c("mean", "median"), ..., joinSegments=TRUE, knownSegments=NULL, seed=NULL, verbose=FALSE)
## Default S3 method: segmentByCBS(y, chromosome=0L, x=NULL, index=seq_along(y), w=NULL, undo=0, avg=c("mean", "median"), ..., joinSegments=TRUE, knownSegments=NULL, seed=NULL, verbose=FALSE)
y |
|
chromosome |
Optional |
x |
Optional |
index |
An optional |
w |
|
undo |
A non-negative |
avg |
A |
... |
Additional arguments passed to the |
joinSegments |
If |
knownSegments |
Optional |
seed |
An (optional) |
verbose |
See |
Internally segment
of DNAcopy is used to
segment the signals.
This segmentation method support weighted segmentation.
Returns a CBS
object.
The DNAcopy::segment()
implementation of CBS uses approximation
through random sampling for some estimates. Because of this,
repeated calls using the same signals may result in slightly
different results, unless the random seed is set/fixed.
Signals may contain missing values (NA
or NaN
), but not
infinite values (+/-Inf
). Loci with missing-value signals
are preserved and keep in the result.
Likewise, genomic positions may contain missing values.
However, if they do, such loci are silently excluded before
performing the segmentation, and are not kept in the results.
The mapping between the input locus-level data and ditto of
the result can be inferred from the index
column of
the locus-level data of the result.
None of the input data may have infinite values,
i.e. -Inf
or +Inf
. If so, an informative error is thrown.
Henrik Bengtsson
[1] A.B. Olshen, E.S. Venkatraman (aka Venkatraman E. Seshan), R. Lucito and M. Wigler, Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics, 2004
[2] E.S. Venkatraman and A.B. Olshen, A faster circular binary segmentation algorithm for the analysis of array CGH data, Bioinformatics, 2007
To segment allele-specific tumor copy-number signals from a tumor
with a matched normal, see segmentByPairedPSCBS
().
For the same without a matched normal,
see segmentByNonPairedPSCBS
().
It is also possible to prune change points after segmentation (with
identical results) using
pruneBySdUndo()
.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Simulating copy-number data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - set.seed(0xBEEF) # Number of loci J <- 1000 mu <- double(J) mu[200:300] <- mu[200:300] + 1 mu[350:400] <- NA # centromere mu[650:800] <- mu[650:800] - 1 eps <- rnorm(J, sd=1/2) y <- mu + eps x <- sort(runif(length(y), max=length(y))) * 1e5 w <- runif(J) w[650:800] <- 0.001 # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Segmentation # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- segmentByCBS(y, x=x) print(fit) plotTracks(fit) xlab <- "Position (Mb)" ylim <- c(-3,3) xMb <- x/1e6 plot(xMb,y, pch=20, col="#aaaaaa", xlab=xlab, ylim=ylim) drawLevels(fit, col="red", lwd=2, xScale=1e-6) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # TESTS # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- segmentByCBS(y, x=x, seed=0xBEEF) print(fit) ## id chromosome start end nbrOfLoci mean ## 1 y 0 55167.82 20774251 201 0.0164 ## 2 y 0 20774250.85 29320105 99 1.0474 ## 3 y 0 29320104.86 65874675 349 -0.0227 ## 4 y 0 65874675.06 81348129 151 -1.0813 ## 5 y 0 81348129.20 99910827 200 -0.0612 # Test #1: Reverse the ordering and segment fitR <- segmentByCBS(rev(y), x=rev(x), seed=0xBEEF) # Sanity check stopifnot(all.equal(getSegments(fitR), getSegments(fit))) # Sanity check stopifnot(all.equal(rev(getLocusData(fitR)$index), getLocusData(fit)$index)) # Test #2: Reverse, but preserve ordering of 'data' object fitRP <- segmentByCBS(rev(y), x=rev(x), preserveOrder=TRUE) stopifnot(all.equal(getSegments(fitRP), getSegments(fit))) # (Test #3: Change points inbetween data points at the same locus) x[650:654] <- x[649] fitC <- segmentByCBS(rev(y), x=rev(x), preserveOrder=TRUE, seed=0xBEEF) # Test #4: Allow for some missing values in signals y[450] <- NA fitD <- segmentByCBS(y, x=x, seed=0xBEEF) # Test #5: Allow for some missing genomic annotations x[495] <- NA fitE <- segmentByCBS(y, x=x, seed=0xBEEF) # Test #6: Undo all change points found fitF <- segmentByCBS(y, x=x, undo=Inf, seed=0xBEEF) print(fitF) stopifnot(nbrOfSegments(fitF) == 1L) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # MISC. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Emulate a centromere x[650:699] <- NA fit <- segmentByCBS(y, x=x, seed=0xBEEF) xMb <- x/1e6 plot(xMb,y, pch=20, col="#aaaaaa", xlab=xlab, ylim=ylim) drawLevels(fit, col="red", lwd=2, xScale=1e-6) fitC <- segmentByCBS(y, x=x, joinSegments=FALSE, seed=0xBEEF) drawLevels(fitC, col="blue", lwd=2, xScale=1e-6) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Multiple chromosomes # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Appending CBS results fit1 <- segmentByCBS(y, chromosome=1, x=x) fit2 <- segmentByCBS(y, chromosome=2, x=x) fit <- c(fit1, fit2) print(fit) plotTracks(fit, subset=NULL, lwd=2, Clim=c(-3,3)) # Segmenting multiple chromosomes at once chromosomeWG <- rep(1:2, each=J) xWG <- rep(x, times=2) yWG <- rep(y, times=2) fitWG <- segmentByCBS(yWG, chromosome=chromosomeWG, x=xWG) print(fitWG) plotTracks(fitWG, subset=NULL, lwd=2, Clim=c(-3,3)) # Assert same results fit$data[,"index"] <- getLocusData(fitWG)[,"index"] # Ignore 'index' stopifnot(all.equal(getLocusData(fitWG), getLocusData(fit))) stopifnot(all.equal(getSegments(fitWG), getSegments(fit)))
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Simulating copy-number data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - set.seed(0xBEEF) # Number of loci J <- 1000 mu <- double(J) mu[200:300] <- mu[200:300] + 1 mu[350:400] <- NA # centromere mu[650:800] <- mu[650:800] - 1 eps <- rnorm(J, sd=1/2) y <- mu + eps x <- sort(runif(length(y), max=length(y))) * 1e5 w <- runif(J) w[650:800] <- 0.001 # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Segmentation # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- segmentByCBS(y, x=x) print(fit) plotTracks(fit) xlab <- "Position (Mb)" ylim <- c(-3,3) xMb <- x/1e6 plot(xMb,y, pch=20, col="#aaaaaa", xlab=xlab, ylim=ylim) drawLevels(fit, col="red", lwd=2, xScale=1e-6) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # TESTS # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- segmentByCBS(y, x=x, seed=0xBEEF) print(fit) ## id chromosome start end nbrOfLoci mean ## 1 y 0 55167.82 20774251 201 0.0164 ## 2 y 0 20774250.85 29320105 99 1.0474 ## 3 y 0 29320104.86 65874675 349 -0.0227 ## 4 y 0 65874675.06 81348129 151 -1.0813 ## 5 y 0 81348129.20 99910827 200 -0.0612 # Test #1: Reverse the ordering and segment fitR <- segmentByCBS(rev(y), x=rev(x), seed=0xBEEF) # Sanity check stopifnot(all.equal(getSegments(fitR), getSegments(fit))) # Sanity check stopifnot(all.equal(rev(getLocusData(fitR)$index), getLocusData(fit)$index)) # Test #2: Reverse, but preserve ordering of 'data' object fitRP <- segmentByCBS(rev(y), x=rev(x), preserveOrder=TRUE) stopifnot(all.equal(getSegments(fitRP), getSegments(fit))) # (Test #3: Change points inbetween data points at the same locus) x[650:654] <- x[649] fitC <- segmentByCBS(rev(y), x=rev(x), preserveOrder=TRUE, seed=0xBEEF) # Test #4: Allow for some missing values in signals y[450] <- NA fitD <- segmentByCBS(y, x=x, seed=0xBEEF) # Test #5: Allow for some missing genomic annotations x[495] <- NA fitE <- segmentByCBS(y, x=x, seed=0xBEEF) # Test #6: Undo all change points found fitF <- segmentByCBS(y, x=x, undo=Inf, seed=0xBEEF) print(fitF) stopifnot(nbrOfSegments(fitF) == 1L) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # MISC. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Emulate a centromere x[650:699] <- NA fit <- segmentByCBS(y, x=x, seed=0xBEEF) xMb <- x/1e6 plot(xMb,y, pch=20, col="#aaaaaa", xlab=xlab, ylim=ylim) drawLevels(fit, col="red", lwd=2, xScale=1e-6) fitC <- segmentByCBS(y, x=x, joinSegments=FALSE, seed=0xBEEF) drawLevels(fitC, col="blue", lwd=2, xScale=1e-6) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Multiple chromosomes # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Appending CBS results fit1 <- segmentByCBS(y, chromosome=1, x=x) fit2 <- segmentByCBS(y, chromosome=2, x=x) fit <- c(fit1, fit2) print(fit) plotTracks(fit, subset=NULL, lwd=2, Clim=c(-3,3)) # Segmenting multiple chromosomes at once chromosomeWG <- rep(1:2, each=J) xWG <- rep(x, times=2) yWG <- rep(y, times=2) fitWG <- segmentByCBS(yWG, chromosome=chromosomeWG, x=xWG) print(fitWG) plotTracks(fitWG, subset=NULL, lwd=2, Clim=c(-3,3)) # Assert same results fit$data[,"index"] <- getLocusData(fitWG)[,"index"] # Ignore 'index' stopifnot(all.equal(getLocusData(fitWG), getLocusData(fit))) stopifnot(all.equal(getSegments(fitWG), getSegments(fit)))
Segment total copy numbers and allele B fractions using the Non-paired PSCBS method [1]. This method does not requires matched normals. This is a low-level segmentation method. It is intended to be applied to one tumor sample at the time.
## Default S3 method: segmentByNonPairedPSCBS(CT, betaT, ..., flavor=c("tcn", "tcn&dh", "tcn,dh", "sqrt(tcn),dh", "sqrt(tcn)&dh"), tauA=NA, tauB=1 - tauA, verbose=FALSE)
## Default S3 method: segmentByNonPairedPSCBS(CT, betaT, ..., flavor=c("tcn", "tcn&dh", "tcn,dh", "sqrt(tcn),dh", "sqrt(tcn)&dh"), tauA=NA, tauB=1 - tauA, verbose=FALSE)
CT |
A |
betaT |
A |
... |
Additional arguments passed to |
flavor |
A |
tauA , tauB
|
Lower and upper thresholds ( |
verbose |
See |
Internally segmentByPairedPSCBS
() is used for segmentation.
This segmentation method does not support weights.
Returns the segmentation results as a NonPairedPSCBS
object.
The "DNAcopy::segment" implementation of CBS uses approximation through random sampling for some estimates. Because of this, repeated calls using the same signals may result in slightly different results, unless the random seed is set/fixed.
Although it is possible to segment each chromosome independently using Paired PSCBS, we strongly recommend to segment whole-genome (TCN,BAF) data at once. The reason for this is that downstream CN-state calling methods, such as the AB and the LOH callers, performs much better on whole-genome data. In fact, they may fail to provide valid calls if done chromosome by chromosome.
The total copy number signals as well as any optional positions
must not contain missing values, i.e. NA
s or NaN
s.
If there are any, an informative error is thrown.
Allele B fractions may contain missing values, because such are
interpreted as representing non-polymorphic loci.
None of the input signals may have infinite values, i.e. -Inf
or +Inf
.
If so, an informative error is thrown.
If allele B fractions for the matched normal (betaN
) are
not available, but genotypes (muN
) are, then it is possible
to run Paired PSCBS. See segmentByPairedPSCBS
() for details.
Henrik Bengtsson
[1] A.B. Olshen, H. Bengtsson, P. Neuvial, P.T. Spellman, R.A. Olshen, V.E. Seshan, Parent-specific copy number in paired tumor-normal studies using circular binary segmentation, Bioinformatics, 2011
[2] H. Bengtsson, P. Neuvial and T.P. Speed, TumorBoost: Normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays, BMC Bioinformatics, 2010
To segment paired tumor-normal total copy numbers and allele B fractions,
see segmentByPairedPSCBS
().
To segment total copy numbers, or any other unimodal signals,
see segmentByCBS
().
verbose <- R.utils::Arguments$getVerbose(-10*interactive(), timestamp=TRUE) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Load SNP microarray data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - data <- PSCBS::exampleData("paired.chr01") str(data) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Paired PSCBS segmentation # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Drop single-locus outliers dataS <- dropSegmentationOutliers(data) # Speed up example by segmenting fewer loci dataS <- dataS[seq(from=1, to=nrow(data), by=20),] str(dataS) R.oo::attachLocally(dataS) # Non-Paired PSCBS segmentation fit <- segmentByNonPairedPSCBS(CT, betaT=betaT, chromosome=chromosome, x=x, seed=0xBEEF, verbose=verbose) print(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Bootstrap segment level estimates # (used by the AB caller, which, if skipped here, # will do it automatically) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- bootstrapTCNandDHByRegion(fit, B=100, verbose=verbose) print(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in allelic balance (AB) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in DH for calling AB # (which be done by default by the caller, if skipped here) deltaAB <- estimateDeltaAB(fit, flavor="qq(DH)", verbose=verbose) print(deltaAB) fit <- callAB(fit, delta=deltaAB, verbose=verbose) print(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaAB == deltaAB) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in loss-of-heterozygosity (LOH) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in C1 for calling LOH # (which be done by default by the caller, if skipped here) deltaLOH <- estimateDeltaLOH(fit, flavor="minC1|nonAB", verbose=verbose) print(deltaLOH) fit <- callLOH(fit, delta=deltaLOH, verbose=verbose) print(fit) plotTracks(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaLOH == deltaLOH)
verbose <- R.utils::Arguments$getVerbose(-10*interactive(), timestamp=TRUE) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Load SNP microarray data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - data <- PSCBS::exampleData("paired.chr01") str(data) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Paired PSCBS segmentation # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Drop single-locus outliers dataS <- dropSegmentationOutliers(data) # Speed up example by segmenting fewer loci dataS <- dataS[seq(from=1, to=nrow(data), by=20),] str(dataS) R.oo::attachLocally(dataS) # Non-Paired PSCBS segmentation fit <- segmentByNonPairedPSCBS(CT, betaT=betaT, chromosome=chromosome, x=x, seed=0xBEEF, verbose=verbose) print(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Bootstrap segment level estimates # (used by the AB caller, which, if skipped here, # will do it automatically) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- bootstrapTCNandDHByRegion(fit, B=100, verbose=verbose) print(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in allelic balance (AB) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in DH for calling AB # (which be done by default by the caller, if skipped here) deltaAB <- estimateDeltaAB(fit, flavor="qq(DH)", verbose=verbose) print(deltaAB) fit <- callAB(fit, delta=deltaAB, verbose=verbose) print(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaAB == deltaAB) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in loss-of-heterozygosity (LOH) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in C1 for calling LOH # (which be done by default by the caller, if skipped here) deltaLOH <- estimateDeltaLOH(fit, flavor="minC1|nonAB", verbose=verbose) print(deltaLOH) fit <- callLOH(fit, delta=deltaLOH, verbose=verbose) print(fit) plotTracks(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaLOH == deltaLOH)
Segment total copy numbers and allele B fractions using the Paired PSCBS method [1]. This method requires matched normals. This is a low-level segmentation method. It is intended to be applied to one tumor-normal sample at the time.
## Default S3 method: segmentByPairedPSCBS(CT, thetaT=NULL, thetaN=NULL, betaT=NULL, betaN=NULL, muN=NULL, rho=NULL, chromosome=0, x=NULL, alphaTCN=0.009, alphaDH=0.001, undoTCN=0, undoDH=0, ..., avgTCN=c("mean", "median"), avgDH=c("mean", "median"), flavor=c("tcn&dh", "tcn,dh", "sqrt(tcn),dh", "sqrt(tcn)&dh", "tcn"), tbn=is.null(rho), joinSegments=TRUE, knownSegments=NULL, dropMissingCT=TRUE, seed=NULL, verbose=FALSE, preserveScale=FALSE)
## Default S3 method: segmentByPairedPSCBS(CT, thetaT=NULL, thetaN=NULL, betaT=NULL, betaN=NULL, muN=NULL, rho=NULL, chromosome=0, x=NULL, alphaTCN=0.009, alphaDH=0.001, undoTCN=0, undoDH=0, ..., avgTCN=c("mean", "median"), avgDH=c("mean", "median"), flavor=c("tcn&dh", "tcn,dh", "sqrt(tcn),dh", "sqrt(tcn)&dh", "tcn"), tbn=is.null(rho), joinSegments=TRUE, knownSegments=NULL, dropMissingCT=TRUE, seed=NULL, verbose=FALSE, preserveScale=FALSE)
CT |
A |
thetaT , thetaN
|
(alternative) As an alternative to specifying
tumor TCN ratios relative to the match normal by
argument |
betaT |
A |
betaN |
A |
muN |
An optional |
rho |
(alternative to |
chromosome |
(Optional) An |
x |
Optional |
alphaTCN , alphaDH
|
The significance levels for segmenting total copy numbers (TCNs) and decrease-in-heterozygosity signals (DHs), respectively. |
undoTCN , undoDH
|
Non-negative |
avgTCN , avgDH
|
A |
... |
Additional arguments passed to |
flavor |
A |
tbn |
If |
joinSegments |
If |
knownSegments |
Optional |
dropMissingCT |
If |
seed |
An (optional) |
verbose |
See |
preserveScale |
Defunct - gives an error is specified. |
Internally segmentByCBS
() is used for segmentation.
The Paired PSCBS segmentation method does not support weights.
Returns the segmentation results as a PairedPSCBS
object.
The "DNAcopy::segment" implementation of CBS uses approximation through random sampling for some estimates. Because of this, repeated calls using the same signals may result in slightly different results, unless the random seed is set/fixed.
Although it is possible to segment each chromosome independently using Paired PSCBS, we strongly recommend to segment whole-genome (TCN,BAF) data at once. The reason for this is that downstream CN-state calling methods, such as the AB and the LOH callers, performs much better on whole-genome data. In fact, they may fail to provide valid calls if done chromosome by chromosome.
The total copy number signals as well as any optional positions
must not contain missing values, i.e. NA
s or NaN
s.
If there are any, an informative error is thrown.
Allele B fractions may contain missing values, because such are
interpreted as representing non-polymorphic loci.
None of the input signals may have infinite values, i.e. -Inf
or +Inf
.
If so, an informative error is thrown.
If allele B fractions for the matched normal (betaN
) are
not available, but genotypes (muN
) are, then it is possible
to run a version of Paired PSCBS where TumorBoost normalization
of the tumor allele B fractions is skipped. In order for this
to work, argument tbn
must be set to FALSE
.
Henrik Bengtsson
[1] A.B. Olshen, H. Bengtsson, P. Neuvial, P.T. Spellman, R.A. Olshen, V.E. Seshan, Parent-specific copy number in paired tumor-normal studies using circular binary segmentation, Bioinformatics, 2011
[2] H. Bengtsson, P. Neuvial and T.P. Speed, TumorBoost: Normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays, BMC Bioinformatics, 2010
Internally, callNaiveGenotypes
is used to
call naive genotypes, normalizeTumorBoost
is
used for TumorBoost normalization, and segmentByCBS
() is used
to segment TCN and DH separately.
To segment tumor total copy numbers and allele B fractions
without a matched normal, see segmentByNonPairedPSCBS
().
To segment total copy-numbers, or any other unimodal signals,
see segmentByCBS
().
verbose <- R.utils::Arguments$getVerbose(-10*interactive(), timestamp=TRUE) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Load SNP microarray data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - data <- PSCBS::exampleData("paired.chr01") str(data) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Paired PSCBS segmentation # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Drop single-locus outliers dataS <- dropSegmentationOutliers(data) # Speed up example by segmenting fewer loci dataS <- dataS[seq(from=1, to=nrow(data), by=10),] str(dataS) R.oo::attachLocally(dataS) # Paired PSCBS segmentation fit <- segmentByPairedPSCBS(CT, betaT=betaT, betaN=betaN, chromosome=chromosome, x=x, seed=0xBEEF, verbose=verbose) print(fit) # Plot results plotTracks(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Bootstrap segment level estimates # (used by the AB caller, which, if skipped here, # will do it automatically) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- bootstrapTCNandDHByRegion(fit, B=100, verbose=verbose) print(fit) plotTracks(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in allelic balance (AB) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in DH for calling AB # (which be done by default by the caller, if skipped here) deltaAB <- estimateDeltaAB(fit, flavor="qq(DH)", verbose=verbose) print(deltaAB) ## [1] 0.1657131 fit <- callAB(fit, delta=deltaAB, verbose=verbose) print(fit) plotTracks(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaAB == deltaAB) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in loss-of-heterozygosity (LOH) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in C1 for calling LOH # (which be done by default by the caller, if skipped here) deltaLOH <- estimateDeltaLOH(fit, flavor="minC1|nonAB", verbose=verbose) print(deltaLOH) ## [1] 0.625175 fit <- callLOH(fit, delta=deltaLOH, verbose=verbose) print(fit) plotTracks(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaLOH == deltaLOH)
verbose <- R.utils::Arguments$getVerbose(-10*interactive(), timestamp=TRUE) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Load SNP microarray data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - data <- PSCBS::exampleData("paired.chr01") str(data) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Paired PSCBS segmentation # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Drop single-locus outliers dataS <- dropSegmentationOutliers(data) # Speed up example by segmenting fewer loci dataS <- dataS[seq(from=1, to=nrow(data), by=10),] str(dataS) R.oo::attachLocally(dataS) # Paired PSCBS segmentation fit <- segmentByPairedPSCBS(CT, betaT=betaT, betaN=betaN, chromosome=chromosome, x=x, seed=0xBEEF, verbose=verbose) print(fit) # Plot results plotTracks(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Bootstrap segment level estimates # (used by the AB caller, which, if skipped here, # will do it automatically) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - fit <- bootstrapTCNandDHByRegion(fit, B=100, verbose=verbose) print(fit) plotTracks(fit) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in allelic balance (AB) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in DH for calling AB # (which be done by default by the caller, if skipped here) deltaAB <- estimateDeltaAB(fit, flavor="qq(DH)", verbose=verbose) print(deltaAB) ## [1] 0.1657131 fit <- callAB(fit, delta=deltaAB, verbose=verbose) print(fit) plotTracks(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaAB == deltaAB) # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Calling segments in loss-of-heterozygosity (LOH) # NOTE: Ideally, this should be done on whole-genome data # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Explicitly estimate the threshold in C1 for calling LOH # (which be done by default by the caller, if skipped here) deltaLOH <- estimateDeltaLOH(fit, flavor="minC1|nonAB", verbose=verbose) print(deltaLOH) ## [1] 0.625175 fit <- callLOH(fit, delta=deltaLOH, verbose=verbose) print(fit) plotTracks(fit) # Even if not explicitly specified, the estimated # threshold parameter is returned by the caller stopifnot(fit$params$deltaLOH == deltaLOH)