matrixStats - Functions that Apply to Rows and Columns of Matrices (and to Vectors)
High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().
Last updated 2 months ago
matrixperformancevector
18.09 score 208 stars 2.3k dependents 20k scripts 236k downloads
future - Unified Parallel and Distributed Processing in R for Everyone
The purpose of this package is to provide a lightweight and unified Future API for sequential and parallel processing of R expression via futures. The simplest way to evaluate an expression in parallel is to use `x %<-% { expression }` with `plan(multisession)`. This package implements sequential, multicore, multisession, and cluster futures. With these, R expressions can be evaluated on the local machine, in parallel a set of local machines, or distributed on a mix of local and remote machines. Extensions to this package implement additional backends for processing futures via compute cluster schedulers, etc. Because of its unified API, there is no need to modify any code in order switch from sequential on the local machine to, say, distributed processing on a remote compute cluster. Another strength of this package is that global variables and functions are automatically identified and exported as needed, making it straightforward to tweak existing code to make use of futures.
Last updated 7 months ago
asynchronousdistributed-computingfutureshpchpc-clustersparallel-computingparallel-processingparallelizationprogrammingpromises
17.43 score 972 stars 1.2k dependents 15k scripts 225k downloads
parallelly - Enhancing the 'parallel' Package
Utility functions that enhance the 'parallel' package and support the built-in parallel backends of the 'future' package. For example, availableCores() gives the number of CPU cores available to your R process as given by the operating system, 'cgroups' and Linux containers, R options, and environment variables, including those set by job schedulers on high-performance compute clusters. If none is set, it will fall back to parallel::detectCores(). Another example is makeClusterPSOCK(), which is backward compatible with parallel::makePSOCKcluster() while doing a better job in setting up remote cluster workers without the need for configuring the firewall to do port-forwarding to your local computer.
Last updated 1 months ago
parallel-computing
14.21 score 133 stars 1.3k dependents 448 scripts 303k downloadsfuture.apply - Apply Function to Elements in Parallel using Futures
Implementations of apply(), by(), eapply(), lapply(), Map(), .mapply(), mapply(), replicate(), sapply(), tapply(), and vapply() that can be resolved using any future-supported backend, e.g. parallel on the local machine or distributed on a compute cluster. These future_*apply() functions come with the same pros and cons as the corresponding base-R *apply() functions but with the additional feature of being able to be processed via the future framework <doi:10.32614/RJ-2021-048>.
Last updated 4 months ago
asynchronousdistributed-computingfuturehpchpc-clustersparallelparallel-computingparallel-processingparallelizationprogramming
14.02 score 214 stars 925 dependents 2.3k scripts 187k downloadsR.utils - Various Programming Utilities
Utility functions useful when programming and developing R packages.
Last updated 1 years ago
13.60 score 63 stars 795 dependents 5.7k scripts 156k downloadsR.oo - R Object-Oriented Programming with or without References
Methods and classes for object-oriented programming in R with or without references. Large effort has been made on making definition of methods as simple as possible with a minimum of maintenance for package developers. The package has been developed since 2001 and is now considered very stable. This is a cross-platform package implemented in pure R that defines standard S3 classes without any tricks.
Last updated 4 months ago
11.49 score 20 stars 828 dependents 329 scripts 189k downloadsdoFuture - Use Foreach to Parallelize via the Future Framework
The 'future' package provides a unifying parallelization framework for R that supports many parallel and distributed backends. The 'foreach' package provides a powerful API for iterating over an R expression in parallel. The 'doFuture' package brings the best of the two together. There are two alternative ways to use this package. The recommended approach is to use 'y <- foreach(...) %dofuture% { ... }', which does not require using 'registerDoFuture()' and has many advantages over '%dopar%'. The alternative is the traditional 'foreach' approach by registering the 'foreach' adapter 'registerDoFuture()' and so that 'y <- foreach(...) %dopar% { ... }' runs in parallelizes with the 'future' framework.
Last updated 1 years ago
batchjobsbatchtoolsbiocparalleldistributed-computingforeachhpchpc-clustersparallelplyr
11.20 score 84 stars 92 dependents 1.5k scripts 30k downloadslistenv - Environments Behaving (Almost) as Lists
List environments are environments that have list-like properties. For instance, the elements of a list environment are ordered and can be accessed and iterated over using index subsetting, e.g. 'x <- listenv(a = 1, b = 2); for (i in seq_along(x)) x[[i]] <- x[[i]] ^ 2; y <- as.list(x)'.
Last updated 1 years ago
10.96 score 32 stars 1.2k dependents 81 scripts 198k downloadsR.matlab - Read and Write MAT Files and Call MATLAB from Within R
Methods readMat() and writeMat() for reading and writing MAT files. For user with MATLAB v6 or newer installed (either locally or on a remote host), the package also provides methods for controlling MATLAB (trademark) via R and sending and retrieving data between R and MATLAB.
Last updated 3 years ago
matlab
10.55 score 85 stars 25 dependents 2.9k scripts 6.3k downloadsglobals - Identify Global Objects in R Expressions
Identifies global ("unknown" or "free") objects in R expressions by code inspection using various strategies (ordered, liberal, or conservative). The objective of this package is to make it as simple as possible to identify global objects for the purpose of exporting them in parallel, distributed compute environments.
Last updated 2 years ago
code-inspection
10.44 score 29 stars 1.2k dependents 258 scripts 200k downloadsilluminaio - Parsing Illumina Microarray Output Files
Tools for parsing Illumina's microarray output files, including IDAT.
Last updated 4 months ago
infrastructuredataimportmicroarrayproprietaryplatformsbioconductor
10.29 score 5 stars 36 dependents 58 scripts 4.5k downloadsR.cache - Fast and Light-Weight Caching (Memoization) of Objects and Results to Speed Up Computations
Memoization can be used to speed up repetitive and computational expensive function calls. The first time a function that implements memoization is called the results are stored in a cache memory. The next time the function is called with the same set of parameters, the results are momentarily retrieved from the cache avoiding repeating the calculations. With this package, any R object can be cached in a key-value storage where the key can be an arbitrary set of R objects. The cache memory is persistent (on the file system).
Last updated 3 years ago
cachememoization
9.59 score 39 stars 112 dependents 94 scripts 63k downloadsprofmem - Simple Memory Profiling for R
A simple and light-weight API for memory profiling of R expressions. The profiling is built on top of R's built-in memory profiler ('utils::Rprofmem()'), which records every memory allocation done by R (also native code).
Last updated 4 years ago
memory-profilerperformanceram
9.13 score 36 stars 11 dependents 141 scripts 16k downloadsR.methodsS3 - S3 Methods Simplified
Methods that simplify the setup of S3 generic functions and S3 methods. Major effort has been made in making definition of methods as simple as possible with a minimum of maintenance for package developers. For example, generic functions are created automatically, if missing, and naming conflict are automatically solved, if possible. The method setMethodS3() is a good start for those who in the future may want to migrate to S4. This is a cross-platform package implemented in pure R that generates standard S3 methods.
Last updated 3 years ago
8.65 score 1 stars 822 dependents 23 scripts 156k downloadsilluminaio - Parsing Illumina Microarray Output Files
Tools for parsing Illumina's microarray output files, including IDAT.
Last updated 2 years ago
infrastructuredataimportmicroarrayproprietaryplatformsbioconductor
8.30 score 5 stars 37 dependents 51 scriptsaffxparser - Affymetrix File Parsing SDK
Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
Last updated 2 months ago
infrastructuredataimportmicroarrayproprietaryplatformsonechannelbioconductorcpp
8.19 score 7 stars 14 dependents 65 scripts 2.3k downloadsR.rsp - Dynamic Generation of Scientific Reports
The RSP markup language makes any text-based document come alive. RSP provides a powerful markup for controlling the content and output of LaTeX, HTML, Markdown, AsciiDoc, Sweave and knitr documents (and more), e.g. 'Today's date is <%=Sys.Date()%>'. Contrary to many other literate programming languages, with RSP it is straightforward to loop over mixtures of code and text sections, e.g. in month-by-month summaries. RSP has also several preprocessing directives for incorporating static and dynamic contents of external files (local or online) among other things. Functions rstring() and rcat() make it easy to process RSP strings, rsource() sources an RSP file as it was an R script, while rfile() compiles it (even online) into its final output format, e.g. rfile('report.tex.rsp') generates 'report.pdf' and rfile('report.md.rsp') generates 'report.html'. RSP is ideal for self-contained scientific reports and R package vignettes. It's easy to use - if you know how to write an R script, you'll be up and running within minutes.
Last updated 1 years ago
documentmarkupreportreproducibilityscience
7.91 score 31 stars 9 dependents 36 scripts 13k downloadsPSCBS - Analysis of Parent-Specific DNA Copy Numbers
Segmentation of allele-specific DNA copy number data and detection of regions with abnormal copy number within each parental chromosome. Both tumor-normal paired and tumor-only analyses are supported.
Last updated 1 years ago
acghcopynumbervariantssnpmicroarrayonechanneltwochannelgenetics
7.59 score 7 stars 9 dependents 34 scripts 686 downloadsfuture.callr - A Future API for Parallel Processing using 'callr'
Implementation of the Future API on top of the 'callr' package. This allows you to process futures, as defined by the 'future' package, in parallel out of the box, on your local (Linux, macOS, Windows, ...) machine. Contrary to backends relying on the 'parallel' package (e.g. 'future::multisession') and socket connections, the 'callr' backend provided here can run more than 125 parallel R processes.
Last updated 2 years ago
asynchronousfuturesparallel-computingparallel-processingparallelizationprogrammingpromises
7.17 score 62 stars 1 dependents 249 scripts 3.2k downloadsR.devices - Unified Handling of Graphics Devices
Functions for creating plots and image files in a unified way regardless of output format (EPS, PDF, PNG, SVG, TIFF, WMF, etc.). Default device options as well as scales and aspect ratios are controlled in a uniform way across all device types. Switching output format requires minimal changes in code. This package is ideal for large-scale batch processing, because it will never leave open graphics devices or incomplete image files behind, even on errors or user interrupts.
Last updated 1 years ago
graphics
6.88 score 19 stars 13 dependents 67 scripts 3.0k downloadsaffxparser - Affymetrix File Parsing SDK
Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
Last updated 2 months ago
infrastructuredataimportmicroarrayproprietaryplatformsonechannelbioconductorcpp
6.83 score 7 stars 14 dependents 65 scriptsfuture.batchtools - A Future API for Parallel and Distributed Processing using 'batchtools'
Implementation of the Future API on top of the 'batchtools' package. This allows you to process futures, as defined by the 'future' package, in parallel out of the box, not only on your local machine or ad-hoc cluster of machines, but also via high-performance compute ('HPC') job schedulers such as 'LSF', 'OpenLava', 'Slurm', 'SGE', and 'TORQUE' / 'PBS', e.g. 'y <- future.apply::future_lapply(files, FUN = process)'.
Last updated 1 years ago
distributed-computinghpcjob-schedulerparallelpbssgeslurmtorque
6.66 score 84 stars 408 scripts 1.3k downloadsstartup - Friendly R Startup Configuration
Adds support for R startup configuration via '.Renviron.d' and '.Rprofile.d' directories in addition to '.Renviron' and '.Rprofile' files. This makes it possible to keep private / secret environment variables separate from other environment variables. It also makes it easier to share specific startup settings by simply copying a file to a directory.
Last updated 3 months ago
configurationenvironment-variablesstartuputility
6.54 score 166 stars 16 scripts 1.3k downloadsaroma.light - Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Last updated 4 months ago
infrastructuremicroarrayonechanneltwochannelmultichannelvisualizationpreprocessingbioconductor
6.43 score 1 stars 20 dependents 26 scripts 2.9k downloadsTopDom - An Efficient and Deterministic Method for Identifying Topological Domains in Genomes
The 'TopDom' method identifies topological domains in genomes from Hi-C sequence data (Shin et al., 2016 <doi:10.1093/nar/gkv1505>). The authors published an implementation of their method as an R script (two different versions; also available in this package). This package originates from those original 'TopDom' R scripts and provides help pages adopted from the original 'TopDom' PDF documentation. It also provides a small number of bug fixes to the original code.
Last updated 4 years ago
genomicshictopological-domains
5.76 score 19 stars 1 dependents 20 scripts 188 downloadsaroma.affymetrix - Analysis of Large Affymetrix Microarray Data Sets
A cross-platform R framework that facilitates processing of any number of Affymetrix microarray samples regardless of computer system. The only parameter that limits the number of chips that can be processed is the amount of available disk space. The Aroma Framework has successfully been used in studies to process tens of thousands of arrays. This package has actively been used since 2006.
Last updated 1 years ago
infrastructureproprietaryplatformsexonarraymicroarrayonechannelguidataimportdatarepresentationpreprocessingqualitycontrolvisualizationreportwritingacghcopynumbervariantsdifferentialexpressiongeneexpressionsnptranscriptionaffymetrixanalysiscopy-numberdnaexpressionhpclarge-scalenotebookreproducibilityrna
5.70 score 10 stars 3 dependents 112 scripts 764 downloadsaroma.light - Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Last updated 2 years ago
infrastructuremicroarrayonechanneltwochannelmultichannelvisualizationpreprocessingbioconductor
5.67 score 1 stars 20 dependents 26 scriptsport4me - Get the Same, Personal, Free 'TCP' Port over and over
An R implementation of the cross-platform, language-independent "port4me" algorithm (<https://github.com/HenrikBengtsson/port4me>), which (1) finds a free Transmission Control Protocol ('TCP') port in [1024,65535] that the user can open, (2) is designed to work in multi-user environments, (3), gives different users, different ports, (4) gives the user the same port over time with high probability, (5) gives different ports for different software tools, and (6) requires no configuration.
Last updated 1 years ago
bashclihigh-performance-computinghpcmulti-tenantmulti-userportpypi-packagepythonr-languager-programmingtcputility
5.11 score 13 stars 5 scripts 255 downloadsfuture.tests - Test Suite for 'Future API' Backends
Backends implementing the 'Future' API, as defined by the 'future' package, should use the tests provided by this package to validate that they meet the minimal requirements of the 'Future' API. The tests can be performed easily from within R or from outside of R from the command line making it straightforward to include them in package tests and in Continuous Integration (CI) pipelines.
Last updated 2 years ago
futuretesting
4.48 score 10 stars 4 scripts 305 downloadsaroma.core - Core Methods and Classes Used by 'aroma.*' Packages Part of the Aroma Framework
Core methods and classes used by higher-level 'aroma.*' packages part of the Aroma Project, e.g. 'aroma.affymetrix' and 'aroma.cn'.
Last updated 2 years ago
microarrayonechanneltwochannelmultichanneldataimportdatarepresentationguivisualizationpreprocessingqualitycontrolacghcopynumbervariants
4.16 score 1 stars 6 dependents 16 scripts 810 downloadsR.huge - Methods for Accessing Huge Amounts of Data [deprecated]
DEPRECATED. Do not start building new projects based on this package. Cross-platform alternatives are the following packages: bigmemory (CRAN), ff (CRAN), BufferedMatrix (Bioconductor). The main usage of it was inside the aroma.affymetrix package. (The package currently provides a class representing a matrix where the actual data is stored in a binary format on the local file system. This way the size limit of the data is set by the file system and not the memory.)
Last updated 1 years ago
3.88 score 5 dependents 2 scripts 477 downloadsaroma.apd - A Probe-Level Data File Format Used by 'aroma.affymetrix' [deprecated]
DEPRECATED. Do not start building new projects based on this package. (The (in-house) APD file format was initially developed to store Affymetrix probe-level data, e.g. normalized CEL intensities. Chip types can be added to APD file and similar to methods in the affxparser package, this package provides methods to read APDs organized by units (probesets). In addition, the probe elements can be arranged optimally such that the elements are guaranteed to be read in order when, for instance, data is read unit by unit. This speeds up the read substantially. This package is supporting the Aroma framework and should not be used elsewhere.)
Last updated 2 years ago
microarraydataimport
3.82 score 4 dependents 11 scripts 578 downloadsseguid - Sequence Globally Unique Identifier ('SEGUID') Checksums for Linear, Circular, Single-Stranded and Double-Stranded Biological Sequences
An R implementation of the original Sequence Globally Unique Identifier ('SEGUID') algorithm [Babnigg and Giometti (2006) <doi:10.1002/pmic.200600032>] and 'SEGUID' v2 (<https://www.seguid.org>), which extends 'SEGUID' v1 with support for linear, circular, single- and double-stranded biological sequences, e.g. DNA, RNA, and proteins.
Last updated 1 years ago
seguid
3.30 score 5 scripts 143 downloadsACNE - Affymetrix SNP Probe-Summarization using Non-Negative Matrix Factorization
A summarization method to estimate allele-specific copy number signals for Affymetrix SNP microarrays using non-negative matrix factorization (NMF).
Last updated 1 years ago
acghcopynumbervariantssnpmicroarrayonechanneltwochannelgenetics
3.18 score 2 scripts 400 downloadsaroma.cn - Copy-Number Analysis of Large Microarray Data Sets
Methods for analyzing DNA copy-number data. Specifically, this package implements the multi-source copy-number normalization (MSCN) method for normalizing copy-number data obtained on various platforms and technologies. It also implements the TumorBoost method for normalizing paired tumor-normal SNP data.
Last updated 1 years ago
proprietaryplatformsacghcopynumbervariantssnpmicroarrayonechanneltwochanneldataimportdatarepresentationpreprocessingqualitycontrol
2.70 score 1 stars 9 scripts 440 downloadscalmate - Improved Allele-Specific Copy Number of SNP Microarrays for Downstream Segmentation
The CalMaTe method calibrates preprocessed allele-specific copy number estimates (ASCNs) from DNA microarrays by controlling for single-nucleotide polymorphism-specific allelic crosstalk. The resulting ASCNs are on average more accurate, which increases the power of segmentation methods for detecting changes between copy number states in tumor studies including copy neutral loss of heterozygosity. CalMaTe applies to any ASCNs regardless of preprocessing method and microarray technology, e.g. Affymetrix and Illumina.
Last updated 3 years ago
acghcopynumbervariantssnpmicroarrayonechanneltwochannelgenetics
2.70 score 1 stars 6 scripts 353 downloadsdChipIO - Methods for Reading dChip Files
Functions for reading DCP and CDF.bin files generated by the dChip software.
Last updated 9 years ago
infrastructuredataimport
2.70 score 3 scripts 179 downloadsfuture.tools - Tools for Working with Futures
Tools for Working with Futures.
Last updated 9 months ago
parallel-computingparallel-programming
2.60 score 2 starsdoFuture.tests.extra - Extra Test Sets for the 'doFuture' Package
Runs examples of packages that use 'foreach' and '%dopar%' for parallelization, where 'doFuture' is used as the 'foreach' adapter making it possible to use any future backend for parallelization. The package tests use these tools to test 'doFuture' with 'foreach'-based examples from packages 'BiocParallel', 'caret', 'doParallel', 'glmnet', 'NMF', 'plyr', and 'TSP'. These tests are run with many known future backends.
Last updated 1 years ago
futureverseparalleltestsuite
1.70 scoresfit - Multidimensional Simplex Fitting
Methods for robustly fitting a K-dimensional simplex in M dimensions.
Last updated 2 years ago
cconemodelsimplex
1.70 score 1 stars