future - Unified Parallel and Distributed Processing in R for Everyone
The purpose of this package is to provide a lightweight and unified Future API for sequential and parallel processing of R expression via futures. The simplest way to evaluate an expression in parallel is to use `x %<-% { expression }` with `plan(multisession)`. This package implements sequential, multicore, multisession, and cluster futures. With these, R expressions can be evaluated on the local machine, in parallel a set of local machines, or distributed on a mix of local and remote machines. Extensions to this package implement additional backends for processing futures via compute cluster schedulers, etc. Because of its unified API, there is no need to modify any code in order switch from sequential on the local machine to, say, distributed processing on a remote compute cluster. Another strength of this package is that global variables and functions are automatically identified and exported as needed, making it straightforward to tweak existing code to make use of futures.
Last updated 4 months ago
asynchronousdistributed-computingfutureshpchpc-clustersparallel-computingparallel-processingparallelizationprogrammingpromises
18.63 score 956 stars 1.1k packages 15k scripts 255k downloadsmatrixStats - Functions that Apply to Rows and Columns of Matrices (and to Vectors)
High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().
Last updated 2 months ago
matrixperformancevector
18.19 score 203 stars 2.2k packages 19k scripts 248k downloadsfuture.apply - Apply Function to Elements in Parallel using Futures
Implementations of apply(), by(), eapply(), lapply(), Map(), .mapply(), mapply(), replicate(), sapply(), tapply(), and vapply() that can be resolved using any future-supported backend, e.g. parallel on the local machine or distributed on a compute cluster. These future_*apply() functions come with the same pros and cons as the corresponding base-R *apply() functions but with the additional feature of being able to be processed via the future framework <doi:10.32614/RJ-2021-048>.
Last updated 25 days ago
asynchronousdistributed-computingfuturehpchpc-clustersparallelparallel-computingparallel-processingparallelizationprogramming
14.06 score 210 stars 872 packages 1.9k scripts 184k downloadsR.utils - Various Programming Utilities
Utility functions useful when programming and developing R packages.
Last updated 1 years ago
13.67 score 62 stars 802 packages 5.6k scripts 187k downloadsparallelly - Enhancing the 'parallel' Package
Utility functions that enhance the 'parallel' package and support the built-in parallel backends of the 'future' package. For example, availableCores() gives the number of CPU cores available to your R process as given by the operating system, 'cgroups' and Linux containers, R options, and environment variables, including those set by job schedulers on high-performance compute clusters. If none is set, it will fall back to parallel::detectCores(). Another example is makeClusterPSOCK(), which is backward compatible with parallel::makePSOCKcluster() while doing a better job in setting up remote cluster workers without the need for configuring the firewall to do port-forwarding to your local computer.
Last updated 14 days ago
parallel-computing
12.93 score 130 stars 1.2k packages 349 scripts 261k downloadsdoFuture - Use Foreach to Parallelize via the Future Framework
The 'future' package provides a unifying parallelization framework for R that supports many parallel and distributed backends. The 'foreach' package provides a powerful API for iterating over an R expression in parallel. The 'doFuture' package brings the best of the two together. There are two alternative ways to use this package. The recommended approach is to use 'y <- foreach(...) %dofuture% { ... }', which does not require using 'registerDoFuture()' and has many advantages over '%dopar%'. The alternative is the traditional 'foreach' approach by registering the 'foreach' adapter 'registerDoFuture()' and so that 'y <- foreach(...) %dopar% { ... }' runs in parallelizes with the 'future' framework.
Last updated 11 months ago
batchjobsbatchtoolsbiocparalleldistributed-computingforeachhpchpc-clustersparallelplyr
12.04 score 84 stars 80 packages 1.3k scripts 29k downloadslistenv - Environments Behaving (Almost) as Lists
List environments are environments that have list-like properties. For instance, the elements of a list environment are ordered and can be accessed and iterated over using index subsetting, e.g. 'x <- listenv(a = 1, b = 2); for (i in seq_along(x)) x[[i]] <- x[[i]] ^ 2; y <- as.list(x)'.
Last updated 10 months ago
11.91 score 29 stars 1.1k packages 79 scripts 207k downloadsR.oo - R Object-Oriented Programming with or without References
Methods and classes for object-oriented programming in R with or without references. Large effort has been made on making definition of methods as simple as possible with a minimum of maintenance for package developers. The package has been developed since 2001 and is now considered very stable. This is a cross-platform package implemented in pure R that defines standard S3 classes without any tricks.
Last updated 20 days ago
11.75 score 20 stars 819 packages 319 scripts 181k downloadsglobals - Identify Global Objects in R Expressions
Identifies global ("unknown" or "free") objects in R expressions by code inspection using various strategies (ordered, liberal, or conservative). The objective of this package is to make it as simple as possible to identify global objects for the purpose of exporting them in parallel, distributed compute environments.
Last updated 2 years ago
code-inspection
11.59 score 28 stars 1.2k packages 258 scripts 300k downloadsR.matlab - Read and Write MAT Files and Call MATLAB from Within R
Methods readMat() and writeMat() for reading and writing MAT files. For user with MATLAB v6 or newer installed (either locally or on a remote host), the package also provides methods for controlling MATLAB (trademark) via R and sending and retrieving data between R and MATLAB.
Last updated 2 years ago
matlab
10.58 score 86 stars 24 packages 2.8k scripts 7.3k downloadsilluminaio - Parsing Illumina Microarray Output Files
Tools for parsing Illumina's microarray output files, including IDAT.
Last updated 23 days ago
infrastructuredataimportmicroarrayproprietaryplatformsbioconductor
10.39 score 6 stars 44 packages 50 scripts 4.5k downloadsR.cache - Fast and Light-Weight Caching (Memoization) of Objects and Results to Speed Up Computations
Memoization can be used to speed up repetitive and computational expensive function calls. The first time a function that implements memoization is called the results are stored in a cache memory. The next time the function is called with the same set of parameters, the results are momentarily retrieved from the cache avoiding repeating the calculations. With this package, any R object can be cached in a key-value storage where the key can be an arbitrary set of R objects. The cache memory is persistent (on the file system).
Last updated 2 years ago
cachememoization
9.65 score 38 stars 109 packages 92 scripts 79k downloadsprofmem - Simple Memory Profiling for R
A simple and light-weight API for memory profiling of R expressions. The profiling is built on top of R's built-in memory profiler ('utils::Rprofmem()'), which records every memory allocation done by R (also native code).
Last updated 4 years ago
memory-profilerperformanceram
9.02 score 35 stars 10 packages 149 scripts 14k downloadsR.methodsS3 - S3 Methods Simplified
Methods that simplify the setup of S3 generic functions and S3 methods. Major effort has been made in making definition of methods as simple as possible with a minimum of maintenance for package developers. For example, generic functions are created automatically, if missing, and naming conflict are automatically solved, if possible. The method setMethodS3() is a good start for those who in the future may want to migrate to S4. This is a cross-platform package implemented in pure R that generates standard S3 methods.
Last updated 2 years ago
8.72 score 1 stars 828 packages 23 scripts 184k downloadsilluminaio - Parsing Illumina Microarray Output Files
Tools for parsing Illumina's microarray output files, including IDAT.
Last updated 2 years ago
infrastructuredataimportmicroarrayproprietaryplatformsbioconductor
8.43 score 6 stars 44 packages 49 scriptsfuture.callr - A Future API for Parallel Processing using 'callr'
Implementation of the Future API on top of the 'callr' package. This allows you to process futures, as defined by the 'future' package, in parallel out of the box, on your local (Linux, macOS, Windows, ...) machine. Contrary to backends relying on the 'parallel' package (e.g. 'future::multisession') and socket connections, the 'callr' backend provided here can run more than 125 parallel R processes.
Last updated 1 years ago
asynchronousfuturesparallel-computingparallel-processingparallelizationprogrammingpromises
8.03 score 62 stars 1 packages 229 scripts 2.5k downloadsR.rsp - Dynamic Generation of Scientific Reports
The RSP markup language makes any text-based document come alive. RSP provides a powerful markup for controlling the content and output of LaTeX, HTML, Markdown, AsciiDoc, Sweave and knitr documents (and more), e.g. 'Today's date is <%=Sys.Date()%>'. Contrary to many other literate programming languages, with RSP it is straightforward to loop over mixtures of code and text sections, e.g. in month-by-month summaries. RSP has also several preprocessing directives for incorporating static and dynamic contents of external files (local or online) among other things. Functions rstring() and rcat() make it easy to process RSP strings, rsource() sources an RSP file as it was an R script, while rfile() compiles it (even online) into its final output format, e.g. rfile('report.tex.rsp') generates 'report.pdf' and rfile('report.md.rsp') generates 'report.html'. RSP is ideal for self-contained scientific reports and R package vignettes. It's easy to use - if you know how to write an R script, you'll be up and running within minutes.
Last updated 9 months ago
documentmarkupreportreproducibilityscience
7.95 score 31 stars 9 packages 36 scripts 15k downloadsaffxparser - Affymetrix File Parsing SDK
Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
Last updated 23 days ago
infrastructuredataimportmicroarrayproprietaryplatformsonechannelbioconductor
7.82 score 7 stars 15 packages 65 scripts 2.3k downloadsfuture.batchtools - A Future API for Parallel and Distributed Processing using 'batchtools'
Implementation of the Future API on top of the 'batchtools' package. This allows you to process futures, as defined by the 'future' package, in parallel out of the box, not only on your local machine or ad-hoc cluster of machines, but also via high-performance compute ('HPC') job schedulers such as 'LSF', 'OpenLava', 'Slurm', 'SGE', and 'TORQUE' / 'PBS', e.g. 'y <- future.apply::future_lapply(files, FUN = process)'.
Last updated 11 months ago
distributed-computinghpcjob-schedulerparallelpbssgeslurmtorque
7.76 score 84 stars 382 scripts 1.8k downloadsPSCBS - Analysis of Parent-Specific DNA Copy Numbers
Segmentation of allele-specific DNA copy number data and detection of regions with abnormal copy number within each parental chromosome. Both tumor-normal paired and tumor-only analyses are supported.
Last updated 9 months ago
acghcopynumbervariantssnpmicroarrayonechanneltwochannelgenetics
7.59 score 7 stars 9 packages 34 scripts 760 downloadsaffxparser - Affymetrix File Parsing SDK
Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
Last updated 10 months ago
infrastructuredataimportmicroarrayproprietaryplatformsonechannelbioconductor
7.16 score 7 stars 15 packages 65 scriptsR.devices - Unified Handling of Graphics Devices
Functions for creating plots and image files in a unified way regardless of output format (EPS, PDF, PNG, SVG, TIFF, WMF, etc.). Default device options as well as scales and aspect ratios are controlled in a uniform way across all device types. Switching output format requires minimal changes in code. This package is ideal for large-scale batch processing, because it will never leave open graphics devices or incomplete image files behind, even on errors or user interrupts.
Last updated 10 months ago
graphics
7.08 score 19 stars 13 packages 84 scripts 3.9k downloadsaroma.light - Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Last updated 23 days ago
infrastructuremicroarrayonechanneltwochannelmultichannelvisualizationpreprocessingbioconductor
6.42 score 1 stars 20 packages 25 scripts 2.9k downloadsstartup - Friendly R Startup Configuration
Adds support for R startup configuration via '.Renviron.d' and '.Rprofile.d' directories in addition to '.Renviron' and '.Rprofile' files. This makes it possible to keep private / secret environment variables separate from other environment variables. It also makes it easier to share specific startup settings by simply copying a file to a directory.
Last updated 4 months ago
configurationenvironment-variablesstartuputility
6.42 score 163 stars 16 scripts 493 downloadsTopDom - An Efficient and Deterministic Method for Identifying Topological Domains in Genomes
The 'TopDom' method identifies topological domains in genomes from Hi-C sequence data (Shin et al., 2016 <doi:10.1093/nar/gkv1505>). The authors published an implementation of their method as an R script (two different versions; also available in this package). This package originates from those original 'TopDom' R scripts and provides help pages adopted from the original 'TopDom' PDF documentation. It also provides a small number of bug fixes to the original code.
Last updated 4 years ago
genomicshictopological-domains
5.76 score 20 stars 1 packages 19 scripts 203 downloadsaroma.affymetrix - Analysis of Large Affymetrix Microarray Data Sets
A cross-platform R framework that facilitates processing of any number of Affymetrix microarray samples regardless of computer system. The only parameter that limits the number of chips that can be processed is the amount of available disk space. The Aroma Framework has successfully been used in studies to process tens of thousands of arrays. This package has actively been used since 2006.
Last updated 9 months ago
infrastructureproprietaryplatformsexonarraymicroarrayonechannelguidataimportdatarepresentationpreprocessingqualitycontrolvisualizationreportwritingacghcopynumbervariantsdifferentialexpressiongeneexpressionsnptranscriptionaffymetrixanalysiscopy-numberdnaexpressionhpclarge-scalenotebookreproducibilityrna
5.70 score 10 stars 3 packages 112 scripts 792 downloadsaroma.light - Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Last updated 2 years ago
infrastructuremicroarrayonechanneltwochannelmultichannelvisualizationpreprocessingbioconductor
5.65 score 1 stars 20 packages 25 scriptsfuture.tests - Test Suite for 'Future API' Backends
Backends implementing the 'Future' API, as defined by the 'future' package, should use the tests provided by this package to validate that they meet the minimal requirements of the 'Future' API. The tests can be performed easily from within R or from outside of R from the command line making it straightforward to include them in package tests and in Continuous Integration (CI) pipelines.
Last updated 2 years ago
futuretesting
5.48 score 10 stars 4 scripts 300 downloadsport4me - Get the Same, Personal, Free 'TCP' Port over and over
An R implementation of the cross-platform, language-independent "port4me" algorithm (<https://github.com/HenrikBengtsson/port4me>), which (1) finds a free Transmission Control Protocol ('TCP') port in [1024,65535] that the user can open, (2) is designed to work in multi-user environments, (3), gives different users, different ports, (4) gives the user the same port over time with high probability, (5) gives different ports for different software tools, and (6) requires no configuration.
Last updated 9 months ago
bashclihigh-performance-computinghpcmulti-tenantmulti-userportpypi-packagepythonr-languager-programmingtcputility
5.41 score 13 stars 5 scripts 187 downloadsaroma.core - Core Methods and Classes Used by 'aroma.*' Packages Part of the Aroma Framework
Core methods and classes used by higher-level 'aroma.*' packages part of the Aroma Project, e.g. 'aroma.affymetrix' and 'aroma.cn'.
Last updated 2 years ago
microarrayonechanneltwochannelmultichanneldataimportdatarepresentationguivisualizationpreprocessingqualitycontrolacghcopynumbervariants
4.16 score 1 stars 6 packages 16 scripts 938 downloadsR.huge - Methods for Accessing Huge Amounts of Data [deprecated]
DEPRECATED. Do not start building new projects based on this package. Cross-platform alternatives are the following packages: bigmemory (CRAN), ff (CRAN), BufferedMatrix (Bioconductor). The main usage of it was inside the aroma.affymetrix package. (The package currently provides a class representing a matrix where the actual data is stored in a binary format on the local file system. This way the size limit of the data is set by the file system and not the memory.)
Last updated 10 months ago
3.88 score 5 packages 2 scripts 415 downloadsaroma.apd - A Probe-Level Data File Format Used by 'aroma.affymetrix' [deprecated]
DEPRECATED. Do not start building new projects based on this package. (The (in-house) APD file format was initially developed to store Affymetrix probe-level data, e.g. normalized CEL intensities. Chip types can be added to APD file and similar to methods in the affxparser package, this package provides methods to read APDs organized by units (probesets). In addition, the probe elements can be arranged optimally such that the elements are guaranteed to be read in order when, for instance, data is read unit by unit. This speeds up the read substantially. This package is supporting the Aroma framework and should not be used elsewhere.)
Last updated 1 years ago
microarraydataimport
3.82 score 4 packages 11 scripts 562 downloadsseguid - Sequence Globally Unique Identifier ('SEGUID') Checksums for Linear, Circular, Single-Stranded and Double-Stranded Biological Sequences
An R implementation of the original Sequence Globally Unique Identifier ('SEGUID') algorithm [Babnigg and Giometti (2006) <doi:10.1002/pmic.200600032>] and 'SEGUID' v2 (<https://www.seguid.org>), which extends 'SEGUID' v1 with support for linear, circular, single- and double-stranded biological sequences, e.g. DNA, RNA, and proteins.
Last updated 9 months ago
seguid
3.70 score 5 scripts 136 downloadsACNE - Affymetrix SNP Probe-Summarization using Non-Negative Matrix Factorization
A summarization method to estimate allele-specific copy number signals for Affymetrix SNP microarrays using non-negative matrix factorization (NMF).
Last updated 9 months ago
acghcopynumbervariantssnpmicroarrayonechanneltwochannelgenetics
3.18 score 2 scripts 385 downloadsfuture.tools - Tools for Working with Futures
Tools for Working with Futures.
Last updated 6 months ago
parallel-computingparallel-programming
2.90 score 2 starsaroma.cn - Copy-Number Analysis of Large Microarray Data Sets
Methods for analyzing DNA copy-number data. Specifically, this package implements the multi-source copy-number normalization (MSCN) method for normalizing copy-number data obtained on various platforms and technologies. It also implements the TumorBoost method for normalizing paired tumor-normal SNP data.
Last updated 9 months ago
proprietaryplatformsacghcopynumbervariantssnpmicroarrayonechanneltwochanneldataimportdatarepresentationpreprocessingqualitycontrol
2.70 score 1 stars 9 scripts 427 downloadscalmate - Improved Allele-Specific Copy Number of SNP Microarrays for Downstream Segmentation
The CalMaTe method calibrates preprocessed allele-specific copy number estimates (ASCNs) from DNA microarrays by controlling for single-nucleotide polymorphism-specific allelic crosstalk. The resulting ASCNs are on average more accurate, which increases the power of segmentation methods for detecting changes between copy number states in tumor studies including copy neutral loss of heterozygosity. CalMaTe applies to any ASCNs regardless of preprocessing method and microarray technology, e.g. Affymetrix and Illumina.
Last updated 3 years ago
acghcopynumbervariantssnpmicroarrayonechanneltwochannelgenetics
2.70 score 1 stars 6 scripts 284 downloadsdChipIO - Methods for Reading dChip Files
Functions for reading DCP and CDF.bin files generated by the dChip software.
Last updated 9 years ago
infrastructuredataimport
2.70 score 3 scripts 142 downloadsdoFuture.tests.extra - Extra Test Sets for the 'doFuture' Package
Runs examples of packages that use 'foreach' and '%dopar%' for parallelization, where 'doFuture' is used as the 'foreach' adapter making it possible to use any future backend for parallelization. The package tests use these tools to test 'doFuture' with 'foreach'-based examples from packages 'BiocParallel', 'caret', 'doParallel', 'glmnet', 'NMF', 'plyr', and 'TSP'. These tests are run with many known future backends.
Last updated 12 months ago
futureverseparalleltestsuite
1.70 scoresfit - Multidimensional Simplex Fitting
Methods for robustly fitting a K-dimensional simplex in M dimensions.
Last updated 2 years ago
cconemodelsimplex
1.70 score 1 stars