matrixStats - Functions that Apply to Rows and Columns of Matrices (and to Vectors)
High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().
Last updated 4 months ago
matrixperformancevector
200 stars 14.74 score 0 dependencies 2127 dependents![](https://github.com/HenrikBengtsson/future/raw/master/man/figures/logo.png)
future - Unified Parallel and Distributed Processing in R for Everyone
The purpose of this package is to provide a lightweight and unified Future API for sequential and parallel processing of R expression via futures. The simplest way to evaluate an expression in parallel is to use `x %<-% { expression }` with `plan(multisession)`. This package implements sequential, multicore, multisession, and cluster futures. With these, R expressions can be evaluated on the local machine, in parallel a set of local machines, or distributed on a mix of local and remote machines. Extensions to this package implement additional backends for processing futures via compute cluster schedulers, etc. Because of its unified API, there is no need to modify any code in order switch from sequential on the local machine to, say, distributed processing on a remote compute cluster. Another strength of this package is that global variables and functions are automatically identified and exported as needed, making it straightforward to tweak existing code to make use of futures.
Last updated 4 months ago
asynchronousdistributed-computingfutureshpchpc-clustersparallel-computingparallel-processingparallelizationprogrammingpromises
948 stars 13.18 score 5 dependencies 1039 dependents![](https://github.com/HenrikBengtsson/parallelly/raw/HEAD/man/figures/logo.png)
parallelly - Enhancing the 'parallel' Package
Utility functions that enhance the 'parallel' package and support the built-in parallel backends of the 'future' package. For example, availableCores() gives the number of CPU cores available to your R process as given by the operating system, 'cgroups' and Linux containers, R options, and environment variables, including those set by job schedulers on high-performance compute clusters. If none is set, it will fall back to parallel::detectCores(). Another example is makeClusterPSOCK(), which is backward compatible with parallel::makePSOCKcluster() while doing a better job in setting up remote cluster workers without the need for configuring the firewall to do port-forwarding to your local computer.
Last updated 5 months ago
parallel-computing
124 stars 12.61 score 0 dependencies 1077 dependentsglobals - Identify Global Objects in R Expressions
Identifies global ("unknown" or "free") objects in R expressions by code inspection using various strategies (ordered, liberal, or conservative). The objective of this package is to make it as simple as possible to identify global objects for the purpose of exporting them in parallel, distributed compute environments.
Last updated 2 years ago
code-inspection
29 stars 12.49 score 1 dependencies 1062 dependentslistenv - Environments Behaving (Almost) as Lists
List environments are environments that have list-like properties. For instance, the elements of a list environment are ordered and can be accessed and iterated over using index subsetting, e.g. 'x <- listenv(a = 1, b = 2); for (i in seq_along(x)) x[[i]] <- x[[i]] ^ 2; y <- as.list(x)'.
Last updated 6 months ago
28 stars 12.38 score 0 dependencies 1029 dependentsfuture.apply - Apply Function to Elements in Parallel using Futures
Implementations of apply(), by(), eapply(), lapply(), Map(), .mapply(), mapply(), replicate(), sapply(), tapply(), and vapply() that can be resolved using any future-supported backend, e.g. parallel on the local machine or distributed on a compute cluster. These future_*apply() functions come with the same pros and cons as the corresponding base-R *apply() functions but with the additional feature of being able to be processed via the future framework.
Last updated 4 months ago
asynchronousdistributed-computingfuturehpchpc-clustersparallelparallel-computingparallel-processingparallelizationprogramming
206 stars 11.86 score 6 dependencies 804 dependentsR.oo - R Object-Oriented Programming with or without References
Methods and classes for object-oriented programming in R with or without references. Large effort has been made on making definition of methods as simple as possible with a minimum of maintenance for package developers. The package has been developed since 2001 and is now considered very stable. This is a cross-platform package implemented in pure R that defines standard S3 classes without any tricks.
Last updated 6 months ago
20 stars 11.19 score 1 dependencies 688 dependentsR.methodsS3 - S3 Methods Simplified
Methods that simplify the setup of S3 generic functions and S3 methods. Major effort has been made in making definition of methods as simple as possible with a minimum of maintenance for package developers. For example, generic functions are created automatically, if missing, and naming conflict are automatically solved, if possible. The method setMethodS3() is a good start for those who in the future may want to migrate to S4. This is a cross-platform package implemented in pure R that generates standard S3 methods.
Last updated 2 years ago
1 stars 11.15 score 0 dependencies 684 dependentsR.utils - Various Programming Utilities
Utility functions useful when programming and developing R packages.
Last updated 8 months ago
61 stars 11.13 score 2 dependencies 654 dependentsR.cache - Fast and Light-Weight Caching (Memoization) of Objects and Results to Speed Up Computations
Memoization can be used to speed up repetitive and computational expensive function calls. The first time a function that implements memoization is called the results are stored in a cache memory. The next time the function is called with the same set of parameters, the results are momentarily retrieved from the cache avoiding repeating the calculations. With this package, any R object can be cached in a key-value storage where the key can be an arbitrary set of R objects. The cache memory is persistent (on the file system).
Last updated 2 years ago
cachememoization
38 stars 6.67 score 4 dependencies 99 dependentsdoFuture - Use Foreach to Parallelize via the Future Framework
The 'future' package provides a unifying parallelization framework for R that supports many parallel and distributed backends. The 'foreach' package provides a powerful API for iterating over an R expression in parallel. The 'doFuture' package brings the best of the two together. There are two alternative ways to use this package. The recommended approach is to use 'y <- foreach(...) %dofuture% { ... }', which does not require using 'registerDoFuture()' and has many advantages over '%dopar%'. The alternative is the traditional 'foreach' approach by registering the 'foreach' adapter 'registerDoFuture()' and so that 'y <- foreach(...) %dopar% { ... }' runs in parallelizes with the 'future' framework.
Last updated 7 months ago
batchjobsbatchtoolsbiocparalleldistributed-computingforeachhpchpc-clustersparallelplyr
83 stars 6.48 score 9 dependencies 77 dependentsilluminaio - Parsing Illumina Microarray Output Files
Tools for parsing Illumina's microarray output files, including IDAT.
Last updated 3 months ago
bioconductor-package
5.06 score 4 dependencies 44 dependentsR.matlab - Read and Write MAT Files and Call MATLAB from Within R
Methods readMat() and writeMat() for reading and writing MAT files. For user with MATLAB v6 or newer installed (either locally or on a remote host), the package also provides methods for controlling MATLAB (trademark) via R and sending and retrieving data between R and MATLAB.
Last updated 2 years ago
matlab
86 stars 4.98 score 3 dependencies 22 dependentsstartup - Friendly R Startup Configuration
Adds support for R startup configuration via '.Renviron.d' and '.Rprofile.d' directories in addition to '.Renviron' and '.Rprofile' files. This makes it possible to keep private / secret environment variables separate from other environment variables. It also makes it easier to share specific startup settings by simply copying a file to a directory.
Last updated 8 months ago
configurationenvironment-variablesstartuputility
161 stars 4.96 score 0 dependenciesilluminaio - Parsing Illumina Microarray Output Files
Tools for parsing Illumina's microarray output files, including IDAT.
Last updated 1 years ago
bioconductor-packagebioconductor
6 stars 4.75 score 4 dependencies 44 dependentsTopDom - An Efficient and Deterministic Method for Identifying Topological Domains in Genomes
The 'TopDom' method identifies topological domains in genomes from Hi-C sequence data (Shin et al., 2016 <doi:10.1093/nar/gkv1505>). The authors published an implementation of their method as an R script (two different versions; also available in this package). This package originates from those original 'TopDom' R scripts and provides help pages adopted from the original 'TopDom' PDF documentation. It also provides a small number of bug fixes to the original code.
Last updated 3 years ago
genomicshictopological-domains
18 stars 4.33 score 34 dependencies 1 dependentsfuture.batchtools - A Future API for Parallel and Distributed Processing using 'batchtools'
Implementation of the Future API on top of the 'batchtools' package. This allows you to process futures, as defined by the 'future' package, in parallel out of the box, not only on your local machine or ad-hoc cluster of machines, but also via high-performance compute ('HPC') job schedulers such as 'LSF', 'OpenLava', 'Slurm', 'SGE', and 'TORQUE' / 'PBS', e.g. 'y <- future.apply::future_lapply(files, FUN = process)'.
Last updated 7 months ago
distributed-computinghpcjob-schedulerparallelpbssgeslurmtorque
83 stars 3.97 score 27 dependenciesaffxparser - Affymetrix File Parsing SDK
Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
Last updated 3 months ago
bioconductor-package
3.93 score 0 dependencies 15 dependentsfuture.callr - A Future API for Parallel Processing using 'callr'
Implementation of the Future API on top of the 'callr' package. This allows you to process futures, as defined by the 'future' package, in parallel out of the box, on your local (Linux, macOS, Windows, ...) machine. Contrary to backends relying on the 'parallel' package (e.g. 'future::multisession') and socket connections, the 'callr' backend provided here can run more than 125 parallel R processes.
Last updated 12 months ago
asynchronousfuturesparallel-computingparallel-processingparallelizationprogrammingpromises
62 stars 3.64 score 10 dependencies 1 dependentsprofmem - Simple Memory Profiling for R
A simple and light-weight API for memory profiling of R expressions. The profiling is built on top of R's built-in memory profiler ('utils::Rprofmem()'), which records every memory allocation done by R (also native code).
Last updated 4 years ago
memory-profilerperformanceram
33 stars 3.61 score 0 dependencies 10 dependentsaroma.light - Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Last updated 1 years ago
bioconductor-packagebioconductor
1 stars 3.47 score 4 dependencies 20 dependentsaroma.light - Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Last updated 3 months ago
bioconductor-package
3.38 score 4 dependencies 20 dependentsaffxparser - Affymetrix File Parsing SDK
Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
Last updated 6 months ago
bioconductor-packagebioconductor
7 stars 3.36 score 0 dependencies 15 dependentsR.devices - Unified Handling of Graphics Devices
Functions for creating plots and image files in a unified way regardless of output format (EPS, PDF, PNG, SVG, TIFF, WMF, etc.). Default device options as well as scales and aspect ratios are controlled in a uniform way across all device types. Switching output format requires minimal changes in code. This package is ideal for large-scale batch processing, because it will never leave open graphics devices or incomplete image files behind, even on errors or user interrupts.
Last updated 6 months ago
graphics
17 stars 3.32 score 4 dependencies 13 dependentsPSCBS - Analysis of Parent-Specific DNA Copy Numbers
Segmentation of allele-specific DNA copy number data and detection of regions with abnormal copy number within each parental chromosome. Both tumor-normal paired and tumor-only analyses are supported.
Last updated 5 months ago
7 stars 3.18 score 13 dependencies 9 dependentsaroma.core - Core Methods and Classes Used by 'aroma.*' Packages Part of the Aroma Framework
Core methods and classes used by higher-level 'aroma.*' packages part of the Aroma Project, e.g. 'aroma.affymetrix' and 'aroma.cn'.
Last updated 2 years ago
1 stars 2.09 score 20 dependencies 6 dependentsaroma.affymetrix - Analysis of Large Affymetrix Microarray Data Sets
A cross-platform R framework that facilitates processing of any number of Affymetrix microarray samples regardless of computer system. The only parameter that limits the number of chips that can be processed is the amount of available disk space. The Aroma Framework has successfully been used in studies to process tens of thousands of arrays. This package has actively been used since 2006.
Last updated 5 months ago
affymetrixanalysiscopy-numberdnaexpressionhpclarge-scalemicroarraynotebookreproducibilityrna
10 stars 2.08 score 24 dependencies 3 dependentsport4me - Get the Same, Personal, Free 'TCP' Port over and over
An R implementation of the cross-platform, language-independent "port4me" algorithm (<https://github.com/HenrikBengtsson/port4me>), which (1) finds a free Transmission Control Protocol ('TCP') port in [1024,65535] that the user can open, (2) is designed to work in multi-user environments, (3), gives different users, different ports, (4) gives the user the same port over time with high probability, (5) gives different ports for different software tools, and (6) requires no configuration.
Last updated 5 months ago
bashclihigh-performance-computinghpcmulti-tenantmulti-userportpypi-packagepythonr-languager-programmingtcputility
13 stars 2.01 score 0 dependenciesR.huge - Methods for Accessing Huge Amounts of Data [deprecated]
DEPRECATED. Do not start building new projects based on this package. Cross-platform alternatives are the following packages: bigmemory (CRAN), ff (CRAN), BufferedMatrix (Bioconductor). The main usage of it was inside the aroma.affymetrix package. (The package currently provides a class representing a matrix where the actual data is stored in a binary format on the local file system. This way the size limit of the data is set by the file system and not the memory.)
Last updated 6 months ago
1.83 score 3 dependencies 5 dependentsfuture.tests - Test Suite for 'Future API' Backends
Backends implementing the 'Future' API, as defined by the 'future' package, should use the tests provided by this package to validate that they meet the minimal requirements of the 'Future' API. The tests can be performed easily from within R or from outside of R from the command line making it straightforward to include them in package tests and in Continuous Integration (CI) pipelines.
Last updated 1 years ago
futuretesting
10 stars 1.82 score 10 dependenciesseguid - Sequence Globally Unique Identifier ('SEGUID') Checksums for Linear, Circular, Single-Stranded and Double-Stranded Biological Sequences
An R implementation of the original Sequence Globally Unique Identifier ('SEGUID') algorithm [Babnigg and Giometti (2006) <doi:10.1002/pmic.200600032>] and 'SEGUID' v2 (<https://www.seguid.org>), which extends 'SEGUID' v1 with support for linear, circular, single- and double-stranded biological sequences, e.g. DNA, RNA, and proteins.
Last updated 5 months ago
seguid
1.64 score 2 dependenciesaroma.cn - Copy-Number Analysis of Large Microarray Data Sets
Methods for analyzing DNA copy-number data. Specifically, this package implements the multi-source copy-number normalization (MSCN) method for normalizing copy-number data obtained on various platforms and technologies. It also implements the TumorBoost method for normalizing paired tumor-normal SNP data.
Last updated 5 months ago
1 stars 0.95 score 22 dependenciesfuture.tools - Tools for Working with Futures
Tools for Working with Futures.
Last updated 2 months ago
parallel-computingparallel-programming
2 stars 0.71 score 37 dependenciesdChipIO - Methods for Reading dChip Files
Functions for reading DCP and CDF.bin files generated by the dChip software.
Last updated 9 years ago
0.62 score 0 dependenciesdoFuture.tests.extra - Extra Test Sets for the 'doFuture' Package
Runs examples of packages that use 'foreach' and '%dopar%' for parallelization, where 'doFuture' is used as the 'foreach' adapter making it possible to use any future backend for parallelization. The package tests use these tools to test 'doFuture' with 'foreach'-based examples from packages 'BiocParallel', 'caret', 'doParallel', 'glmnet', 'NMF', 'plyr', and 'TSP'. These tests are run with many known future backends.
Last updated 8 months ago
futureverseparalleltestsuite
0.09 score 10 dependenciessfit - Multidimensional Simplex Fitting
Methods for robustly fitting a K-dimensional simplex in M dimensions.
Last updated 2 years ago
cconemodelsimplex
1 stars 0.09 score 2 dependenciesHaarSeg - Fast and Flexible Microarray Segmentation
A fast and flexible method for the segmentation of aCGH data using the HaarSeg method by Ben-Yaacov and Eldar (2008) <doi:10.1093/bioinformatics/btn272>.
Last updated 2 years ago
legacy
0.00 score 0 dependenciesexpectile - Modelling of Expectiles
Methods for fitting a simplex or polyhedral cone to multivariate data, for doing expectile regression and skyline/baseline estimation.
Last updated 2 years ago
0.00 score 1 dependencies