Category Package Title Description
1 Bayesian Inference abc Tools for Approximate Bayesian Computation (ABC) Implements several ABC algorithms for performing parameter estimation, model selection, and goodness-of-fit. Cross-validation tools are also available for measuring the accuracy of ABC estimates, and to calculate the misclassification probabilities of different models.
2 Bayesian Inference abn Modelling Multivariate Data with Additive Bayesian Networks Bayesian network analysis is a form of probabilistic graphical models which derives from empirical data a directed acyclic graph, DAG, describing the dependency structure between random variables. An additive Bayesian network model consists of a form of a DAG where each node comprises a generalized linear model, GLM. Additive Bayesian network models are equivalent to Bayesian multivariate regression using graphical modelling, they generalises the usual multivariable regression, GLM, to multiple dependent variables. ‘abn’ provides routines to help determine optimal Bayesian network models for a given data set, where these models are used to identify statistical dependencies in messy, complex data. The additive formulation of these models is equivalent to multivariate generalised linear modelling (including mixed models with iid random effects). The usual term to describe this model selection process is structure discovery. The core functionality is concerned with model selection - determining the most robust empirical model of data from interdependent variables. Laplace approximations are used to estimate goodness of fit metrics and model parameters, and wrappers are also included to the INLA package which can be obtained from http://www.r-inla.org. It is recommended the testing version, which can be downloaded by running: source(“http://www.math.ntnu.no/inla/givemeINLA-testing.R”). A comprehensive set of documented case studies, numerical accuracy/quality assurance exercises, and additional documentation are available from the ‘abn’ website.
3 Bayesian Inference AdMit Adaptive Mixture of Student-t Distributions Provides functions to perform the fitting of an adaptive mixture of Student-t distributions to a target density through its kernel function as described in Ardia et al. (2009) doi:10.18637/jss.v029.i03. The mixture approximation can then be used as the importance density in importance sampling or as the candidate density in the Metropolis-Hastings algorithm to obtain quantities of interest for the target density itself.
4 Bayesian Inference arm (core) Data Analysis Using Regression and Multilevel/Hierarchical Models Functions to accompany A. Gelman and J. Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2007.
5 Bayesian Inference AtelieR A GTK GUI for teaching basic concepts in statistical inference, and doing elementary bayesian tests A collection of statistical simulation and computation tools with a GTK GUI, to help teach statistical concepts and compute probabilities. Two domains are covered: I. Understanding (Central-Limit Theorem and the Normal Distribution, Distribution of a sample mean, Distribution of a sample variance, Probability calculator for common distributions), and II. Elementary Bayesian Statistics (bayesian inference on proportions, contingency tables, means and variances, with informative and noninformative priors).
6 Bayesian Inference BaBooN Bayesian Bootstrap Predictive Mean Matching - Multiple and Single Imputation for Discrete Data Included are two variants of Bayesian Bootstrap Predictive Mean Matching to multiply impute missing data. The first variant is a variable-by-variable imputation combining sequential regression and Predictive Mean Matching (PMM) that has been extended for unordered categorical data. The Bayesian Bootstrap allows for generating approximately proper multiple imputations. The second variant is also based on PMM, but the focus is on imputing several variables at the same time. The suggestion is to use this variant, if the missing-data pattern resembles a data fusion situation, or any other missing-by-design pattern, where several variables have identical missing-data patterns. Both variants can be run as ‘single imputation’ versions, in case the analysis objective is of a purely descriptive nature.
7 Bayesian Inference BACCO (core) Bayesian Analysis of Computer Code Output (BACCO) The BACCO bundle of packages is replaced by the BACCO package, which provides a vignette that illustrates the constituent packages (emulator, approximator, calibrator) in use.
8 Bayesian Inference BaM Functions and Datasets for Books by Jeff Gill Functions and datasets for Jeff Gill: “Bayesian Methods: A Social and Behavioral Sciences Approach”. First, Second, and Third Edition. Published by Chapman and Hall/CRC (2002, 2007, 2014).
9 Bayesian Inference BAS Bayesian Variable Selection and Model Averaging using Bayesian Adaptive Sampling Package for Bayesian Variable Selection and Model Averaging in linear models and generalized linear models using stochastic or deterministic sampling without replacement from posterior distributions. Prior distributions on coefficients are from Zellner’s g-prior or mixtures of g-priors corresponding to the Zellner-Siow Cauchy Priors or the mixture of g-priors from Liang et al (2008) doi:10.1198/016214507000001337 for linear models or mixtures of g-priors in GLMs of Li and Clyde (2015) <arXiv:1503.06913>. Other model selection criteria include AIC, BIC and Empirical Bayes estimates of g. Sampling probabilities may be updated based on the sampled models using Sampling w/out Replacement or an efficient MCMC algorithm samples models using the BAS tree structure as an efficient hash table. Uniform priors over all models or beta-binomial prior distributions on model size are allowed, and for large p truncated priors on the model space may be used. The user may force variables to always be included. Details behind the sampling algorithm are provided in Clyde, Ghosh and Littman (2010) doi:10.1198/jcgs.2010.09049. This material is based upon work supported by the National Science Foundation under Grant DMS-1106891. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
10 Bayesian Inference BayesDA Functions and Datasets for the book “Bayesian Data Analysis” Functions for Bayesian Data Analysis, with datasets from the book “Bayesian data Analysis (second edition)” by Gelman, Carlin, Stern and Rubin. Not all datasets yet, hopefully completed soon.
11 Bayesian Inference bayesGARCH Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations Provides the bayesGARCH() function which performs the Bayesian estimation of the GARCH(1,1) model with Student’s t innovations as described in Ardia (2008) doi:10.1007/978-3-540-78657-3.
12 Bayesian Inference bayesImageS Bayesian Methods for Image Segmentation using a Potts Model Various algorithms for segmentation of 2D and 3D images, such as computed tomography and satellite remote sensing. This package implements Bayesian image analysis using the hidden Potts model with external field prior. Latent labels are sampled using chequerboard updating or Swendsen-Wang. Algorithms for the smoothing parameter include pseudolikelihood, path sampling, the exchange algorithm, approximate Bayesian computation (ABC-MCMC and ABC-SMC), and Bayesian indirect likelihood (BIL).
13 Bayesian Inference bayesm (core) Bayesian Inference for Marketing/Micro-Econometrics Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley 2005) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).
14 Bayesian Inference bayesmeta Bayesian Random-Effects Meta-Analysis A collection of functions allowing to derive the posterior distribution of the two parameters in a random-effects meta-analysis, and providing functionality to evaluate joint and marginal posterior probability distributions, predictive distributions, shrinkage effects, posterior predictive p-values, etc.
15 Bayesian Inference bayesmix Bayesian Mixture Models with JAGS The fitting of finite mixture models of univariate Gaussian distributions using JAGS within a Bayesian framework is provided.
16 Bayesian Inference bayesQR Bayesian Quantile Regression Bayesian quantile regression using the asymmetric Laplace distribution, both continuous as well as binary dependent variables are supported. The package consists of implementations of the methods of Yu & Moyeed (2001) doi:10.1016/S0167-7152(01)00124-9, Benoit & Van den Poel (2012) doi:10.1002/jae.1216 and Al-Hamzawi, Yu & Benoit (2012) doi:10.1177/1471082X1101200304. To speed up the calculations, the Markov Chain Monte Carlo core of all algorithms is programmed in Fortran and called from R.
17 Bayesian Inference BayesSummaryStatLM MCMC Sampling of Bayesian Linear Models via Summary Statistics Methods for generating Markov Chain Monte Carlo (MCMC) posterior samples of Bayesian linear regression model parameters that require only summary statistics of data as input. Summary statistics are useful for systems with very limited amounts of physical memory. The package provides two functions: one function that computes summary statistics of data and one function that carries out the MCMC posterior sampling for Bayesian linear regression models where summary statistics are used as input. The function read.regress.data.ff utilizes the R package ‘ff’ to handle data sets that are too large to fit into a user’s physical memory, by reading in data in chunks.
18 Bayesian Inference bayesSurv (core) Bayesian Survival Regression with Flexible Error and Random Effects Distributions Contains Bayesian implementations of Mixed-Effects Accelerated Failure Time (MEAFT) models for censored data. Those can be not only right-censored but also interval-censored, doubly-interval-censored or misclassified interval-censored.
19 Bayesian Inference Bayesthresh Bayesian thresholds mixed-effects models for categorical data This package fits a linear mixed model for ordinal categorical responses using Bayesian inference via Monte Carlo Markov Chains. Default is Nandran & Chen algorithm using Gaussian link function and saving just the summaries of the chains. Among the options, package allow for two other options of algorithms, for using Student’s “t” link function and for saving the full chains.
20 Bayesian Inference BayesTree Bayesian Additive Regression Trees This is an implementation of BART:Bayesian Additive Regression Trees, by Chipman, George, McCulloch (2010).
21 Bayesian Inference BayesValidate BayesValidate Package BayesValidate implements the software validation method described in the paper “Validation of Software for Bayesian Models using Posterior Quantiles” (Cook, Gelman, and Rubin, 2005). It inputs a function to perform Bayesian inference as well as functions to generate data from the Bayesian model being fit, and repeatedly generates and analyzes data to check that the Bayesian inference program works properly.
22 Bayesian Inference BayesVarSel Bayes Factors, Model Choice and Variable Selection in Linear Models Conceived to calculate Bayes factors in linear models and then to provide a formal Bayesian answer to testing and variable selection problems. From a theoretical side, the emphasis in this package is placed on the prior distributions and it allows a wide range of them: Jeffreys (1961); Zellner and Siow(1980)doi:10.1007/bf02888369; Zellner and Siow(1984); Zellner (1986)doi:10.2307/2233941; Fernandez et al. (2001)doi:10.1016/S0304-4076(00)00076-2; Liang et al. (2008)doi:10.1198/016214507000001337 and Bayarri et al. (2012)doi:10.1214/12-aos1013. The interaction with the package is through a friendly interface that syntactically mimics the well-known lm() command of R. The resulting objects can be easily explored providing the user very valuable information (like marginal, joint and conditional inclusion probabilities of potential variables; the highest posterior probability model, HPM; the median probability model, MPM) about the structure of the true -data generating- model. Additionally, this package incorporates abilities to handle problems with a large number of potential explanatory variables through parallel and heuristic versions of the main commands, Garcia-Donato and Martinez-Beneito (2013)doi:10.1080/01621459.2012.742443.
23 Bayesian Inference BayesX R Utilities Accompanying the Software Package BayesX This package provides functionality for exploring and visualising estimation results obtained with the software package BayesX for structured additive regression. It also provides functions that allow to read, write and manipulate map objects that are required in spatial analyses performed with BayesX, a free software for estimating structured additive regression models (http://www.bayesx.org).
24 Bayesian Inference BayHaz R Functions for Bayesian Hazard Rate Estimation A suite of R functions for Bayesian estimation of smooth hazard rates via Compound Poisson Process (CPP) and Bayesian Penalized Spline (BPS) priors.
25 Bayesian Inference BAYSTAR On Bayesian analysis of Threshold autoregressive model (BAYSTAR) The manuscript introduces the BAYSTAR package, which provides the functionality for Bayesian estimation in autoregressive threshold models.
26 Bayesian Inference bbemkr Bayesian bandwidth estimation for multivariate kernel regression with Gaussian error Bayesian bandwidth estimation for Nadaraya-Watson type multivariate kernel regression with Gaussian error density
27 Bayesian Inference BCBCSF Bias-Corrected Bayesian Classification with Selected Features Fully Bayesian Classification with a subset of high-dimensional features, such as expression levels of genes. The data are modeled with a hierarchical Bayesian models using heavy-tailed t distributions as priors. When a large number of features are available, one may like to select only a subset of features to use, typically those features strongly correlated with the response in training cases. Such a feature selection procedure is however invalid since the relationship between the response and the features has be exaggerated by feature selection. This package provides a way to avoid this bias and yield better-calibrated predictions for future cases when one uses F-statistic to select features.
28 Bayesian Inference BCE Bayesian composition estimator: estimating sample (taxonomic) composition from biomarker data Function to estimates taxonomic compositions from biomarker data, using a Bayesian approach.
29 Bayesian Inference bclust Bayesian Hierarchical Clustering Using Spike and Slab Models Builds a dendrogram using log posterior as a natural distance defined by the model and meanwhile waits the clustering variables. It is also capable to computing equivalent Bayesian discrimination probabilities. The adopted method suites small sample large dimension setting. The model parameter estimation maybe difficult, depending on data structure and the chosen distribution family.
30 Bayesian Inference bcp Bayesian Analysis of Change Point Problems Provides an implementation of the Barry and Hartigan (1993) product partition model for the normal errors change point problem using Markov Chain Monte Carlo. It also extends the methodology to regression models on a connected graph (Wang and Emerson, 2015); this allows estimation of change point models with multivariate responses. Parallel MCMC, previously available in bcp v.3.0.0, is currently not implemented.
31 Bayesian Inference bisoreg Bayesian Isotonic Regression with Bernstein Polynomials Provides functions for fitting Bayesian monotonic regression models to data.
32 Bayesian Inference BLR Bayesian Linear Regression Bayesian Linear Regression
33 Bayesian Inference BMA Bayesian Model Averaging Package for Bayesian model averaging and variable selection for linear models, generalized linear models and survival models (cox regression).
34 Bayesian Inference Bmix Bayesian Sampling for Stick-Breaking Mixtures This is a bare-bones implementation of sampling algorithms for a variety of Bayesian stick-breaking (marginally DP) mixture models, including particle learning and Gibbs sampling for static DP mixtures, particle learning for dynamic BAR stick-breaking, and DP mixture regression. The software is designed to be easy to customize to suit different situations and for experimentation with stick-breaking models. Since particles are repeatedly copied, it is not an especially efficient implementation.
35 Bayesian Inference BMS Bayesian Model Averaging Library Bayesian model averaging for linear models with a wide choice of (customizable) priors. Built-in priors include coefficient priors (fixed, flexible and hyper-g priors), 5 kinds of model priors, moreover model sampling by enumeration or various MCMC approaches. Post-processing functions allow for inferring posterior inclusion and model probabilities, various moments, coefficient and predictive densities. Plotting functions available for posterior model size, MCMC convergence, predictive and coefficient densities, best models representation, BMA comparison.
36 Bayesian Inference bnlearn Bayesian Network Structure Learning, Parameter Learning and Inference Bayesian network structure learning, parameter learning and inference. This package implements constraint-based (GS, IAMB, Inter-IAMB, Fast-IAMB, MMPC, Hiton-PC), pairwise (ARACNE and Chow-Liu), score-based (Hill-Climbing and Tabu Search) and hybrid (MMHC and RSMAX2) structure learning algorithms for discrete, Gaussian and conditional Gaussian networks, along with many score functions and conditional independence tests. The Naive Bayes and the Tree-Augmented Naive Bayes (TAN) classifiers are also implemented. Some utility functions (model comparison and manipulation, random data generation, arc orientation testing, simple and advanced plots) are included, as well as support for parameter estimation (maximum likelihood and Bayesian) and inference, conditional probability queries and cross-validation. Development snapshots with the latest bugfixes are available from http://www.bnlearn.com.
37 Bayesian Inference boa (core) Bayesian Output Analysis Program (BOA) for MCMC A menu-driven program and library of functions for carrying out convergence diagnostics and statistical and graphical analysis of Markov chain Monte Carlo sampling output.
38 Bayesian Inference Bolstad Functions for Elementary Bayesian Inference A set of R functions and data sets for the book Introduction to Bayesian Statistics, Bolstad, W.M. (2017), John Wiley & Sons ISBN 978-1-118-09156-2.
39 Bayesian Inference Boom Bayesian Object Oriented Modeling A C++ library for Bayesian modeling, with an emphasis on Markov chain Monte Carlo. Although boom contains a few R utilities (mainly plotting functions), its primary purpose is to install the BOOM C++ library on your system so that other packages can link against it.
40 Bayesian Inference BoomSpikeSlab MCMC for Spike and Slab Regression Spike and slab regression a la McCulloch and George (1997).
41 Bayesian Inference bqtl Bayesian QTL Mapping Toolkit QTL mapping toolkit for inbred crosses and recombinant inbred lines. Includes maximum likelihood and Bayesian tools.
42 Bayesian Inference bridgesampling Bridge Sampling for Marginal Likelihoods and Bayes Factors Provides functions for estimating marginal likelihoods, Bayes factors, posterior model probabilities, and normalizing constants in general, via different versions of bridge sampling (Meng & Wong, 1996, <http:// www3.stat.sinica.edu.tw/statistica/j6n4/j6n43/j6n43.htm>).
43 Bayesian Inference bspec Bayesian Spectral Inference Bayesian inference on the (discrete) power spectrum of time series.
44 Bayesian Inference bspmma bspmma: Bayesian Semiparametric Models for Meta-Analysis Some functions for nonparametric and semiparametric Bayesian models for random effects meta-analysis
45 Bayesian Inference BSquare Bayesian Simultaneous Quantile Regression This package models the quantile process as a function of predictors.
46 Bayesian Inference bsts Bayesian Structural Time Series Time series regression using dynamic linear models fit using MCMC. See Scott and Varian (2014) doi:10.1504/IJMMNO.2014.059942, among many other sources.
47 Bayesian Inference BVS Bayesian Variant Selection: Bayesian Model Uncertainty Techniques for Genetic Association Studies The functions in this package focus on analyzing case-control association studies involving a group of genetic variants. In particular, we are interested in modeling the outcome variable as a function of a multivariate genetic profile using Bayesian model uncertainty and variable selection techniques. The package incorporates functions to analyze data sets involving common variants as well as extensions to model rare variants via the Bayesian Risk Index (BRI) as well as haplotypes. Finally, the package also allows the incorporation of external biological information to inform the marginal inclusion probabilities via the iBMU.
48 Bayesian Inference catnet Categorical Bayesian Network Inference Structure learning and parameter estimation of discrete Bayesian networks using likelihood-based criteria. Exhaustive search for fixed node orders and stochastic search of optimal orders via simulated annealing algorithm are implemented.
49 Bayesian Inference coalescentMCMC MCMC Algorithms for the Coalescent Flexible framework for coalescent analyses in R. It includes a main function running the MCMC algorithm, auxiliary functions for tree rearrangement, and some functions to compute population genetic parameters.
50 Bayesian Inference coda (core) Output Analysis and Diagnostics for MCMC Provides functions for summarizing and plotting the output from Markov Chain Monte Carlo (MCMC) simulations, as well as diagnostic tests of convergence to the equilibrium distribution of the Markov chain.
51 Bayesian Inference cudaBayesreg CUDA Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis Compute Unified Device Architecture (CUDA) is a software platform for massively parallel high-performance computing on NVIDIA GPUs. This package provides a CUDA implementation of a Bayesian multilevel model for the analysis of brain fMRI data. A fMRI data set consists of time series of volume data in 4D space. Typically, volumes are collected as slices of 64 x 64 voxels. Analysis of fMRI data often relies on fitting linear regression models at each voxel of the brain. The volume of the data to be processed, and the type of statistical analysis to perform in fMRI analysis, call for high-performance computing strategies. In this package, the CUDA programming model uses a separate thread for fitting a linear regression model at each voxel in parallel. The global statistical model implements a Gibbs Sampler for hierarchical linear models with a normal prior. This model has been proposed by Rossi, Allenby and McCulloch in ‘Bayesian Statistics and Marketing’, Chapter 3, and is referred to as ‘rhierLinearModel’ in the R-package bayesm. A notebook equipped with a NVIDIA ‘GeForce 8400M GS’ card having Compute Capability 1.1 has been used in the tests. The data sets used in the package’s examples are available in the separate package cudaBayesregData.
52 Bayesian Inference dclone Data Cloning and MCMC Tools for Maximum Likelihood Methods Low level functions for implementing maximum likelihood estimating procedures for complex models using data cloning and Bayesian Markov chain Monte Carlo methods. Sequential and parallel MCMC support for JAGS, WinBUGS and OpenBUGS.
53 Bayesian Inference deal Learning Bayesian Networks with Mixed Variables Bayesian networks with continuous and/or discrete variables can be learned and compared from data.
54 Bayesian Inference deBInfer Bayesian Inference for Differential Equations A Bayesian framework for parameter inference in differential equations. This approach offers a rigorous methodology for parameter inference as well as modeling the link between unobservable model states and parameters, and observable quantities. Provides templates for the DE model, the observation model and data likelihood, and the model parameters and their prior distributions. A Markov chain Monte Carlo (MCMC) procedure processes these inputs to estimate the posterior distributions of the parameters and any derived quantities, including the model trajectories. Further functionality is provided to facilitate MCMC diagnostics and the visualisation of the posterior distributions of model parameters and trajectories.
55 Bayesian Inference dlm Bayesian and Likelihood Analysis of Dynamic Linear Models Maximum likelihood, Kalman filtering and smoothing, and Bayesian analysis of Normal linear State Space models, also known as Dynamic Linear Models
56 Bayesian Inference DPpackage (core) Bayesian Nonparametric Modeling in R Functions to perform inference via simulation from the posterior distributions for Bayesian nonparametric and semiparametric models. Although the name of the package was motivated by the Dirichlet Process prior, the package considers and will consider other priors on functional spaces. So far, DPpackage includes models considering Dirichlet Processes, Dependent Dirichlet Processes, Dependent Poisson- Dirichlet Processes, Hierarchical Dirichlet Processes, Polya Trees, Linear Dependent Tailfree Processes, Mixtures of Triangular distributions, Random Bernstein polynomials priors and Dependent Bernstein Polynomials. The package also includes models considering Penalized B-Splines. Includes semiparametric models for marginal and conditional density estimation, ROC curve analysis, interval censored data, binary regression models, generalized linear mixed models, IRT type models, and generalized additive models. Also contains functions to compute Pseudo-Bayes factors for model comparison, and to elicitate the precision parameter of the Dirichlet Process. To maximize computational efficiency, the actual sampling for each model is done in compiled FORTRAN. The functions return objects which can be subsequently analyzed with functions provided in the ‘coda’ package.
57 Bayesian Inference EbayesThresh Empirical Bayes Thresholding and Related Methods Empirical Bayes thresholding using the methods developed by I. M. Johnstone and B. W. Silverman. The basic problem is to estimate a mean vector given a vector of observations of the mean vector plus white noise, taking advantage of possible sparsity in the mean vector. Within a Bayesian formulation, the elements of the mean vector are modelled as having, independently, a distribution that is a mixture of an atom of probability at zero and a suitable heavy-tailed distribution. The mixing parameter can be estimated by a marginal maximum likelihood approach. This leads to an adaptive thresholding approach on the original data. Extensions of the basic method, in particular to wavelet thresholding, are also implemented within the package.
58 Bayesian Inference ebdbNet Empirical Bayes Estimation of Dynamic Bayesian Networks Infer the adjacency matrix of a network from time course data using an empirical Bayes estimation procedure based on Dynamic Bayesian Networks.
59 Bayesian Inference eco Ecological Inference in 2x2 Tables Implements the Bayesian and likelihood methods proposed in Imai, Lu, and Strauss (2008 doi:10.1093/pan/mpm017) and (2011 doi:10.18637/jss.v042.i05) for ecological inference in 2 by 2 tables as well as the method of bounds introduced by Duncan and Davis (1953). The package fits both parametric and nonparametric models using either the Expectation-Maximization algorithms (for likelihood models) or the Markov chain Monte Carlo algorithms (for Bayesian models). For all models, the individual-level data can be directly incorporated into the estimation whenever such data are available. Along with in-sample and out-of-sample predictions, the package also provides a functionality which allows one to quantify the effect of data aggregation on parameter estimation and hypothesis testing under the parametric likelihood models.
60 Bayesian Inference eigenmodel Semiparametric factor and regression models for symmetric relational data This package estimates the parameters of a model for symmetric relational data (e.g., the above-diagonal part of a square matrix), using a model-based eigenvalue decomposition and regression. Missing data is accomodated, and a posterior mean for missing data is calculated under the assumption that the data are missing at random. The marginal distribution of the relational data can be arbitrary, and is fit with an ordered probit specification.
61 Bayesian Inference ensembleBMA Probabilistic Forecasting using Ensembles and Bayesian Model Averaging Bayesian Model Averaging to create probabilistic forecasts from ensemble forecasts and weather observations.
62 Bayesian Inference evdbayes Bayesian Analysis in Extreme Value Theory Provides functions for the bayesian analysis of extreme value models, using MCMC methods.
63 Bayesian Inference exactLoglinTest Monte Carlo Exact Tests for Log-linear models Monte Carlo and MCMC goodness of fit tests for log-linear models
64 Bayesian Inference factorQR Bayesian quantile regression factor models Package to fit Bayesian quantile regression models that assume a factor structure for at least part of the design matrix.
65 Bayesian Inference FME A Flexible Modelling Environment for Inverse Modelling, Sensitivity, Identifiability and Monte Carlo Analysis Provides functions to help in fitting models to data, to perform Monte Carlo, sensitivity and identifiability analysis. It is intended to work with models be written as a set of differential equations that are solved either by an integration routine from package ‘deSolve’, or a steady-state solver from package ‘rootSolve’. However, the methods can also be used with other types of functions.
66 Bayesian Inference geoR Analysis of Geostatistical Data Geostatistical analysis including traditional, likelihood-based and Bayesian methods.
67 Bayesian Inference geoRglm A Package for Generalised Linear Spatial Models Functions for inference in generalised linear spatial models. The posterior and predictive inference is based on Markov chain Monte Carlo methods. Package geoRglm is an extension to the package geoR, which must be installed first.
68 Bayesian Inference ggmcmc Tools for Analyzing MCMC Simulations from Bayesian Inference Tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables.
69 Bayesian Inference glmmBUGS Generalised Linear Mixed Models with BUGS and JAGS Automates running Generalized Linear Mixed Models, including spatial models, with WinBUGS, OpenBUGS and JAGS. Models are specified with formulas, with the package writings model files, arranging unbalanced data in ragged arrays, and creating starting values. The model is re-parameterized, and functions are provided for converting model outputs to the original parameterization.
70 Bayesian Inference gRain Graphical Independence Networks Probability propagation in graphical independence networks, also known as Bayesian networks or probabilistic expert systems.
71 Bayesian Inference growcurves Bayesian Semi and Nonparametric Growth Curve Models that Additionally Include Multiple Membership Random Effects Employs a non-parametric formulation for by-subject random effect parameters to borrow strength over a constrained number of repeated measurement waves in a fashion that permits multiple effects per subject. One class of models employs a Dirichlet process (DP) prior for the subject random effects and includes an additional set of random effects that utilize a different grouping factor and are mapped back to clients through a multiple membership weight matrix; e.g. treatment(s) exposure or dosage. A second class of models employs a dependent DP (DDP) prior for the subject random effects that directly incorporates the multiple membership pattern.
72 Bayesian Inference hbsae Hierarchical Bayesian Small Area Estimation Functions to compute small area estimates based on a basic area or unit-level model. The model is fit using restricted maximum likelihood, or in a hierarchical Bayesian way. In the latter case numerical integration is used to average over the posterior density for the between-area variance. The output includes the model fit, small area estimates and corresponding MSEs, as well as some model selection measures. Additional functions provide means to compute aggregate estimates and MSEs, to minimally adjust the small area estimates to benchmarks at a higher aggregation level, and to graphically compare different sets of small area estimates.
73 Bayesian Inference HI Simulation from distributions supported by nested hyperplanes Simulation from distributions supported by nested hyperplanes, using the algorithm described in Petris & Tardella, “A geometric approach to transdimensional Markov chain Monte Carlo”, Canadian Journal of Statistics, v.31, n.4, (2003). Also random direction multivariate Adaptive Rejection Metropolis Sampling.
74 Bayesian Inference Hmisc Harrell Miscellaneous Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, and recoding variables.
75 Bayesian Inference iterLap Approximate Probability Densities by Iterated Laplace Approximations The iterLap (iterated Laplace approximation) algorithm approximates a general (possibly non-normalized) probability density on R^p, by repeated Laplace approximations to the difference between current approximation and true density (on log scale). The final approximation is a mixture of multivariate normal distributions and might be used for example as a proposal distribution for importance sampling (eg in Bayesian applications). The algorithm can be seen as a computational generalization of the Laplace approximation suitable for skew or multimodal densities.
76 Bayesian Inference LaplacesDemon Complete Environment for Bayesian Inference Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview). The README describes the history of the package development process.
77 Bayesian Inference LearnBayes Functions for Learning Bayesian Inference LearnBayes contains a collection of functions helpful in learning the basic tenets of Bayesian statistical inference. It contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.
78 Bayesian Inference lme4 Linear Mixed-Effects Models using ‘Eigen’ and S4 Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the ‘Eigen’ C++ library for numerical linear algebra and ‘RcppEigen’ “glue”.
79 Bayesian Inference lmm Linear Mixed Models Some improved procedures for linear mixed models.
80 Bayesian Inference MasterBayes ML and MCMC Methods for Pedigree Reconstruction and Analysis The primary aim of MasterBayes is to use MCMC techniques to integrate over uncertainty in pedigree configurations estimated from molecular markers and phenotypic data. Emphasis is put on the marginal distribution of parameters that relate the phenotypic data to the pedigree. All simulation is done in compiled C++ for efficiency.
81 Bayesian Inference matchingMarkets Analysis of Stable Matchings Implements structural estimators to correct for the sample selection bias from observed outcomes in matching markets. This includes one-sided matching of agents into groups as well as two-sided matching of students to schools. The package also contains algorithms to find stable matchings in the three most common matching problems: the stable roommates problem, the college admissions problem, and the house allocation problem.
82 Bayesian Inference mcmc (core) Markov Chain Monte Carlo Simulates continuous distributions of random vectors using Markov chain Monte Carlo (MCMC). Users specify the distribution by an R function that evaluates the log unnormalized density. Algorithms are random walk Metropolis algorithm (function metrop), simulated tempering (function temper), and morphometric random walk Metropolis (Johnson and Geyer, 2012, https://doi.org/10.1214/12-AOS1048, function morph.metrop), which achieves geometric ergodicity by change of variable.
83 Bayesian Inference MCMCglmm MCMC Generalised Linear Mixed Models MCMC Generalised Linear Mixed Models.
84 Bayesian Inference MCMCpack (core) Markov Chain Monte Carlo (MCMC) Package Contains functions to perform Bayesian inference using posterior simulation for a number of statistical models. Most simulation is done in compiled C++ written in the Scythe Statistical Library Version 1.0.3. All models return coda mcmc objects that can then be summarized using the coda package. Some useful utility functions such as density functions, pseudo-random number generators for statistical distributions, a general purpose Metropolis sampling algorithm, and tools for visualization are provided.
85 Bayesian Inference mgcv Mixed GAM Computation Vehicle with Automatic Smoothness Estimation Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar. Includes a gam() function, a wide variety of smoothers, JAGS support and distributions beyond the exponential family.
86 Bayesian Inference mlogitBMA Bayesian Model Averaging for Multinomial Logit Models Provides a modified function bic.glm of the BMA package that can be applied to multinomial logit (MNL) data. The data is converted to binary logit using the Begg & Gray approximation. The package also contains functions for maximum likelihood estimation of MNL.
87 Bayesian Inference MNP R Package for Fitting the Multinomial Probit Model Fits the Bayesian multinomial probit model via Markov chain Monte Carlo. The multinomial probit model is often used to analyze the discrete choices made by individuals recorded in survey data. Examples where the multinomial probit model may be useful include the analysis of product choice by consumers in market research and the analysis of candidate or party choice by voters in electoral studies. The MNP package can also fit the model with different choice sets for each individual, and complete or partial individual choice orderings of the available alternatives from the choice set. The estimation is based on the efficient marginal data augmentation algorithm that is developed by Imai and van Dyk (2005). “A Bayesian Analysis of the Multinomial Probit Model Using the Data Augmentation,” Journal of Econometrics, Vol. 124, No. 2 (February), pp. 311-334. doi:10.1016/j.jeconom.2004.02.002 Detailed examples are given in Imai and van Dyk (2005). “MNP: R Package for Fitting the Multinomial Probit Model.” Journal of Statistical Software, Vol. 14, No. 3 (May), pp. 1-32. doi:10.18637/jss.v014.i03.
88 Bayesian Inference mombf Moment and Inverse Moment Bayes Factors Model selection and parameter estimation based on non-local and Zellner priors. Bayes factors, marginal densities and variable selection in regression setups. Routines to sample, evaluate prior densities, distribution functions and quantiles are included.
89 Bayesian Inference monomvn Estimation for Multivariate Normal and Student-t Data with Monotone Missingness Estimation of multivariate normal and student-t data of arbitrary dimension where the pattern of missing data is monotone. Through the use of parsimonious/shrinkage regressions (plsr, pcr, lasso, ridge, etc.), where standard regressions fail, the package can handle a nearly arbitrary amount of missing data. The current version supports maximum likelihood inference and a full Bayesian approach employing scale-mixtures for Gibbs sampling. Monotone data augmentation extends this Bayesian approach to arbitrary missingness patterns. A fully functional standalone interface to the Bayesian lasso (from Park & Casella), Normal-Gamma (from Griffin & Brown), Horseshoe (from Carvalho, Polson, & Scott), and ridge regression with model selection via Reversible Jump, and student-t errors (from Geweke) is also provided.
90 Bayesian Inference MSBVAR Markov-Switching, Bayesian, Vector Autoregression Models Provides methods for estimating frequentist and Bayesian Vector Autoregression (VAR) models and Markov-switching Bayesian VAR (MSBVAR). Functions for reduced form and structural VAR models are also available. Includes methods for the generating posterior inferences for these models, forecasts, impulse responses (using likelihood-based error bands), and forecast error decompositions. Also includes utility functions for plotting forecasts and impulse responses, and generating draws from Wishart and singular multivariate normal densities. Current version includes functionality to build and evaluate models with Markov switching.
91 Bayesian Inference NetworkChange Bayesian Package for Network Changepoint Analysis Network changepoint analysis for undirected network data. The package implements a hidden Markov multilinear tensor regression model (Park and Sohn, 2017, http://jhp.snu.ac.kr/NetworkChange.pdf). Functions for break number detection using the approximate marginal likelihood and WAIC are also provided.
92 Bayesian Inference nimble MCMC, Particle Filtering, and Programmable Hierarchical Modeling A system for writing hierarchical statistical models largely compatible with ‘BUGS’ and ‘JAGS’, writing nimbleFunctions to operate models and do basic R-style math, and compiling both models and nimbleFunctions via custom-generated C++. ‘NIMBLE’ includes default methods for MCMC, particle filtering, Monte Carlo Expectation Maximization, and some other tools. The nimbleFunction system makes it easy to do things like implement new MCMC samplers from R, customize the assignment of samplers to different parts of a model from R, and compile the new samplers automatically via C++ alongside the samplers ‘NIMBLE’ provides. ‘NIMBLE’ extends the ‘BUGS’/‘JAGS’ language by making it extensible: New distributions and functions can be added, including as calls to external compiled code. Although most people think of MCMC as the main goal of the ‘BUGS’/‘JAGS’ language for writing models, one can use ‘NIMBLE’ for writing arbitrary other kinds of model-generic algorithms as well. A full User Manual is available at http://r-nimble.org.
93 Bayesian Inference openEBGM EBGM Scores for Mining Large Contingency Tables An implementation of DuMouchel’s (1999) doi:10.1080/00031305.1999.10474456 Bayesian data mining method for the market basket problem. Calculates Empirical Bayes Geometric Mean (EBGM) and quantile scores from the posterior distribution using the Gamma-Poisson Shrinker (GPS) model to find unusually large cell counts in large, sparse contingency tables. Can be used to find unusually high reporting rates of adverse events associated with products. In general, can be used to mine any database where the co-occurrence of two variables or items is of interest. Also calculates relative and proportional reporting ratios. Builds on the work of the ‘PhViD’ package, from which much of the code is derived. Some of the added features include stratification to adjust for confounding variables and data squashing to improve computational efficiency.
94 Bayesian Inference pacbpred PAC-Bayesian Estimation and Prediction in Sparse Additive Models This package is intended to perform estimation and prediction in high-dimensional additive models, using a sparse PAC-Bayesian point of view and a MCMC algorithm. The method is fully described in Guedj and Alquier (2013), ‘PAC-Bayesian Estimation and Prediction in Sparse Additive Models’, Electronic Journal of Statistics, 7, 264291.
95 Bayesian Inference PAWL Implementation of the PAWL algorithm Implementation of the Parallel Adaptive Wang-Landau algorithm. Also implemented for comparison: parallel adaptive Metropolis-Hastings,SMC sampler.
96 Bayesian Inference PottsUtils Utility Functions of the Potts Models A package including several functions related to the Potts models.
97 Bayesian Inference predmixcor Classification rule based on Bayesian mixture models with feature selection bias corrected “train_predict_mix” predicts the binary response with binary features
98 Bayesian Inference PReMiuM Dirichlet Process Bayesian Clustering, Profile Regression Bayesian clustering using a Dirichlet process mixture model. This model is an alternative to regression models, non-parametrically linking a response vector to covariate data through cluster membership. The package allows Bernoulli, Binomial, Poisson, Normal, survival and categorical response, as well as Normal and discrete covariates. It also allows for fixed effects in the response model, where a spatial CAR (conditional autoregressive) term can be also included. Additionally, predictions may be made for the response, and missing values for the covariates are handled. Several samplers and label switching moves are implemented along with diagnostic tools to assess convergence. A number of R functions for post-processing of the output are also provided. In addition to fitting mixtures, it may additionally be of interest to determine which covariates actively drive the mixture components. This is implemented in the package as variable selection.
99 Bayesian Inference prevalence Tools for Prevalence Assessment Studies The prevalence package provides Frequentist and Bayesian methods for prevalence assessment studies. IMPORTANT: the truePrev functions in the prevalence package call on JAGS (Just Another Gibbs Sampler), which therefore has to be available on the user’s system. JAGS can be downloaded from http://mcmc-jags.sourceforge.net/.
100 Bayesian Inference profdpm Profile Dirichlet Process Mixtures This package facilitates profile inference (inference at the posterior mode) for a class of product partition models (PPM). The Dirichlet process mixture is currently the only available member of this class. These methods search for the maximum posterior (MAP) estimate for the data partition in a PPM.
101 Bayesian Inference pscl Political Science Computational Laboratory Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching at the Political Science Computational Laboratory; seats-votes curves.
102 Bayesian Inference R2jags Using R to Run ‘JAGS’ Providing wrapper functions to implement Bayesian analysis in JAGS. Some major features include monitoring convergence of a MCMC model using Rubin and Gelman Rhat statistics, automatically running a MCMC model till it converges, and implementing parallel processing of a MCMC model for multiple chains.
103 Bayesian Inference R2WinBUGS Running ‘WinBUGS’ and ‘OpenBUGS’ from ‘R’ / ‘S-PLUS’ Invoke a ‘BUGS’ model in ‘OpenBUGS’ or ‘WinBUGS’, a class “bugs” for ‘BUGS’ results and functions to work with that class. Function write.model() allows a ‘BUGS’ model file to be written. The class and auxiliary functions could be used with other MCMC programs, including ‘JAGS’.
104 Bayesian Inference ramps Bayesian Geostatistical Modeling with RAMPS Bayesian geostatistical modeling of Gaussian processes using a reparameterized and marginalized posterior sampling (RAMPS) algorithm designed to lower autocorrelation in MCMC samples. Package performance is tuned for large spatial datasets.
105 Bayesian Inference rbugs Fusing R and OpenBugs and Beyond Functions to prepare files needed for running BUGS in batch-mode, and running BUGS from R. Support for Linux and Windows systems with OpenBugs is emphasized.
106 Bayesian Inference revdbayes Ratio-of-Uniforms Sampling for Bayesian Extreme Value Analysis Provides functions for the Bayesian analysis of extreme value models. The ‘rust’ package https://cran.r-project.org/package=rust is used to simulate a random sample from the required posterior distribution. The functionality of ‘revdbayes’ is similar to the ‘evdbayes’ package https://cran.r-project.org/package=evdbayes, which uses Markov Chain Monte Carlo (‘MCMC’) methods for posterior simulation. Also provided are functions for making inferences about the extremal index, using the K-gaps model of Suveges and Davison (2010) doi:10.1214/09-AOAS292. See the ‘revdbayes’ website for more information, documentation and examples.
107 Bayesian Inference RJaCGH Reversible Jump MCMC for the Analysis of CGH Arrays Bayesian analysis of CGH microarrays fitting Hidden Markov Chain models. The selection of the number of states is made via their posterior probability computed by Reversible Jump Markov Chain Monte Carlo Methods. Also returns probabilistic common regions for gains/losses.
108 Bayesian Inference rjags Bayesian Graphical Models using MCMC Interface to the JAGS MCMC library.
109 Bayesian Inference RSGHB Functions for Hierarchical Bayesian Estimation: A Flexible Approach Functions for estimating models using a Hierarchical Bayesian (HB) framework. The flexibility comes in allowing the user to specify the likelihood function directly instead of assuming predetermined model structures. Types of models that can be estimated with this code include the family of discrete choice models (Multinomial Logit, Mixed Logit, Nested Logit, Error Components Logit and Latent Class) as well ordered response models like ordered probit and ordered logit. In addition, the package allows for flexibility in specifying parameters as either fixed (non-varying across individuals) or random with continuous distributions. Parameter distributions supported include normal, positive/negative log-normal, positive/negative censored normal, and the Johnson SB distribution. Kenneth Train’s Matlab and Gauss code for doing Hierarchical Bayesian estimation has served as the basis for a few of the functions included in this package. These Matlab/Gauss functions have been rewritten to be optimized within R. Considerable code has been added to increase the flexibility and usability of the code base. Train’s original Gauss and Matlab code can be found here: http://elsa.berkeley.edu/Software/abstracts/train1006mxlhb.html See Train’s chapter on HB in Discrete Choice with Simulation here: http://elsa.berkeley.edu/books/choice2.html; and his paper on using HB with non-normal distributions here: http://eml.berkeley.edu//~train/trainsonnier.pdf.
110 Bayesian Inference RSGHB Functions for Hierarchical Bayesian Estimation: A Flexible Approach Functions for estimating models using a Hierarchical Bayesian (HB) framework. The flexibility comes in allowing the user to specify the likelihood function directly instead of assuming predetermined model structures. Types of models that can be estimated with this code include the family of discrete choice models (Multinomial Logit, Mixed Logit, Nested Logit, Error Components Logit and Latent Class) as well ordered response models like ordered probit and ordered logit. In addition, the package allows for flexibility in specifying parameters as either fixed (non-varying across individuals) or random with continuous distributions. Parameter distributions supported include normal, positive/negative log-normal, positive/negative censored normal, and the Johnson SB distribution. Kenneth Train’s Matlab and Gauss code for doing Hierarchical Bayesian estimation has served as the basis for a few of the functions included in this package. These Matlab/Gauss functions have been rewritten to be optimized within R. Considerable code has been added to increase the flexibility and usability of the code base. Train’s original Gauss and Matlab code can be found here: http://elsa.berkeley.edu/Software/abstracts/train1006mxlhb.html See Train’s chapter on HB in Discrete Choice with Simulation here: http://elsa.berkeley.edu/books/choice2.html; and his paper on using HB with non-normal distributions here: http://eml.berkeley.edu//~train/trainsonnier.pdf.
111 Bayesian Inference rstan R Interface to Stan User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the ‘StanHeaders’ package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via ‘variational’ approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.
112 Bayesian Inference rstiefel Random orthonormal matrix generation on the Stiefel manifold This package simulates random orthonormal matrices from linear and quadratic exponential family distributions on the Stiefel manifold. The most general type of distribution covered is the matrix-variate Bingham-von Mises-Fisher distribution. Most of the simulation methods are presented in Hoff(2009) “Simulation of the Matrix Bingham-von Mises-Fisher Distribution, With Applications to Multivariate and Relational Data.”
113 Bayesian Inference runjags Interface Utilities, Model Templates, Parallel Computing Methods and Additional Distributions for MCMC Models in JAGS User-friendly interface utilities for MCMC models via Just Another Gibbs Sampler (JAGS), facilitating the use of parallel (or distributed) processors for multiple chains, automated control of convergence and sample length diagnostics, and evaluation of the performance of a model using drop-k validation or against simulated data. Template model specifications can be generated using a standard lme4-style formula interface to assist users less familiar with the BUGS syntax. A JAGS extension module provides additional distributions including the Pareto family of distributions, the DuMouchel prior and the half-Cauchy prior.
114 Bayesian Inference Runuran R Interface to the UNU.RAN Random Variate Generators Interface to the UNU.RAN library for Universal Non-Uniform RANdom variate generators. Thus it allows to build non-uniform random number generators from quite arbitrary distributions. In particular, it provides an algorithm for fast numerical inversion for distribution with given density function. In addition, the package contains densities, distribution functions and quantiles from a couple of distributions.
115 Bayesian Inference RxCEcolInf R x C Ecological Inference With Optional Incorporation of Survey Information Fits the R x C inference model described in Greiner and Quinn (2009). Allows incorporation of survey results.
116 Bayesian Inference SamplerCompare A Framework for Comparing the Performance of MCMC Samplers A framework for running sets of MCMC samplers on sets of distributions with a variety of tuning parameters, along with plotting functions to visualize the results of those simulations. See sc-intro.pdf for an introduction.
117 Bayesian Inference SampleSizeMeans Sample size calculations for normal means A set of R functions for calculating sample size requirements using three different Bayesian criteria in the context of designing an experiment to estimate a normal mean or the difference between two normal means. Functions for calculation of required sample sizes for the Average Length Criterion, the Average Coverage Criterion and the Worst Outcome Criterion in the context of normal means are provided. Functions for both the fully Bayesian and the mixed Bayesian/likelihood approaches are provided.
118 Bayesian Inference SampleSizeProportions Calculating sample size requirements when estimating the difference between two binomial proportions A set of R functions for calculating sample size requirements using three different Bayesian criteria in the context of designing an experiment to estimate the difference between two binomial proportions. Functions for calculation of required sample sizes for the Average Length Criterion, the Average Coverage Criterion and the Worst Outcome Criterion in the context of binomial observations are provided. In all cases, estimation of the difference between two binomial proportions is considered. Functions for both the fully Bayesian and the mixed Bayesian/likelihood approaches are provided.
119 Bayesian Inference sbgcop Semiparametric Bayesian Gaussian copula estimation and imputation This package estimates parameters of a Gaussian copula, treating the univariate marginal distributions as nuisance parameters as described in Hoff(2007). It also provides a semiparametric imputation procedure for missing multivariate data.
120 Bayesian Inference SimpleTable Bayesian Inference and Sensitivity Analysis for Causal Effects from 2 x 2 and 2 x 2 x K Tables in the Presence of Unmeasured Confounding SimpleTable provides a series of methods to conduct Bayesian inference and sensitivity analysis for causal effects from 2 x 2 and 2 x 2 x K tables when unmeasured confounding is present or suspected.
121 Bayesian Inference sna Tools for Social Network Analysis A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, network regression, random graph generation, and 2D/3D network visualization.
122 Bayesian Inference spBayes Univariate and Multivariate Spatial-Temporal Modeling Fits univariate and multivariate spatio-temporal random effects models for point-referenced data using Markov chain Monte Carlo (MCMC). Details are given in Finley, Banerjee, and Gelfand (2015) doi:10.18637/jss.v063.i13 and Finley, Banerjee, and Cook (2014) doi:10.1111/2041-210X.12189.
123 Bayesian Inference spikeslab Prediction and variable selection using spike and slab regression Spike and slab for prediction and variable selection in linear regression models. Uses a generalized elastic net for variable selection.
124 Bayesian Inference spikeSlabGAM Bayesian Variable Selection and Model Choice for Generalized Additive Mixed Models Bayesian variable selection, model choice, and regularized estimation for (spatial) generalized additive mixed regression models via stochastic search variable selection with spike-and-slab priors.
125 Bayesian Inference spTimer Spatio-Temporal Bayesian Modelling Fits, spatially predicts and temporally forecasts large amounts of space-time data using [1] Bayesian Gaussian Process (GP) Models, [2] Bayesian Auto-Regressive (AR) Models, and [3] Bayesian Gaussian Predictive Processes (GPP) based AR Models for spatio-temporal big-n problems. Bakar and Sahu (2015) doi:10.18637/jss.v063.i15.
126 Bayesian Inference stochvol Efficient Bayesian Inference for Stochastic Volatility (SV) Models Efficient algorithms for fully Bayesian estimation of stochastic volatility (SV) models via Markov chain Monte Carlo (MCMC) methods.
127 Bayesian Inference tgp Bayesian Treed Gaussian Process Models Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes (GPs) with jumps to the limiting linear model (LLM). Special cases also implemented include Bayesian linear models, CART, treed linear models, stationary separable and isotropic GPs, and GP single-index models. Provides 1-d and 2-d plotting functions (with projection and slice capabilities) and tree drawing, designed for visualization of tgp-class output. Sensitivity analysis and multi-resolution models are supported. Sequential experimental design and adaptive sampling functions are also provided, including ALM, ALC, and expected improvement. The latter supports derivative-free optimization of noisy black-box functions.
128 Bayesian Inference tRophicPosition Bayesian Trophic Position Calculation with Stable Isotopes Estimates the trophic position of a consumer relative to a baseline species. It implements a Bayesian approach which combines an interface to the ‘JAGS’ MCMC library of ‘rjags’ and stable isotopes. Users are encouraged to test the package and send bugs and/or errors to trophicposition-support@googlegroups.com.
129 Bayesian Inference zic Bayesian Inference for Zero-Inflated Count Models Provides MCMC algorithms for the analysis of zero-inflated count models. The case of stochastic search variable selection (SVS) is also considered. All MCMC samplers are coded in C++ for improved efficiency. A data set considering the demand for health care is provided.
130 Chemometrics and Computational Physics ALS (core) Multivariate Curve Resolution Alternating Least Squares (MCR-ALS) Alternating least squares is often used to resolve components contributing to data with a bilinear structure; the basic technique may be extended to alternating constrained least squares. Commonly applied constraints include unimodality, non-negativity, and normalization of components. Several data matrices may be decomposed simultaneously by assuming that one of the two matrices in the bilinear decomposition is shared between datasets.
131 Chemometrics and Computational Physics AnalyzeFMRI Functions for analysis of fMRI datasets stored in the ANALYZE or NIFTI format Functions for I/O, visualisation and analysis of functional Magnetic Resonance Imaging (fMRI) datasets stored in the ANALYZE or NIFTI format.
132 Chemometrics and Computational Physics AquaEnv Integrated Development Toolbox for Aquatic Chemical Model Generation Toolbox for the experimental aquatic chemist, focused on acidification and CO2 air-water exchange. It contains all elements to model the pH, the related CO2 air-water exchange, and aquatic acid-base chemistry for an arbitrary marine, estuarine or freshwater system. It contains a suite of tools for sensitivity analysis, visualisation, modelling of chemical batches, and can be used to build dynamic models of aquatic systems. As from version 1.0-4, it also contains functions to calculate the buffer factors.
133 Chemometrics and Computational Physics astro Astronomy Functions, Tools and Routines The astro package provides a series of functions, tools and routines in everyday use within astronomy. Broadly speaking, one may group these functions into 7 main areas, namely: cosmology, FITS file manipulation, the Sersic function, plotting, data manipulation, statistics and general convenience functions and scripting tools.
134 Chemometrics and Computational Physics astrodatR Astronomical Data A collection of 19 datasets from contemporary astronomical research. They are described the textbook ‘Modern Statistical Methods for Astronomy with R Applications’ by Eric D. Feigelson and G. Jogesh Babu (Cambridge University Press, 2012, Appendix C) or on the website of Penn State’s Center for Astrostatistics (http://astrostatistics.psu.edu/datasets). These datasets can be used to exercise methodology involving: density estimation; heteroscedastic measurement errors; contingency tables; two-sample hypothesis tests; spatial point processes; nonlinear regression; mixture models; censoring and truncation; multivariate analysis; classification and clustering; inhomogeneous Poisson processes; periodic and stochastic time series analysis.
135 Chemometrics and Computational Physics astroFns Astronomy: time and position functions, misc. utilities Miscellaneous astronomy functions, utilities, and data.
136 Chemometrics and Computational Physics astrolibR Astronomy Users Library Several dozen low-level utilities and codes from the Interactive Data Language (IDL) Astronomy Users Library (http://idlastro.gsfc.nasa.gov) are implemented in R. They treat: time, coordinate and proper motion transformations; terrestrial precession and nutation, atmospheric refraction and aberration, barycentric corrections, and related effects; utilities for astrometry, photometry, and spectroscopy; and utilities for planetary, stellar, Galactic, and extragalactic science.
137 Chemometrics and Computational Physics Bchron Radiocarbon Dating, Age-Depth Modelling, Relative Sea Level Rate Estimation, and Non-Parametric Phase Modelling Enables quick calibration of radiocarbon dates under various calibration curves (including user generated ones); Age-depth modelling as per the algorithm of Haslett and Parnell (2008) doi:10.1111/j.1467-9876.2008.00623.x; Relative sea level rate estimation incorporating time uncertainty in polynomial regression models; and non-parametric phase modelling via Gaussian mixtures as a means to determine the activity of a site (and as an alternative to the Oxcal function SUM). The package includes a vignette which explains most of the basic functionality.
138 Chemometrics and Computational Physics BioMark Find Biomarkers in Two-Class Discrimination Problems Variable selection methods are provided for several classification methods: the lasso/elastic net, PCLDA, PLSDA, and several t-tests. Two approaches for selecting cutoffs can be used, one based on the stability of model coefficients under perturbation, and the other on higher criticism.
139 Chemometrics and Computational Physics bvls The Stark-Parker algorithm for bounded-variable least squares An R interface to the Stark-Parker implementation of an algorithm for bounded-variable least squares
140 Chemometrics and Computational Physics cda Coupled-Dipole Approximation for Electromagnetic Scattering by Three-Dimensional Clusters of Sub-Wavelength Particles Coupled-dipole simulations for electromagnetic scattering of light by sub-wavelength particles in arbitrary 3-dimensional configurations. Scattering and absorption spectra are simulated by inversion of the interaction matrix, or by an order-of-scattering approximation scheme. High-level functions are provided to simulate spectra with varying angles of incidence, as well as with full angular averaging.
141 Chemometrics and Computational Physics celestial Collection of Common Astronomical Conversion Routines and Functions Contains a number of common astronomy conversion routines, particularly the HMS and degrees schemes, which can be fiddly to convert between on mass due to the textural nature of the former. It allows users to coordinate match datasets quickly. It also contains functions for various cosmological calculations.
142 Chemometrics and Computational Physics CellularAutomaton One-Dimensional Cellular Automata This package is an object-oriented implementation of one-dimensional cellular automata. It supports many of the features offered by Mathematica, including elementary rules, user-defined rules, radii, user-defined seeding, and plotting.
143 Chemometrics and Computational Physics chemCal (core) Calibration Functions for Analytical Chemistry Simple functions for plotting linear calibration functions and estimating standard errors for measurements according to the Handbook of Chemometrics and Qualimetrics: Part A by Massart et al. There are also functions estimating the limit of detection (LOD) and limit of quantification (LOQ). The functions work on model objects from - optionally weighted - linear regression (lm) or robust linear regression (‘rlm’ from the ‘MASS’ package).
144 Chemometrics and Computational Physics chemometrics Multivariate Statistical Analysis in Chemometrics R companion to the book “Introduction to Multivariate Statistical Analysis in Chemometrics” written by K. Varmuza and P. Filzmoser (2009).
145 Chemometrics and Computational Physics ChemometricsWithR Chemometrics with R - Multivariate Data Analysis in the Natural Sciences and Life Sciences Functions and scripts used in the book “Chemometrics with R - Multivariate Data Analysis in the Natural Sciences and Life Sciences” by Ron Wehrens, Springer (2011). Data used in the package are available from github.
146 Chemometrics and Computational Physics ChemoSpec Exploratory Chemometrics for Spectroscopy A collection of functions for top-down exploratory data analysis of spectral data obtained via nuclear magnetic resonance (NMR), infrared (IR) or Raman spectroscopy. Includes functions for plotting and inspecting spectra, peak alignment, hierarchical cluster analysis (HCA), principal components analysis (PCA) and model-based clustering. Robust methods appropriate for this type of high-dimensional data are available. ChemoSpec is designed with metabolomics data sets in mind, where the samples fall into groups such as treatment and control. Graphical output is formatted consistently for publication quality plots. ChemoSpec is intended to be very user friendly and help you get usable results quickly. A vignette covering typical operations is available.
147 Chemometrics and Computational Physics CHNOSZ Thermodynamic Calculations for Geobiochemistry An integrated set of tools for thermodynamic calculations in geochemistry and compositional biology. Thermodynamic properties are taken from a database for minerals and inorganic and organic aqueous species including biomolecules, or from amino acid group additivity for proteins. High-temperature properties are calculated using the revised Helgeson-Kirkham-Flowers equations of state for aqueous species, and activity coefficients can be calculated for specified ionic strength. Functions are provided to define a system using basis species, automatically balance reactions, calculate the chemical affinities of formation reactions for selected species, calculate equilibrium activities, and plot the results on chemical activity diagrams.
148 Chemometrics and Computational Physics clustvarsel Variable Selection for Gaussian Model-Based Clustering Variable selection for Gaussian model-based clustering as implemented in the ‘mclust’ package. The methodology allows to find the (locally) optimal subset of variables in a data set that have group/cluster information. A greedy or headlong search can be used, either in a forward-backward or backward-forward direction, with or without sub-sampling at the hierarchical clustering stage for starting ‘mclust’ models. By default the algorithm uses a sequential search, but parallelisation is also available.
149 Chemometrics and Computational Physics compositions Compositional Data Analysis The package provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by Aitchison and Pawlowsky-Glahn.
150 Chemometrics and Computational Physics cosmoFns Functions for cosmological distances, times, luminosities, etc Package encapsulates standard expressions for distances, times, luminosities, and other quantities useful in observational cosmology, including molecular line observations. Currently coded for a flat universe only.
151 Chemometrics and Computational Physics CosmoPhotoz Photometric redshift estimation using generalized linear models User-friendly interfaces to perform fast and reliable photometric redshift estimation. The code makes use of generalized linear models and can adopt gamma or inverse gaussian families, either from a frequentist or a Bayesian perspective. The code additionally provides a Shiny application providing a simple user interface.
152 Chemometrics and Computational Physics CRAC Cosmology R Analysis Code R functions for cosmological research. The main functions are similar to the python library, cosmolopy.
153 Chemometrics and Computational Physics dielectric Defines some physical constants and dielectric functions commonly used in optics, plasmonics Physical constants. Gold, silver and glass permittivities, together with spline interpolation functions.
154 Chemometrics and Computational Physics diffractometry Baseline identification and peak decomposition for x-ray diffractograms Residual-based baseline identification and peak decomposition for x-ray diffractograms as introduced in Davies/Gather/Mergel/Meise/Mildenberger (2008).
155 Chemometrics and Computational Physics drc Analysis of Dose-Response Curves Analysis of dose-response data is made available through a suite of flexible and versatile model fitting and after-fitting functions.
156 Chemometrics and Computational Physics drm Regression and association models for repeated categorical data Likelihood-based marginal regression and association modelling for repeated, or otherwise clustered, categorical responses using dependence ratio as a measure of the association
157 Chemometrics and Computational Physics EEM Read and Preprocess Fluorescence Excitation-Emission Matrix (EEM) Data Read raw EEM data and prepares them for further analysis.
158 Chemometrics and Computational Physics elasticnet Elastic-Net for Sparse Estimation and Sparse PCA This package provides functions for fitting the entire solution path of the Elastic-Net and also provides functions for estimating sparse Principal Components. The Lasso solution paths can be computed by the same function. First version: 2005-10.
159 Chemometrics and Computational Physics enpls Ensemble Partial Least Squares Regression An algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions.
160 Chemometrics and Computational Physics fastICA FastICA Algorithms to Perform ICA and Projection Pursuit Implementation of FastICA algorithm to perform Independent Component Analysis (ICA) and Projection Pursuit.
161 Chemometrics and Computational Physics FITSio FITS (Flexible Image Transport System) Utilities Utilities to read and write files in the FITS (Flexible Image Transport System) format, a standard format in astronomy (see e.g. https://en.wikipedia.org/wiki/FITS for more information). Present low-level routines allow: reading, parsing, and modifying FITS headers; reading FITS images (multi-dimensional arrays); reading FITS binary and ASCII tables; and writing FITS images (multi-dimensional arrays). Higher-level functions allow: reading files composed of one or more headers and a single (perhaps multidimensional) image or single table; reading tables into data frames; generating vectors for image array axes; scaling and writing images as 16-bit integers. Known incompletenesses are reading random group extensions, as well as bit, complex, and array descriptor data types in binary tables.
162 Chemometrics and Computational Physics fmri Analysis of fMRI Experiments Contains R-functions to perform an fMRI analysis as described in Tabelow et al. (2006) doi:10.1016/j.neuroimage.2006.06.029, Polzehl et al. (2010) doi:10.1016/j.neuroimage.2010.04.241, Tabelow and Polzehl (2011) doi:10.18637/jss.v044.i11.
163 Chemometrics and Computational Physics fpca Restricted MLE for Functional Principal Components Analysis A geometric approach to MLE for functional principal components
164 Chemometrics and Computational Physics FTICRMS Programs for Analyzing Fourier Transform-Ion Cyclotron Resonance Mass Spectrometry Data This package was developed partially with funding from the NIH Training Program in Biomolecular Technology (2-T32-GM08799).
165 Chemometrics and Computational Physics homals Gifi Methods for Optimal Scaling Performs a homogeneity analysis (multiple correspondence analysis) and various extensions. Rank restrictions on the category quantifications can be imposed (nonlinear PCA). The categories are transformed by means of optimal scaling with options for nominal, ordinal, and numerical scale levels (for rank-1 restrictions). Variables can be grouped into sets, in order to emulate regression analysis and canonical correlation analysis.
166 Chemometrics and Computational Physics hyperSpec Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, …) Comfortable ways to work with hyperspectral data sets. I.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f (wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
167 Chemometrics and Computational Physics investr Inverse Estimation/Calibration Functions Functions to facilitate inverse estimation (e.g., calibration) in linear, generalized linear, nonlinear, and (linear) mixed-effects models. A generic function is also provided for plotting fitted regression models with or without confidence/prediction bands that may be of use to the general user.
168 Chemometrics and Computational Physics Iso (core) Functions to Perform Isotonic Regression Linear order and unimodal order (univariate) isotonic regression; bivariate isotonic regression with linear order on both variables.
169 Chemometrics and Computational Physics kohonen (core) Supervised and Unsupervised Self-Organising Maps Functions to train self-organising maps (SOMs). Also interrogation of the maps and prediction using trained maps are supported. The name of the package refers to Teuvo Kohonen, the inventor of the SOM.
170 Chemometrics and Computational Physics leaps Regression Subset Selection Regression subset selection, including exhaustive search.
171 Chemometrics and Computational Physics lspls LS-PLS Models Implements the LS-PLS (least squares - partial least squares) method described in for instance Jorgensen, K., Segtnan, V. H., Thyholt, K., Nas, T. (2004) A Comparison of Methods for Analysing Regression Models with Both Spectral and Designed Variables. Journal of Chemometrics, 18(10), 451464.
172 Chemometrics and Computational Physics MALDIquant Quantitative Analysis of Mass Spectrometry Data A complete analysis pipeline for matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) and other two-dimensional mass spectrometry data. In addition to commonly used plotting and processing methods it includes distinctive features, namely baseline subtraction methods such as morphological filters (TopHat) or the statistics-sensitive non-linear iterative peak-clipping algorithm (SNIP), peak alignment using warping functions, handling of replicated measurements as well as allowing spectra with different resolutions.
173 Chemometrics and Computational Physics minpack.lm R Interface to the Levenberg-Marquardt Nonlinear Least-Squares Algorithm Found in MINPACK, Plus Support for Bounds The nls.lm function provides an R interface to lmder and lmdif from the MINPACK library, for solving nonlinear least-squares problems by a modification of the Levenberg-Marquardt algorithm, with support for lower and upper parameter bounds. The implementation can be used via nls-like calls using the nlsLM function.
174 Chemometrics and Computational Physics moonsun Basic astronomical calculations with R A collection of basic astronomical routines for R based on “Practical astronomy with your calculator” by Peter Duffet-Smith.
175 Chemometrics and Computational Physics nlme Linear and Nonlinear Mixed Effects Models Fit and compare Gaussian linear and nonlinear mixed-effects models.
176 Chemometrics and Computational Physics nlreg Higher Order Inference for Nonlinear Heteroscedastic Models Likelihood inference based on higher order approximations for nonlinear models with possibly non constant variance
177 Chemometrics and Computational Physics nnls (core) The Lawson-Hanson algorithm for non-negative least squares (NNLS) An R interface to the Lawson-Hanson implementation of an algorithm for non-negative least squares (NNLS). Also allows the combination of non-negative and non-positive constraints.
178 Chemometrics and Computational Physics OrgMassSpecR Organic Mass Spectrometry Organic/biological mass spectrometry data analysis.
179 Chemometrics and Computational Physics pcaPP Robust PCA by Projection Pursuit Provides functions for robust PCA by projection pursuit. The methods are described in Croux et al. (2006) doi:10.2139/ssrn.968376, Croux et al. (2013) doi:10.1080/00401706.2012.727746, Todorov and Filzmoser (2013) doi:10.1007/978-3-642-33042-1_31.
180 Chemometrics and Computational Physics Peaks Peaks Spectrum manipulation: background estimation, Markov smoothing, deconvolution and peaks search functions. Ported from ROOT/TSpectrum class.
181 Chemometrics and Computational Physics PET Simulation and Reconstruction of PET Images This package implements different analytic/direct and iterative reconstruction methods of Peter Toft. It also offer the possibility to simulate PET data.
182 Chemometrics and Computational Physics planar Multilayer Optics Solves the electromagnetic problem of reflection and transmission at a planar multilayer interface. Also computed are the decay rates and emission profile for a dipolar emitter.
183 Chemometrics and Computational Physics pls (core) Partial Least Squares and Principal Component Regression Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).
184 Chemometrics and Computational Physics plspm Tools for Partial Least Squares Path Modeling (PLS-PM) Partial Least Squares Path Modeling (PLS-PM) analysis for both metric and non-metric data, as well as REBUS analysis.
185 Chemometrics and Computational Physics ppls Penalized Partial Least Squares This package contains linear and nonlinear regression methods based on Partial Least Squares and Penalization Techniques. Model parameters are selected via cross-validation, and confidence intervals ans tests for the regression coefficients can be conducted via jackknifing.
186 Chemometrics and Computational Physics prospectr Miscellaneous functions for processing and sample selection of vis-NIR diffuse reflectance data The package provides functions for pretreatment and sample selection of visible and near infrared diffuse reflectance spectra
187 Chemometrics and Computational Physics psy Various procedures used in psychometry Kappa, ICC, Cronbach alpha, screeplot, mtmm
188 Chemometrics and Computational Physics PTAk (core) Principal Tensor Analysis on k Modes A multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.
189 Chemometrics and Computational Physics quantchem Quantitative chemical analysis: calibration and evaluation of results Statistical evaluation of calibration curves by different regression techniques: ordinary, weighted, robust (up to 4th order polynomial). Log-log and Box-Cox transform, estimation of optimal power and weighting scheme. Tests for heteroscedascity and normality of residuals. Different kinds of plots commonly used in illustrating calibrations. Easy “inverse prediction” of concentration by given responses and statistical evaluation of results (comparison of precision and accuracy by common tests).
190 Chemometrics and Computational Physics rcdk Interface to the ‘CDK’ Libraries Allows the user to access functionality in the ‘CDK’, a Java framework for chemoinformatics. This allows the user to load molecules, evaluate fingerprints, calculate molecular descriptors and so on. In addition, the ‘CDK’ API allows the user to view structures in 2D.
191 Chemometrics and Computational Physics rcdklibs The CDK Libraries Packaged for R An R interface to the Chemistry Development Kit, a Java library for chemoinformatics. Given the size of the library itself, this package is not expected to change very frequently. To make use of the CDK within R, it is suggested that you use the ‘rcdk’ package. Note that it is possible to directly interact with the CDK using ‘rJava’. However ‘rcdk’ exposes functionality in a more idiomatic way. The CDK library itself is released as LGPL and the sources can be obtained from https://github.com/cdk/cdk.
192 Chemometrics and Computational Physics represent Determine the representativity of two multidimensional data sets Contains workhorse function jrparams(), as well as two helper functions Mboxtest() and JRsMahaldist(), and four example data sets.
193 Chemometrics and Computational Physics resemble Regression and Similarity Evaluation for Memory-Based Learning in Spectral Chemometrics Implementation of functions for spectral similarity/dissimilarity analysis and memory-based learning (MBL) for non-linear modeling in complex spectral datasets. In chemometrics MBL is also known as local modeling.
194 Chemometrics and Computational Physics RobPer Robust Periodogram and Periodicity Detection Methods Calculates periodograms based on (robustly) fitting periodic functions to light curves (irregularly observed time series, possibly with measurement accuracies, occurring in astroparticle physics). Three main functions are included: RobPer() calculates the periodogram. Outlying periodogram bars (indicating a period) can be detected with betaCvMfit(). Artificial light curves can be generated using the function tsgen(). For more details see the corresponding article: Thieler, Fried and Rathjens (2016), Journal of Statistical Software 69(9), 1-36, doi:10.18637/jss.v069.i09.
195 Chemometrics and Computational Physics rpubchem An Interface to the PubChem Collection Access PubChem data (compounds, substance, assays) using R. Structural information is provided in the form of SMILES strings. It currently only provides access to a subset of the precalculated data stored by PubChem. Bio-assay data can be accessed to obtain descriptions as well as the actual data. It is also possible to search for assay ID’s by keyword.
196 Chemometrics and Computational Physics sapa Spectral Analysis for Physical Applications Software for the book Spectral Analysis for Physical Applications, Donald B. Percival and Andrew T. Walden, Cambridge University Press, 1993.
197 Chemometrics and Computational Physics SCEPtER Stellar CharactEristics Pisa Estimation gRid SCEPtER pipeline for estimating the stellar age, mass, and radius given observational effective temperature, [Fe/H], and astroseismic parameters. The results are obtained adopting a maximum likelihood technique over a grid of pre-computed stellar models.
198 Chemometrics and Computational Physics simecol Simulation of Ecological (and Other) Dynamic Systems An object oriented framework to simulate ecological (and other) dynamic systems. It can be used for differential equations, individual-based (or agent-based) and other models as well. The package helps to organize scenarios (to avoid copy and paste) and aims to improve readability and usability of code.
199 Chemometrics and Computational Physics snapshot Gadget N-body cosmological simulation code snapshot I/O utilities Functions for reading and writing Gadget N-body snapshots. The Gadget code is popular in astronomy for running N-body / hydrodynamical cosmological and merger simulations. To find out more about Gadget see the main distribution page at www.mpa-garching.mpg.de/gadget/
200 Chemometrics and Computational Physics solaR Radiation and Photovoltaic Systems Calculation methods of solar radiation and performance of photovoltaic systems from daily and intradaily irradiation data sources.
201 Chemometrics and Computational Physics som Self-Organizing Map Self-Organizing Map (with application in gene clustering).
202 Chemometrics and Computational Physics speaq Tools for Nuclear Magnetic Resonance (NMR) Spectra Alignment, Peak Based Processing, Quantitative Analysis and Visualizations Makes Nuclear Magnetic Resonance spectroscopy (NMR spectroscopy) data analysis as easy as possible by only requiring a small set of functions to perform an entire analysis. ‘speaq’ offers the possibility of raw spectra alignment and quantitation but also an analysis based on features whereby the spectra are converted to peaks which are then grouped and turned into features. These features can be processed with any number of statistical tools either included in ‘speaq’ or available elsewhere on CRAN. More detail can be found in doi:10.1186/1471-2105-12-405 and doi:10.1101/138503.
203 Chemometrics and Computational Physics spls Sparse Partial Least Squares (SPLS) Regression and Classification This package provides functions for fitting a Sparse Partial Least Squares Regression and Classification
204 Chemometrics and Computational Physics stellaR stellar evolution tracks and isochrones A package to manage and display stellar tracks and isochrones from Pisa low-mass database. Includes tools for isochrones construction and tracks interpolation.
205 Chemometrics and Computational Physics stepPlr L2 penalized logistic regression with a stepwise variable selection L2 penalized logistic regression for both continuous and discrete predictors, with forward stagewise/forward stepwise variable selection procedure.
206 Chemometrics and Computational Physics subselect Selecting Variable Subsets A collection of functions which (i) assess the quality of variable subsets as surrogates for a full data set, in either an exploratory data analysis or in the context of a multivariate linear model, and (ii) search for subsets which are optimal under various criteria.
207 Chemometrics and Computational Physics TIMP Fitting Separable Nonlinear Models in Spectroscopy and Microscopy A problem-solving environment (PSE) for fitting separable nonlinear models to measurements arising in physics and chemistry experiments; has been extensively applied to time-resolved spectroscopy and FLIM-FRET data.
208 Chemometrics and Computational Physics titan Titration analysis for mass spectrometry data GUI to analyze mass spectrometric data on the relative abundance of two substances from a titration series.
209 Chemometrics and Computational Physics titrationCurves Acid/Base, Complexation, Redox, and Precipitation Titration Curves A collection of functions to plot acid/base titration curves (pH vs. volume of titrant), complexation titration curves (pMetal vs. volume of EDTA), redox titration curves (potential vs.volume of titrant), and precipitation titration curves (either pAnalyte or pTitrant vs. volume of titrant). Options include the titration of mixtures, the ability to overlay two or more titration curves, and the ability to show equivalence points.
210 Chemometrics and Computational Physics UPMASK Unsupervised Photometric Membership Assignment in Stellar Clusters An implementation of the UPMASK method for performing membership assignment in stellar clusters in R. It is prepared to use photometry and spatial positions, but it can take into account other types of data. The method is able to take into account arbitrary error models, and it is unsupervised, data-driven, physical-model-free and relies on as few assumptions as possible. The approach followed for membership assessment is based on an iterative process, principal component analysis, a clustering algorithm and a kernel density estimation.
211 Chemometrics and Computational Physics varSelRF Variable Selection using Random Forests Variable selection from random forests using both backwards variable elimination (for the selection of small sets of non-redundant variables) and selection based on the importance spectrum (somewhat similar to scree plots; for the selection of large, potentially highly-correlated variables). Main applications in high-dimensional data (e.g., microarray data, and other genomics and proteomics applications).
212 Chemometrics and Computational Physics webchem Chemical Information from the Web Chemical information from around the web. This package interacts with a suite of web APIs for chemical information.
213 Chemometrics and Computational Physics WilcoxCV Wilcoxon-based variable selection in cross-validation This package provides functions to perform fast variable selection based on the Wilcoxon rank sum test in the cross-validation or Monte-Carlo cross-validation settings, for use in microarray-based binary classification.
214 Clinical Trial Design, Monitoring, and Analysis adaptTest (core) Adaptive two-stage tests The functions defined in this program serve for implementing adaptive two-stage tests. Currently, four tests are included: Bauer and Koehne (1994), Lehmacher and Wassmer (1999), Vandemeulebroecke (2006), and the horizontal conditional error function. User-defined tests can also be implemented. Reference: Vandemeulebroecke, An investigation of two-stage tests, Statistica Sinica 2006.
215 Clinical Trial Design, Monitoring, and Analysis AGSDest Estimation in Adaptive Group Sequential Trials Calculation of repeated confidence intervals as well as confidence intervals based on the stage-wise ordering in group sequential designs and adaptive group sequential designs. For adaptive group sequential designs the confidence intervals are based on the conditional rejection probability principle. Currently the procedures do not support the use of futility boundaries or more than one adaptive interim analysis.
216 Clinical Trial Design, Monitoring, and Analysis asd (core) Simulations for Adaptive Seamless Designs Package runs simulations for adaptive seamless designs with and without early outcomes for treatment selection and subpopulation type designs.
217 Clinical Trial Design, Monitoring, and Analysis asypow Calculate Power Utilizing Asymptotic Likelihood Ratio Methods A set of routines written in the S language that calculate power and related quantities utilizing asymptotic likelihood ratio methods.
218 Clinical Trial Design, Monitoring, and Analysis bcrm (core) Bayesian Continual Reassessment Method for Phase I Dose-Escalation Trials Implements a wide variety of one and two-parameter Bayesian CRM designs. The program can run interactively, allowing the user to enter outcomes after each cohort has been recruited, or via simulation to assess operating characteristics.
219 Clinical Trial Design, Monitoring, and Analysis bifactorial (core) Inferences for bi- and trifactorial trial designs This package makes global and multiple inferences for given bi- and trifactorial clinical trial designs using bootstrap methods and a classical approach.
220 Clinical Trial Design, Monitoring, and Analysis binomSamSize Confidence Intervals and Sample Size Determination for a Binomial Proportion under Simple Random Sampling and Pooled Sampling A suite of functions to compute confidence intervals and necessary sample sizes for the parameter p of the Bernoulli B(p) distribution under simple random sampling or under pooled sampling. Such computations are e.g. of interest when investigating the incidence or prevalence in populations. The package contains functions to compute coverage probabilities and coverage coefficients of the provided confidence intervals procedures. Sample size calculations are based on expected length.
221 Clinical Trial Design, Monitoring, and Analysis blockrand (core) Randomization for block random clinical trials Create randomizations for block random clinical trials. Can also produce a pdf file of randomization cards.
222 Clinical Trial Design, Monitoring, and Analysis clinfun (core) Clinical Trial Design and Data Analysis Functions Utilities to make your clinical collaborations easier if not fun. It contains functions for designing studies such as Simon 2-stage and group sequential designs and for data analysis such as Jonckheere-Terpstra test and estimating survival quantiles.
223 Clinical Trial Design, Monitoring, and Analysis clinsig Clinical Significance Functions Functions for calculating clinical significance.
224 Clinical Trial Design, Monitoring, and Analysis clusterPower Power Calculations for Cluster-Randomized and Cluster-Randomized Crossover Trials Calculate power for cluster randomized trials (CRTs) that compare two means, two proportions, or two counts using closed-form solutions. In addition, calculate power for cluster randomized crossover trials using Monte Carlo methods. For more information, see Reich et al. (2012) doi:10.1371/journal.pone.0035564.
225 Clinical Trial Design, Monitoring, and Analysis coin Conditional Inference Procedures in a Permutation Test Framework Conditional inference procedures for the general independence problem including two-sample, K-sample (non-parametric ANOVA), correlation, censored, ordered and multivariate problems.
226 Clinical Trial Design, Monitoring, and Analysis conf.design Construction of factorial designs This small library contains a series of simple tools for constructing and manipulating confounded and fractional factorial designs.
227 Clinical Trial Design, Monitoring, and Analysis CRM Continual Reassessment Method (CRM) for Phase I Clinical Trials CRM simulator for Phase I Clinical Trials
228 Clinical Trial Design, Monitoring, and Analysis crmPack Object-Oriented Implementation of CRM Designs Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules.
229 Clinical Trial Design, Monitoring, and Analysis CRTSize (core) Sample Size Estimation Functions for Cluster Randomized Trials Sample size estimation in cluster (group) randomized trials. Contains traditional power-based methods, empirical smoothing (Rotondi and Donner, 2009), and updated meta-analysis techniques (Rotondi and Donner, 2012).
230 Clinical Trial Design, Monitoring, and Analysis dfcrm (core) Dose-finding by the continual reassessment method This package provides functions to run the CRM and TITE-CRM in phase I trials and calibration tools for trial planning purposes.
231 Clinical Trial Design, Monitoring, and Analysis dfped Extrapolation and Bridging of Adult Information in Early Phase Dose-Finding Paediatrics Studies A unified method for designing and analysing dose-finding trials in paediatrics, while bridging information from adults, is proposed in the dfped package. The dose range can be calculated under three extrapolation methods: linear, allometry and maturation adjustment, using pharmacokinetic (PK) data. To do this, it is assumed that target exposures are the same in both populations. The working model and prior distribution parameters of the dose-toxicity and dose-efficacy relationships can be obtained using early phase adult toxicity and efficacy data at several dose levels through dfped package. Priors are used into the dose finding process through a Bayesian model selection or adaptive priors, to facilitate adjusting the amount of prior information to differences between adults and children. This calibrates the model to adjust for misspecification if the adult and paediatric data are very different. User can use his/her own Bayesian model written in Stan code through the dfped package. A template of this model is proposed in the examples of the corresponding R functions in the package. Finally, in this package you can find a simulation function for one trial or for more than one trial.
232 Clinical Trial Design, Monitoring, and Analysis dfpk Bayesian Dose-Finding Designs using Pharmacokinetics (PK) for Phase I Clinical Trials Statistical methods involving PK measures are provided, in the dose allocation process during a Phase I clinical trials. These methods enter pharmacokinetics (PK) in the dose finding designs in different ways, including covariates models, dependent variable or hierarchical models. This package provides functions to generate data from several scenarios and functions to run simulations which their objective is to determine the maximum tolerated dose (MTD).
233 Clinical Trial Design, Monitoring, and Analysis DoseFinding Planning and Analyzing Dose Finding Experiments The DoseFinding package provides functions for the design and analysis of dose-finding experiments (with focus on pharmaceutical Phase II clinical trials). It provides functions for: multiple contrast tests, fitting non-linear dose-response models (using Bayesian and non-Bayesian estimation), calculating optimal designs and an implementation of the MCPMod methodology.
234 Clinical Trial Design, Monitoring, and Analysis epibasix Elementary Epidemiological Functions for Epidemiology and Biostatistics This package contains elementary tools for analysis of common epidemiological problems, ranging from sample size estimation, through 2x2 contingency table analysis and basic measures of agreement (kappa, sensitivity/specificity). Appropriate print and summary statements are also written to facilitate interpretation wherever possible. Source code is commented throughout to facilitate modification. The target audience includes advanced undergraduate and graduate students in epidemiology or biostatistics courses, and clinical researchers.
235 Clinical Trial Design, Monitoring, and Analysis ewoc Escalation with Overdose Control An implementation of a variety of escalation with overdose control designs introduced by Babb, Rogatko and Zacks (1998) doi:10.1002/(SICI)1097-0258(19980530)17:10%3C1103::AID-SIM793%3E3.0.CO;2-9. It calculates the next dose as a clinical trial proceeds as well as performs simulations to obtain operating characteristics.
236 Clinical Trial Design, Monitoring, and Analysis experiment (core) experiment: R package for designing and analyzing randomized experiments The package provides various statistical methods for designing and analyzing randomized experiments. One main functionality of the package is the implementation of randomized-block and matched-pair designs based on possibly multivariate pre-treatment covariates. The package also provides the tools to analyze various randomized experiments including cluster randomized experiments, randomized experiments with noncompliance, and randomized experiments with missing data.
237 Clinical Trial Design, Monitoring, and Analysis FrF2 Fractional Factorial Designs with 2-Level Factors Regular and non-regular Fractional Factorial 2-level designs can be created. Furthermore, analysis tools for Fractional Factorial designs with 2-level factors are offered (main effects and interaction plots for all factors simultaneously, cube plot for looking at the simultaneous effects of three factors, full or half normal plot, alias structure in a more readable format than with the built-in function alias).
238 Clinical Trial Design, Monitoring, and Analysis GroupSeq (core) A GUI-Based Program to Compute Probabilities Regarding Group Sequential Designs A graphical user interface to compute group sequential designs based on normally distributed test statistics, particularly critical boundaries, power, drift, and confidence intervals of such designs. All computations are based on the alpha spending approach by Lan-DeMets with various alpha spending functions being available to choose among.
239 Clinical Trial Design, Monitoring, and Analysis gsbDesign Group Sequential Bayes Design Group Sequential Operating Characteristics for Clinical, Bayesian two-arm Trials with known Sigma and Normal Endpoints.
240 Clinical Trial Design, Monitoring, and Analysis gsDesign (core) Group Sequential Design Derives group sequential designs and describes their properties.
241 Clinical Trial Design, Monitoring, and Analysis HH Statistical Analysis and Data Display: Heiberger and Holland Support software for Statistical Analysis and Data Display (Second Edition, Springer, ISBN 978-1-4939-2121-8, 2015) and (First Edition, Springer, ISBN 0-387-40270-5, 2004) by Richard M. Heiberger and Burt Holland. This contemporary presentation of statistical methods features extensive use of graphical displays for exploring data and for displaying the analysis. The second edition includes redesigned graphics and additional chapters. The authors emphasize how to construct and interpret graphs, discuss principles of graphical design, and show how accompanying traditional tabular results are used to confirm the visual impressions derived directly from the graphs. Many of the graphical formats are novel and appear here for the first time in print. All chapters have exercises. All functions introduced in the book are in the package. R code for all examples, both graphs and tables, in the book is included in the scripts directory of the package.
242 Clinical Trial Design, Monitoring, and Analysis Hmisc (core) Harrell Miscellaneous Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, and recoding variables.
243 Clinical Trial Design, Monitoring, and Analysis InformativeCensoring Multiple Imputation for Informative Censoring Multiple Imputation for Informative Censoring. This package implements two methods. Gamma Imputation from Jackson et al. (2014) doi:10.1002/sim.6274 and Risk Score Imputation from Hsu et al. (2009) doi:10.1002/sim.3480.
244 Clinical Trial Design, Monitoring, and Analysis ldbounds (core) Lan-DeMets Method for Group Sequential Boundaries Computations related to group sequential boundaries. Includes calculation of bounds using the Lan-DeMets alpha spending function approach.
245 Clinical Trial Design, Monitoring, and Analysis longpower Sample Size Calculations for Longitudinal Data The longpower package contains functions for computing power and sample size for linear models of longitudinal data based on the formula due to Liu and Liang (1997) and Diggle et al (2002). Either formula is expressed in terms of marginal model or Generalized Estimating Equations (GEE) parameters. This package contains functions which translate pilot mixed effect model parameters (e.g. random intercept and/or slope) into marginal model parameters so that the formulas of Diggle et al or Liu and Liang formula can be applied to produce sample size calculations for two sample longitudinal designs assuming known variance.
246 Clinical Trial Design, Monitoring, and Analysis MChtest (core) Monte Carlo hypothesis tests with Sequential Stopping The package performs Monte Carlo hypothesis tests. It allows a couple of different sequential stopping boundaries (a truncated sequential probability ratio test boundary and a boundary proposed by Besag and Clifford, 1991). Gives valid p-values and confidence intervals on p-values.
247 Clinical Trial Design, Monitoring, and Analysis MCPMod Design and Analysis of Dose-Finding Studies Implements a methodology for the design and analysis of dose-response studies that combines aspects of multiple comparison procedures and modeling approaches (Bretz, Pinheiro and Branson, 2005, Biometrics 61, 738-748, doi:10.1111/j.1541-0420.2005.00344.x). The package provides tools for the analysis of dose finding trials as well as a variety of tools necessary to plan a trial to be conducted with the MCP-Mod methodology. Please note: The ‘MCPMod’ package will not be further developed, all future development of the MCP-Mod methodology will be done in the ‘DoseFinding’ R-package.
248 Clinical Trial Design, Monitoring, and Analysis Mediana Clinical Trial Simulations Provides a general framework for clinical trial simulations based on the Clinical Scenario Evaluation (CSE) approach. The package supports a broad class of data models (including clinical trials with continuous, binary, survival-type and count-type endpoints as well as multivariate outcomes that are based on combinations of different endpoints), analysis strategies and commonly used evaluation criteria.
249 Clinical Trial Design, Monitoring, and Analysis meta General Package for Meta-Analysis User-friendly general package providing standard methods for meta-analysis and supporting Schwarzer, Carpenter, and Rucker doi:10.1007/978-3-319-21416-0, “Meta-Analysis with R” (2015): - fixed effect and random effects meta-analysis; - several plots (forest, funnel, Galbraith / radial, L’Abbe, Baujat, bubble); - statistical tests and trim-and-fill method to evaluate bias in meta-analysis; - import data from ‘RevMan 5’; - prediction interval, Hartung-Knapp and Paule-Mandel method for random effects model; - cumulative meta-analysis and leave-one-out meta-analysis; - meta-regression (if R package ‘metafor’ is installed); - generalised linear mixed models (if R packages ‘metafor’, ‘lme4’, ‘numDeriv’, and ‘BiasedUrn’ are installed).
250 Clinical Trial Design, Monitoring, and Analysis metafor Meta-Analysis Package for R A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L’Abbe, Baujat, GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto’s method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted.
251 Clinical Trial Design, Monitoring, and Analysis metaLik Likelihood Inference in Meta-Analysis and Meta-Regression Models First- and higher-order likelihood inference in meta-analysis and meta-regression models.
252 Clinical Trial Design, Monitoring, and Analysis metasens Advanced Statistical Methods to Model and Adjust for Bias in Meta-Analysis The following methods are implemented to evaluate how sensitive the results of a meta-analysis are to potential bias in meta-analysis and to support Schwarzer et al. (2015) doi:10.1007/978-3-319-21416-0, Chapter 5 “Small-Study Effects in Meta-Analysis”: - Copas selection model described in Copas & Shi (2001) doi:10.1177/096228020101000402; - limit meta-analysis by Rucker et al. (2011) doi:10.1093/biostatistics/kxq046; - upper bound for outcome reporting bias by Copas & Jackson (2004) doi:10.1111/j.0006-341X.2004.00161.x.
253 Clinical Trial Design, Monitoring, and Analysis multcomp Simultaneous Inference in General Parametric Models Simultaneous tests and confidence intervals for general linear hypotheses in parametric models, including linear, generalized linear, linear mixed effects, and survival models. The package includes demos reproducing analyzes presented in the book “Multiple Comparisons Using R” (Bretz, Hothorn, Westfall, 2010, CRC Press).
254 Clinical Trial Design, Monitoring, and Analysis nppbib Nonparametric Partially-Balanced Incomplete Block Design Analysis Implements a nonparametric statistical test for rank or score data from partially-balanced incomplete block-design experiments.
255 Clinical Trial Design, Monitoring, and Analysis PIPS (core) Predicted Interval Plots Generate Predicted Interval Plots. Simulate and plot confidence intervals of an effect estimate given observed data and a hypothesis about the distribution of future data.
256 Clinical Trial Design, Monitoring, and Analysis PowerTOST (core) Power and Sample Size Based on Two One-Sided t-Tests (TOST) for (Bio)Equivalence Studies Contains functions to calculate power and sample size for various study designs used for bioequivalence studies. See function known.designs() for study designs covered. Moreover the package contains functions for power and sample size based on ‘expected’ power in case of uncertain (estimated) variability and/or uncertain theta0. ― Added are functions for the power and sample size for the ratio of two means with normally distributed data on the original scale (based on Fieller’s confidence (‘fiducial’) interval). ― Contains further functions for power and sample size calculations based on non-inferiority t-test. This is not a TOST procedure but eventually useful if the question of ‘non-superiority’ must be evaluated. The power and sample size calculations based on non-inferiority test may also performed via ‘expected’ power in case of uncertain (estimated) variability and/or uncertain theta0. ― Contains functions power.scABEL() and sampleN.scABEL() to calculate power and sample size for the BE decision via scaled (widened) BE acceptance limits (EMA recommended) based on simulations. Contains also functions scABEL.ad() and sampleN.scABEL.ad() to iteratively adjust alpha in order to maintain the overall consumer risk in ABEL studies and adapt the sample size for the loss in power. Contains further functions power.RSABE() and sampleN.RSABE() to calculate power and sample size for the BE decision via reference scaled ABE criterion according to the FDA procedure based on simulations. Contains further functions power.NTIDFDA() and sampleN.NTIDFDA() to calculate power and sample size for the BE decision via the FDA procedure for NTID’s based on simulations. Contains further functions power.HVNTID() and sampleN.HVNTID() to calculate power and sample size for the BE decision via the FDA procedure for highly variable NTID’s (see FDA Dabigatran / rivaroxaban guidances) ― Contains functions for power analysis of a sample size plan for ABE (pa.ABE()), scaled ABE (pa.scABE()) and scaled ABE for NTID’s (pa.NTIDFDA()) analysing power if deviating from assumptions of the plan. ― Contains further functions for power calculations / sample size estimation for dose proportionality studies using the Power model.
257 Clinical Trial Design, Monitoring, and Analysis pwr (core) Basic Functions for Power Analysis Power analysis functions along the lines of Cohen (1988).
258 Clinical Trial Design, Monitoring, and Analysis PwrGSD (core) Power in a Group Sequential Design Tools the evaluation of interim analysis plans for sequentially monitored trials on a survival endpoint; tools to construct efficacy and futility boundaries, for deriving power of a sequential design at a specified alternative, template for evaluating the performance of candidate plans at a set of time varying alternatives.
259 Clinical Trial Design, Monitoring, and Analysis qtlDesign (core) Design of QTL experiments Tools for the design of QTL experiments
260 Clinical Trial Design, Monitoring, and Analysis rmeta Meta-analysis Functions for simple fixed and random effects meta-analysis for two-sample comparisons and cumulative meta-analyses. Draws standard summary plots, funnel plots, and computes summaries and tests for association and heterogeneity
261 Clinical Trial Design, Monitoring, and Analysis samplesize Sample Size Calculation for Various t-Tests and Wilcoxon-Test Computes sample size for Student’s t-test and for the Wilcoxon-Mann-Whitney test for categorical data. The t-test function allows paired and unpaired (balanced / unbalanced) designs as well as homogeneous and heterogeneous variances. The Wilcoxon function allows for ties.
262 Clinical Trial Design, Monitoring, and Analysis seqmon (core) Group Sequential Design Class for Clinical Trials S4 class object for creating and managing group sequential designs. It calculates the efficacy and futility boundaries at each look. It allows modifying the design and tracking the design update history.
263 Clinical Trial Design, Monitoring, and Analysis speff2trial (core) Semiparametric efficient estimation for a two-sample treatment effect The package performs estimation and testing of the treatment effect in a 2-group randomized clinical trial with a quantitative, dichotomous, or right-censored time-to-event endpoint. The method improves efficiency by leveraging baseline predictors of the endpoint. The inverse probability weighting technique of Robins, Rotnitzky, and Zhao (JASA, 1994) is used to provide unbiased estimation when the endpoint is missing at random.
264 Clinical Trial Design, Monitoring, and Analysis ssanv Sample Size Adjusted for Nonadherence or Variability of Input Parameters A set of functions to calculate sample size for two-sample difference in means tests. Does adjustments for either nonadherence or variability that comes from using data to estimate parameters.
265 Clinical Trial Design, Monitoring, and Analysis survival (core) Survival Analysis Contains the core survival analysis routines, including definition of Surv objects, Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models, and parametric accelerated failure time models.
266 Clinical Trial Design, Monitoring, and Analysis TEQR (core) Target Equivalence Range Design The TEQR package contains software to calculate the operating characteristics for the TEQR and the ACT designs.The TEQR (toxicity equivalence range) design is a toxicity based cumulative cohort design with added safety rules. The ACT (Activity constrained for toxicity) design is also a cumulative cohort design with additional safety rules. The unique feature of this design is that dose is escalated based on lack of activity rather than on lack of toxicity and is de-escalated only if an unacceptable level of toxicity is experienced.
267 Clinical Trial Design, Monitoring, and Analysis ThreeArmedTrials Design and Analysis of Clinical Non-Inferiority or Superiority Trials with Active and Placebo Control Design and analyze three-arm non-inferiority or superiority trials which follow a gold-standard design, i.e. trials with an experimental treatment, an active, and a placebo control.
268 Clinical Trial Design, Monitoring, and Analysis ThreeGroups ML Estimator for Baseline-Placebo-Treatment (Three-Group) Experiments Implements the Maximum Likelihood estimator for baseline, placebo, and treatment groups (three-group) experiments with non-compliance proposed by Gerber, Green, Kaplan, and Kern (2010).
269 Clinical Trial Design, Monitoring, and Analysis TrialSize (core) R functions in Chapter 3,4,6,7,9,10,11,12,14,15 functions and examples in Sample Size Calculation in Clinical Research.
270 Cluster Analysis & Finite Mixture Models AdMit Adaptive Mixture of Student-t Distributions Provides functions to perform the fitting of an adaptive mixture of Student-t distributions to a target density through its kernel function as described in Ardia et al. (2009) doi:10.18637/jss.v029.i03. The mixture approximation can then be used as the importance density in importance sampling or as the candidate density in the Metropolis-Hastings algorithm to obtain quantities of interest for the target density itself.
271 Cluster Analysis & Finite Mixture Models ADPclust Fast Clustering Using Adaptive Density Peak Detection An implementation of ADPclust clustering procedures (Fast Clustering Using Adaptive Density Peak Detection). The work is built and improved upon the idea of Rodriguez and Laio (2014)doi:10.1126/science.1242072. ADPclust clusters data by finding density peaks in a density-distance plot generated from local multivariate Gaussian density estimation. It includes an automatic centroids selection and parameter optimization algorithm, which finds the number of clusters and cluster centroids by comparing average silhouettes on a grid of testing clustering results; It also includes a user interactive algorithm that allows the user to manually selects cluster centroids from a two dimensional “density-distance plot”. Here is the research article associated with this package: “Wang, Xiao-Feng, and Yifan Xu (2015)doi:10.1177/0962280215609948 Fast clustering using adaptive density peak detection.” Statistical methods in medical research“. url: http://smm.sagepub.com/content/early/2015/10/15/0962280215609948.abstract.
272 Cluster Analysis & Finite Mixture Models amap Another Multidimensional Analysis Package Tools for Clustering and Principal Component Analysis (With robust methods, and parallelized functions).
273 Cluster Analysis & Finite Mixture Models apcluster Affinity Propagation Clustering Implements Affinity Propagation clustering introduced by Frey and Dueck (2007) doi:10.1126/science.1136800. The algorithms are largely analogous to the ‘Matlab’ code published by Frey and Dueck. The package further provides leveraged affinity propagation and an algorithm for exemplar-based agglomerative clustering that can also be used to join clusters obtained from affinity propagation. Various plotting functions are available for analyzing clustering results.
274 Cluster Analysis & Finite Mixture Models BayesLCA Bayesian Latent Class Analysis Bayesian Latent Class Analysis using several different methods.
275 Cluster Analysis & Finite Mixture Models bayesm Bayesian Inference for Marketing/Micro-Econometrics Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley 2005) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).
276 Cluster Analysis & Finite Mixture Models bayesMCClust Mixtures-of-Experts Markov Chain Clustering and Dirichlet Multinomial Clustering This package provides various Markov Chain Monte Carlo (MCMC) sampler for model-based clustering of discrete-valued time series obtained by observing a categorical variable with several states (in a Bayesian approach). In order to analyze group membership, we provide also an extension to the approaches by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule using a multinomial logit model.
277 Cluster Analysis & Finite Mixture Models bayesmix Bayesian Mixture Models with JAGS The fitting of finite mixture models of univariate Gaussian distributions using JAGS within a Bayesian framework is provided.
278 Cluster Analysis & Finite Mixture Models bclust Bayesian Hierarchical Clustering Using Spike and Slab Models Builds a dendrogram using log posterior as a natural distance defined by the model and meanwhile waits the clustering variables. It is also capable to computing equivalent Bayesian discrimination probabilities. The adopted method suites small sample large dimension setting. The model parameter estimation maybe difficult, depending on data structure and the chosen distribution family.
279 Cluster Analysis & Finite Mixture Models bgmm Gaussian Mixture Modeling Algorithms and the Belief-Based Mixture Modeling Two partially supervised mixture modeling methods: soft-label and belief-based modeling are implemented. For completeness, we equipped the package also with the functionality of unsupervised, semi- and fully supervised mixture modeling. The package can be applied also to selection of the best-fitting from a set of models with different component numbers or constraints on their structures. For detailed introduction see: Przemyslaw Biecek, Ewa Szczurek, Martin Vingron, Jerzy Tiuryn (2012), The R Package bgmm: Mixture Modeling with Uncertain Knowledge, Journal of Statistical Software doi:10.18637/jss.v047.i03.
280 Cluster Analysis & Finite Mixture Models biclust BiCluster Algorithms The main function biclust provides several algorithms to find biclusters in two-dimensional data: Cheng and Church, Spectral, Plaid Model, Xmotifs and Bimax. In addition, the package provides methods for data preprocessing (normalization and discretisation), visualisation, and validation of bicluster solutions.
281 Cluster Analysis & Finite Mixture Models Bmix Bayesian Sampling for Stick-Breaking Mixtures This is a bare-bones implementation of sampling algorithms for a variety of Bayesian stick-breaking (marginally DP) mixture models, including particle learning and Gibbs sampling for static DP mixtures, particle learning for dynamic BAR stick-breaking, and DP mixture regression. The software is designed to be easy to customize to suit different situations and for experimentation with stick-breaking models. Since particles are repeatedly copied, it is not an especially efficient implementation.
282 Cluster Analysis & Finite Mixture Models bmixture Bayesian Estimation for Finite Mixture of Distributions Provides statistical tools for Bayesian estimation for finite mixture of distributions, mainly mixture of Gamma, Normal and t-distributions. The package is implemented the recent improvements in Bayesian literature for the finite mixture of distributions, including Mohammadi and et al. (2013) doi:10.1007/s00180-012-0323-3 and Mohammadi and Salehi-Rad (2012) doi:10.1080/03610918.2011.588358.
283 Cluster Analysis & Finite Mixture Models cba Clustering for Business Analytics Implements clustering techniques such as Proximus and Rock, utility functions for efficient computation of cross distances and data manipulation.
284 Cluster Analysis & Finite Mixture Models cclust Convex Clustering Methods and Clustering Indexes Convex Clustering methods, including K-means algorithm, On-line Update algorithm (Hard Competitive Learning) and Neural Gas algorithm (Soft Competitive Learning), and calculation of several indexes for finding the number of clusters in a data set.
285 Cluster Analysis & Finite Mixture Models CEC Cross-Entropy Clustering Cross-Entropy Clustering (CEC) divides the data into Gaussian type clusters. It performs the automatic reduction of unnecessary clusters, while at the same time allows the simultaneous use of various type Gaussian mixture models.
286 Cluster Analysis & Finite Mixture Models CHsharp Choi and Hall Style Data Sharpening Functions for use in perturbing data prior to use of nonparametric smoothers and clustering.
287 Cluster Analysis & Finite Mixture Models clue Cluster Ensembles CLUster Ensembles.
288 Cluster Analysis & Finite Mixture Models cluster (core) “Finding Groups in Data”: Cluster Analysis Extended Rousseeuw et al. Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) “Finding Groups in Data”.
289 Cluster Analysis & Finite Mixture Models clusterfly Explore clustering interactively using R and GGobi Visualise clustering algorithms with GGobi. Contains both general code for visualising clustering results and specific visualisations for model-based, hierarchical and SOM clustering.
290 Cluster Analysis & Finite Mixture Models clusterGeneration Random Cluster Generation (with Specified Degree of Separation) We developed the clusterGeneration package to provide functions for generating random clusters, generating random covariance/correlation matrices, calculating a separation index (data and population version) for pairs of clusters or cluster distributions, and 1-D and 2-D projection plots to visualize clusters. The package also contains a function to generate random clusters based on factorial designs with factors such as degree of separation, number of clusters, number of variables, number of noisy variables.
291 Cluster Analysis & Finite Mixture Models clusterRepro Reproducibility of gene expression clusters A function for validating microarry clusters via reproducibility
292 Cluster Analysis & Finite Mixture Models clusterSim Searching for Optimal Clustering Procedure for a Data Set Distance measures (GDM1, GDM2, Sokal-Michener, Bray-Curtis, for symbolic interval-valued data), cluster quality indices (Calinski-Harabasz, Baker-Hubert, Hubert-Levine, Silhouette, Krzanowski-Lai, Hartigan, Gap, Davies-Bouldin), data normalization formulas, data generation (typical and non-typical data), HINoV method, replication analysis, linear ordering methods, spectral clustering, agreement indices between two partitions, plot functions (for categorical and symbolic interval-valued data). (MILLIGAN, G.W., COOPER, M.C. (1985) doi:10.1007/BF02294245, HUBERT, L., ARABIE, P. (1985), doi:10.1007%2FBF01908075, RAND, W.M. (1971) doi:10.1080/01621459.1971.10482356, JAJUGA, K., WALESIAK, M. (2000) doi:10.1007/978-3-642-57280-7_11, MILLIGAN, G.W., COOPER, M.C. (1988) doi:10.1007/BF01897163, CORMACK, R.M. (1971) doi:10.2307/2344237, JAJUGA, K., WALESIAK, M., BAK, A. (2003) doi:10.1007/978-3-642-55721-7_12, CARMONE, F.J., KARA, A., MAXWELL, S. (1999) doi:10.2307/3152003, DAVIES, D.L., BOULDIN, D.W. (1979) doi:10.1109/TPAMI.1979.4766909, CALINSKI, T., HARABASZ, J. (1974) doi:10.1080/03610927408827101, HUBERT, L. (1974) doi:10.1080/01621459.1974.10480191, TIBSHIRANI, R., WALTHER, G., HASTIE, T. (2001) doi:10.1111/1467-9868.00293, KRZANOWSKI, W.J., LAI, Y.T. (1988) doi:10.2307/2531893, BRECKENRIDGE, J.N. (2000) doi:10.1207/S15327906MBR3502_5, WALESIAK, M., DUDEK, A. (2008) doi:10.1007/978-3-540-78246-9_11).
293 Cluster Analysis & Finite Mixture Models clustMixType k-Prototypes Clustering for Mixed Variable-Type Data Functions to perform k-prototypes partitioning clustering for mixed variable-type data according to Z.Huang (1998): Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Variables, Data Mining and Knowledge Discovery 2, 283-304, doi:10.1023/A:1009769707641.
294 Cluster Analysis & Finite Mixture Models clustvarsel Variable Selection for Gaussian Model-Based Clustering Variable selection for Gaussian model-based clustering as implemented in the ‘mclust’ package. The methodology allows to find the (locally) optimal subset of variables in a data set that have group/cluster information. A greedy or headlong search can be used, either in a forward-backward or backward-forward direction, with or without sub-sampling at the hierarchical clustering stage for starting ‘mclust’ models. By default the algorithm uses a sequential search, but parallelisation is also available.
295 Cluster Analysis & Finite Mixture Models clv Cluster Validation Techniques Package contains most of the popular internal and external cluster validation methods ready to use for the most of the outputs produced by functions coming from package “cluster”. Package contains also functions and examples of usage for cluster stability approach that might be applied to algorithms implemented in “cluster” package as well as user defined clustering algorithms.
296 Cluster Analysis & Finite Mixture Models clValid Validation of Clustering Results Statistical and biological validation of clustering results.
297 Cluster Analysis & Finite Mixture Models CoClust Copula Based Cluster Analysis Copula Based Cluster Analysis.
298 Cluster Analysis & Finite Mixture Models compHclust Complementary Hierarchical Clustering Performs the complementary hierarchical clustering procedure and returns X’ (the expected residual matrix) and a vector of the relative gene importances.
299 Cluster Analysis & Finite Mixture Models dbscan Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms A fast reimplementation of several density-based algorithms of the DBSCAN family for spatial data. Includes the DBSCAN (density-based spatial clustering of applications with noise) and OPTICS (ordering points to identify the clustering structure) clustering algorithms HDBSCAN (hierarchical DBSCAN) and the LOF (local outlier factor) algorithm. The implementations uses the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided.
300 Cluster Analysis & Finite Mixture Models dendextend Extending ‘Dendrogram’ Functionality in R Offers a set of functions for extending ‘dendrogram’ objects in R, letting you visualize and compare trees of ‘hierarchical clusterings’. You can (1) Adjust a tree’s graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different ‘dendrograms’ to one another.
301 Cluster Analysis & Finite Mixture Models depmix Dependent Mixture Models Fits (multigroup) mixtures of latent or hidden Markov models on mixed categorical and continuous (timeseries) data. The Rdonlp2 package can optionally be used for optimization of the log-likelihood and is available from R-forge.
302 Cluster Analysis & Finite Mixture Models depmixS4 Dependent Mixture Models - Hidden Markov Models of GLMs and Other Distributions in S4 Fits latent (hidden) Markov models on mixed categorical and continuous (time series) data, otherwise known as dependent mixture models.
303 Cluster Analysis & Finite Mixture Models dpmixsim Dirichlet Process Mixture model simulation for clustering and image segmentation The package implements a Dirichlet Process Mixture (DPM) model for clustering and image segmentation. The DPM model is a Bayesian nonparametric methodology that relies on MCMC simulations for exploring mixture models with an unknown number of components. The code implements conjugate models with normal structure (conjugate normal-normal DP mixture model). The package’s applications are oriented towards the classification of magnetic resonance images according to tissue type or region of interest.
304 Cluster Analysis & Finite Mixture Models dynamicTreeCut Methods for Detection of Clusters in Hierarchical Clustering Dendrograms Contains methods for detection of clusters in hierarchical clustering dendrograms.
305 Cluster Analysis & Finite Mixture Models e1071 Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, …
306 Cluster Analysis & Finite Mixture Models edci Edge Detection and Clustering in Images Detection of edge points in images based on the difference of two asymmetric M-kernel estimators. Linear and circular regression clustering based on redescending M-estimators. Detection of linear edges in images.
307 Cluster Analysis & Finite Mixture Models EMCluster EM Algorithm for Model-Based Clustering of Finite Mixture Gaussian Distribution EM algorithms and several efficient initialization methods for model-based clustering of finite mixture Gaussian distribution with unstructured dispersion in both of unsupervised and semi-supervised learning.
308 Cluster Analysis & Finite Mixture Models evclust Evidential Clustering Various clustering algorithms that produce a credal partition, i.e., a set of Dempster-Shafer mass functions representing the membership of objects to clusters. The mass functions quantify the cluster-membership uncertainty of the objects. The algorithms are: Evidential c-Means (ECM), Relational Evidential c-Means (RECM), Constrained Evidential c-Means (CECM), EVCLUS and EK-NNclus.
309 Cluster Analysis & Finite Mixture Models FactoClass Combination of Factorial Methods and Cluster Analysis Some functions of ‘ade4’ and ‘stats’ are combined in order to obtain a partition of the rows of a data table, with columns representing variables of scales: quantitative, qualitative or frequency. First, a principal axes method is performed and then, a combination of Ward agglomerative hierarchical classification and K-means is performed, using some of the first coordinates obtained from the previous principal axes method. The function ‘kmeansW’, a modification of ‘kmeans’, programmed in C++, is included, in order to permit to have different weights of the elements to be clustered. Some complementary functions and datasets are included. See, for example: Lebart, L. and Piron, M. and Morineau, A. (2006). Statitique exploratoire multidimensionnelle, Dunod, Paris.
310 Cluster Analysis & Finite Mixture Models fastcluster Fast Hierarchical Clustering Routines for R and Python This is a two-in-one package which provides interfaces to both R and Python. It implements fast hierarchical, agglomerative clustering routines. Part of the functionality is designed as drop-in replacement for existing routines: linkage() in the SciPy package ‘scipy.cluster.hierarchy’, hclust() in R’s ‘stats’ package, and the ‘flashClust’ package. It provides the same functionality with the benefit of a much faster implementation. Moreover, there are memory-saving routines for clustering of vector data, which go beyond what the existing packages provide. For information on how to install the Python files, see the file INSTALL in the source distribution.
311 Cluster Analysis & Finite Mixture Models fclust Fuzzy Clustering Algorithms for fuzzy clustering, cluster validity indices and plots for cluster validity and visualizing fuzzy clustering results.
312 Cluster Analysis & Finite Mixture Models FisherEM The Fisher-EM algorithm The FisherEM package provides an efficient algorithm for the unsupervised classification of high-dimensional data. This FisherEM algorithm models and clusters the data in a discriminative and low-dimensional latent subspace. It also provides a low-dimensional representation of the clustered data. A sparse version of Fisher-EM algorithm is also provided.
313 Cluster Analysis & Finite Mixture Models flashClust Implementation of optimal hierarchical clustering Fast implementation of hierarchical clustering
314 Cluster Analysis & Finite Mixture Models flexclust (core) Flexible Cluster Algorithms The main function kcca implements a general framework for k-centroids cluster analysis supporting arbitrary distance measures and centroid computation. Further cluster methods include hard competitive learning, neural gas, and QT clustering. There are numerous visualization methods for cluster results (neighborhood graphs, convex cluster hulls, barcharts of centroids, …), and bootstrap methods for the analysis of cluster stability.
315 Cluster Analysis & Finite Mixture Models flexCWM Flexible Cluster-Weighted Modeling Allows for maximum likelihood fitting of cluster-weighted models, a class of mixtures of regression models with random covariates.
316 Cluster Analysis & Finite Mixture Models flexmix (core) Flexible Mixture Modeling A general framework for finite mixtures of regression models using the EM algorithm is implemented. The package provides the E-step and all data handling, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models and model-based clustering.
317 Cluster Analysis & Finite Mixture Models fpc Flexible Procedures for Clustering Various methods for clustering and cluster validation. Fixed point clustering. Linear regression clustering. Clustering by merging Gaussian mixture components. Symmetric and asymmetric discriminant projections for visualisation of the separation of groupings. Cluster validation statistics for distance based clustering including corrected Rand index. Cluster-wise cluster stability assessment. Methods for estimation of the number of clusters: Calinski-Harabasz, Tibshirani and Walther’s prediction strength, Fang and Wang’s bootstrap stability. Gaussian/multinomial mixture fitting for mixed continuous/categorical variables. Variable-wise statistics for cluster interpretation. DBSCAN clustering. Interface functions for many clustering methods implemented in R, including estimating the number of clusters with kmeans, pam and clara. Modality diagnosis for Gaussian mixtures. For an overview see package?fpc.
318 Cluster Analysis & Finite Mixture Models FunCluster Functional Profiling of Microarray Expression Data FunCluster performs a functional analysis of microarray expression data based on Gene Ontology & KEGG functional annotations. From expression data and functional annotations FunCluster builds classes of putatively co-regulated biological processes through a specially designed clustering procedure.
319 Cluster Analysis & Finite Mixture Models funFEM Clustering in the Discriminative Functional Subspace The funFEM algorithm (Bouveyron et al., 2014) allows to cluster functional data by modeling the curves within a common and discriminative functional subspace.
320 Cluster Analysis & Finite Mixture Models funHDDC Model-based clustering in group-specific functional subspaces The package provides the funHDDC algorithm (Bouveyron & Jacques, 2011) which allows to cluster functional data by modeling each group within a specific functional subspace.
321 Cluster Analysis & Finite Mixture Models gamlss.mx Fitting Mixture Distributions with GAMLSS The main purpose of this package is to allow fitting of mixture distributions with GAMLSS models.
322 Cluster Analysis & Finite Mixture Models genie A New, Fast, and Outlier Resistant Hierarchical Clustering Algorithm A new hierarchical clustering linkage criterion: the Genie algorithm links two clusters in such a way that a chosen economic inequity measure (e.g., the Gini index) of the cluster sizes does not increase drastically above a given threshold. Benchmarks indicate a high practical usefulness of the introduced method: it most often outperforms the Ward or average linkage in terms of the clustering quality while retaining the single linkage speed, see (Gagolewski et al. 2016a doi:10.1016/j.ins.2016.05.003, 2016b doi:10.1007/978-3-319-45656-0_16) for more details.
323 Cluster Analysis & Finite Mixture Models GLDEX Fitting Single and Mixture of Generalised Lambda Distributions (RS and FMKL) using Various Methods The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood and quantile matching estimation. Other methods such as moment matching, starship method, L moment matching are also provided. Diagnostics on goodness of fit can be done via qqplots, KS-resample tests and comparing mean, variance, skewness and kurtosis of the data with the fitted distribution.
324 Cluster Analysis & Finite Mixture Models GSM Gamma Shape Mixture Implementation of a Bayesian approach for estimating a mixture of gamma distributions in which the mixing occurs over the shape parameter. This family provides a flexible and novel approach for modeling heavy-tailed distributions, it is computationally efficient, and it only requires to specify a prior distribution for a single parameter.
325 Cluster Analysis & Finite Mixture Models HDclassif High Dimensional Supervised Classification and Clustering Discriminant analysis and data clustering methods for high dimensional data, based on the assumption that high-dimensional data live in different subspaces with low dimensionality proposing a new parametrization of the Gaussian mixture model which combines the ideas of dimension reduction and constraints on the model.
326 Cluster Analysis & Finite Mixture Models hybridHclust Hybrid Hierarchical Clustering Hybrid hierarchical clustering via mutual clusters. A mutual cluster is a set of points closer to each other than to all other points. Mutual clusters are used to enrich top-down hierarchical clustering.
327 Cluster Analysis & Finite Mixture Models idendr0 Interactive Dendrograms Interactive dendrogram that enables the user to select and color clusters, to zoom and pan the dendrogram, and to visualize the clustered data not only in a built-in heat map, but also in ‘GGobi’ interactive plots and user-supplied plots. This is a backport of Qt-based ‘idendro’ (https://github.com/tsieger/idendro) to base R graphics and Tcl/Tk GUI.
328 Cluster Analysis & Finite Mixture Models isopam Isopam (Clustering) Isopam clustering algorithm and utilities. Isopam optimizes clusters and optionally cluster numbers in a brute force style and aims at an optimum separation by all or some descriptors (typically species).
329 Cluster Analysis & Finite Mixture Models kernlab Kernel-Based Machine Learning Lab Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods ‘kernlab’ includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.
330 Cluster Analysis & Finite Mixture Models kml K-Means for Longitudinal Data An implementation of k-means specifically design to cluster longitudinal data. It provides facilities to deal with missing value, compute several quality criterion (Calinski and Harabatz, Ray and Turie, Davies and Bouldin, BIC, …) and propose a graphical interface for choosing the ‘best’ number of clusters.
331 Cluster Analysis & Finite Mixture Models largeVis High-Quality Visualizations of Large, High-Dimensional Datasets Implements the largeVis algorithm (see Tang, et al. (2016) doi:10.1145/2872427.2883041) for visualizing very large high-dimensional datasets. Also very fast search for approximate nearest neighbors; outlier detection; and optimized implementations of the HDBSCAN*, DBSCAN and OPTICS clustering algorithms; plotting functions for visualizing the above.
332 Cluster Analysis & Finite Mixture Models latentnet Latent Position and Cluster Models for Statistical Networks Fit and simulate latent position and cluster models for statistical networks.
333 Cluster Analysis & Finite Mixture Models lcmm Extended Mixed Models Using Latent Classes and Latent Processes Estimation of various extensions of the mixed models including latent class mixed models, joint latent latent class mixed models and mixed models for curvilinear univariate or multivariate longitudinal outcomes using a maximum likelihood estimation method.
334 Cluster Analysis & Finite Mixture Models longclust Model-Based Clustering and Classification for Longitudinal Data Clustering or classification of longitudinal data based on a mixture of multivariate t or Gaussian distributions with a Cholesky-decomposed covariance structure.
335 Cluster Analysis & Finite Mixture Models mcclust Process an MCMC Sample of Clusterings Implements methods for processing a sample of (hard) clusterings, e.g. the MCMC output of a Bayesian clustering model. Among them are methods that find a single best clustering to represent the sample, which are based on the posterior similarity matrix or a relabelling algorithm.
336 Cluster Analysis & Finite Mixture Models mclust (core) Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.
337 Cluster Analysis & Finite Mixture Models MetabolAnalyze Probabilistic latent variable models for metabolomic data Fits probabilistic principal components analysis, probabilistic principal components and covariates analysis and mixtures of probabilistic principal components models to metabolomic spectral data.
338 Cluster Analysis & Finite Mixture Models mixAK Multivariate Normal Mixture Models and Mixtures of Generalized Linear Mixed Models Including Model Based Clustering Contains a mixture of statistical methods including the MCMC methods to analyze normal mixtures. Additionally, model based clustering methods are implemented to perform classification based on (multivariate) longitudinal (or otherwise correlated) data. The basis for such clustering is a mixture of multivariate generalized linear mixed models.
339 Cluster Analysis & Finite Mixture Models mixdist Finite Mixture Distribution Models This package contains functions for fitting finite mixture distribution models to grouped data and conditional data by the method of maximum likelihood using a combination of a Newton-type algorithm and the EM algorithm.
340 Cluster Analysis & Finite Mixture Models mixer Random graph clustering Routines for the analysis (unsupervised clustering) of networks using MIXtures of Erdos-Renyi random graphs
341 Cluster Analysis & Finite Mixture Models mixPHM Mixtures of Proportional Hazard Models Fits multiple variable mixtures of various parametric proportional hazard models using the EM-Algorithm. Proportionality restrictions can be imposed on the latent groups and/or on the variables. Several survival distributions can be specified. Missing values and censored values are allowed. Independence is assumed over the single variables.
342 Cluster Analysis & Finite Mixture Models mixRasch Mixture Rasch Models with JMLE Estimates Rasch models and mixture Rasch models, including the dichotomous Rasch model, the rating scale model, and the partial credit model.
343 Cluster Analysis & Finite Mixture Models mixreg Functions to fit mixtures of regressions Fits mixtures of (possibly multivariate) regressions (which has been described as doing ANCOVA when you don’t know the levels).
344 Cluster Analysis & Finite Mixture Models MixSim Simulating Data to Study Performance of Clustering Algorithms The utility of this package is in simulating mixtures of Gaussian distributions with different levels of overlap between mixture components. Pairwise overlap, defined as a sum of two misclassification probabilities, measures the degree of interaction between components and can be readily employed to control the clustering complexity of datasets simulated from mixtures. These datasets can then be used for systematic performance investigation of clustering and finite mixture modeling algorithms. Among other capabilities of ‘MixSim’, there are computing the exact overlap for Gaussian mixtures, simulating Gaussian and non-Gaussian data, simulating outliers and noise variables, calculating various measures of agreement between two partitionings, and constructing parallel distribution plots for the graphical display of finite mixture models.
345 Cluster Analysis & Finite Mixture Models mixsmsn Fitting Finite Mixture of Scale Mixture of Skew-Normal Distributions Functions to fit finite mixture of scale mixture of skew-normal (FM-SMSN) distributions.
346 Cluster Analysis & Finite Mixture Models mixtools Tools for Analyzing Finite Mixture Models Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772.
347 Cluster Analysis & Finite Mixture Models mixture Mixture Models for Clustering and Classification An implementation of all 14 Gaussian parsimonious clustering models (GPCMs) for model-based clustering and model-based classification.
348 Cluster Analysis & Finite Mixture Models MOCCA Multi-objective optimization for collecting cluster alternatives This package provides methods to analyze cluster alternatives based on multi-objective optimization of cluster validation indices.
349 Cluster Analysis & Finite Mixture Models movMF Mixtures of von Mises-Fisher Distributions Fit and simulate mixtures of von Mises-Fisher distributions.
350 Cluster Analysis & Finite Mixture Models mritc MRI Tissue Classification Various methods for MRI tissue classification.
351 Cluster Analysis & Finite Mixture Models NbClust Determining the Best Number of Clusters in a Data Set It provides 30 indexes for determining the optimal number of clusters in a data set and offers the best clustering scheme from different results to the user.
352 Cluster Analysis & Finite Mixture Models nor1mix Normal (1-d) Mixture Models (S3 Classes and Methods) Onedimensional Normal Mixture Models Classes, for, e.g., density estimation or clustering algorithms research and teaching; providing the widely used Marron-Wand densities. Efficient random number generation and graphics; now fitting to data by ML (Maximum Likelihood) or EM estimation.
353 Cluster Analysis & Finite Mixture Models optpart Optimal Partitioning of Similarity Relations Contains a set of algorithms for creating partitions and coverings of objects largely based on operations on (dis)similarity relations (or matrices). There are several iterative re-assignment algorithms optimizing different goodness-of-clustering criteria. In addition, there are covering algorithms ‘clique’ which derives maximal cliques, and ‘maxpact’ which creates a covering of maximally compact sets. Graphical analyses and conversion routines are also included.
354 Cluster Analysis & Finite Mixture Models ORIClust Order-restricted Information Criterion-based Clustering Algorithm ORIClust is a user-friendly R-based software package for gene clustering. Clusters are given by genes matched to prespecified profiles across various ordered treatment groups. It is particularly useful for analyzing data obtained from short time-course or dose-response microarray experiments.
355 Cluster Analysis & Finite Mixture Models pdfCluster Cluster analysis via nonparametric density estimation The package performs cluster analysis via nonparametric density estimation. Operationally, the kernel method is used throughout to estimate the density. Diagnostics methods for evaluating the quality of the clustering are available. The package includes also a routine to estimate the probability density function obtained by the kernel method, given a set of data with arbitrary dimensions.
356 Cluster Analysis & Finite Mixture Models pendensity Density Estimation with a Penalized Mixture Approach Estimation of univariate (conditional) densities using penalized B-splines with automatic selection of optimal smoothing parameter.
357 Cluster Analysis & Finite Mixture Models pgmm Parsimonious Gaussian Mixture Models Carries out model-based clustering or classification using parsimonious Gaussian mixture models.
358 Cluster Analysis & Finite Mixture Models pmclust Parallel Model-Based Clustering using Expectation-Gathering-Maximization Algorithm for Finite Mixture Gaussian Model Aims to utilize model-based clustering (unsupervised) for high dimensional and ultra large data, especially in a distributed manner. The code employs pbdMPI to perform a expectation-gathering-maximization algorithm for finite mixture Gaussian models. The unstructured dispersion matrices are assumed in the Gaussian models. The implementation is default in the single program multiple data programming model. The code can be executed through pbdMPI and independent to most MPI applications. See the High Performance Statistical Computing website for more information, documents and examples.
359 Cluster Analysis & Finite Mixture Models poLCA Polytomous variable Latent Class Analysis Latent class analysis and latent class regression models for polytomous outcome variables. Also known as latent structure analysis.
360 Cluster Analysis & Finite Mixture Models prabclus Functions for Clustering of Presence-Absence, Abundance and Multilocus Genetic Data Distance-based parametric bootstrap tests for clustering with spatial neighborhood information. Some distance measures, Clustering of presence-absence, abundance and multilocus genetical data for species delimitation, nearest neighbor based noise detection. Try package?prabclus for on overview.
361 Cluster Analysis & Finite Mixture Models prcr Person-Centered Analysis Provides an easy-to-use yet adaptable set of tools to conduct person-center analysis using a two-step clustering procedure. As described in Bergman and El-Khouri (1999) doi:10.1002/(SICI)1521-4036(199910)41:6%3C753::AID-BIMJ753%3E3.0.CO;2-K, hierarchical clustering is performed to determine the initial partition for the subsequent k-means clustering procedure.
362 Cluster Analysis & Finite Mixture Models PReMiuM Dirichlet Process Bayesian Clustering, Profile Regression Bayesian clustering using a Dirichlet process mixture model. This model is an alternative to regression models, non-parametrically linking a response vector to covariate data through cluster membership. The package allows Bernoulli, Binomial, Poisson, Normal, survival and categorical response, as well as Normal and discrete covariates. It also allows for fixed effects in the response model, where a spatial CAR (conditional autoregressive) term can be also included. Additionally, predictions may be made for the response, and missing values for the covariates are handled. Several samplers and label switching moves are implemented along with diagnostic tools to assess convergence. A number of R functions for post-processing of the output are also provided. In addition to fitting mixtures, it may additionally be of interest to determine which covariates actively drive the mixture components. This is implemented in the package as variable selection.
363 Cluster Analysis & Finite Mixture Models profdpm Profile Dirichlet Process Mixtures This package facilitates profile inference (inference at the posterior mode) for a class of product partition models (PPM). The Dirichlet process mixture is currently the only available member of this class. These methods search for the maximum posterior (MAP) estimate for the data partition in a PPM.
364 Cluster Analysis & Finite Mixture Models protoclust Hierarchical Clustering with Prototypes Performs minimax linkage hierarchical clustering. Every cluster has an associated prototype element that represents that cluster as described in Bien, J., and Tibshirani, R. (2011), “Hierarchical Clustering with Prototypes via Minimax Linkage,” accepted for publication in The Journal of the American Statistical Association, DOI: 10.1198/jasa.2011.tm10183.
365 Cluster Analysis & Finite Mixture Models psychomix Psychometric Mixture Models Psychometric mixture models based on ‘flexmix’ infrastructure. At the moment Rasch mixture models with different parameterizations of the score distribution (saturated vs. mean/variance specification), Bradley-Terry mixture models, and MPT mixture models are implemented. These mixture models can be estimated with or without concomitant variables. See vignette(‘raschmix’, package = ‘psychomix’) for details on the Rasch mixture models.
366 Cluster Analysis & Finite Mixture Models pvclust Hierarchical Clustering with P-Values via Multiscale Bootstrap Resampling An implementation of multiscale bootstrap resampling for assessing the uncertainty in hierarchical cluster analysis. It provides AU (approximately unbiased) p-value as well as BP (bootstrap probability) value for each cluster in a dendrogram.
367 Cluster Analysis & Finite Mixture Models randomLCA Random Effects Latent Class Analysis Fits standard and random effects latent class models. The single level random effects model is described in Qu et al doi:10.2307/2533043 and the two level random effects model in Beath and Heller doi:10.1177/1471082X0800900302. Examples are given for their use in diagnostic testing.
368 Cluster Analysis & Finite Mixture Models rjags Bayesian Graphical Models using MCMC Interface to the JAGS MCMC library.
369 Cluster Analysis & Finite Mixture Models Rmixmod (core) Supervised, Unsupervised, Semi-Supervised Classification with MIXture MODelling (Interface of MIXMOD Software) Interface of MIXMOD software for supervised, unsupervised and semi-Supervised classification with MIXture MODelling.
370 Cluster Analysis & Finite Mixture Models RPMM Recursively Partitioned Mixture Model Recursively Partitioned Mixture Model for Beta and Gaussian Mixtures. This is a model-based clustering algorithm that returns a hierarchy of classes, similar to hierarchical clustering, but also similar to finite mixture models.
371 Cluster Analysis & Finite Mixture Models seriation Infrastructure for Ordering Objects Using Seriation Infrastructure for seriation with an implementation of several seriation/sequencing techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).
372 Cluster Analysis & Finite Mixture Models sigclust Statistical Significance of Clustering SigClust is a statistical method for testing the significance of clustering results. SigClust can be applied to assess the statistical significance of splitting a data set into two clusters. For more than two clusters, SigClust can be used iteratively.
373 Cluster Analysis & Finite Mixture Models skmeans Spherical k-Means Clustering Algorithms to compute spherical k-means partitions. Features several methods, including a genetic and a fixed-point algorithm and an interface to the CLUTO vcluster program.
374 Cluster Analysis & Finite Mixture Models som Self-Organizing Map Self-Organizing Map (with application in gene clustering).
375 Cluster Analysis & Finite Mixture Models sparcl Perform sparse hierarchical clustering and sparse k-means clustering Implements the sparse clustering methods of Witten and Tibshirani (2010): “A framework for feature selection in clustering”; published in Journal of the American Statistical Association 105(490): 713-726.
376 Cluster Analysis & Finite Mixture Models tclust Robust Trimmed Clustering Provides functions for robust trimmed clustering. The methods are described in Garcia-Escudero (2008) doi:10.1214/07-AOS515, Fritz et al. (2012) doi:10.18637/jss.v047.i12 and others.
377 Cluster Analysis & Finite Mixture Models teigen Model-Based Clustering and Classification with the Multivariate t Distribution Fits mixtures of multivariate t-distributions (with eigen-decomposed covariance structure) via the expectation conditional-maximization algorithm under a clustering or classification paradigm.
378 Cluster Analysis & Finite Mixture Models treeClust Cluster Distances Through Trees Create a measure of inter-point dissimilarity useful for clustering mixed data, and, optionally, perform the clustering.
379 Cluster Analysis & Finite Mixture Models trimcluster Cluster analysis with trimming Trimmed k-means clustering.
380 Cluster Analysis & Finite Mixture Models wle Weighted Likelihood Estimation Approach to the robustness via Weighted Likelihood.
381 Differential Equations adaptivetau Tau-Leaping Stochastic Simulation Implements adaptive tau leaping to approximate the trajectory of a continuous-time stochastic process as described by Cao et al. (2007) The Journal of Chemical Physics doi:10.1063/1.2745299. This package is based upon work supported by NSF DBI-0906041 and NIH K99-GM104158 to Philip Johnson and NIH R01-AI049334 to Rustom Antia.
382 Differential Equations bvpSolve (core) Solvers for Boundary Value Problems of Differential Equations Functions that solve boundary value problems (‘BVP’) of systems of ordinary differential equations (‘ODE’) and differential algebraic equations (‘DAE’). The functions provide an interface to the FORTRAN functions ‘twpbvpC’, ‘colnew/colsys’, and an R-implementation of the shooting method.
383 Differential Equations cOde Automated C Code Generation for ‘deSolve’, ‘bvpSolve’ and ‘Sundials’ Generates all necessary C functions allowing the user to work with the compiled-code interface of ode() and bvptwp(). The implementation supports “forcings” and “events”. Also provides functions to symbolically compute Jacobians, sensitivity equations and adjoint sensitivities being the basis for sensitivity analysis. Alternatively to ‘deSolve’, the Sundials ‘CVODES’ solver is implemented for computation of model sensitivities.
384 Differential Equations CollocInfer Collocation Inference for Dynamic Systems These functions implement collocation-inference for continuous-time and discrete-time stochastic processes. They provide model-based smoothing, gradient-matching, generalized profiling and forwards prediction error methods.
385 Differential Equations deSolve (core) Solvers for Initial Value Problems of Differential Equations (‘ODE’, ‘DAE’, ‘DDE’) Functions that solve initial value problems of a system of first-order ordinary differential equations (‘ODE’), of partial differential equations (‘PDE’), of differential algebraic equations (‘DAE’), and of delay differential equations. The functions provide an interface to the FORTRAN functions ‘lsoda’, ‘lsodar’, ‘lsode’, ‘lsodes’ of the ‘ODEPACK’ collection, to the FORTRAN functions ‘dvode’, ‘zvode’ and ‘daspk’ and a C-implementation of solvers of the ‘Runge-Kutta’ family with fixed or variable time steps. The package contains routines designed for solving ‘ODEs’ resulting from 1-D, 2-D and 3-D partial differential equations (‘PDE’) that have been converted to ‘ODEs’ by numerical differencing.
386 Differential Equations deTestSet Test Set for Differential Equations Solvers and test set for stiff and non-stiff differential equations, and differential algebraic equations.
387 Differential Equations dMod Dynamic Modeling and Parameter Estimation in ODE Models The framework provides functions to generate ODEs of reaction networks, parameter transformations, observation functions, residual functions, etc. The framework follows the paradigm that derivative information should be used for optimization whenever possible. Therefore, all major functions produce and can handle expressions for symbolic derivatives.
388 Differential Equations ecolMod “A practical guide to ecological modelling - using R as a simulation platform” Figures, data sets and examples from the book “A practical guide to ecological modelling - using R as a simulation platform” by Karline Soetaert and Peter MJ Herman (2009). Springer. All figures from chapter x can be generated by “demo(chapx)”, where x = 1 to 11. The R-scripts of the model examples discussed in the book are in subdirectory “examples”, ordered per chapter. Solutions to model projects are in the same subdirectories.
389 Differential Equations FME A Flexible Modelling Environment for Inverse Modelling, Sensitivity, Identifiability and Monte Carlo Analysis Provides functions to help in fitting models to data, to perform Monte Carlo, sensitivity and identifiability analysis. It is intended to work with models be written as a set of differential equations that are solved either by an integration routine from package ‘deSolve’, or a steady-state solver from package ‘rootSolve’. However, the methods can also be used with other types of functions.
390 Differential Equations GillespieSSA Gillespie’s Stochastic Simulation Algorithm (SSA) GillespieSSA provides a simple to use, intuitive, and extensible interface to several stochastic simulation algorithms for generating simulated trajectories of finite population continuous-time model. Currently it implements Gillespie’s exact stochastic simulation algorithm (Direct method) and several approximate methods (Explicit tau-leap, Binomial tau-leap, and Optimized tau-leap). The package also contains a library of template models that can be run as demo models and can easily be customized and extended. Currently the following models are included, decaying-dimerization reaction set, linear chain system, logistic growth model, Lotka predator-prey model, Rosenzweig-MacArthur predator-prey model, Kermack-McKendrick SIR model, and a metapopulation SIRS model.
391 Differential Equations mkin Kinetic Evaluation of Chemical Degradation Data Calculation routines based on the FOCUS Kinetics Report (2006, 2014). Includes a function for conveniently defining differential equation models, model solution based on eigenvalues if possible or using numerical solvers and a choice of the optimisation methods made available by the ‘FME’ package. If a C compiler (on windows: ‘Rtools’) is installed, differential equation models are solved using compiled C functions. Please note that no warranty is implied for correctness of results or fitness for a particular purpose.
392 Differential Equations nlmeODE Non-linear mixed-effects modelling in nlme using differential equations This package combines the odesolve and nlme packages for mixed-effects modelling using differential equations.
393 Differential Equations odeintr C++ ODE Solvers Compiled on-Demand Wraps the Boost odeint library for integration of differential equations.
394 Differential Equations PBSddesolve Solver for Delay Differential Equations Routines for solving systems of delay differential equations by interfacing numerical routines written by Simon N. Wood , with contributions by Benjamin J. Cairns. These numerical routines first appeared in Simon Wood’s ‘solv95’ program. This package includes a vignette and a complete user’s guide. ‘PBSddesolve’ originally appeared on CRAN under the name ‘ddesolve’. That version is no longer supported. The current name emphasizes a close association with other PBS packages, particularly ‘PBSmodelling’.
395 Differential Equations PBSmodelling GUI Tools Made Easy: Interact with Models and Explore Data Provides software to facilitate the design, testing, and operation of computer models. It focuses particularly on tools that make it easy to construct and edit a customized graphical user interface (GUI). Although our simplified GUI language depends heavily on the R interface to the Tcl/Tk package, a user does not need to know Tcl/Tk. Examples illustrate models built with other R packages, including PBSmapping, PBSddesolve, and BRugs. A complete user’s guide ‘PBSmodelling-UG.pdf’ shows how to use this package effectively.
396 Differential Equations phaseR Phase Plane Analysis of One and Two Dimensional Autonomous ODE Systems phaseR is an R package for the qualitative analysis of one and two dimensional autonomous ODE systems, using phase plane methods. Programs are available to identify and classify equilibrium points, plot the direction field, and plot trajectories for multiple initial conditions. In the one dimensional case, a program is also available to plot the phase portrait. Whilst in the two dimensional case, additionally a program is available to plot nullclines. Many example systems are provided for the user.
397 Differential Equations pomp Statistical Inference for Partially Observed Markov Processes Tools for working with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.
398 Differential Equations pracma Practical Numerical Math Functions Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses ‘MATLAB’ function names where appropriate to simplify porting.
399 Differential Equations primer Functions and data for A Primer of Ecology with R Functions are primarily functions for systems of ordinary differential equations, difference equations, and eigenanalysis and projection of demographic matrices; data are for examples.
400 Differential Equations QPot Quasi-Potential Analysis for Stochastic Differential Equations Tools to 1) simulate and visualize stochastic differential equations and 2) determine stability of equilibria using the ordered-upwind method to compute the quasi-potential.
401 Differential Equations ReacTran Reactive Transport Modelling in 1d, 2d and 3d Routines for developing models that describe reaction and advective-diffusive transport in one, two or three dimensions. Includes transport routines in porous media, in estuaries, and in bodies with variable shape.
402 Differential Equations rODE Ordinary Differential Equation (ODE) Solvers Written in R Using S4 Classes Show physics, math and engineering students how an ODE solver is made and how effective R classes can be for the construction of the equations that describe natural phenomena. Inspiration for this work comes from the book on “Computer Simulations in Physics” by Harvey Gould, Jan Tobochnik, and Wolfgang Christian. Book link: http://www.compadre.org/osp/items/detail.cfm?ID=7375.
403 Differential Equations rodeo A Code Generator for ODE-Based Models Provides an R6 class and several utility methods to facilitate the implementation of models based on ordinary differential equations. The heart of the package is a code generator that creates compiled ‘Fortran’ (or ‘R’) code which can be passed to a numerical solver. There is direct support for solvers contained in packages ‘deSolve’ and ‘rootSolve’.
404 Differential Equations rootSolve (core) Nonlinear Root Finding, Equilibrium and Steady-State Analysis of Ordinary Differential Equations Routines to find the root of nonlinear functions, and to perform steady-state and equilibrium analysis of ordinary differential equations (ODE). Includes routines that: (1) generate gradient and jacobian matrices (full and banded), (2) find roots of non-linear equations by the ‘Newton-Raphson’ method, (3) estimate steady-state conditions of a system of (differential) equations in full, banded or sparse form, using the ‘Newton-Raphson’ method, or by dynamically running, (4) solve the steady-state conditions for uni-and multicomponent 1-D, 2-D, and 3-D partial differential equations, that have been converted to ordinary differential equations by numerical differencing (using the method-of-lines approach). Includes fortran code.
405 Differential Equations rpgm Fast Simulation of Normal/Exponential Random Variables and Stochastic Differential Equations / Poisson Processes Fast simulation of some random variables than the usual native functions, including rnorm() and rexp(), using Ziggurat method, reference: MARSAGLIA, George, TSANG, Wai Wan, and al. (2000) doi:10.18637/jss.v005.i08, and fast simulation of stochastic differential equations / Poisson processes.
406 Differential Equations scaRabee Optimization Toolkit for Pharmacokinetic-Pharmacodynamic Models scaRabee is a port of the Scarabee toolkit originally written as a Matlab-based application. It provides a framework for simulation and optimization of pharmacokinetic-pharmacodynamic models at the individual and population level. It is built on top of the neldermead package, which provides the direct search algorithm proposed by Nelder and Mead for model optimization.
407 Differential Equations sde (core) Simulation and Inference for Stochastic Differential Equations Companion package to the book Simulation and Inference for Stochastic Differential Equations With R Examples, ISBN 978-0-387-75838-1, Springer, NY.
408 Differential Equations Sim.DiffProc Simulation of Diffusion Processes A package for symbolic and numerical computations on scalar and multivariate systems of stochastic differential equations. It provides users with a wide range of tools to simulate, estimate, analyze, and visualize the dynamics of these systems in both forms Ito and Stratonovich. Statistical analysis with Parallel Monte-Carlo and moment equations methods of SDE’s. Enabled many searchers in different domains to use these equations to modeling practical problems in financial and actuarial modeling and other areas of application, e.g., modeling and simulate of first passage time problem in shallow water using the attractive center (Boukhetala K, 1996).
409 Differential Equations simecol Simulation of Ecological (and Other) Dynamic Systems An object oriented framework to simulate ecological (and other) dynamic systems. It can be used for differential equations, individual-based (or agent-based) and other models as well. The package helps to organize scenarios (to avoid copy and paste) and aims to improve readability and usability of code.
410 Probability Distributions actuar (core) Actuarial Functions and Heavy Tailed Distributions Functions and data sets for actuarial science: modeling of loss distributions; risk theory and ruin theory; simulation of compound models, discrete mixtures and compound hierarchical models; credibility theory. Support for many additional probability distributions to model insurance loss amounts and loss frequency: 19 continuous heavy tailed distributions; the Poisson-inverse Gaussian discrete distribution; zero-truncated and zero-modified extensions of the standard discrete distributions. Support for phase-type distributions commonly used to compute ruin probabilities.
411 Probability Distributions AdMit Adaptive Mixture of Student-t Distributions Provides functions to perform the fitting of an adaptive mixture of Student-t distributions to a target density through its kernel function as described in Ardia et al. (2009) doi:10.18637/jss.v029.i03. The mixture approximation can then be used as the importance density in importance sampling or as the candidate density in the Metropolis-Hastings algorithm to obtain quantities of interest for the target density itself.
412 Probability Distributions agricolae Statistical Procedures for Agricultural Research Original idea was presented in the thesis “A statistical analysis tool for agricultural research” to obtain the degree of Master on science, National Engineering University (UNI), Lima-Peru. Some experimental data for the examples come from the CIP and others research. Agricolae offers extensive functionality on experimental design especially for agricultural and plant breeding experiments, which can also be useful for other purposes. It supports planning of lattice, Alpha, Cyclic, Complete Block, Latin Square, Graeco-Latin Squares, augmented block, factorial, split and strip plot designs. There are also various analysis facilities for experimental data, e.g. treatment comparison procedures and several non-parametric tests comparison, biodiversity indexes and consensus cluster.
413 Probability Distributions ald The Asymmetric Laplace Distribution It provides the density, distribution function, quantile function, random number generator, likelihood function, moments and Maximum Likelihood estimators for a given sample, all this for the three parameter Asymmetric Laplace Distribution defined in Koenker and Machado (1999). This is a special case of the skewed family of distributions available in Galarza (2016) http://www.ime.unicamp.br/sites/default/files/rp07-16.pdf useful for quantile regression.
414 Probability Distributions AtelieR A GTK GUI for teaching basic concepts in statistical inference, and doing elementary bayesian tests A collection of statistical simulation and computation tools with a GTK GUI, to help teach statistical concepts and compute probabilities. Two domains are covered: I. Understanding (Central-Limit Theorem and the Normal Distribution, Distribution of a sample mean, Distribution of a sample variance, Probability calculator for common distributions), and II. Elementary Bayesian Statistics (bayesian inference on proportions, contingency tables, means and variances, with informative and noninformative priors).
415 Probability Distributions bayesm Bayesian Inference for Marketing/Micro-Econometrics Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley 2005) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).
416 Probability Distributions benchden 28 benchmark densities from Berlinet/Devroye (1994) Full implementation of the 28 distributions introduced as benchmarks for nonparametric density estimation by Berlinet and Devroye (1994). Includes densities, cdfs, quantile functions and generators for samples as well as additional information on features of the densities. Also contains the 4 histogram densities used in Rozenholc/Mildenberger/Gather (2010).
417 Probability Distributions BiasedUrn Biased Urn Model Distributions Statistical models of biased sampling in the form of univariate and multivariate noncentral hypergeometric distributions, including Wallenius’ noncentral hypergeometric distribution and Fisher’s noncentral hypergeometric distribution (also called extended hypergeometric distribution). See vignette(“UrnTheory”) for explanation of these distributions.
418 Probability Distributions BivarP Estimating the Parameters of Some Bivariate Distributions Parameter estimation of bivariate distribution functions modeled as a Archimedean copula function. The input data may contain values from right censored. Used marginal distributions are two-parameter. Methods for density, distribution, survival, random sample generation.
419 Probability Distributions bmixture Bayesian Estimation for Finite Mixture of Distributions Provides statistical tools for Bayesian estimation for finite mixture of distributions, mainly mixture of Gamma, Normal and t-distributions. The package is implemented the recent improvements in Bayesian literature for the finite mixture of distributions, including Mohammadi and et al. (2013) doi:10.1007/s00180-012-0323-3 and Mohammadi and Salehi-Rad (2012) doi:10.1080/03610918.2011.588358.
420 Probability Distributions bridgedist An Implementation of the Bridge Distribution with Logit-Link as in Wang and Louis (2003) An implementation of the bridge distribution with logit-link in R. In Wang and Louis (2003) doi:10.1093/biomet/90.4.765, such a univariate bridge distribution was derived as the distribution of the random intercept that ‘bridged’ a marginal logistic regression and a conditional logistic regression. The conditional and marginal regression coefficients are a scalar multiple of each other. Such is not the case if the random intercept distribution was Gaussian.
421 Probability Distributions CDVine Statistical Inference of C- And D-Vine Copulas Functions for statistical inference of canonical vine (C-vine) and D-vine copulas. Tools for bivariate exploratory data analysis and for bivariate as well as vine copula selection are provided. Models can be estimated either sequentially or by joint maximum likelihood estimation. Sampling algorithms and plotting methods are also included. Data is assumed to lie in the unit hypercube (so-called copula data).
422 Probability Distributions cmvnorm The Complex Multivariate Gaussian Distribution Various utilities for the complex multivariate Gaussian distribution.
423 Probability Distributions coga Convolution of Gamma Distributions Convolution of gamma distributions in R. The convolution of gamma distributions is the sum of series of gamma distributions and all gamma distributions here can have different parameters. This package can calculate density, distribution function and do simulation work.
424 Probability Distributions CompGLM Conway-Maxwell-Poisson GLM and distribution functions The package contains a function (which uses a similar interface to the ‘glm’ function) for the fitting of a Conway-Maxwell-Poisson GLM. There are also various methods for analysis of the model fit. The package also contains functions for the Conway-Maxwell-Poisson distribution in a similar interface to functions ‘dpois’, ‘ppois’ and ‘rpois’. The functions are generally quick, since the workhorse functions are written in C++ (thanks to the Rcpp package).
425 Probability Distributions CompLognormal Functions for actuarial scientists Computes the probability density function, cumulative distribution function, quantile function, random numbers of any composite model based on the lognormal distribution.
426 Probability Distributions compoisson Conway-Maxwell-Poisson Distribution Provides routines for density and moments of the Conway-Maxwell-Poisson distribution as well as functions for fitting the COM-Poisson model for over/under-dispersed count data.
427 Probability Distributions Compositional Compositional Data Analysis Regression, classification, contour plots, hypothesis testing and fitting of distributions for compositional data are some of the functions included. The standard textbook for such data is John Aitchison’s (1986). “The statistical analysis of compositional data”. Chapman & Hall.
428 Probability Distributions Compounding Computing Continuous Distributions Computing Continuous Distributions Obtained by Compounding a Continuous and a Discrete Distribution
429 Probability Distributions CompQuadForm Distribution Function of Quadratic Forms in Normal Variables Computes the distribution function of quadratic forms in normal variables using Imhof’s method, Davies’s algorithm, Farebrother’s algorithm or Liu et al.’s algorithm.
430 Probability Distributions condMVNorm Conditional Multivariate Normal Distribution Computes conditional multivariate normal probabilities, random deviates and densities.
431 Probability Distributions copBasic General Bivariate Copula Theory and Many Utility Functions Extensive functions for bivariate copula (bicopula) computations and related operations concerning oft cited bicopula theory described by Nelsen (2006), Joe (2014), and other selected works. The lower, upper, product, and select other bicopula are implemented. Arbitrary bicopula expressions include the diagonal, survival copula, the dual of a copula, co-copula, numerical bicopula density, and maximum likelihood estimation. Level curves (sets), horizontal and vertical sections also are supported. Numerical derivatives and inverses of a bicopula are provided; simulation by the conditional distribution method thus is supported. Bicopula composition, convex combination, and products are provided. Support extends to Kendall Function as well as the Lmoments thereof, Kendall Tau, Spearman Rho and Footrule, Gini Gamma, Blomqvist Beta, Hoeffding Phi, Schweizer-Wolff Sigma, tail dependency (including pseudo-polar representation) and tail order, skewness, and bivariate Lmoments. Evaluators of positively/negatively quadrant dependency, left increasing and right decreasing are available. Kullback-Leibler divergence, Vuong’s procedure, Spectral Measure, and Lcomoments for copula inference are available. Quantile and median regressions for V with respect to U and U with respect to V are available. Empirical copulas (EC) are supported.
432 Probability Distributions copula (core) Multivariate Dependence with Copulas Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
433 Probability Distributions csn Closed Skew-Normal Distribution Provides functions for computing the density and the log-likelihood function of closed-skew normal variates, and for generating random vectors sampled from this distribution. See Gonzalez-Farias, G., Dominguez-Molina, J., and Gupta, A. (2004). The closed skew normal distribution, Skew-elliptical distributions and their applications: a journey beyond normality, Chapman and Hall/CRC, Boca Raton, FL, pp. 25-42.
434 Probability Distributions Davies The Davies Quantile Function Various utilities for the Davies distribution.
435 Probability Distributions degreenet Models for Skewed Count Distributions Relevant to Networks Likelihood-based inference for skewed count distributions used in network modeling. “degreenet” is a part of the “statnet” suite of packages for network analysis.
436 Probability Distributions Delaporte Statistical Functions for the Delaporte Distribution Provides probability mass, distribution, quantile, random-variate generation, and method-of-moments parameter-estimation functions for the Delaporte distribution. The Delaporte is a discrete probability distribution which can be considered the convolution of a negative binomial distribution with a Poisson distribution. Alternatively, it can be considered a counting distribution with both Poisson and negative binomial components. It has been studied in actuarial science as a frequency distribution which has more variability than the Poisson, but less than the negative binomial.
437 Probability Distributions denstrip Density strips and other methods for compactly illustrating distributions Graphical methods for compactly illustrating probability distributions, including density strips, density regions, sectioned density plots and varying width strips.
438 Probability Distributions dirmult Estimation in Dirichlet-Multinomial distribution Estimate parameters in Dirichlet-Multinomial and compute profile log-likelihoods.
439 Probability Distributions disclap Discrete Laplace Exponential Family Discrete Laplace exponential family for models such as a generalized linear model
440 Probability Distributions DiscreteInverseWeibull Discrete Inverse Weibull Distribution Probability mass function, distribution function, quantile function, random generation and parameter estimation for the discrete inverse Weibull distribution.
441 Probability Distributions DiscreteLaplace Discrete Laplace Distributions Probability mass function, distribution function, quantile function, random generation and estimation for the skew discrete Laplace distributions.
442 Probability Distributions DiscreteWeibull Discrete Weibull Distributions (Type 1 and 3) Probability mass function, distribution function, quantile function, random generation and parameter estimation for the type I and III discrete Weibull distributions.
443 Probability Distributions distr (core) Object Oriented Implementation of Distributions S4-classes and methods for distributions.
444 Probability Distributions distrDoc Documentation for ‘distr’ Family of R Packages Provides documentation in form of a common vignette to packages ‘distr’, ‘distrEx’, ‘distrMod’, ‘distrSim’, ‘distrTEst’, ‘distrTeach’, and ‘distrEllipse’.
445 Probability Distributions distrEllipse S4 Classes for Elliptically Contoured Distributions Distribution (S4-)classes for elliptically contoured distributions (based on package ‘distr’).
446 Probability Distributions distrEx Extensions of Package ‘distr’ Extends package ‘distr’ by functionals, distances, and conditional distributions.
447 Probability Distributions DistributionUtils Distribution Utilities This package contains utilities which are of use in the packages I have developed for dealing with distributions. Currently these packages are GeneralizedHyperbolic, VarianceGamma, and SkewHyperbolic and NormalLaplace. Each of these packages requires DistributionUtils. Functionality includes sample skewness and kurtosis, log-histogram, tail plots, moments by integration, changing the point about which a moment is calculated, functions for testing distributions using inversion tests and the Massart inequality. Also includes an implementation of the incomplete Bessel K function.
448 Probability Distributions distrMod Object Oriented Implementation of Probability Models Implements S4 classes for probability models based on packages ‘distr’ and ‘distrEx’.
449 Probability Distributions distrSim Simulation Classes Based on Package ‘distr’ S4-classes for setting up a coherent framework for simulation within the distr family of packages.
450 Probability Distributions distrTeach Extensions of Package ‘distr’ for Teaching Stochastics/Statistics in Secondary School Provides flexible examples of LLN and CLT for teaching purposes in secondary school.
451 Probability Distributions distrTEst Estimation and Testing Classes Based on Package ‘distr’ Evaluation (S4-)classes based on package distr for evaluating procedures (estimators/tests) at data/simulation in a unified way.
452 Probability Distributions e1071 Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, …
453 Probability Distributions emdbook Support Functions and Data for “Ecological Models and Data” Auxiliary functions and data sets for “Ecological Models and Data”, a book presenting maximum likelihood estimation and related topics for ecologists (ISBN 978-0-691-12522-0).
454 Probability Distributions emg Exponentially Modified Gaussian (EMG) Distribution Provides basic distribution functions for a mixture model of a Gaussian and exponential distribution.
455 Probability Distributions EnvStats Package for Environmental Statistics, Including US EPA Guidance Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book “EnvStats: An R Package for Environmental Statistics” (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, http://www.springer.com/book/9781461484554).
456 Probability Distributions evd Functions for Extreme Value Distributions Extends simulation, distribution, quantile and density functions to univariate and multivariate parametric extreme value distributions, and provides fitting functions which calculate maximum likelihood estimates for univariate and bivariate maxima models, and for univariate and bivariate threshold models.
457 Probability Distributions evdbayes Bayesian Analysis in Extreme Value Theory Provides functions for the bayesian analysis of extreme value models, using MCMC methods.
458 Probability Distributions evir Extreme Values in R Functions for extreme value theory, which may be divided into the following groups; exploratory data analysis, block maxima, peaks over thresholds (univariate and bivariate), point processes, gev/gpd distributions.
459 Probability Distributions ExtDist Extending the Range of Functions for Probability Distributions A consistent, unified and extensible framework for estimation of parameters for probability distributions, including parameter estimation procedures that allow for weighted samples; the current set of distributions included are: the standard beta, The four-parameter beta, Burr, gamma, Gumbel, Johnson SB and SU, Laplace, logistic, normal, symmetric truncated normal, truncated normal, symmetric-reflected truncated beta, standard symmetric-reflected truncated beta, triangular, uniform, and Weibull distributions; decision criteria and selections based on these decision criteria.
460 Probability Distributions extraDistr Additional Univariate and Multivariate Distributions Density, distribution function, quantile function and random generation for a number of univariate and multivariate distributions. This package implements the following distributions: Bernoulli, beta-binomial, beta-negative binomial, beta prime, Bhattacharjee, Birnbaum-Saunders, bivariate normal, bivariate Poisson, categorical, Dirichlet, Dirichlet-multinomial, discrete gamma, discrete Laplace, discrete normal, discrete uniform, discrete Weibull, Frechet, gamma-Poisson, generalized extreme value, Gompertz, generalized Pareto, Gumbel, half-Cauchy, half-normal, half-t, Huber density, inverse chi-squared, inverse-gamma, Kumaraswamy, Laplace, logarithmic, Lomax, multivariate hypergeometric, multinomial, negative hypergeometric, non-standard t, non-standard beta, normal mixture, Poisson mixture, Pareto, power, reparametrized beta, Rayleigh, shifted Gompertz, Skellam, slash, triangular, truncated binomial, truncated normal, truncated Poisson, Tukey lambda, Wald, zero-inflated binomial, zero-inflated negative binomial, zero-inflated Poisson.
461 Probability Distributions extremefit Estimation of Extreme Conditional Quantiles and Probabilities Extreme value theory, nonparametric kernel estimation, tail conditional probabilities, extreme conditional quantile, adaptive estimation, quantile regression, survival probabilities.
462 Probability Distributions FAdist Distributions that are Sometimes Used in Hydrology Probability distributions that are sometimes useful in hydrology.
463 Probability Distributions FatTailsR Kiener Distributions and Fat Tails in Finance Kiener distributions K1, K2, K3, K4 and K7 to characterize distributions with left and right, symmetric or asymmetric fat tails in market finance, neuroscience and other disciplines. Two algorithms to estimate with a high accuracy distribution parameters, quantiles, value-at-risk and expected shortfall. Include power hyperbolas and power hyperbolic functions.
464 Probability Distributions fBasics Rmetrics - Markets and Basic Statistics Provides a collection of functions to explore and to investigate basic properties of financial returns and related quantities. The covered fields include techniques of explorative data analysis and the investigation of distributional properties, including parameter estimation and hypothesis testing. Even more there are several utility functions for data handling and management.
465 Probability Distributions fCopulae (core) Rmetrics - Bivariate Dependence Structures with Copulae Provides a collection of functions to manage, to investigate and to analyze bivariate financial returns by Copulae. Included are the families of Archemedean, Elliptical, Extreme Value, and Empirical Copulae.
466 Probability Distributions fExtremes Rmetrics - Modelling Extreme Events in Finance Provides functions for analysing and modelling extreme events in financial time Series. The topics include: (i) data pre-processing, (ii) explorative data analysis, (iii) peak over threshold modelling, (iv) block maxima modelling, (v) estimation of VaR and CVaR, and (vi) the computation of the extreme index.
467 Probability Distributions fgac Generalized Archimedean Copula Bi-variate data fitting is done by two stochastic components: the marginal distributions and the dependency structure. The dependency structure is modeled through a copula. An algorithm was implemented considering seven families of copulas (Generalized Archimedean Copulas), the best fitting can be obtained looking all copula’s options (totally positive of order 2 and stochastically increasing models).
468 Probability Distributions fitdistrplus Help to Fit of a Parametric Distribution to Non-Censored or Censored Data Extends the fitdistr() function (of the MASS package) with several functions to help the fit of a parametric distribution to non-censored or censored data. Censored data may contain left censored, right censored and interval censored values, with several lower and upper bounds. In addition to maximum likelihood estimation (MLE), the package provides moment matching (MME), quantile matching (QME) and maximum goodness-of-fit estimation (MGE) methods (available only for non-censored data). Weighted versions of MLE, MME and QME are available.
469 Probability Distributions flexsurv Flexible Parametric Survival and Multi-State Models Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models.
470 Probability Distributions FMStable Finite Moment Stable Distributions This package implements some basic procedures for dealing with log maximally skew stable distributions, which are also called finite moment log stable distributions.
471 Probability Distributions fpow Computing the noncentrality parameter of the noncentral F distribution Returns the noncentrality parameter of the noncentral F distribution if probability of type I and type II error, degrees of freedom of the numerator and the denominator are given. It may be useful for computing minimal detectable differences for general ANOVA models. This program is documented in the paper of A. Baharev, S. Kemeny, On the computation of the noncentral F and noncentral beta distribution; Statistics and Computing, 2008, 18 (3), 333-340.
472 Probability Distributions frmqa The Generalized Hyperbolic Distribution, Related Distributions and Their Applications in Finance A collection of R and C++ functions to work with the generalized hyperbolic distribution, related distributions and their applications in financial risk management and quantitative analysis.
473 Probability Distributions gambin Fit the Gambin Model to Species Abundance Distributions Fits unimodal and multimodal gambin distributions to species-abundance distributions from ecological data. ‘gambin’ is short for ‘gamma-binomial’. The main function is fit_abundances(), which estimates the ‘alpha’ parameter(s) of the gambin distribution using maximum likelihood. Functions are also provided to generate the gambin distribution and for calculating likelihood statistics.
474 Probability Distributions gamlss.dist (core) Distributions for Generalized Additive Models for Location Scale and Shape A set of distributions which can be used for modelling the response variables in Generalized Additive Models for Location Scale and Shape, Rigby and Stasinopoulos (2005), doi:10.1111/j.1467-9876.2005.00510.x. The distributions can be continuous, discrete or mixed distributions. Extra distributions can be created, by transforming, any continuous distribution defined on the real line, to a distribution defined on ranges 0 to infinity or 0 to 1, by using a “log” or a “logit” transformation respectively.
475 Probability Distributions gamlss.mx Fitting Mixture Distributions with GAMLSS The main purpose of this package is to allow fitting of mixture distributions with GAMLSS models.
476 Probability Distributions gaussDiff Difference measures for multivariate Gaussian probability density functions A collection difference measures for multivariate Gaussian probability density functions, such as the Euclidea mean, the Mahalanobis distance, the Kullback-Leibler divergence, the J-Coefficient, the Minkowski L2-distance, the Chi-square divergence and the Hellinger Coefficient.
477 Probability Distributions gb Generalize Lambda Distribution and Generalized Bootstrapping This package collects algorithms and functions for fitting data to a generalized lambda distribution via moment matching methods, and generalized bootstrapping.
478 Probability Distributions GB2 Generalized Beta Distribution of the Second Kind: Properties, Likelihood, Estimation Package GB2 explores the Generalized Beta distribution of the second kind. Density, cumulative distribution function, quantiles and moments of the distributions are given. Functions for the full log-likelihood, the profile log-likelihood and the scores are provided. Formulas for various indicators of inequality and poverty under the GB2 are implemented. The GB2 is fitted by the methods of maximum pseudo-likelihood estimation using the full and profile log-likelihood, and non-linear least squares estimation of the model parameters. Various plots for the visualization and analysis of the results are provided. Variance estimation of the parameters is provided for the method of maximum pseudo-likelihood estimation. A mixture distribution based on the compounding property of the GB2 is presented (denoted as “compound” in the documentation). This mixture distribution is based on the discretization of the distribution of the underlying random scale parameter. The discretization can be left or right tail. Density, cumulative distribution function, moments and quantiles for the mixture distribution are provided. The compound mixture distribution is fitted using the method of maximum pseudo-likelihood estimation. The fit can also incorporate the use of auxiliary information. In this new version of the package, the mixture case is complemented with new functions for variance estimation by linearization and comparative density plots.
479 Probability Distributions GenBinomApps Clopper-Pearson Confidence Interval and Generalized Binomial Distribution Density, distribution function, quantile function and random generation for the Generalized Binomial Distribution. Functions to compute the Clopper-Pearson Confidence Interval and the required sample size. Enhanced model for burn-in studies, where failures are tackled by countermeasures.
480 Probability Distributions GeneralizedHyperbolic The Generalized Hyperbolic Distribution This package provides functions for the hyperbolic and related distributions. Density, distribution and quantile functions and random number generation are provided for the hyperbolic distribution, the generalized hyperbolic distribution, the generalized inverse Gaussian distribution and the skew-Laplace distribution. Additional functionality is provided for the hyperbolic distribution, normal inverse Gaussian distribution and generalized inverse Gaussian distribution, including fitting of these distributions to data. Linear models with hyperbolic errors may be fitted using hyperblmFit.
481 Probability Distributions GenOrd Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions A gaussian copula based procedure for generating samples from discrete random variables with prescribed correlation matrix and marginal distributions.
482 Probability Distributions geoR Analysis of Geostatistical Data Geostatistical analysis including traditional, likelihood-based and Bayesian methods.
483 Probability Distributions ghyp A Package on Generalized Hyperbolic Distribution and Its Special Cases Detailed functionality for working with the univariate and multivariate Generalized Hyperbolic distribution and its special cases (Hyperbolic (hyp), Normal Inverse Gaussian (NIG), Variance Gamma (VG), skewed Student-t and Gaussian distribution). Especially, it contains fitting procedures, an AIC-based model selection routine, and functions for the computation of density, quantile, probability, random variates, expected shortfall and some portfolio optimization and plotting routines as well as the likelihood ratio test. In addition, it contains the Generalized Inverse Gaussian distribution.
484 Probability Distributions GIGrvg Random Variate Generator for the GIG Distribution Generator and density function for the Generalized Inverse Gaussian (GIG) distribution.
485 Probability Distributions gld Estimation and Use of the Generalised (Tukey) Lambda Distribution The generalised lambda distribution, or Tukey lambda distribution, provides a wide variety of shapes with one functional form. This package provides random numbers, quantiles, probabilities, densities and density quantiles for four different parameterisations of the distribution. It provides the density function, distribution function, and Quantile-Quantile plots. It implements a variety of estimation methods for the distribution, including diagnostic plots. Estimation methods include the starship (all 4 parameterisations) and a number of methods for only the FKML parameterisation. These include maximum likelihood, maximum product of spacings, Titterington’s method, Moments, L-Moments, Trimmed L-Moments and Distributional Least Absolutes.
486 Probability Distributions GLDEX Fitting Single and Mixture of Generalised Lambda Distributions (RS and FMKL) using Various Methods The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood and quantile matching estimation. Other methods such as moment matching, starship method, L moment matching are also provided. Diagnostics on goodness of fit can be done via qqplots, KS-resample tests and comparing mean, variance, skewness and kurtosis of the data with the fitted distribution.
487 Probability Distributions glogis Fitting and Testing Generalized Logistic Distributions Tools for the generalized logistic distribution (Type I, also known as skew-logistic distribution), encompassing basic distribution functions (p, q, d, r, score), maximum likelihood estimation, and structural change methods.
488 Probability Distributions GMD Generalized Minimum Distance of distributions GMD is a package for non-parametric distance measurement between two discrete frequency distributions.
489 Probability Distributions GSM Gamma Shape Mixture Implementation of a Bayesian approach for estimating a mixture of gamma distributions in which the mixing occurs over the shape parameter. This family provides a flexible and novel approach for modeling heavy-tailed distributions, it is computationally efficient, and it only requires to specify a prior distribution for a single parameter.
490 Probability Distributions gumbel The Gumbel-Hougaard Copula Provides probability functions (cumulative distribution and density functions), simulation function (Gumbel copula multivariate simulation) and estimation functions (Maximum Likelihood Estimation, Inference For Margins, Moment Based Estimation and Canonical Maximum Likelihood).
491 Probability Distributions HAC Estimation, Simulation and Visualization of Hierarchical Archimedean Copulae (HAC) Package provides the estimation of the structure and the parameters, sampling methods and structural plots of Hierarchical Archimedean Copulae (HAC).
492 Probability Distributions hermite Generalized Hermite Distribution Probability functions and other utilities for the generalized Hermite distribution.
493 Probability Distributions HI Simulation from distributions supported by nested hyperplanes Simulation from distributions supported by nested hyperplanes, using the algorithm described in Petris & Tardella, “A geometric approach to transdimensional Markov chain Monte Carlo”, Canadian Journal of Statistics, v.31, n.4, (2003). Also random direction multivariate Adaptive Rejection Metropolis Sampling.
494 Probability Distributions HistogramTools Utility Functions for R Histograms Provides a number of utility functions useful for manipulating large histograms. This includes methods to trim, subset, merge buckets, merge histograms, convert to CDF, and calculate information loss due to binning. It also provides a protocol buffer representations of the default R histogram class to allow histograms over large data sets to be computed and manipulated in a MapReduce environment.
495 Probability Distributions hyper2 The Hyperdirichlet Distribution, Mark 2 A suite of routines for the hyperdirichlet distribution; supersedes the hyperdirichlet package for most purposes.
496 Probability Distributions HyperbolicDist The hyperbolic distribution This package provides functions for the hyperbolic and related distributions. Density, distribution and quantile functions and random number generation are provided for the hyperbolic distribution, the generalized hyperbolic distribution, the generalized inverse Gaussian distribution and the skew-Laplace distribution. Additional functionality is provided for the hyperbolic distribution, including fitting of the hyperbolic to data.
497 Probability Distributions ihs Inverse Hyperbolic Sine Distribution Density, distribution function, quantile function and random generation for the inverse hyperbolic sine distribution. This package also provides a function that can fit data to the inverse hyperbolic sine distribution using maximum likelihood estimation.
498 Probability Distributions kernelboot Smoothed Bootstrap and Random Generation from Kernel Densities Smoothed bootstrap and functions for random generation from univariate and multivariate kernel densities. It does not estimate kernel densities.
499 Probability Distributions kolmim An Improved Evaluation of Kolmogorov’s Distribution Provides an alternative, more efficient evaluation of extreme probabilities of Kolmogorov’s goodness-of-fit measure, Dn, when compared to the original implementation of Wang, Marsaglia, and Tsang. These probabilities are used in Kolmogorov-Smirnov tests when comparing two samples.
500 Probability Distributions KScorrect Lilliefors-Corrected Kolmogorov-Smirnoff Goodness-of-Fit Tests Implements the Lilliefors-corrected Kolmogorov-Smirnoff test for use in goodness-of-fit tests, suitable when population parameters are unknown and must be estimated by sample statistics. P-values are estimated by simulation. Can be used with a variety of continuous distributions, including normal, lognormal, univariate mixtures of normals, uniform, loguniform, exponential, gamma, and Weibull distributions. Functions to generate random numbers and calculate density, distribution, and quantile functions are provided for use with the log uniform and mixture distributions.
501 Probability Distributions LambertW Probabilistic Models to Analyze and Gaussianize Heavy-Tailed, Skewed Data Lambert W x F distributions are a generalized framework to analyze skewed, heavy-tailed data. It is based on an input/output system, where the output random variable (RV) Y is a non-linearly transformed version of an input RV X ~ F with similar properties as X, but slightly skewed (heavy-tailed). The transformed RV Y has a Lambert W x F distribution. This package contains functions to model and analyze skewed, heavy-tailed data the Lambert Way: simulate random samples, estimate parameters, compute quantiles, and plot/ print results nicely. Probably the most important function is ‘Gaussianize’, which works similarly to ‘scale’, but actually makes the data Gaussian. A do-it-yourself toolkit allows users to define their own Lambert W x ‘MyFavoriteDistribution’ and use it in their analysis right away.
502 Probability Distributions LearnBayes Functions for Learning Bayesian Inference LearnBayes contains a collection of functions helpful in learning the basic tenets of Bayesian statistical inference. It contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.
503 Probability Distributions lhs Latin Hypercube Samples Provides a number of methods for creating and augmenting Latin Hypercube Samples.
504 Probability Distributions LIHNPSD Poisson Subordinated Distribution A Poisson Subordinated Distribution to capture major leptokurtic features in log-return time series of financial data.
505 Probability Distributions lmom L-Moments Functions related to L-moments: computation of L-moments and trimmed L-moments of distributions and data samples; parameter estimation; L-moment ratio diagram; plot vs. quantiles of an extreme-value distribution.
506 Probability Distributions lmomco (core) L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions Extensive functions for L-moments (LMs) and probability-weighted moments (PWMs), parameter estimation for distributions, LM computation for distributions, and L-moment ratio diagrams. Maximum likelihood and maximum product of spacings estimation are also available. LMs for right-tail and left-tail censoring by known or unknown threshold and by indicator variable are available. Asymmetric (asy) trimmed LMs (TL-moments, TLMs) are supported. LMs of residual (resid) and reversed (rev) resid life are implemented along with 13 quantile function operators for reliability and survival analyses. Exact analytical bootstrap estimates of order statistics, LMs, and variances- covariances of LMs are provided. The Harri-Coble Tau34-squared Normality Test is available. Distribution support with “L” (LMs), “TL” (TLMs) and added (+) support for right-tail censoring (RC) encompasses: Asy Exponential (Exp) Power [L], Asy Triangular [L], Cauchy [TL], Eta-Mu [L], Exp. [L], Gamma [L], Generalized (Gen) Exp Poisson [L], Gen Extreme Value [L], Gen Lambda [L,TL], Gen Logistic [L), Gen Normal [L], Gen Pareto [L+RC, TL], Govindarajulu [L], Gumbel [L], Kappa [L], Kappa-Mu [L], Kumaraswamy [L], Laplace [L], Linear Mean Resid. Quantile Function [L], Normal [L], 3-p log-Normal [L], Pearson Type III [L], Rayleigh [L], Rev-Gumbel [L+RC], Rice/Rician [L], Slash [TL], 3-p Student t [L], Truncated Exponential [L], Wakeby [L], and Weibull [L]. Multivariate sample L-comoments (LCMs) are implemented to measure asymmetric associations.
507 Probability Distributions Lmoments L-Moments and Quantile Mixtures Contains functions to estimate L-moments and trimmed L-moments from the data. Also contains functions to estimate the parameters of the normal polynomial quantile mixture and the Cauchy polynomial quantile mixture from L-moments and trimmed L-moments.
508 Probability Distributions logitnorm Functions for the Logitnormal Distribution Density, distribution, quantile and random generation function for the logitnormal distribution. Estimation of the mode and the first two moments. Estimation of distribution parameters.
509 Probability Distributions loglognorm Double log normal distribution functions r,d,p,q functions for the double log normal distribution
510 Probability Distributions marg Approximate marginal inference for regression-scale models Likelihood inference based on higher order approximations for linear nonnormal regression models
511 Probability Distributions MASS Support Functions and Datasets for Venables and Ripley’s MASS Functions and datasets to support Venables and Ripley, “Modern Applied Statistics with S” (4th edition, 2002).
512 Probability Distributions mbbefd Maxwell Boltzmann Bose Einstein Fermi Dirac Distribution and Destruction Rate Modelling Distributions that are typically used for exposure rating in general insurance, in particular to price reinsurance contracts. The vignettes show code snippets to fit the distribution to empirical data.
513 Probability Distributions mc2d Tools for Two-Dimensional Monte-Carlo Simulations A complete framework to build and study Two-Dimensional Monte-Carlo simulations, aka Second-Order Monte-Carlo simulations. Also includes various distributions (pert, triangular, Bernoulli, empirical discrete and continuous).
514 Probability Distributions mclust Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.
515 Probability Distributions MCMCpack Markov Chain Monte Carlo (MCMC) Package Contains functions to perform Bayesian inference using posterior simulation for a number of statistical models. Most simulation is done in compiled C++ written in the Scythe Statistical Library Version 1.0.3. All models return coda mcmc objects that can then be summarized using the coda package. Some useful utility functions such as density functions, pseudo-random number generators for statistical distributions, a general purpose Metropolis sampling algorithm, and tools for visualization are provided.
516 Probability Distributions mgpd mgpd: Functions for multivariate generalized Pareto distribution (MGPD of Type II) Extends distribution and density functions to parametric multivariate generalized Pareto distributions (MGPD of Type II), and provides fitting functions which calculate maximum likelihood estimates for bivariate and trivariate models. (Help is under progress)
517 Probability Distributions minimax Minimax distribution family The minimax family of distributions is a two-parameter family like the beta family, but computationally a lot more tractible.
518 Probability Distributions MitISEM Mixture of Student t Distributions using Importance Sampling and Expectation Maximization Flexible multivariate function approximation using adapted Mixture of Student t Distributions. Mixture of t distribution is obtained using Importance Sampling weighted Expectation Maximization algorithm.
519 Probability Distributions MittagLeffleR The Mittag-Leffler Distribution Calculates Mittag-Leffler probabilities and the Mittag-Leffler function, generates Mittag-Leffler random variables, and fits the Mittag-Leffler distribution to data. Based on the algorithm by Garrappa, R. (2015) doi:10.1137/140971191.
520 Probability Distributions MixedTS Mixed Tempered Stable Distribution We provide detailed functions for univariate Mixed Tempered Stable distribution.
521 Probability Distributions mixtools Tools for Analyzing Finite Mixture Models Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772.
522 Probability Distributions MM The multiplicative multinomial distribution Description: Various utilities for the Multiplicative Multinomial distribution
523 Probability Distributions mnormpow Multivariate Normal Distributions with Power Integrand Computes integral of f(x)*x_i^k on a product of intervals, where f is the density of a gaussian law. This a is small alteration of the mnormt code from A. Genz and A. Azzalini.
524 Probability Distributions mnormt (core) The Multivariate Normal and t Distributions Functions are provided for computing the density and the distribution function of multivariate normal and “t” random variables, and for generating random vectors sampled from these distributions. Probabilities are computed via non-Monte Carlo methods; different routines are used in the case d=1, d=2, d>2, if d denotes the number of dimensions.
525 Probability Distributions modeest Mode Estimation This package provides estimators of the mode of univariate unimodal data or univariate unimodal distributions
526 Probability Distributions moments Moments, cumulants, skewness, kurtosis and related tests Functions to calculate: moments, Pearson’s kurtosis, Geary’s kurtosis and skewness; tests related to them (Anscombe-Glynn, D’Agostino, Bonett-Seier).
527 Probability Distributions movMF Mixtures of von Mises-Fisher Distributions Fit and simulate mixtures of von Mises-Fisher distributions.
528 Probability Distributions msm Multi-State Markov and Hidden Markov Models in Continuous Time Functions for fitting continuous-time Markov and hidden Markov multi-state models to longitudinal data. Designed for processes observed at arbitrary times in continuous time (panel data) but some other observation schemes are supported. Both Markov transition rates and the hidden Markov output process can be modelled in terms of covariates, which may be constant or piecewise-constant in time.
529 Probability Distributions mvprpb Orthant Probability of the Multivariate Normal Distribution Computes orthant probabilities multivariate normal distribution.
530 Probability Distributions mvrtn Mean and Variance of Truncated Normal Distribution Mean, variance, and random variates for left/right truncated normal distributions.
531 Probability Distributions mvtnorm (core) Multivariate Normal and t Distributions Computes multivariate normal and t probabilities, quantiles, random deviates and densities.
532 Probability Distributions nCDunnett Noncentral Dunnett’s Test Distribution Computes the noncentral Dunnett’s test distribution (pdf, cdf and quantile) and generates random numbers.
533 Probability Distributions Newdistns Computes Pdf, Cdf, Quantile and Random Numbers, Measures of Inference for 19 General Families of Distributions Computes the probability density function, cumulative distribution function, quantile function, random numbers and measures of inference for the following general families of distributions (each family defined in terms of an arbitrary cdf G): Marshall Olkin G distributions, exponentiated G distributions, beta G distributions, gamma G distributions, Kumaraswamy G distributions, generalized beta G distributions, beta extended G distributions, gamma G distributions, gamma uniform G distributions, beta exponential G distributions, Weibull G distributions, log gamma G I distributions, log gamma G II distributions, exponentiated generalized G distributions, exponentiated Kumaraswamy G distributions, geometric exponential Poisson G distributions, truncated-exponential skew-symmetric G distributions, modified beta G distributions, and exponentiated exponential Poisson G distributions.
534 Probability Distributions nor1mix Normal (1-d) Mixture Models (S3 Classes and Methods) Onedimensional Normal Mixture Models Classes, for, e.g., density estimation or clustering algorithms research and teaching; providing the widely used Marron-Wand densities. Efficient random number generation and graphics; now fitting to data by ML (Maximum Likelihood) or EM estimation.
535 Probability Distributions NormalGamma Normal-gamma convolution model The functions proposed in this package compute the density of the sum of a Gaussian and a gamma random variables, estimate the parameters and correct the noise effect in a gamma-signal and Gaussian-noise model. This package has been used to implement the background correction method for Illumina microarray data presented in Plancade S., Rozenholc Y. and Lund E. “Generalization of the normal-exponential model : exploration of a more accurate parameterization for the signal distribution on Illumina BeadArrays”, BMC Bioinfo 2012, 13(329).
536 Probability Distributions NormalLaplace The Normal Laplace Distribution This package provides functions for the normal Laplace distribution. It is currently under development and provides only limited functionality. Density, distribution and quantile functions, random number generation, and moments are provided.
537 Probability Distributions normalp Routines for Exponential Power Distribution Collection of utilities referred to Exponential Power distribution, also known as General Error Distribution (see Mineo, A.M. and Ruggieri, M. (2005), A software Tool for the Exponential Power Distribution: The normalp package. In Journal of Statistical Software, Vol. 12, Issue 4)
538 Probability Distributions npde Normalised prediction distribution errors for nonlinear mixed-effect models Routines to compute normalised prediction distribution errors, a metric designed to evaluate non-linear mixed effect models such as those used in pharmacokinetics and pharmacodynamics
539 Probability Distributions ORDER2PARENT Estimate parent distributions with data of several order statistics This package uses B-spline based nonparametric smooth estimators to estimate parent distributions given observations on multiple order statistics.
540 Probability Distributions OrdNor Concurrent Generation of Ordinal and Normal Data with Given Correlation Matrix and Marginal Distributions Implementation of a procedure for generating samples from a mixed distribution of ordinal and normal random variables with pre-specified correlation matrix and marginal distributions.
541 Probability Distributions ParetoPosStable Computing, Fitting and Validating the PPS Distribution Statistical functions to describe a Pareto Positive Stable (PPS) distribution and fit it to real data. Graphical and statistical tools to validate the fits are included.
542 Probability Distributions PDQutils PDQ Functions via Gram Charlier, Edgeworth, and Cornish Fisher Approximations A collection of tools for approximating the ‘PDQ’ functions (respectively, the cumulative distribution, density, and quantile) of probability distributions via classical expansions involving moments and cumulants.
543 Probability Distributions PearsonDS (core) Pearson Distribution System Implementation of the Pearson distribution system, including full support for the (d,p,q,r)-family of functions for probability distributions and fitting via method of moments and maximum likelihood method.
544 Probability Distributions PhaseType Inference for Phase-type Distributions Functions to perform Bayesian inference on absorption time data for Phase-type distributions. Plans to expand this to include frequentist inference and simulation tools.
545 Probability Distributions poibin The Poisson Binomial Distribution This package implements both the exact and approximation methods for computing the cdf of the Poisson binomial distribution. It also provides the pmf, quantile function, and random number generation for the Poisson binomial distribution.
546 Probability Distributions poilog Poisson lognormal and bivariate Poisson lognormal distribution Functions for obtaining the density, random deviates and maximum likelihood estimates of the Poisson lognormal distribution and the bivariate Poisson lognormal distribution.
547 Probability Distributions poistweedie Poisson-Tweedie exponential family models Simulation of models Poisson-Tweedie.
548 Probability Distributions polyaAeppli Implementation of the Polya-Aeppli distribution Functions for evaluating the mass density, cumulative distribution function, quantile function and random variate generation for the Polya-Aeppli distribution, also known as the geometric compound Poisson distribution.
549 Probability Distributions poweRlaw Analysis of Heavy Tailed Distributions An implementation of maximum likelihood estimators for a variety of heavy tailed distributions, including both the discrete and continuous power law distributions. Additionally, a goodness-of-fit based approach is used to estimate the lower cut-off for the scaling region.
550 Probability Distributions qmap Statistical Transformations for Post-Processing Climate Model Output Empirical adjustment of the distribution of variables originating from (regional) climate model simulations using quantile mapping.
551 Probability Distributions QRM Provides R-Language Code to Examine Quantitative Risk Management Concepts Accompanying package to the book Quantitative Risk Management: Concepts, Techniques and Tools by Alexander J. McNeil, Rudiger Frey, and Paul Embrechts.
552 Probability Distributions randaes Random number generator based on AES cipher The deterministic part of the Fortuna cryptographic pseudorandom number generator, described by Schneier & Ferguson “Practical Cryptography”
553 Probability Distributions random True Random Numbers using RANDOM.ORG The true random number service provided by the RANDOM.ORG website created by Mads Haahr samples atmospheric noise via radio tuned to an unused broadcasting frequency together with a skew correction algorithm due to John von Neumann. More background is available in the included vignette based on an essay by Mads Haahr. In its current form, the package offers functions to retrieve random integers, randomized sequences and random strings.
554 Probability Distributions randtoolbox Toolbox for Pseudo and Quasi Random Number Generation and RNG Tests Provides (1) pseudo random generators - general linear congruential generators, multiple recursive generators and generalized feedback shift register (SF-Mersenne Twister algorithm and WELL generators); (2) quasi random generators - the Torus algorithm, the Sobol sequence, the Halton sequence (including the Van der Corput sequence) and (3) some RNG tests - the gap test, the serial test, the poker test. The package depends on rngWELL package but it can be provided without this dependency on demand to the maintainer. For true random number generation, use the ‘random’ package, for Latin Hypercube Sampling (a hybrid QMC method), use the ‘lhs’ package. A number of RNGs and tests for RNGs are also provided by ‘RDieHarder’, all available on CRAN. There is also a small stand-alone package ‘rngwell19937’ for the WELL19937a RNG.
555 Probability Distributions RDieHarder R interface to the dieharder RNG test suite The RDieHarder packages provides an R interface to the dieharder suite of random number generators and tests that was developed by Robert G. Brown and David Bauer, extending earlier work by George Marsaglia and others.
556 Probability Distributions ReIns Functions from “Reinsurance: Actuarial and Statistical Aspects” Functions from the book “Reinsurance: Actuarial and Statistical Aspects” (2017) by Hansjoerg Albrecher, Jan Beirlant and Jef Teugels http://wiley.com/WileyCDA/WileyTitle/productCd-0470772689.html.
557 Probability Distributions reliaR (core) Package for some probability distributions A collection of utilities for some reliability models/probability distributions.
558 Probability Distributions Renext Renewal Method for Extreme Values Extrapolation Peaks Over Threshold (POT) or ‘methode du renouvellement’. The distribution for the exceedances can be chosen, and heterogeneous data (including historical data or block data) can be used in a Maximum-Likelihood framework.
559 Probability Distributions retimes Reaction Time Analysis Reaction time analysis by maximum likelihood
560 Probability Distributions revdbayes Ratio-of-Uniforms Sampling for Bayesian Extreme Value Analysis Provides functions for the Bayesian analysis of extreme value models. The ‘rust’ package https://cran.r-project.org/package=rust is used to simulate a random sample from the required posterior distribution. The functionality of ‘revdbayes’ is similar to the ‘evdbayes’ package https://cran.r-project.org/package=evdbayes, which uses Markov Chain Monte Carlo (‘MCMC’) methods for posterior simulation. Also provided are functions for making inferences about the extremal index, using the K-gaps model of Suveges and Davison (2010) doi:10.1214/09-AOAS292. See the ‘revdbayes’ website for more information, documentation and examples.
561 Probability Distributions rlecuyer R Interface to RNG with Multiple Streams Provides an interface to the C implementation of the random number generator with multiple independent streams developed by L’Ecuyer et al (2002). The main purpose of this package is to enable the use of this random number generator in parallel R applications.
562 Probability Distributions RMKdiscrete Sundry Discrete Probability Distributions Sundry discrete probability distributions and helper functions.
563 Probability Distributions RMTstat Distributions, Statistics and Tests derived from Random Matrix Theory Functions for working with the Tracy-Widom laws and other distributions related to the eigenvalues of large Wishart matrices. The tables for computing the Tracy-Widom densities and distribution functions were computed by Momar Dieng’s MATLAB package “RMLab” (formerly available on his homepage at http://math.arizona.edu/~momar/research.htm ). This package is part of a collaboration between Iain Johnstone, Zongming Ma, Patrick Perry, and Morteza Shahram. It will soon be replaced by a package with more accuracy and built-in support for relevant statistical tests.
564 Probability Distributions rngwell19937 Random number generator WELL19937a with 53 or 32 bit output Long period linear random number generator WELL19937a by F. Panneton, P. L’Ecuyer and M. Matsumoto. The initialization algorithm allows to seed the generator with a numeric vector of an arbitrary length and uses MRG32k5a by P. L’Ecuyer to achieve good quality of the initialization. The output function may be set to provide numbers from the interval (0,1) with 53 (the default) or 32 random bits. WELL19937a is of similar type as Mersenne Twister and has the same period. WELL19937a is slightly slower than Mersenne Twister, but has better equidistribution and “bit-mixing” properties and faster recovery from states with prevailing zeros than Mersenne Twister. All WELL generators with orders 512, 1024, 19937 and 44497 can be found in randtoolbox package.
565 Probability Distributions rstream Streams of Random Numbers Unified object oriented interface for multiple independent streams of random numbers from different sources.
566 Probability Distributions RTDE Robust Tail Dependence Estimation Robust tail dependence estimation for bivariate models. This package is based on two papers by the authors:‘Robust and bias-corrected estimation of the coefficient of tail dependence’ and ‘Robust and bias-corrected estimation of probabilities of extreme failure sets’. This work was supported by a research grant (VKR023480) from VILLUM FONDEN and an international project for scientific cooperation (PICS-6416).
567 Probability Distributions rtdists Response Time Distributions Provides response time distributions (density/PDF, distribution function/CDF, quantile function, and random generation): (a) Ratcliff diffusion model (Ratcliff & McKoon, 2008, doi:10.1162/neco.2008.12-06-420) based on C code by Andreas and Jochen Voss and (b) linear ballistic accumulator (LBA; Brown & Heathcote, 2008, doi:10.1016/j.cogpsych.2007.12.002) with different distributions underlying the drift rate.
568 Probability Distributions Runuran R Interface to the UNU.RAN Random Variate Generators Interface to the UNU.RAN library for Universal Non-Uniform RANdom variate generators. Thus it allows to build non-uniform random number generators from quite arbitrary distributions. In particular, it provides an algorithm for fast numerical inversion for distribution with given density function. In addition, the package contains densities, distribution functions and quantiles from a couple of distributions.
569 Probability Distributions s20x Functions for University of Auckland Course STATS 201/208 Data Analysis A set of functions used in teaching STATS 201/208 Data Analysis at the University of Auckland. The functions are designed to make parts of R more accessible to a large undergraduate population who are mostly not statistics majors.
570 Probability Distributions sadists Some Additional Distributions Provides the density, distribution, quantile and generation functions of some obscure probability distributions, including the doubly non- central t, F, Beta, and Eta distributions; the lambda-prime and K-prime; the upsilon distribution; the (weighted) sum of non-central chi-squares to a power; the (weighted) sum of log non-central chi-squares; the product of non-central chi-squares to powers; the product of doubly non-central F variables; the product of independent normals.
571 Probability Distributions SCI Standardized Climate Indices Such as SPI, SRI or SPEI Functions for generating Standardized Climate Indices (SCI). SCI is a transformation of (smoothed) climate (or environmental) time series that removes seasonality and forces the data to take values of the standard normal distribution. SCI was originally developed for precipitation. In this case it is known as the Standardized Precipitation Index (SPI).
572 Probability Distributions setRNG Set (Normal) Random Number Generator and Seed SetRNG provides utilities to help set and record the setting of the seed and the uniform and normal generators used when a random experiment is run. The utilities can be used in other functions that do random experiments to simplify recording and/or setting all the necessary information for reproducibility. See the vignette and reference manual for examples.
573 Probability Distributions sfsmisc Utilities from ‘Seminar fuer Statistik’ ETH Zurich Useful utilities [‘goodies’] from Seminar fuer Statistik ETH Zurich, quite a few related to graphics; some were ported from S-plus.
574 Probability Distributions sgt Skewed Generalized T Distribution Tree Density, distribution function, quantile function and random generation for the skewed generalized t distribution. This package also provides a function that can fit data to the skewed generalized t distribution using maximum likelihood estimation.
575 Probability Distributions skellam Densities and Sampling for the Skellam Distribution Functions for the Skellam distribution, including: density (pmf), cdf, quantiles and regression.
576 Probability Distributions SkewHyperbolic The Skew Hyperbolic Student t-Distribution Functions are provided for the density function, distribution function, quantiles and random number generation for the skew hyperbolic t-distribution. There are also functions that fit the distribution to data. There are functions for the mean, variance, skewness, kurtosis and mode of a given distribution and to calculate moments of any order about any centre. To assess goodness of fit, there are functions to generate a Q-Q plot, a P-P plot and a tail plot.
577 Probability Distributions skewt The Skewed Student-t Distribution Density, distribution function, quantile function and random generation for the skewed t distribution of Fernandez and Steel.
578 Probability Distributions sld Estimation and Use of the Quantile-Based Skew Logistic Distribution The skew logistic distribution is a quantile-defined generalisation of the logistic distribution (van Staden and King 2015). Provides random numbers, quantiles, probabilities, densities and density quantiles for the distribution. It provides Quantile-Quantile plots and method of L-Moments estimation (including asymptotic standard errors) for the distribution.
579 Probability Distributions smoothmest Smoothed M-estimators for 1-dimensional location Some M-estimators for 1-dimensional location (Bisquare, ML for the Cauchy distribution, and the estimators from application of the smoothing principle introduced in Hampel, Hennig and Ronchetti (2011) to the above, the Huber M-estimator, and the median, main function is smoothm), and Pitman estimator.
580 Probability Distributions SMR Externally Studentized Midrange Distribution Computes the studentized midrange distribution (pdf, cdf and quantile) and generates random numbers
581 Probability Distributions sn The Skew-Normal and Related Distributions Such as the Skew-t Build and manipulate probability distributions of the skew-normal family and some related ones, notably the skew-t family, and provide related statistical methods for data fitting and model diagnostics, in the univariate and the multivariate case.
582 Probability Distributions sparseMVN Multivariate Normal Functions for Sparse Covariance and Precision Matrices Computes multivariate normal (MVN) densities, and samples from MVN distributions, when the covariance or precision matrix is sparse.
583 Probability Distributions spd Semi Parametric Distribution The Semi Parametric Piecewise Distribution blends the Generalized Pareto Distribution for the tails with a kernel based interior.
584 Probability Distributions stabledist Stable Distribution Functions Density, Probability and Quantile functions, and random number generation for (skew) stable distributions, using the parametrizations of Nolan.
585 Probability Distributions STAR Spike Train Analysis with R Functions to analyze neuronal spike trains from a single neuron or from several neurons recorded simultaneously.
586 Probability Distributions statmod Statistical Modeling A collection of algorithms and functions to aid statistical modeling. Includes growth curve comparisons, limiting dilution analysis (aka ELDA), mixed linear models, heteroscedastic regression, inverse-Gaussian probability calculations, Gauss quadrature and a secure convergence algorithm for nonlinear models. Includes advanced generalized linear model functions that implement secure convergence, dispersion modeling and Tweedie power-law families.
587 Probability Distributions SuppDists Supplementary Distributions Ten distributions supplementing those built into R. Inverse Gauss, Kruskal-Wallis, Kendall’s Tau, Friedman’s chi squared, Spearman’s rho, maximum F ratio, the Pearson product moment correlation coefficient, Johnson distributions, normal scores and generalized hypergeometric distributions. In addition two random number generators of George Marsaglia are included.
588 Probability Distributions symmoments Symbolic central and noncentral moments of the multivariate normal distribution Symbolic central and non-central moments of the multivariate normal distribution. Computes a standard representation, LateX code, and values at specified mean and covariance matrices.
589 Probability Distributions tmvtnorm Truncated Multivariate Normal and Student t Distribution Random number generation for the truncated multivariate normal and Student t distribution. Computes probabilities, quantiles and densities, including one-dimensional and bivariate marginal densities. Computes first and second moments (i.e. mean and covariance matrix) for the double-truncated multinormal case.
590 Probability Distributions tolerance Statistical Tolerance Intervals and Regions Statistical tolerance limits provide the limits between which we can expect to find a specified proportion of a sampled population with a given level of confidence. This package provides functions for estimating tolerance limits (intervals) for various univariate distributions (binomial, Cauchy, discrete Pareto, exponential, two-parameter exponential, extreme value, hypergeometric, Laplace, logistic, negative binomial, negative hypergeometric, normal, Pareto, Poisson-Lindley, Poisson, uniform, and Zipf-Mandelbrot), Bayesian normal tolerance limits, multivariate normal tolerance regions, nonparametric tolerance intervals, tolerance bands for regression settings (linear regression, nonlinear regression, nonparametric regression, and multivariate regression), and analysis of variance tolerance intervals. Visualizations are also available for most of these settings.
591 Probability Distributions trapezoid The Trapezoidal Distribution The trapezoid package provides dtrapezoid, ptrapezoid, qtrapezoid, and rtrapezoid functions for the trapezoidal distribution.
592 Probability Distributions triangle Provides the Standard Distribution Functions for the Triangle Distribution Provides the “r, q, p, and d” distribution functions for the triangle distribution.
593 Probability Distributions truncnorm Truncated normal distribution r/d/p/q functions for the truncated normal distribution
594 Probability Distributions TSA Time Series Analysis Contains R functions and datasets detailed in the book “Time Series Analysis with Applications in R (second edition)” by Jonathan Cryer and Kung-Sik Chan
595 Probability Distributions tsallisqexp Tsallis q-Exp Distribution Tsallis distribution also known as the q-exponential family distribution. Provide distribution d, p, q, r functions, fitting and testing functions. Project initiated by Paul Higbie and based on Cosma Shalizi’s code.
596 Probability Distributions TTmoment Sampling and Calculating the First and Second Moments for the Doubly Truncated Multivariate t Distribution Computing the first two moments of the truncated multivariate t (TMVT) distribution under the double truncation. Appling the slice sampling algorithm to generate random variates from the TMVT distribution.
597 Probability Distributions tweedie Evaluation of Tweedie Exponential Family Models Maximum likelihood computations for Tweedie families, including the series expansion (Dunn and Smyth, 2005; <doi10.1007/s11222-005-4070-y>) and the Fourier inversion (Dunn and Smyth, 2008; doi:10.1007/s11222-007-9039-6), and related methods.
598 Probability Distributions VarianceGamma The Variance Gamma Distribution This package provides functions for the variance gamma distributions. Density, distribution and quantile functions. Functions for random number generation and fitting of the variance gamma to data. Also, functions for computing moments of the variance gamma distribution of any order about any location. In addition, there are functions for checking the validity of parameters and to interchange different sets of parameterizatons for the variance gamma distribution.
599 Probability Distributions VGAM (core) Vector Generalized Linear and Additive Models An implementation of about 6 major classes of statistical regression models. At the heart of it are the vector generalized linear and additive model (VGLM/VGAM) classes, and the book “Vector Generalized Linear and Additive Models: With an Implementation in R” (Yee, 2015) doi:10.1007/978-1-4939-2818-7 gives details of the statistical framework and VGAM package. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs (i.e., with smoothing). The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)―these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.
600 Probability Distributions VineCopula Statistical Inference of Vine Copulas Provides tools for the statistical analysis of vine copula models. The package includes tools for parameter estimation, model selection, simulation, goodness-of-fit tests, and visualization. Tools for estimation, selection and exploratory data analysis of bivariate copula models are also provided.
601 Probability Distributions vines Multivariate Dependence Modeling with Vines Implementation of the vine graphical model for building high-dimensional probability distributions as a factorization of bivariate copulas and marginal density functions. This package provides S4 classes for vines (C-vines and D-vines) and methods for inference, goodness-of-fit tests, density/distribution function evaluation, and simulation.
602 Probability Distributions zipfR Statistical Models for Word Frequency Distributions Statistical models and utilities for the analysis of word frequency distributions. The utilities include functions for loading, manipulating and visualizing word frequency data and vocabulary growth curves. The package also implements several statistical models for the distribution of word frequencies in a population. (The name of this package derives from the most famous word frequency distribution, Zipf’s law.)
603 Econometrics AER (core) Applied Econometrics with R Functions, data sets, examples, demos, and vignettes for the book Christian Kleiber and Achim Zeileis (2008), Applied Econometrics with R, Springer-Verlag, New York. ISBN 978-0-387-77316-2. (See the vignette “AER” for a package overview.)
604 Econometrics aod Analysis of Overdispersed Data This package provides a set of functions to analyse overdispersed counts or proportions. Most of the methods are already available elsewhere but are scattered in different packages. The proposed functions should be considered as complements to more sophisticated methods such as generalized estimating equations (GEE) or generalized linear mixed effect models (GLMM).
605 Econometrics apt Asymmetric Price Transmission Asymmetric price transmission between two time series is assessed. Several functions are available for linear and nonlinear threshold cointegration, and furthermore, symmetric and asymmetric error correction model. A graphical user interface is also included for major functions included in the package, so users can also use these functions in a more intuitive way.
606 Econometrics bayesm Bayesian Inference for Marketing/Micro-Econometrics Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley 2005) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).
607 Econometrics betareg Beta Regression Beta regression for modeling beta-distributed dependent variables, e.g., rates and proportions. In addition to maximum likelihood regression (for both mean and precision of a beta-distributed response), bias-corrected and bias-reduced estimation as well as finite mixture models and recursive partitioning for beta regressions are provided.
608 Econometrics BMA Bayesian Model Averaging Package for Bayesian model averaging and variable selection for linear models, generalized linear models and survival models (cox regression).
609 Econometrics BMS Bayesian Model Averaging Library Bayesian model averaging for linear models with a wide choice of (customizable) priors. Built-in priors include coefficient priors (fixed, flexible and hyper-g priors), 5 kinds of model priors, moreover model sampling by enumeration or various MCMC approaches. Post-processing functions allow for inferring posterior inclusion and model probabilities, various moments, coefficient and predictive densities. Plotting functions available for posterior model size, MCMC convergence, predictive and coefficient densities, best models representation, BMA comparison.
610 Econometrics boot Bootstrap Functions (Originally by Angelo Canty for S) Functions and datasets for bootstrapping from the book “Bootstrap Methods and Their Application” by A. C. Davison and D. V. Hinkley (1997, CUP), originally written by Angelo Canty for S.
611 Econometrics bootstrap Functions for the Book “An Introduction to the Bootstrap” Software (bootstrap, cross-validation, jackknife) and data for the book “An Introduction to the Bootstrap” by B. Efron and R. Tibshirani, 1993, Chapman and Hall. This package is primarily provided for projects already based on it, and for support of the book. New projects should preferentially use the recommended package “boot”.
612 Econometrics brglm Bias Reduction in Binomial-Response Generalized Linear Models Fit generalized linear models with binomial responses using either an adjusted-score approach to bias reduction or maximum penalized likelihood where penalization is by Jeffreys invariant prior. These procedures return estimates with improved frequentist properties (bias, mean squared error) that are always finite even in cases where the maximum likelihood estimates are infinite (data separation). Fitting takes place by fitting generalized linear models on iteratively updated pseudo-data. The interface is essentially the same as ‘glm’. More flexibility is provided by the fact that custom pseudo-data representations can be specified and used for model fitting. Functions are provided for the construction of confidence intervals for the reduced-bias estimates.
613 Econometrics CADFtest A Package to Perform Covariate Augmented Dickey-Fuller Unit Root Tests Hansen’s (1995) Covariate-Augmented Dickey-Fuller (CADF) test. The only required argument is y, the Tx1 time series to be tested. If no stationary covariate X is passed to the procedure, then an ordinary ADF test is performed. The p-values of the test are computed using the procedure illustrated in Lupi (2009).
614 Econometrics car (core) Companion to Applied Regression Functions and Datasets to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Second Edition, Sage, 2011.
615 Econometrics CDNmoney Components of Canadian Monetary and Credit Aggregates Components of Canadian Credit Aggregates and Monetary Aggregates with continuity adjustments.
616 Econometrics censReg Censored Regression (Tobit) Models Maximum Likelihood estimation of censored regression (Tobit) models with cross-sectional and panel data.
617 Econometrics clubSandwich Cluster-Robust (Sandwich) Variance Estimators with Small-Sample Corrections Provides several cluster-robust variance estimators (i.e., sandwich estimators) for ordinary and weighted least squares linear regression models, including the bias-reduced linearization estimator introduced by Bell and McCaffrey (2002) http://www.statcan.gc.ca/pub/12-001-x/2002002/article/9058-eng.pdf and developed further by Pustejovsky and Tipton (2017) doi:10.1080/07350015.2016.1247004. The package includes functions for estimating the variance- covariance matrix and for testing single- and multiple-contrast hypotheses based on Wald test statistics. Tests of single regression coefficients use Satterthwaite or saddle-point corrections. Tests of multiple-contrast hypotheses use an approximation to Hotelling’s T-squared distribution. Methods are provided for a variety of fitted models, including lm() and mlm objects, glm(), ivreg (from package ‘AER’), plm() (from package ‘plm’), gls() and lme() (from ‘nlme’), robu() (from ‘robumeta’), and rma.uni() and rma.mv() (from ‘metafor’).
618 Econometrics clusterSEs Calculate Cluster-Robust p-Values and Confidence Intervals Calculate p-values and confidence intervals using cluster-adjusted t-statistics (based on Ibragimov and Muller (2010) doi:10.1198/jbes.2009.08046, pairs cluster bootstrapped t-statistics, and wild cluster bootstrapped t-statistics (the latter two techniques based on Cameron, Gelbach, and Miller (2008) doi:10.1162/rest.90.3.414. Procedures are included for use with GLM, ivreg, plm (pooling or fixed effects), and mlogit models.
619 Econometrics crch Censored Regression with Conditional Heteroscedasticity Different approaches to censored or truncated regression with conditional heteroscedasticity are provided. First, continuous distributions can be used for the (right and/or left censored or truncated) response with separate linear predictors for the mean and variance. Second, cumulative link models for ordinal data (obtained by interval-censoring continuous data) can be employed for heteroscedastic extended logistic regression (HXLR). In the latter type of models, the intercepts depend on the thresholds that define the intervals.
620 Econometrics decompr Global-Value-Chain Decomposition Two global-value-chain decompositions are implemented. Firstly, the Wang-Wei-Zhu (Wang, Wei, and Zhu, 2013) algorithm splits bilateral gross exports into 16 value-added components. Secondly, the Leontief decomposition (default) derives the value added origin of exports by country and industry, which is also based on Wang, Wei, and Zhu (Wang, Z., S.-J. Wei, and K. Zhu. 2013. “Quantifying International Production Sharing at the Bilateral and Sector Levels.”).
621 Econometrics dlsem Distributed-Lag Linear Structural Equation Modelling Inference functionalities for distributed-lag linear structural equation models. Endpoint-constrained quadratic, quadratic decreasing and gamma lag shapes are available.
622 Econometrics dynlm Dynamic Linear Regression Dynamic linear models and time series regression.
623 Econometrics Ecdat Data Sets for Econometrics Data sets for econometrics.
624 Econometrics effects Effect Displays for Linear, Generalized Linear, and Other Models Graphical and tabular effect displays, e.g., of interactions, for various statistical models with linear predictors.
625 Econometrics erer Empirical Research in Economics with R Functions, datasets, and sample codes related to the book of ‘Empirical Research in Economics: Growing up with R’ by Dr. Changyou Sun are included. Marginal effects for binary or ordered choice models can be calculated. Static and dynamic Almost Ideal Demand System (AIDS) models can be estimated. A typical event analysis in finance can be conducted with several functions included.
626 Econometrics expsmooth Data Sets from “Forecasting with Exponential Smoothing” Data sets from the book “Forecasting with exponential smoothing: the state space approach” by Hyndman, Koehler, Ord and Snyder (Springer, 2008).
627 Econometrics ExtremeBounds Extreme Bounds Analysis (EBA) An implementation of Extreme Bounds Analysis (EBA), a global sensitivity analysis that examines the robustness of determinants in regression models. The package supports both Leamer’s and Sala-i-Martin’s versions of EBA, and allows users to customize all aspects of the analysis.
628 Econometrics fma Data Sets from “Forecasting: Methods and Applications” by Makridakis, Wheelwright & Hyndman (1998) All data sets from “Forecasting: methods and applications” by Makridakis, Wheelwright & Hyndman (Wiley, 3rd ed., 1998).
629 Econometrics forecast (core) Forecasting Functions for Time Series and Linear Models Methods and tools for displaying and analysing univariate time series forecasts including exponential smoothing via state space models and automatic ARIMA modelling.
630 Econometrics frm Regression Analysis of Fractional Responses Estimation and specification analysis of one- and two-part fractional regression models and calculation of partial effects.
631 Econometrics frontier Stochastic Frontier Analysis Maximum Likelihood Estimation of Stochastic Frontier Production and Cost Functions. Two specifications are available: the error components specification with time-varying efficiencies (Battese and Coelli, 1992) and a model specification in which the firm effects are directly influenced by a number of variables (Battese and Coelli, 1995).
632 Econometrics fxregime Exchange Rate Regime Analysis Exchange rate regression and structural change tools for estimating, testing, dating, and monitoring (de facto) exchange rate regimes.
633 Econometrics gam Generalized Additive Models Functions for fitting and working with generalized additive models, as described in chapter 7 of “Statistical Models in S” (Chambers and Hastie (eds), 1991), and “Generalized Additive Models” (Hastie and Tibshirani, 1990).
634 Econometrics gamlss Generalised Additive Models for Location Scale and Shape Functions for fitting the Generalized Additive Models for Location Scale and Shape introduced by Rigby and Stasinopoulos (2005), doi:10.1111/j.1467-9876.2005.00510.x. The models use a distributional regression approach where all the parameters of the conditional distribution of the response variable are modelled using explanatory variables.
635 Econometrics geepack Generalized Estimating Equation Package Generalized estimating equations solver for parameters in mean, scale, and correlation structures, through mean link, scale link, and correlation link. Can also handle clustered categorical responses.
636 Econometrics gets General-to-Specific (GETS) Modelling and Indicator Saturation Methods Automated General-to-Specific (GETS) modelling of the mean and variance of a regression, and indicator saturation methods for detecting and testing for structural breaks in the mean.
637 Econometrics glmx Generalized Linear Models Extended Extended techniques for generalized linear models (GLMs), especially for binary responses, including parametric links and heteroskedastic latent variables.
638 Econometrics gmm Generalized Method of Moments and Generalized Empirical Likelihood It is a complete suite to estimate models based on moment conditions. It includes the two step Generalized method of moments (Hansen 1982; doi:10.2307/1912775), the iterated GMM and continuous updated estimator (Hansen, Eaton and Yaron 1996; doi:10.2307/1392442) and several methods that belong to the Generalized Empirical Likelihood family of estimators (Smith 1997; doi:10.1111/j.0013-0133.1997.174.x, Kitamura 1997; doi:10.1214/aos/1069362388, Newey and Smith 2004; doi:10.1111/j.1468-0262.2004.00482.x, and Anatolyev 2005 doi:10.1111/j.1468-0262.2005.00601.x).
639 Econometrics gmnl Multinomial Logit Models with Random Parameters An implementation of maximum simulated likelihood method for the estimation of multinomial logit models with random coefficients. Specifically, it allows estimating models with continuous heterogeneity such as the mixed multinomial logit and the generalized multinomial logit. It also allows estimating models with discrete heterogeneity such as the latent class and the mixed-mixed multinomial logit model.
640 Econometrics gvc Global Value Chains Tools Several tools for Global Value Chain (‘GVC’) analysis are implemented.
641 Econometrics Hmisc Harrell Miscellaneous Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, and recoding variables.
642 Econometrics ineq Measuring Inequality, Concentration, and Poverty Inequality, concentration, and poverty measures. Lorenz curves (empirical and theoretical).
643 Econometrics intReg Interval Regression Estimating interval regression models. Supports both common and observation-specific boundaries.
644 Econometrics ivbma Bayesian Instrumental Variable Estimation and Model Determination via Conditional Bayes Factors This package allows one incorporate instrument and covariate uncertainty into instrumental variable regression.
645 Econometrics ivfixed Instrumental fixed effect panel data model Fit an Instrumental least square dummy variable model
646 Econometrics ivlewbel Uses heteroscedasticity to estimate mismeasured and endogenous regressor models GMM estimation of triangular systems using heteroscedasticity based instrumental variables as in Lewbel (2012)
647 Econometrics ivpack Instrumental Variable Estimation This package contains functions for carrying out instrumental variable estimation of causal effects and power analyses for instrumental variable studies.
648 Econometrics ivpanel Instrumental Panel Data Models Fit the instrumental panel data models: the fixed effects, random effects and between models.
649 Econometrics ivprobit Instrumental variables probit model ivprobit fit an Instrumental variables probit model using the generalized least squares estimator
650 Econometrics LARF Local Average Response Functions for Instrumental Variable Estimation of Treatment Effects Provides instrumental variable estimation of treatment effects when both the endogenous treatment and its instrument are binary. Applicable to both binary and continuous outcomes.
651 Econometrics lavaan Latent Variable Analysis Fit a variety of latent variable models, including confirmatory factor analysis, structural equation modeling and latent growth curve models.
652 Econometrics lfe Linear Group Fixed Effects Transforms away factors with many levels prior to doing an OLS. Useful for estimating linear models with multiple group fixed effects, and for estimating linear models which uses factors with many levels as pure control variables. Includes support for instrumental variables, conditional F statistics for weak instruments, robust and multi-way clustered standard errors, as well as limited mobility bias correction.
653 Econometrics LinRegInteractive Interactive Interpretation of Linear Regression Models Interactive visualization of effects, response functions and marginal effects for different kinds of regression models. In this version linear regression models, generalized linear models, generalized additive models and linear mixed-effects models are supported. Major features are the interactive approach and the handling of the effects of categorical covariates: if two or more factors are used as covariates every combination of the levels of each factor is treated separately. The automatic calculation of marginal effects and a number of possibilities to customize the graphical output are useful features as well.
654 Econometrics lme4 Linear Mixed-Effects Models using ‘Eigen’ and S4 Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the ‘Eigen’ C++ library for numerical linear algebra and ‘RcppEigen’ “glue”.
655 Econometrics lmtest (core) Testing Linear Regression Models A collection of tests, data sets, and examples for diagnostic checking in linear regression models. Furthermore, some generic tools for inference in parametric models are provided.
656 Econometrics margins Marginal Effects for Model Objects An R port of Stata’s ‘margins’ command, which can be used to calculate marginal (or partial) effects from model objects.
657 Econometrics MASS Support Functions and Datasets for Venables and Ripley’s MASS Functions and datasets to support Venables and Ripley, “Modern Applied Statistics with S” (4th edition, 2002).
658 Econometrics matchingMarkets Analysis of Stable Matchings Implements structural estimators to correct for the sample selection bias from observed outcomes in matching markets. This includes one-sided matching of agents into groups as well as two-sided matching of students to schools. The package also contains algorithms to find stable matchings in the three most common matching problems: the stable roommates problem, the college admissions problem, and the house allocation problem.
659 Econometrics Matrix Sparse and Dense Matrix Classes and Methods A rich hierarchy of matrix classes, including triangular, symmetric, and diagonal matrices, both dense and sparse and with pattern, logical and numeric entries. Numerous methods for and operations on these matrices, using ‘LAPACK’ and ‘SuiteSparse’ libraries.
660 Econometrics Mcomp Data from the M-Competitions The 1001 time series from the M-competition (Makridakis et al. 1982) doi:10.1002/for.3980010202 and the 3003 time series from the IJF-M3 competition (Makridakis and Hibon, 2000) doi:10.1016/S0169-2070(00)00057-1.
661 Econometrics meboot Maximum Entropy Bootstrap for Time Series Maximum entropy density based dependent data bootstrap. An algorithm is provided to create a population of time series (ensemble) without assuming stationarity. The reference paper (Vinod, H.D., 2004) explains how the algorithm satisfies the ergodic theorem and the central limit theorem.
662 Econometrics mfx Marginal Effects, Odds Ratios and Incidence Rate Ratios for GLMs Estimates probit, logit, Poisson, negative binomial, and beta regression models, returning their marginal effects, odds ratios, or incidence rate ratios as an output.
663 Econometrics mgcv Mixed GAM Computation Vehicle with Automatic Smoothness Estimation Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar. Includes a gam() function, a wide variety of smoothers, JAGS support and distributions beyond the exponential family.
664 Econometrics mhurdle Multiple Hurdle Tobit Models Estimation of models with zero left-censored variables. Null values may be caused by a selection process, insufficient resources or infrequency of purchase.
665 Econometrics micEcon Microeconomic Analysis and Modelling Various tools for microeconomic analysis and microeconomic modelling, e.g. estimating quadratic, Cobb-Douglas and Translog functions, calculating partial derivatives and elasticities of these functions, and calculating Hessian matrices, checking curvature and preparing restrictions for imposing monotonicity of Translog functions.
666 Econometrics micEconAids Demand Analysis with the Almost Ideal Demand System (AIDS) Functions and tools for analysing consumer demand with the Almost Ideal Demand System (AIDS) suggested by Deaton and Muellbauer (1980).
667 Econometrics micEconCES Analysis with the Constant Elasticity of Substitution (CES) function Tools for economic analysis and economic modelling with a Constant Elasticity of Substitution (CES) function
668 Econometrics micEconSNQP Symmetric Normalized Quadratic Profit Function Production analysis with the Symmetric Normalized Quadratic (SNQ) profit function
669 Econometrics midasr Mixed Data Sampling Regression Methods and tools for mixed frequency time series data analysis. Allows estimation, model selection and forecasting for MIDAS regressions.
670 Econometrics mlogit multinomial logit model Estimation of the multinomial logit model
671 Econometrics mnlogit Multinomial Logit Model Time and memory efficient estimation of multinomial logit models using maximum likelihood method. Numerical optimization performed by Newton-Raphson method using an optimized, parallel C++ library to achieve fast computation of Hessian matrices. Motivated by large scale multiclass classification problems in econometrics and machine learning.
672 Econometrics MNP R Package for Fitting the Multinomial Probit Model Fits the Bayesian multinomial probit model via Markov chain Monte Carlo. The multinomial probit model is often used to analyze the discrete choices made by individuals recorded in survey data. Examples where the multinomial probit model may be useful include the analysis of product choice by consumers in market research and the analysis of candidate or party choice by voters in electoral studies. The MNP package can also fit the model with different choice sets for each individual, and complete or partial individual choice orderings of the available alternatives from the choice set. The estimation is based on the efficient marginal data augmentation algorithm that is developed by Imai and van Dyk (2005). “A Bayesian Analysis of the Multinomial Probit Model Using the Data Augmentation,” Journal of Econometrics, Vol. 124, No. 2 (February), pp. 311-334. doi:10.1016/j.jeconom.2004.02.002 Detailed examples are given in Imai and van Dyk (2005). “MNP: R Package for Fitting the Multinomial Probit Model.” Journal of Statistical Software, Vol. 14, No. 3 (May), pp. 1-32. doi:10.18637/jss.v014.i03.
673 Econometrics MSBVAR Markov-Switching, Bayesian, Vector Autoregression Models Provides methods for estimating frequentist and Bayesian Vector Autoregression (VAR) models and Markov-switching Bayesian VAR (MSBVAR). Functions for reduced form and structural VAR models are also available. Includes methods for the generating posterior inferences for these models, forecasts, impulse responses (using likelihood-based error bands), and forecast error decompositions. Also includes utility functions for plotting forecasts and impulse responses, and generating draws from Wishart and singular multivariate normal densities. Current version includes functionality to build and evaluate models with Markov switching.
674 Econometrics multiwayvcov Multi-Way Standard Error Clustering Exports two functions implementing multi-way clustering using the method suggested by Cameron, Gelbach, & Miller (2011) and cluster (or block) bootstrapping for estimating variance-covariance matrices. Normal one and two-way clustering matches the results of other common statistical packages. Missing values are handled transparently and rudimentary parallelization support is provided.
675 Econometrics mvProbit Multivariate Probit Models Tools for estimating multivariate probit models, calculating conditional and unconditional expectations, and calculating marginal effects on conditional and unconditional expectations.
676 Econometrics nlme Linear and Nonlinear Mixed Effects Models Fit and compare Gaussian linear and nonlinear mixed-effects models.
677 Econometrics nnet Feed-Forward Neural Networks and Multinomial Log-Linear Models Software for feed-forward neural networks with a single hidden layer, and for multinomial log-linear models.
678 Econometrics nonnest2 Tests of Non-Nested Models Testing non-nested models via theory supplied by Vuong (1989) doi:10.2307/1912557. Includes tests of model distinguishability and of model fit that can be applied to both nested and non-nested models. Also includes functionality to obtain confidence intervals associated with AIC and BIC. This material is based on work supported by the National Science Foundation under Grant Number SES-1061334.
679 Econometrics np Nonparametric Kernel Smoothing Methods for Mixed Data Types Nonparametric (and semiparametric) kernel methods that seamlessly handle a mix of continuous, unordered, and ordered factor data types. We would like to gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC:www.nserc.ca), the Social Sciences and Humanities Research Council of Canada (SSHRC:www.sshrc.ca), and the Shared Hierarchical Academic Research Computing Network (SHARCNET:www.sharcnet.ca).
680 Econometrics ordinal Regression Models for Ordinal Data Implementation of cumulative link (mixed) models also known as ordered regression models, proportional odds models, proportional hazards models for grouped survival times and ordered logit/probit/… models. Estimation is via maximum likelihood and mixed models are fitted with the Laplace approximation and adaptive Gauss-Hermite quadrature. Multiple random effect terms are allowed and they may be nested, crossed or partially nested/crossed. Restrictions of symmetry and equidistance can be imposed on the thresholds (cut-points/intercepts). Standard model methods are available (summary, anova, drop-methods, step, confint, predict etc.) in addition to profile methods and slice methods for visualizing the likelihood function and checking convergence.
681 Econometrics OrthoPanels Dynamic Panel Models with Orthogonal Reparameterization of Fixed Effects Implements the orthogonal reparameterization approach recommended by Lancaster (2002) to estimate dynamic panel models with fixed effects (and optionally: panel specific intercepts). The approach uses a likelihood-based estimator and produces estimates that are asymptotically unbiased as N goes to infinity, with a T as low as 2.
682 Econometrics pampe Implementation of the Panel Data Approach Method for Program Evaluation Implements the Panel Data Approach Method for program evaluation as developed in Hsiao, Ching and Ki Wan (2012). pampe estimates the effect of an intervention by comparing the evolution of the outcome for a unit affected by an intervention or treatment to the evolution of the unit had it not been affected by the intervention.
683 Econometrics panelAR Estimation of Linear AR(1) Panel Data Models with Cross-Sectional Heteroskedasticity and/or Correlation The package estimates linear models on panel data structures in the presence of AR(1)-type autocorrelation as well as panel heteroskedasticity and/or contemporaneous correlation. First, AR(1)-type autocorrelation is addressed via a two-step Prais-Winsten feasible generalized least squares (FGLS) procedure, where the autocorrelation coefficients may be panel-specific. A number of common estimators for the autocorrelation coefficient are supported. In case of panel heteroskedasticty, one can choose to use a sandwich-type robust standard error estimator with OLS or a panel weighted least squares estimator after the two-step Prais-Winsten estimator. Alternatively, if panels are both heteroskedastic and contemporaneously correlated, the package supports panel-corrected standard errors (PCSEs) as well as the Parks-Kmenta FGLS estimator.
684 Econometrics Paneldata Linear models for panel data Linear models for panel data: the fixed effect model and the random effect model
685 Econometrics PANICr PANIC Tests of Nonstationarity A methodology that makes use of the factor structure of large dimensional panels to understand the nature of nonstationarity inherent in data. This is referred to as PANIC, Panel Analysis of Nonstationarity in Idiosyncratic and Common Components. PANIC (2004)doi:10.1111/j.1468-0262.2004.00528.x includes valid pooling methods that allow panel tests to be constructed. PANIC (2004) can detect whether the nonstationarity in a series is pervasive, or variable specific, or both. PANIC (2010) doi:10.1017/s0266466609990478 includes two new tests on the idiosyncratic component that estimates the pooled autoregressive coefficient and sample moment, respectively. The PANIC model approximates the number of factors based on Bai and Ng (2002) doi:10.1111/1468-0262.00273.
686 Econometrics pco Panel Cointegration Tests Computation of the Pedroni (1999) panel cointegration test statistics. Reported are the empirical and the standardized values.
687 Econometrics pcse Panel-Corrected Standard Error Estimation in R This package contains a function to estimate panel-corrected standard errors. Data may contain balanced or unbalanced panels.
688 Econometrics pder Panel Data Econometrics with R Data sets for the Panel Data Econometrics with R book.
689 Econometrics pdR Threshold Model and Unit Root Tests in Panel Data Threshold model, panel version of Hylleberg et al.(1990)doi:10.1016/0304-4076(90)90080-D seasonal unit root tests, and panel unit root test of Chang(2002)doi:10.1016/S0304-4076(02)00095-7.
690 Econometrics pglm Panel Generalized Linear Models Estimation of panel models for glm-like models: this includes binomial models (logit and probit) count models (poisson and negbin) and ordered models (logit and probit).
691 Econometrics phtt Panel Data Analysis with Heterogeneous Time Trends The package provides estimation procedures for panel data with large dimensions n, T, and general forms of unobservable heterogeneous effects. Particularly, the estimation procedures are those of Bai (2009) and Kneip, Sickles, and Song (2012), which complement one another very well: both models assume the unobservable heterogeneous effects to have a factor structure. The method of Bai (2009) assumes that the factors are stationary, whereas the method of Kneip et al. (2012) allows the factors to be non-stationary. Additionally, the ‘phtt’ package provides a wide range of dimensionality criteria in order to estimate the number of the unobserved factors simultaneously with the remaining model parameters.
692 Econometrics plm (core) Linear Models for Panel Data A set of estimators and tests for panel data econometrics.
693 Econometrics pscl Political Science Computational Laboratory Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching at the Political Science Computational Laboratory; seats-votes curves.
694 Econometrics psidR Build Panel Data Sets from PSID Raw Data Makes it easy to build panel data in wide format from Panel Survey of Income Dynamics (PSID) delivered raw data. Deals with data downloaded and pre-processed by ‘Stata’ or ‘SAS’, or can optionally download directly from the PSID server using the ‘SAScii’ package. ‘psidR’ takes care of merging data from each wave onto a cross-period index file, so that individuals can be followed over time. The user must specify which years they are interested in, and the PSID variable names (e.g. ER21003) for each year (they differ in each year). There are different panel data designs and sample subsetting criteria implemented (“SRC”, “SEO”, “immigrant” and “latino” samples).
695 Econometrics pwt Penn World Table (Versions 5.6, 6.x, 7.x) The Penn World Table provides purchasing power parity and national income accounts converted to international prices for 189 countries for some or all of the years 1950-2010.
696 Econometrics pwt8 Penn World Table (Version 8.x) The Penn World Table 8.x provides information on relative levels of income, output, inputs, and productivity for 167 countries between 1950 and 2011.
697 Econometrics pwt9 Penn World Table (Version 9.x) The Penn World Table 9.x provides information on relative levels of income, output, inputs, and productivity for 182 countries between 1950 and 2014.
698 Econometrics quantreg Quantile Regression Estimation and inference methods for models of conditional quantiles: Linear and nonlinear parametric and non-parametric (total variation penalized) models for conditional quantiles of a univariate response and several methods for handling censored survival data. Portfolio selection methods based on expected shortfall risk are also included.
699 Econometrics Rchoice Discrete Choice (Binary, Poisson and Ordered) Models with Random Parameters An implementation of simulated maximum likelihood method for the estimation of Binary (Probit and Logit), Ordered (Probit and Logit) and Poisson models with random parameters for cross-sectional and longitudinal data.
700 Econometrics rdd Regression Discontinuity Estimation Provides the tools to undertake estimation in Regression Discontinuity Designs. Both sharp and fuzzy designs are supported. Estimation is accomplished using local linear regression. A provided function will utilize Imbens-Kalyanaraman optimal bandwidth calculation. A function is also included to test the assumption of no-sorting effects.
701 Econometrics rddtools Toolbox for Regression Discontinuity Design (‘RDD’) Set of functions for Regression Discontinuity Design (‘RDD’), for data visualisation, estimation and testing.
702 Econometrics rdlocrand Local Randomization Methods for RD Designs The regression discontinuity (RD) design is a popular quasi-experimental design for causal inference and policy evaluation. Under the local randomization approach, RD designs can be interpreted as randomized experiments inside a window around the cutoff. This package provides tools to perform randomization inference for RD designs under local randomization: rdrandinf() to perform hypothesis testing using randomization inference, rdwinselect() to select a window around the cutoff in which randomization is likely to hold, rdsensitivity() to assess the sensitivity of the results to different window lengths and null hypotheses and rdrbounds() to construct Rosenbaum bounds for sensitivity to unobserved confounders.
703 Econometrics rdrobust Robust Data-Driven Statistical Inference in Regression-Discontinuity Designs Regression-discontinuity (RD) designs are quasi-experimental research designs popular in social, behavioral and natural sciences. The RD design is usually employed to study the (local) causal effect of a treatment, intervention or policy. This package provides tools for data-driven graphical and analytical statistical inference in RD designs: rdrobust() to construct local-polynomial point estimators and robust confidence intervals for average treatment effects at the cutoff in Sharp, Fuzzy and Kink RD settings, rdbwselect() to perform bandwidth selection for the different procedures implemented, and rdplot() to conduct exploratory data analysis (RD plots).
704 Econometrics reldist Relative Distribution Methods Tools for the comparison of distributions. This includes nonparametric estimation of the relative distribution PDF and CDF and numerical summaries as described in “Relative Distribution Methods in the Social Sciences” by Mark S. Handcock and Martina Morris, Springer-Verlag, 1999, Springer-Verlag, ISBN 0387987789.
705 Econometrics REndo Fitting Linear Models with Endogenous Regressors using Latent Instrumental Variables Fits linear models with endogenous regressor using latent instrumental variable approaches. The methods included in the package are Lewbel’s (1997) doi:10.2307/2171884 higher moments approach as well as Lewbel’s (2012) doi:10.1080/07350015.2012.643126 heteroskedasticity approach, Park and Gupta’s (2012) doi:10.1287/mksc.1120.0718joint estimation method that uses Gaussian copula and Kim and Frees’s (2007) doi:10.1007/s11336-007-9008-1 multilevel generalized method of moment approach that deals with endogeneity in a multilevel setting. These are statistical techniques to address the endogeneity problem where no external instrumental variables are needed. This version: - includes an omitted variable test in the multilevel estimation. It is reported in the summary() function of the multilevelIV() function. - resolves the error “Error in listIDs[, 1] : incorrect number of dimensions” when using the multilevelIV() function. - a new simulated dataset is provided, dataMultilevelIV, on which to exemplify the multilevelIV() function.
706 Econometrics rms Regression Modeling Strategies Regression modeling, testing, estimation, validation, graphics, prediction, and typesetting by storing enhanced model design attributes in the fit. ‘rms’ is a collection of functions that assist with and streamline modeling. It also contains functions for binary and ordinal logistic regression models, ordinal models for continuous Y with a variety of distribution families, and the Buckley-James multiple regression model for right-censored responses, and implements penalized maximum likelihood estimation for logistic and ordinary linear models. ‘rms’ works with almost any regression model, but it was especially written to work with binary or ordinal regression models, Cox regression, accelerated failure time models, ordinary linear models, the Buckley-James model, generalized least squares for serially or spatially correlated observations, generalized linear models, and quantile regression.
707 Econometrics RSGHB Functions for Hierarchical Bayesian Estimation: A Flexible Approach Functions for estimating models using a Hierarchical Bayesian (HB) framework. The flexibility comes in allowing the user to specify the likelihood function directly instead of assuming predetermined model structures. Types of models that can be estimated with this code include the family of discrete choice models (Multinomial Logit, Mixed Logit, Nested Logit, Error Components Logit and Latent Class) as well ordered response models like ordered probit and ordered logit. In addition, the package allows for flexibility in specifying parameters as either fixed (non-varying across individuals) or random with continuous distributions. Parameter distributions supported include normal, positive/negative log-normal, positive/negative censored normal, and the Johnson SB distribution. Kenneth Train’s Matlab and Gauss code for doing Hierarchical Bayesian estimation has served as the basis for a few of the functions included in this package. These Matlab/Gauss functions have been rewritten to be optimized within R. Considerable code has been added to increase the flexibility and usability of the code base. Train’s original Gauss and Matlab code can be found here: http://elsa.berkeley.edu/Software/abstracts/train1006mxlhb.html See Train’s chapter on HB in Discrete Choice with Simulation here: http://elsa.berkeley.edu/books/choice2.html; and his paper on using HB with non-normal distributions here: http://eml.berkeley.edu//~train/trainsonnier.pdf.
708 Econometrics rUnemploymentData Data and Functions for USA State and County Unemployment Data Contains data and visualization functions for USA unemployment data. Data comes from the US Bureau of Labor Statistics (BLS). State data is in ?df_state_unemployment and covers 2000-2013. County data is in ?df_county_unemployment and covers 1990-2013. Choropleth maps of the data can be generated with ?state_unemployment_choropleth() and ?county_unemployment_choropleth() respectively.
709 Econometrics sampleSelection Sample Selection Models Two-step estimation and maximum likelihood estimation of Heckman-type sample selection models: standard sample selection models (Tobit-2) and endogenous switching regression models (Tobit-5).
710 Econometrics sandwich (core) Robust Covariance Matrix Estimators Model-robust standard error estimators for cross-sectional, time series, clustered, panel, and longitudinal data.
711 Econometrics segmented Regression Models with Break-Points / Change-Points Estimation Given a regression model, segmented ‘updates’ the model by adding one or more segmented (i.e., piece-wise linear) relationships. Several variables with multiple breakpoints are allowed.
712 Econometrics sem Structural Equation Models Functions for fitting general linear structural equation models (with observed and latent variables) using the RAM approach, and for fitting structural equations in observed-variable models by two-stage least squares.
713 Econometrics SemiParSampleSel Semi-Parametric Sample Selection Modelling with Continuous or Discrete Response Routine for fitting continuous or discrete response copula sample selection models with semi-parametric predictors, including linear and nonlinear effects.
714 Econometrics semsfa Semiparametric Estimation of Stochastic Frontier Models Semiparametric Estimation of Stochastic Frontier Models following a two step procedure: in the first step semiparametric or nonparametric regression techniques are used to relax parametric restrictions of the functional form representing technology and in the second step variance parameters are obtained by pseudolikelihood estimators or by method of moments.
715 Econometrics sfa Stochastic Frontier Analysis Stochastic Frontier Analysis introduced by Aigner, Lovell and Schmidt (1976) and Battese and Coelli (1992, 1995).
716 Econometrics simpleboot Simple Bootstrap Routines Simple bootstrap routines
717 Econometrics SparseM Sparse Linear Algebra Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.
718 Econometrics spatialprobit Spatial Probit Models Bayesian Estimation of Spatial Probit and Tobit Models.
719 Econometrics spdep Spatial Dependence: Weighting Schemes, Statistics and Models A collection of functions to create spatial weights matrix objects from polygon ‘contiguities’, from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial ‘autocorrelation’, including global ‘Morans I’, ‘APLE’, ‘Gearys C’, ‘Hubert/Mantel’ general cross product statistic, Empirical Bayes estimates and ‘Assuncao/Reis’ Index, ‘Getis/Ord’ G and multicoloured join count statistics, local ‘Moran’s I’ and ‘Getis/Ord’ G, ‘saddlepoint’ approximations and exact tests for global and local ‘Moran’s I’; and functions for estimating spatial simultaneous ‘autoregressive’ (‘SAR’) lag and error models, impact measures for lag models, weighted and ‘unweighted’ ‘SAR’ and ‘CAR’ spatial regression models, semi-parametric and Moran ‘eigenvector’ spatial filtering, ‘GM SAR’ error models, and generalized spatial two stage least squares models.
720 Econometrics spfrontier Spatial Stochastic Frontier Models A set of tools for estimation of various spatial specifications of stochastic frontier models.
721 Econometrics sphet Estimation of spatial autoregressive models with and without heteroskedastic innovations Generalized Method of Moment estimation of Cliff-Ord-type spatial autoregressive models with and without heteroskedastic innovations
722 Econometrics splm Econometric Models for Spatial Panel Data ML and GM estimation and diagnostic testing of econometric models for spatial panel data.
723 Econometrics ssfa Spatial Stochastic Frontier Analysis Spatial Stochastic Frontier Analysis (SSFA) is an original method for controlling the spatial heterogeneity in Stochastic Frontier Analysis (SFA) models, for cross-sectional data, by splitting the inefficiency term into three terms: the first one related to spatial peculiarities of the territory in which each single unit operates, the second one related to the specific production features and the third one representing the error term.
724 Econometrics strucchange Testing, Monitoring, and Dating Structural Changes Testing, monitoring and dating structural changes in (linear) regression models. strucchange features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g., CUSUM, MOSUM, recursive/moving estimates) and F statistics, respectively. It is possible to monitor incoming data online using fluctuation processes. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals. Emphasis is always given to methods for visualizing the data.
725 Econometrics survival Survival Analysis Contains the core survival analysis routines, including definition of Surv objects, Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models, and parametric accelerated failure time models.
726 Econometrics systemfit Estimating Systems of Simultaneous Equations Fitting simultaneous systems of linear and nonlinear equations using Ordinary Least Squares (OLS), Weighted Least Squares (WLS), Seemingly Unrelated Regressions (SUR), Two-Stage Least Squares (2SLS), Weighted Two-Stage Least Squares (W2SLS), and Three-Stage Least Squares (3SLS).
727 Econometrics truncreg Truncated Gaussian Regression Models Estimation of models for truncated Gaussian variables by maximum likelihood.
728 Econometrics tsDyn Nonlinear Time Series Models with Regime Switching Implements nonlinear autoregressive (AR) time series models. For univariate series, a non-parametric approach is available through additive nonlinear AR. Parametric modeling and testing for regime switching dynamics is available when the transition is either direct (TAR: threshold AR) or smooth (STAR: smooth transition AR, LSTAR). For multivariate series, one can estimate a range of TVAR or threshold cointegration TVECM models with two or three regimes. Tests can be conducted for TVAR as well as for TVECM (Hansen and Seo 2002 and Seo 2006).
729 Econometrics tseries (core) Time Series Analysis and Computational Finance Time series analysis and computational finance.
730 Econometrics tsfa Time Series Factor Analysis Extraction of Factors from Multivariate Time Series. See ?00tsfa-Intro for more details.
731 Econometrics urca (core) Unit Root and Cointegration Tests for Time Series Data Unit root and cointegration tests encountered in applied econometric analysis are implemented.
732 Econometrics vars VAR Modelling Estimation, lag selection, diagnostic testing, forecasting, causality analysis, forecast error variance decomposition and impulse response functions of VAR models and estimation of SVAR and SVEC models.
733 Econometrics VGAM Vector Generalized Linear and Additive Models An implementation of about 6 major classes of statistical regression models. At the heart of it are the vector generalized linear and additive model (VGLM/VGAM) classes, and the book “Vector Generalized Linear and Additive Models: With an Implementation in R” (Yee, 2015) doi:10.1007/978-1-4939-2818-7 gives details of the statistical framework and VGAM package. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs (i.e., with smoothing). The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)―these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.
734 Econometrics wahc Autocorrelation and Heteroskedasticity Correction in Fixed Effect Panel Data Model Fit the fixed effect panel data model with heteroskedasticity and autocorrelation correction.
735 Econometrics wbstats Programmatic Access to Data and Statistics from the World Bank API Tools for searching and downloading data and statistics from the World Bank Data API (http://data.worldbank.org/developers/api-overview) and the World Bank Data Catalog API (http://data.worldbank.org/developers/data-catalog-api).
736 Econometrics wooldridge 105 Data Sets from “Introductory Econometrics: A Modern Approach” by Jeffrey M. Wooldridge Those new to econometrics and R may find themselves challenged by data management tasks inherent to both. The wooldridge data package aims to lighten the task by efficiently loading any data set from the text with a single command. Collectively, all data sets have been compressed to 62.73% of their original size. Most sets have robust documentation including page numbers on which they are used, original data sources, original year of publication, and notes which chronicle their history while offering ideas for further exploration and research. To resurrect a data set, one can pass it’s name to the ‘data()’ function or just define it as the ‘data =’ argument of the model function. The data will lazily load and, provided the syntax is correct, model estimates shall spring forth from the otherwise lifeless abyss of your R console! If the syntax is an issue, the wooldridge-vignette displays solutions to examples from each chapter of the text, providing a relevant introduction to econometric modeling with R. The vignette closes with an Appendix of recommended sources for R and econometrics. Note: Data sets are from the 5th edition (Wooldridge 2013, ISBN-13:978-1-111-53104-1), and are compatible with all others.
737 Econometrics xts eXtensible Time Series Provide for uniform handling of R’s different time-based data classes by extending zoo, maximizing native format information preservation and allowing for user level customization and extension, while simplifying cross-class interoperability.
738 Econometrics Zelig Everyone’s Statistical Software A framework that brings together an abundance of common statistical models found across packages into a unified interface, and provides a common architecture for estimation and interpretation, as well as bridging functions to absorb increasingly more models into the package. Zelig allows each individual package, for each statistical model, to be accessed by a common uniformly structured call and set of arguments. Moreover, Zelig automates all the surrounding building blocks of a statistical work-flowprocedures and algorithms that may be essential to one user’s application but which the original package developer did not use in their own research and might not themselves support. These include bootstrapping, jackknifing, and re-weighting of data. In particular, Zelig automatically generates predicted and simulated quantities of interest (such as relative risk ratios, average treatment effects, first differences and predicted and expected values) to interpret and visualize complex models.
739 Econometrics zoo (core) S3 Infrastructure for Regular and Irregular Time Series (Z’s Ordered Observations) An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo’s key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics.
740 Econometrics zTree Functions to Import Data from ‘z-Tree’ into R Read ‘.xls’ and ‘.sbj’ files which are written by the Microsoft Windows program ‘z-Tree’. The latter is a software for developing and carrying out economic experiments (see http://www.ztree.uzh.ch/ for more information).
741 Analysis of Ecological and Environmental Data ade4 (core) Analysis of Ecological Data : Exploratory and Euclidean Methods in Environmental Sciences Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) doi:10.18637/jss.v022.i04.
742 Analysis of Ecological and Environmental Data adehabitat Analysis of Habitat Selection by Animals A collection of tools for the analysis of habitat selection by animals.
743 Analysis of Ecological and Environmental Data amap Another Multidimensional Analysis Package Tools for Clustering and Principal Component Analysis (With robust methods, and parallelized functions).
744 Analysis of Ecological and Environmental Data analogue Analogue and Weighted Averaging Methods for Palaeoecology Fits Modern Analogue Technique and Weighted Averaging transfer function models for prediction of environmental data from species data, and related methods used in palaeoecology.
745 Analysis of Ecological and Environmental Data aod Analysis of Overdispersed Data This package provides a set of functions to analyse overdispersed counts or proportions. Most of the methods are already available elsewhere but are scattered in different packages. The proposed functions should be considered as complements to more sophisticated methods such as generalized estimating equations (GEE) or generalized linear mixed effect models (GLMM).
746 Analysis of Ecological and Environmental Data ape Analyses of Phylogenetics and Evolution Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel’s test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ, BIONJ, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.
747 Analysis of Ecological and Environmental Data aqp Algorithms for Quantitative Pedology A collection of algorithms related to modeling of soil resources, soil classification, soil profile aggregation, and visualization.
748 Analysis of Ecological and Environmental Data BiodiversityR Package for Community Ecology and Suitability Analysis Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.
749 Analysis of Ecological and Environmental Data boussinesq Analytic Solutions for (ground-water) Boussinesq Equation This package is a collection of R functions implemented from published and available analytic solutions for the One-Dimensional Boussinesq Equation (ground-water). In particular, the function “beq.lin” is the analytic solution of the linearized form of Boussinesq Equation between two different head-based boundary (Dirichlet) conditions; “beq.song” is the non-linear power-series analytic solution of the motion of a wetting front over a dry bedrock (Song at al, 2007, see complete reference on function documentation). Bugs/comments/questions/collaboration of any kind are warmly welcomed.
750 Analysis of Ecological and Environmental Data bReeze Functions for Wind Resource Assessment A collection of functions to analyse, visualize and interpret wind data and to calculate the potential energy production of wind turbines.
751 Analysis of Ecological and Environmental Data CircStats Circular Statistics, from “Topics in circular Statistics” (2001) Circular Statistics, from “Topics in circular Statistics” (2001) S. Rao Jammalamadaka and A. SenGupta, World Scientific.
752 Analysis of Ecological and Environmental Data circular Circular Statistics Circular Statistics, from “Topics in circular Statistics” (2001) S. Rao Jammalamadaka and A. SenGupta, World Scientific.
753 Analysis of Ecological and Environmental Data cluster (core) “Finding Groups in Data”: Cluster Analysis Extended Rousseeuw et al. Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) “Finding Groups in Data”.
754 Analysis of Ecological and Environmental Data cocorresp Co-Correspondence Analysis Methods Fits predictive and symmetric co-correspondence analysis (CoCA) models to relate one data matrix to another data matrix. More specifically, CoCA maximises the weighted covariance between the weighted averaged species scores of one community and the weighted averaged species scores of another community. CoCA attempts to find patterns that are common to both communities.
755 Analysis of Ecological and Environmental Data Distance Distance Sampling Detection Function and Abundance Estimation A simple way of fitting detection functions to distance sampling data for both line and point transects. Adjustment term selection, left and right truncation as well as monotonicity constraints and binning are supported. Abundance and density estimates can also be calculated (via a Horvitz-Thompson-like estimator) if survey area information is provided.
756 Analysis of Ecological and Environmental Data diveMove Dive Analysis and Calibration Utilities to represent, visualize, filter, analyse, and summarize time-depth recorder (TDR) data. Miscellaneous functions for handling location data are also provided.
757 Analysis of Ecological and Environmental Data dse Dynamic Systems Estimation (Time Series Package) Tools for multivariate, linear, time-invariant, time series models. This includes ARMA and state-space representations, and methods for converting between them. It also includes simulation methods and several estimation functions. The package has functions for looking at model roots, stability, and forecasts at different horizons. The ARMA model representation is general, so that VAR, VARX, ARIMA, ARMAX, ARIMAX can all be considered to be special cases. Kalman filter and smoother estimates can be obtained from the state space model, and state-space model reduction techniques are implemented. An introduction and User’s Guide is available in a vignette.
758 Analysis of Ecological and Environmental Data dsm Density Surface Modelling of Distance Sampling Data Density surface modelling of line transect data. A Generalized Additive Model-based approach is used to calculate spatially-explicit estimates of animal abundance from distance sampling (also presence/absence and strip transect) data. Several utility functions are provided for model checking, plotting and variance estimation.
759 Analysis of Ecological and Environmental Data DSpat Spatial Modelling for Distance Sampling Data Fits inhomogeneous Poisson process spatial models to line transect sampling data and provides estimate of abundance within a region.
760 Analysis of Ecological and Environmental Data dyn Time Series Regression Time series regression. The dyn class interfaces ts, irts(), zoo() and zooreg() time series classes to lm(), glm(), loess(), quantreg::rq(), MASS::rlm(), MCMCpack::MCMCregress(), quantreg::rq(), randomForest::randomForest() and other regression functions allowing those functions to be used with time series including specifications that may contain lags, diffs and missing values.
761 Analysis of Ecological and Environmental Data dynatopmodel Implementation of the Dynamic TOPMODEL Hydrological Model A native R implementation and enhancement of the Dynamic TOPMODEL semi-distributed hydrological model. Includes some pre-processsing and output routines.
762 Analysis of Ecological and Environmental Data dynlm Dynamic Linear Regression Dynamic linear models and time series regression.
763 Analysis of Ecological and Environmental Data e1071 Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, …
764 Analysis of Ecological and Environmental Data earth Multivariate Adaptive Regression Splines Build regression models using the techniques in Friedman’s papers “Fast MARS” and “Multivariate Adaptive Regression Splines”. (The term “MARS” is trademarked and thus not used in the name of the package.)
765 Analysis of Ecological and Environmental Data eco Ecological Inference in 2x2 Tables Implements the Bayesian and likelihood methods proposed in Imai, Lu, and Strauss (2008 doi:10.1093/pan/mpm017) and (2011 doi:10.18637/jss.v042.i05) for ecological inference in 2 by 2 tables as well as the method of bounds introduced by Duncan and Davis (1953). The package fits both parametric and nonparametric models using either the Expectation-Maximization algorithms (for likelihood models) or the Markov chain Monte Carlo algorithms (for Bayesian models). For all models, the individual-level data can be directly incorporated into the estimation whenever such data are available. Along with in-sample and out-of-sample predictions, the package also provides a functionality which allows one to quantify the effect of data aggregation on parameter estimation and hypothesis testing under the parametric likelihood models.
766 Analysis of Ecological and Environmental Data ecodist Dissimilarity-Based Functions for Ecological Analysis Dissimilarity-based analysis functions including ordination and Mantel test functions, intended for use with spatial and community data.
767 Analysis of Ecological and Environmental Data EcoHydRology A community modeling foundation for Eco-Hydrology This package provides a flexible foundation for scientists, engineers, and policy makers to base teaching exercises as well as for more applied use to model complex eco-hydrological interactions.
768 Analysis of Ecological and Environmental Data EnvStats Package for Environmental Statistics, Including US EPA Guidance Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book “EnvStats: An R Package for Environmental Statistics” (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, http://www.springer.com/book/9781461484554).
769 Analysis of Ecological and Environmental Data equivalence Provides Tests and Graphics for Assessing Tests of Equivalence Provides statistical tests and graphics for assessing tests of equivalence. Such tests have similarity as the alternative hypothesis instead of the null. Sample data sets are included.
770 Analysis of Ecological and Environmental Data evd Functions for Extreme Value Distributions Extends simulation, distribution, quantile and density functions to univariate and multivariate parametric extreme value distributions, and provides fitting functions which calculate maximum likelihood estimates for univariate and bivariate maxima models, and for univariate and bivariate threshold models.
771 Analysis of Ecological and Environmental Data evdbayes Bayesian Analysis in Extreme Value Theory Provides functions for the bayesian analysis of extreme value models, using MCMC methods.
772 Analysis of Ecological and Environmental Data evir Extreme Values in R Functions for extreme value theory, which may be divided into the following groups; exploratory data analysis, block maxima, peaks over thresholds (univariate and bivariate), point processes, gev/gpd distributions.
773 Analysis of Ecological and Environmental Data extRemes Extreme Value Analysis Functions for performing extreme value analysis.
774 Analysis of Ecological and Environmental Data fast Implementation of the Fourier Amplitude Sensitivity Test (FAST) The Fourier Amplitude Sensitivity Test (FAST) is a method to determine global sensitivities of a model on parameter changes with relatively few model runs. This package implements this sensitivity analysis method.
775 Analysis of Ecological and Environmental Data FD Measuring functional diversity (FD) from multiple traits, and other tools for functional ecology FD is a package to compute different multidimensional FD indices. It implements a distance-based framework to measure FD that allows any number and type of functional traits, and can also consider species relative abundances. It also contains other useful tools for functional ecology.
776 Analysis of Ecological and Environmental Data flexmix Flexible Mixture Modeling A general framework for finite mixtures of regression models using the EM algorithm is implemented. The package provides the E-step and all data handling, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models and model-based clustering.
777 Analysis of Ecological and Environmental Data forecast Forecasting Functions for Time Series and Linear Models Methods and tools for displaying and analysing univariate time series forecasts including exponential smoothing via state space models and automatic ARIMA modelling.
778 Analysis of Ecological and Environmental Data fso Fuzzy Set Ordination Fuzzy set ordination is a multivariate analysis used in ecology to relate the composition of samples to possible explanatory variables. While differing in theory and method, in practice, the use is similar to ‘constrained ordination.’ The package contains plotting and summary functions as well as the analyses
779 Analysis of Ecological and Environmental Data gam Generalized Additive Models Functions for fitting and working with generalized additive models, as described in chapter 7 of “Statistical Models in S” (Chambers and Hastie (eds), 1991), and “Generalized Additive Models” (Hastie and Tibshirani, 1990).
780 Analysis of Ecological and Environmental Data gamair Data for “GAMs: An Introduction with R” Data sets and scripts used in the book “Generalized Additive Models: An Introduction with R”, Wood (2006) CRC.
781 Analysis of Ecological and Environmental Data hydroGOF Goodness-of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series S3 functions implementing both statistical and graphical goodness-of-fit measures between observed and simulated values, mainly oriented to be used during the calibration, validation, and application of hydrological models. Missing values in observed and/or simulated values can be removed before computations. Comments / questions / collaboration of any kind are very welcomed.
782 Analysis of Ecological and Environmental Data HydroMe R codes for estimating water retention and infiltration model parameters using experimental data This package is version 2 of HydroMe v.1 package. It estimates the parameters in infiltration and water retention models by curve-fitting method. The models considered are those that are commonly used in soil science. It has new models for water retention characteristic curve and debugging of errors in HydroMe v.1
783 Analysis of Ecological and Environmental Data hydroPSO Particle Swarm Optimisation, with focus on Environmental Models This package implements a state-of-the-art version of the Particle Swarm Optimisation (PSO) algorithm (SPSO-2011 and SPSO-2007 capable). hydroPSO can be used as a replacement of the ‘optim’ R function for (global) optimization of non-smooth and non-linear functions. However, the main focus of hydroPSO is the calibration of environmental and other real-world models that need to be executed from the system console. hydroPSO is model-independent, allowing the user to easily interface any computer simulation model with the calibration engine (PSO). hydroPSO communicates with the model through the model’s own input and output files, without requiring access to the model’s source code. Several PSO variants and controlling options are included to fine-tune the performance of the calibration engine to different calibration problems. An advanced sensitivity analysis function together with user-friendly plotting summaries facilitate the interpretation and assessment of the calibration results. hydroPSO is parallel-capable, to alleviate the computational burden of complex models with “long” execution time. Bugs reports/comments/questions are very welcomed (in English, Spanish or Italian).
784 Analysis of Ecological and Environmental Data hydroTSM Time Series Management, Analysis and Interpolation for Hydrological Modelling S3 functions for management, analysis, interpolation and plotting of time series used in hydrology and related environmental sciences. In particular, this package is highly oriented to hydrological modelling tasks. The focus of this package has been put in providing a collection of tools useful for the daily work of hydrologists (although an effort was made to optimise each function as much as possible, functionality has had priority over speed). Bugs / comments / questions / collaboration of any kind are very welcomed, and in particular, datasets that can be included in this package for academic purposes.
785 Analysis of Ecological and Environmental Data Interpol.T Hourly interpolation of multiple temperature daily series Hourly interpolation of daily minimum and maximum temperature series. Carries out interpolation on multiple series ad once. Requires some hourly series for calibration (alternatively can use default calibration table).
786 Analysis of Ecological and Environmental Data ipred Improved Predictors Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error.
787 Analysis of Ecological and Environmental Data ismev An Introduction to Statistical Modeling of Extreme Values Functions to support the computations carried out in ‘An Introduction to Statistical Modeling of Extreme Values’ by Stuart Coles. The functions may be divided into the following groups; maxima/minima, order statistics, peaks over thresholds and point processes.
788 Analysis of Ecological and Environmental Data labdsv (core) Ordination and Multivariate Analysis for Ecology A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.
789 Analysis of Ecological and Environmental Data latticeDensity Density estimation and nonparametric regression on irregular regions This package contains functions that compute the lattice-based density estimator of Barry and McIntyre, which accounts for point processes in two-dimensional regions with irregular boundaries and holes. The package also implements two-dimensional non-parametric regression for similar regions.
790 Analysis of Ecological and Environmental Data lme4 Linear Mixed-Effects Models using ‘Eigen’ and S4 Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the ‘Eigen’ C++ library for numerical linear algebra and ‘RcppEigen’ “glue”.
791 Analysis of Ecological and Environmental Data maptree Mapping, pruning, and graphing tree models Functions with example data for graphing, pruning, and mapping models from hierarchical clustering, and classification and regression trees.
792 Analysis of Ecological and Environmental Data marked Mark-Recapture Analysis for Survival and Abundance Estimation Functions for fitting various models to capture-recapture data including fixed and mixed-effects Cormack-Jolly-Seber(CJS) for survival estimation and POPAN structured Jolly-Seber models for abundance estimation. Includes a CJS models that concurrently estimates and corrects for tag loss. Hidden Markov model (HMM) implementations of CJS and multistate models with and without state uncertainty.
793 Analysis of Ecological and Environmental Data MASS (core) Support Functions and Datasets for Venables and Ripley’s MASS Functions and datasets to support Venables and Ripley, “Modern Applied Statistics with S” (4th edition, 2002).
794 Analysis of Ecological and Environmental Data mclust Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.
795 Analysis of Ecological and Environmental Data mda Mixture and Flexible Discriminant Analysis Mixture and flexible discriminant analysis, multivariate adaptive regression splines (MARS), BRUTO, …
796 Analysis of Ecological and Environmental Data mefa Multivariate Data Handling in Ecology and Biogeography A framework package aimed to provide standardized computational environment for specialist work via object classes to represent the data coded by samples, taxa and segments (i.e. subpopulations, repeated measures). It supports easy processing of the data along with cross tabulation and relational data tables for samples and taxa. An object of class ‘mefa’ is a project specific compendium of the data and can be easily used in further analyses. Methods are provided for extraction, aggregation, conversion, plotting, summary and reporting of ‘mefa’ objects. Reports can be generated in plain text or LaTeX format. Vignette contains worked examples.
797 Analysis of Ecological and Environmental Data metacom Analysis of the ‘Elements of Metacommunity Structure’ Functions to analyze coherence, boundary clumping, and turnover following the pattern-based metacommunity analysis of Leibold and Mikkelson 2002 doi:10.1034/j.1600-0706.2002.970210.x. The package also includes functions to visualize ecological networks, and to calculate modularity as a replacement to boundary clumping.
798 Analysis of Ecological and Environmental Data mgcv (core) Mixed GAM Computation Vehicle with Automatic Smoothness Estimation Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar. Includes a gam() function, a wide variety of smoothers, JAGS support and distributions beyond the exponential family.
799 Analysis of Ecological and Environmental Data mrds Mark-Recapture Distance Sampling Animal abundance estimation via conventional, multiple covariate and mark-recapture distance sampling (CDS/MCDS/MRDS). Detection function fitting is performed via maximum likelihood. Also included are diagnostics and plotting for fitted detection functions. Abundance estimation is via a Horvitz-Thompson-like estimator.
800 Analysis of Ecological and Environmental Data nlme Linear and Nonlinear Mixed Effects Models Fit and compare Gaussian linear and nonlinear mixed-effects models.
801 Analysis of Ecological and Environmental Data nsRFA Non-supervised Regional Frequency Analysis A collection of statistical tools for objective (non-supervised) applications of the Regional Frequency Analysis methods in hydrology. The package refers to the index-value method and, more precisely, helps the hydrologist to: (1) regionalize the index-value; (2) form homogeneous regions with similar growth curves; (3) fit distribution functions to the empirical regional growth curves.
802 Analysis of Ecological and Environmental Data oce Analysis of Oceanographic Data Supports the analysis of Oceanographic data, including ‘ADCP’ measurements, measurements made with ‘argo’ floats, ‘CTD’ measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the ‘UNESCO’ or ‘TEOS-10’ equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature.
803 Analysis of Ecological and Environmental Data openair Tools for the Analysis of Air Pollution Data Tools to analyse, interpret and understand air pollution data. Data are typically hourly time series and both monitoring data and dispersion model output can be analysed. Many functions can also be applied to other data, including meteorological and traffic data.
804 Analysis of Ecological and Environmental Data ouch Ornstein-Uhlenbeck Models for Phylogenetic Comparative Hypotheses Fit and compare Ornstein-Uhlenbeck models for evolution along a phylogenetic tree.
805 Analysis of Ecological and Environmental Data party A Laboratory for Recursive Partytioning A computational toolbox for recursive partitioning. The core of the package is ctree(), an implementation of conditional inference trees which embed tree-structured regression models into a well defined theory of conditional inference procedures. This non-parametric class of regression trees is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Based on conditional inference trees, cforest() provides an implementation of Breiman’s random forests. The function mob() implements an algorithm for recursive partitioning based on parametric models (e.g. linear models, GLMs or survival regression) employing parameter instability tests for split selection. Extensible functionality for visualizing tree-structured regression models is available. The methods are described in Hothorn et al. (2006) doi:10.1198/106186006X133933, Zeileis et al. (2008) doi:10.1198/106186008X319331 and Strobl et al. (2007) doi:10.1186/1471-2105-8-25.
806 Analysis of Ecological and Environmental Data pastecs Package for Analysis of Space-Time Ecological Series Regulation, decomposition and analysis of space-time series. The pastecs library is a PNEC-Art4 and IFREMER (Benoit Beliaeff Benoit.Beliaeff@ifremer.fr) initiative to bring PASSTEC 2000 (http://www.obs-vlfr.fr/~enseigne/anado/passtec/passtec.htm) functionalities to R.
807 Analysis of Ecological and Environmental Data pgirmess Data Analysis in Ecology Miscellaneous functions for data analysis in ecology, with special emphasis on spatial data.
808 Analysis of Ecological and Environmental Data popbio Construction and Analysis of Matrix Population Models Construct and analyze projection matrix models from a demography study of marked individuals classified by age or stage. The package covers methods described in Matrix Population Models by Caswell (2001) and Quantitative Conservation Biology by Morris and Doak (2002).
809 Analysis of Ecological and Environmental Data prabclus Functions for Clustering of Presence-Absence, Abundance and Multilocus Genetic Data Distance-based parametric bootstrap tests for clustering with spatial neighborhood information. Some distance measures, Clustering of presence-absence, abundance and multilocus genetical data for species delimitation, nearest neighbor based noise detection. Try package?prabclus for on overview.
810 Analysis of Ecological and Environmental Data primer Functions and data for A Primer of Ecology with R Functions are primarily functions for systems of ordinary differential equations, difference equations, and eigenanalysis and projection of demographic matrices; data are for examples.
811 Analysis of Ecological and Environmental Data pscl Political Science Computational Laboratory Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching at the Political Science Computational Laboratory; seats-votes curves.
812 Analysis of Ecological and Environmental Data pvclust Hierarchical Clustering with P-Values via Multiscale Bootstrap Resampling An implementation of multiscale bootstrap resampling for assessing the uncertainty in hierarchical cluster analysis. It provides AU (approximately unbiased) p-value as well as BP (bootstrap probability) value for each cluster in a dendrogram.
813 Analysis of Ecological and Environmental Data qualV Qualitative Validation Methods Qualitative methods for the validation of dynamic models. It contains (i) an orthogonal set of deviance measures for absolute, relative and ordinal scale and (ii) approaches accounting for time shifts. The first approach transforms time to take time delays and speed differences into account. The second divides the time series into interval units according to their main features and finds the longest common subsequence (LCS) using a dynamic programming algorithm.
814 Analysis of Ecological and Environmental Data quantreg Quantile Regression Estimation and inference methods for models of conditional quantiles: Linear and nonlinear parametric and non-parametric (total variation penalized) models for conditional quantiles of a univariate response and several methods for handling censored survival data. Portfolio selection methods based on expected shortfall risk are also included.
815 Analysis of Ecological and Environmental Data quantregGrowth Growth Charts via Regression Quantiles Fits non-crossing regression quantiles as a function of linear covariates and a smooth terms via B-splines with difference penalties.
816 Analysis of Ecological and Environmental Data randomForest Breiman and Cutler’s Random Forests for Classification and Regression Classification and regression based on a forest of trees using random inputs.
817 Analysis of Ecological and Environmental Data Rcapture Loglinear Models for Capture-Recapture Experiments Estimation of abundance and other of demographic parameters for closed populations, open populations and the robust design in capture-recapture experiments using loglinear models.
818 Analysis of Ecological and Environmental Data rioja Analysis of Quaternary Science Data Functions for the analysis of Quaternary science data, including constrained clustering, WA, WAPLS, IKFA, MLRC and MAT transfer functions, and stratigraphic diagrams.
819 Analysis of Ecological and Environmental Data RMark R Code for Mark Analysis An interface to the software package MARK that constructs input files for MARK and extracts the output. MARK was developed by Gary White and is freely available at http://www.phidot.org/software/mark/downloads/ but is not open source.
820 Analysis of Ecological and Environmental Data RMAWGEN Multi-Site Auto-Regressive Weather GENerator S3 and S4 functions are implemented for spatial multi-site stochastic generation of daily time series of temperature and precipitation. These tools make use of Vector AutoRegressive models (VARs). The weather generator model is then saved as an object and is calibrated by daily instrumental “Gaussianized” time series through the ‘vars’ package tools. Once obtained this model, it can it can be used for weather generations and be adapted to work with several climatic monthly time series.
821 Analysis of Ecological and Environmental Data rpart Recursive Partitioning and Regression Trees Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman, Friedman, Olshen and Stone.
822 Analysis of Ecological and Environmental Data rtop Interpolation of Data with Variable Spatial Support Geostatistical interpolation of data with irregular spatial support such as runoff related data or data from administrative units.
823 Analysis of Ecological and Environmental Data seacarb Seawater Carbonate Chemistry Calculates parameters of the seawater carbonate system and assists the design of ocean acidification perturbation experiments.
824 Analysis of Ecological and Environmental Data seas Seasonal analysis and graphics, especially for climatology Capable of deriving seasonal statistics, such as “normals”, and analysis of seasonal data, such as departures. This package also has graphics capabilities for representing seasonal data, including boxplots for seasonal parameters, and bars for summed normals. There are many specific functions related to climatology, including precipitation normals, temperature normals, cumulative precipitation departures and precipitation interarrivals. However, this package is designed to represent any time-varying parameter with a discernible seasonal signal, such as found in hydrology and ecology.
825 Analysis of Ecological and Environmental Data secr Spatially Explicit Capture-Recapture Functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
826 Analysis of Ecological and Environmental Data segmented Regression Models with Break-Points / Change-Points Estimation Given a regression model, segmented ‘updates’ the model by adding one or more segmented (i.e., piece-wise linear) relationships. Several variables with multiple breakpoints are allowed.
827 Analysis of Ecological and Environmental Data sensitivity Global Sensitivity Analysis of Model Outputs A collection of functions for factor screening, global sensitivity analysis and reliability sensitivity analysis. Most of the functions have to be applied on model with scalar output, but several functions support multi-dimensional outputs.
828 Analysis of Ecological and Environmental Data simba A Collection of functions for similarity analysis of vegetation data Besides functions for the calculation of similarity and multiple plot similarity measures with binary data (for instance presence/absence species data) the package contains some simple wrapper functions for reshaping species lists into matrices and vice versa and some other functions for further processing of similarity data (Mantel-like permutation procedures) as well as some other useful stuff for vegetation analysis.
829 Analysis of Ecological and Environmental Data simecol Simulation of Ecological (and Other) Dynamic Systems An object oriented framework to simulate ecological (and other) dynamic systems. It can be used for differential equations, individual-based (or agent-based) and other models as well. The package helps to organize scenarios (to avoid copy and paste) and aims to improve readability and usability of code.
830 Analysis of Ecological and Environmental Data siplab Spatial Individual-Plant Modelling A platform for experimenting with spatially explicit individual-based vegetation models.
831 Analysis of Ecological and Environmental Data soiltexture Functions for Soil Texture Plot, Classification and Transformation “The Soil Texture Wizard” is a set of R functions designed to produce texture triangles (also called texture plots, texture diagrams, texture ternary plots), classify and transform soil textures data. These functions virtually allows to plot any soil texture triangle (classification) into any triangle geometry (isosceles, right-angled triangles, etc.). This set of function is expected to be useful to people using soil textures data from different soil texture classification or different particle size systems. Many (> 15) texture triangles from all around the world are predefined in the package. A simple text based graphical user interface is provided: soiltexture_gui().
832 Analysis of Ecological and Environmental Data SPACECAP A Program to Estimate Animal Abundance and Density using Bayesian Spatially-Explicit Capture-Recapture Models SPACECAP is a user-friendly software package for estimating animal densities using closed model capture-recapture sampling based on photographic captures using Bayesian spatially-explicit capture-recapture models. This approach offers advantage such as: substantially dealing with problems posed by individual heterogeneity in capture probabilities in conventional capture-recapture analyses. It also offers non-asymptotic inferences which are more appropriate for small samples of capture data typical of photo-capture studies.
833 Analysis of Ecological and Environmental Data SpatialExtremes Modelling Spatial Extremes Tools for the statistical modelling of spatial extremes using max-stable processes, copula or Bayesian hierarchical models. More precisely, this package allows (conditional) simulations from various parametric max-stable models, analysis of the extremal spatial dependence, the fitting of such processes using composite likelihoods or least square (simple max-stable processes only), model checking and selection and prediction. Other approaches (although not completely in agreement with the extreme value theory) are available such as the use of (spatial) copula and Bayesian hierarchical models assuming the so-called conditional assumptions. The latter approaches is handled through an (efficient) Gibbs sampler. Some key references: Davison et al. (2012) doi:10.1214/11-STS376, Padoan et al. (2010) doi:10.1198/jasa.2009.tm08577, Dombry et al. (2013) doi:10.1093/biomet/ass067.
834 Analysis of Ecological and Environmental Data StreamMetabolism Calculate Single Station Metabolism from Diurnal Oxygen Curves I provide functions to calculate Gross Primary Productivity, Net Ecosystem Production, and Ecosystem Respiration from single station diurnal Oxygen curves.
835 Analysis of Ecological and Environmental Data strucchange Testing, Monitoring, and Dating Structural Changes Testing, monitoring and dating structural changes in (linear) regression models. strucchange features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g., CUSUM, MOSUM, recursive/moving estimates) and F statistics, respectively. It is possible to monitor incoming data online using fluctuation processes. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals. Emphasis is always given to methods for visualizing the data.
836 Analysis of Ecological and Environmental Data surveillance Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hohle and Paul (2008) doi:10.1016/j.csda.2008.02.015. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) doi:10.18637/jss.v070.i10. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. ‘hhh4’ estimates models for (multivariate) count time series following Paul and Held (2011) doi:10.1002/sim.4177 and Meyer and Held (2014) doi:10.1214/14-AOAS743. ‘twinSIR’ models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hohle (2009) doi:10.1002/bimj.200900050. ‘twinstim’ estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) doi:10.1111/j.1541-0420.2011.01684.x. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) doi:10.18637/jss.v077.i11.
837 Analysis of Ecological and Environmental Data tiger TIme series of Grouped ERrors Temporally resolved groups of typical differences (errors) between two time series are determined and visualized
838 Analysis of Ecological and Environmental Data topmodel Implementation of the hydrological model TOPMODEL in R Set of hydrological functions including an R implementation of the hydrological model TOPMODEL, which is based on the 1995 FORTRAN version by Keith Beven. From version 0.7.0, the package is put into maintenance mode. New functions for hydrological analysis are now developed as part of the RHydro package. RHydro can be found on R-forge and is built on a set of dedicated S4 classes.
839 Analysis of Ecological and Environmental Data tseries Time Series Analysis and Computational Finance Time series analysis and computational finance.
840 Analysis of Ecological and Environmental Data unmarked Models for Data from Unmarked Animals Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates.
841 Analysis of Ecological and Environmental Data untb ecological drift under the UNTB A collection of utilities for biodiversity data. Includes the simulation of ecological drift under Hubbell’s Unified Neutral Theory of Biodiversity, and the calculation of various diagnostics such as Preston curves. Now includes functionality provided by Francois Munoz and Andrea Manica.
842 Analysis of Ecological and Environmental Data vegan (core) Community Ecology Package Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
843 Analysis of Ecological and Environmental Data vegetarian Jost Diversity Measures for Community Data This package computes diversity for community data sets using the methods outlined by Jost (2006, 2007). While there are differing opinions on the ideal way to calculate diversity (e.g. Magurran 2004), this method offers the advantage of providing diversity numbers equivalents, independent alpha and beta diversities, and the ability to incorporate ‘order’ (q) as a continuous measure of the importance of rare species in the metrics. The functions provided in this package largely correspond with the equations offered by Jost in the cited papers. The package computes alpha diversities, beta diversities, gamma diversities, and similarity indices. Confidence intervals for diversity measures are calculated using a bootstrap method described by Chao et al. (2008). For datasets with many samples (sites, plots), sim.table creates tables of all pairwise comparisons possible, and for grouped samples sim.groups calculates pairwise combinations of within- and between-group comparisons.
844 Analysis of Ecological and Environmental Data VGAM Vector Generalized Linear and Additive Models An implementation of about 6 major classes of statistical regression models. At the heart of it are the vector generalized linear and additive model (VGLM/VGAM) classes, and the book “Vector Generalized Linear and Additive Models: With an Implementation in R” (Yee, 2015) doi:10.1007/978-1-4939-2818-7 gives details of the statistical framework and VGAM package. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs (i.e., with smoothing). The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)―these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.
845 Analysis of Ecological and Environmental Data wasim Visualisation and analysis of output files of the hydrological model WASIM Helpful tools for data processing and visualisation of results of the hydrological model WASIM-ETH.
846 Analysis of Ecological and Environmental Data zoo S3 Infrastructure for Regular and Irregular Time Series (Z’s Ordered Observations) An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo’s key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics.
847 Design of Experiments (DoE) & Analysis of Experimental Data acebayes Optimal Bayesian Experimental Design using the ACE Algorithm Optimal Bayesian experimental design using the approximate coordinate exchange (ACE) algorithm.
848 Design of Experiments (DoE) & Analysis of Experimental Data agricolae (core) Statistical Procedures for Agricultural Research Original idea was presented in the thesis “A statistical analysis tool for agricultural research” to obtain the degree of Master on science, National Engineering University (UNI), Lima-Peru. Some experimental data for the examples come from the CIP and others research. Agricolae offers extensive functionality on experimental design especially for agricultural and plant breeding experiments, which can also be useful for other purposes. It supports planning of lattice, Alpha, Cyclic, Complete Block, Latin Square, Graeco-Latin Squares, augmented block, factorial, split and strip plot designs. There are also various analysis facilities for experimental data, e.g. treatment comparison procedures and several non-parametric tests comparison, biodiversity indexes and consensus cluster.
849 Design of Experiments (DoE) & Analysis of Experimental Data agridat Agricultural Datasets Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
850 Design of Experiments (DoE) & Analysis of Experimental Data AlgDesign (core) Algorithmic Experimental Design Algorithmic experimental designs. Calculates exact and approximate theory experimental designs for D,A, and I criteria. Very large designs may be created. Experimental designs may be blocked or blocked designs created from a candidate list, using several criteria. The blocking can be done when whole and within plot factors interact.
851 Design of Experiments (DoE) & Analysis of Experimental Data ALTopt Optimal Experimental Designs for Accelerated Life Testing Creates the optimal (D, U and I) designs for the accelerated life testing with right censoring or interval censoring. It uses generalized linear model (GLM) approach to derive the asymptotic variance-covariance matrix of regression coefficients. The failure time distribution is assumed to follow Weibull distribution with a known shape parameter and log-linear link functions are used to model the relationship between failure time parameters and stress variables. The acceleration model may have multiple stress factors, although most ALTs involve only two or less stress factors. ALTopt package also provides several plotting functions including contour plot, Fraction of Use Space (FUS) plot and Variance Dispersion graphs of Use Space (VDUS) plot.
852 Design of Experiments (DoE) & Analysis of Experimental Data asd Simulations for Adaptive Seamless Designs Package runs simulations for adaptive seamless designs with and without early outcomes for treatment selection and subpopulation type designs.
853 Design of Experiments (DoE) & Analysis of Experimental Data BatchExperiments Statistical Experiments on Batch Computing Clusters Extends the BatchJobs package to run statistical experiments on batch computing clusters. For further details see the project web page.
854 Design of Experiments (DoE) & Analysis of Experimental Data BayesMAMS Designing Bayesian Multi-Arm Multi-Stage Studies Calculating Bayesian sample sizes for multi-arm trials where several experimental treatments are compared to a common control, perhaps even at multiple stages.
855 Design of Experiments (DoE) & Analysis of Experimental Data bcrm Bayesian Continual Reassessment Method for Phase I Dose-Escalation Trials Implements a wide variety of one and two-parameter Bayesian CRM designs. The program can run interactively, allowing the user to enter outcomes after each cohort has been recruited, or via simulation to assess operating characteristics.
856 Design of Experiments (DoE) & Analysis of Experimental Data BHH2 Useful Functions for Box, Hunter and Hunter II Functions and data sets reproducing some examples in Box, Hunter and Hunter II. Useful for statistical design of experiments, especially factorial experiments.
857 Design of Experiments (DoE) & Analysis of Experimental Data binseqtest Exact Binary Sequential Designs and Analysis For a series of binary responses, create stopping boundary with exact results after stopping, allowing updating for missing assessments.
858 Design of Experiments (DoE) & Analysis of Experimental Data bioOED Sensitivity Analysis and Optimum Experiment Design for Microbial Inactivation Extends the bioinactivation package with functions for Sensitivity Analysis and Optimum Experiment Design.
859 Design of Experiments (DoE) & Analysis of Experimental Data blocksdesign Nested and Crossed Block Designs for Factorial, Fractional Factorial and Unstructured Treatment Sets The ‘blocksdesign’ package constructs nested block and D-optimal factorial designs for any unstructured or factorial treatment model of any size. The nested block designs can have repeated nesting down to any required depth of nesting with either a simple set of nested blocks or a crossed row-and-column blocks design at each level of nesting. The block design at each level of nesting is optimized for D-efficiency within the blocks of each preceding set of blocks. The block sizes in any particular block classification are always as nearly equal as possible and never differ by more than a single plot. Outputs include a table showing the allocation of treatments to blocks, a plan layout showing the allocation of treatments within blocks (unstructured treatment designs only) the achieved D and A-efficiency factors for the block and treatment design (factorial treatment designs only) and, where feasible, an A-efficiency upper bound for the block design (unstructured treatment designs only).
860 Design of Experiments (DoE) & Analysis of Experimental Data blockTools Block, Assign, and Diagnose Potential Interference in Randomized Experiments Blocks units into experimental blocks, with one unit per treatment condition, by creating a measure of multivariate distance between all possible pairs of units. Maximum, minimum, or an allowable range of differences between units on one variable can be set. Randomly assign units to treatment conditions. Diagnose potential interference between units assigned to different treatment conditions. Write outputs to .tex and .csv files.
861 Design of Experiments (DoE) & Analysis of Experimental Data BOIN Bayesian Optimal INterval (BOIN) Design for Single-Agent and Drug- Combination Phase I Clinical Trials The Bayesian optimal interval (BOIN) design is a novel phase I clinical trial design for finding the maximum tolerated dose (MTD). It can be used to design both single-agent and drug-combination trials. The BOIN design is motivated by the top priority and concern of clinicians when testing a new drug, which is to effectively treat patients and minimize the chance of exposing them to subtherapeutic or overly toxic doses. The prominent advantage of the BOIN design is that it achieves simplicity and superior performance at the same time. The BOIN design is algorithm-based and can be implemented in a simple way similar to the traditional 3+3 design. The BOIN design yields an average performance that is comparable to that of the continual reassessment method (CRM, one of the best model-based designs) in terms of selecting the MTD, but has a substantially lower risk of assigning patients to subtherapeutic or overly toxic doses.
862 Design of Experiments (DoE) & Analysis of Experimental Data BsMD Bayes Screening and Model Discrimination Bayes screening and model discrimination follow-up designs.
863 Design of Experiments (DoE) & Analysis of Experimental Data choiceDes Design Functions for Choice Studies This package consists of functions to design DCMs and other types of choice studies (including MaxDiff and other tradeoffs)
864 Design of Experiments (DoE) & Analysis of Experimental Data CombinS Construction Methods of some Series of PBIB Designs Series of partially balanced incomplete block designs (PBIB) based on the combinatory method (S) introduced in (Imane Rezgui et al, 2014) doi:10.3844/jmssp.2014.45.48; and it gives their associated U-type design.
865 Design of Experiments (DoE) & Analysis of Experimental Data conf.design (core) Construction of factorial designs This small library contains a series of simple tools for constructing and manipulating confounded and fractional factorial designs.
866 Design of Experiments (DoE) & Analysis of Experimental Data crmPack Object-Oriented Implementation of CRM Designs Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules.
867 Design of Experiments (DoE) & Analysis of Experimental Data crossdes (core) Construction of Crossover Designs Contains functions for the construction of carryover balanced crossover designs. In addition contains functions to check given designs for balance.
868 Design of Experiments (DoE) & Analysis of Experimental Data Crossover Analysis and Search of Crossover Designs Package Crossover provides different crossover designs from combinatorial or search algorithms as well as from literature and a GUI to access them.
869 Design of Experiments (DoE) & Analysis of Experimental Data dae Functions Useful in the Design and ANOVA of Experiments The content falls into the following groupings: (i) Data, (ii) Factor manipulation functions, (iii) Design functions, (iv) ANOVA functions, (v) Matrix functions, (vi) Projector and canonical efficiency functions, and (vii) Miscellaneous functions. There is a vignette describing how to use the Design functions for randomizing and assessing designs available in the file ‘daeDesignNotes.pdf’. The ANOVA functions facilitate the extraction of information when the ‘Error’ function has been used in the call to ‘aov’.
870 Design of Experiments (DoE) & Analysis of Experimental Data daewr Design and Analysis of Experiments with R Contains Data frames and functions used in the book “Design and Analysis of Experiments with R”.
871 Design of Experiments (DoE) & Analysis of Experimental Data designGG Computational tool for designing genetical genomics experiments The package provides R scripts for designing genetical genomics experiments.
872 Design of Experiments (DoE) & Analysis of Experimental Data designGLMM Finding Optimal Block Designs for a Generalised Linear Mixed Model Use simulated annealing to find optimal designs for Poisson regression models with blocks.
873 Design of Experiments (DoE) & Analysis of Experimental Data designmatch Matched Samples that are Balanced and Representative by Design Includes functions for the construction of matched samples that are balanced and representative by design. Among others, these functions can be used for matching in observational studies with treated and control units, with cases and controls, in related settings with instrumental variables, and in discontinuity designs. Also, they can be used for the design of randomized experiments, for example, for matching before randomization. By default, ‘designmatch’ uses the ‘GLPK’ optimization solver, but its performance is greatly enhanced by the ‘Gurobi’ optimization solver and its associated R interface. For their installation, please follow the instructions at http://user.gurobi.com/download/gurobi-optimizer and http://www.gurobi.com/documentation/7.0/refman/r_api_overview.html. We have also included directions in the gurobi_installation file in the inst folder.
874 Design of Experiments (DoE) & Analysis of Experimental Data desirability Function Optimization and Ranking via Desirability Functions S3 classes for multivariate optimization using the desirability function by Derringer and Suich (1980).
875 Design of Experiments (DoE) & Analysis of Experimental Data desplot Plotting Field Plans for Agricultural Experiments A function for plotting maps of agricultural field experiments that are laid out in grids.
876 Design of Experiments (DoE) & Analysis of Experimental Data dfcomb Phase I/II Adaptive Dose-Finding Design for Combination Studies Phase I/II adaptive dose-finding design for combination studies. Several methods are proposed depending on the type of combinations: (1) the combination of two cytotoxic agents, and (2) combination of a molecularly targeted agent with a cytotoxic agent.
877 Design of Experiments (DoE) & Analysis of Experimental Data dfcrm Dose-finding by the continual reassessment method This package provides functions to run the CRM and TITE-CRM in phase I trials and calibration tools for trial planning purposes.
878 Design of Experiments (DoE) & Analysis of Experimental Data dfmta Phase I/II Adaptive Dose-Finding Design for MTA Phase I/II adaptive dose-finding design for single-agent Molecularly Targeted Agent (MTA), according to the paper “Phase I/II Dose-Finding Design for Molecularly Targeted Agent: Plateau Determination using Adaptive Randomization”.
879 Design of Experiments (DoE) & Analysis of Experimental Data dfpk Bayesian Dose-Finding Designs using Pharmacokinetics (PK) for Phase I Clinical Trials Statistical methods involving PK measures are provided, in the dose allocation process during a Phase I clinical trials. These methods enter pharmacokinetics (PK) in the dose finding designs in different ways, including covariates models, dependent variable or hierarchical models. This package provides functions to generate data from several scenarios and functions to run simulations which their objective is to determine the maximum tolerated dose (MTD).
880 Design of Experiments (DoE) & Analysis of Experimental Data DiceDesign Designs of Computer Experiments Space-Filling Designs and Uniformity Criteria.
881 Design of Experiments (DoE) & Analysis of Experimental Data DiceEval Construction and Evaluation of Metamodels Estimation, validation and prediction of models of different types : linear models, additive models, MARS,PolyMARS and Kriging.
882 Design of Experiments (DoE) & Analysis of Experimental Data DiceKriging Kriging Methods for Computer Experiments Estimation, validation and prediction of kriging models. Important functions : km, print.km, plot.km, predict.km.
883 Design of Experiments (DoE) & Analysis of Experimental Data DiceView Plot methods for computer experiments design and surrogate View 2D/3D sections or contours of computer experiments designs, surrogates or test functions.
884 Design of Experiments (DoE) & Analysis of Experimental Data docopulae Optimal Designs for Copula Models A direct approach to optimal designs for copula models based on the Fisher information. Provides flexible functions for building joint PDFs, evaluating the Fisher information and finding optimal designs. It includes an extensible solution to summation and integration called ‘nint’, functions for transforming, plotting and comparing designs, as well as a set of tools for common low-level tasks.
885 Design of Experiments (DoE) & Analysis of Experimental Data DoE.base (core) Full Factorials, Orthogonal Arrays and Base Utilities for DoE Packages Package DoE.base creates full factorial experimental designs and designs based on orthogonal arrays for (industrial) experiments. Additionally, it provides utility functions for the class design, which is also used by other packages for designed experiments.
886 Design of Experiments (DoE) & Analysis of Experimental Data DoE.MIParray Creation of Arrays by Mixed Integer Programming ‘CRAN’ package ‘DoE.base’ and non-‘CRAN’ packages ‘gurobi’ and ‘Rmosek’ (newer version than that on ‘CRAN’) are enhanced with functionality for the creation of optimized arrays for experimentation, where optimization is in terms of generalized minimum aberration. It is also possible to optimally extend existing arrays to larger run size. Optimization requires the availability of at least one of the commercial products ‘Gurobi’ or ‘Mosek’ (free academic licenses available for both). For installing ‘Gurobi’ and its R package ‘gurobi’, follow instructions at http://www.gurobi.com/downloads/gurobi-optimizer and http://www.gurobi.com/documentation/7.5/refman/r_api_overview.html. For installing ‘Mosek’ and its R package ‘Rmosek’, follow instructions at https://www.mosek.com/downloads/ and http://docs.mosek.com/8.1/rmosek/install-interface.html.
887 Design of Experiments (DoE) & Analysis of Experimental Data DoE.wrapper (core) Wrapper Package for Design of Experiments Functionality Various kinds of designs for (industrial) experiments can be created. The package uses, and sometimes enhances, design generation routines from other packages. So far, response surface designs from package rsm, latin hypercube samples from packages lhs and DiceDesign, and D-optimal designs from package AlgDesign have been implemented.
888 Design of Experiments (DoE) & Analysis of Experimental Data DoseFinding Planning and Analyzing Dose Finding Experiments The DoseFinding package provides functions for the design and analysis of dose-finding experiments (with focus on pharmaceutical Phase II clinical trials). It provides functions for: multiple contrast tests, fitting non-linear dose-response models (using Bayesian and non-Bayesian estimation), calculating optimal designs and an implementation of the MCPMod methodology.
889 Design of Experiments (DoE) & Analysis of Experimental Data dynaTree Dynamic Trees for Learning and Design Inference by sequential Monte Carlo for dynamic tree regression and classification models with hooks provided for sequential design and optimization, fully online learning with drift, variable selection, and sensitivity analysis of inputs. Illustrative examples from the original dynamic trees paper are facilitated by demos in the package; see demo(package=“dynaTree”).
890 Design of Experiments (DoE) & Analysis of Experimental Data easypower Sample Size Estimation for Experimental Designs Power analysis is used in the estimation of sample sizes for experimental designs. Most programs and R packages will only output the highest recommended sample size to the user. Often the user input can be complicated and computing multiple power analyses for different treatment comparisons can be time consuming. This package simplifies the user input and allows the user to view all of the sample size recommendations or just the ones they want to see. The calculations used to calculate the recommended sample sizes are from the ‘pwr’ package.
891 Design of Experiments (DoE) & Analysis of Experimental Data edesign Maximum Entropy Sampling An implementation of maximum entropy sampling for spatial data is provided. An exact branch-and-bound algorithm as well as greedy and dual greedy heuristics are included.
892 Design of Experiments (DoE) & Analysis of Experimental Data EngrExpt Data sets from “Introductory Statistics for Engineering Experimentation” Datasets from Nelson, Coffin and Copeland “Introductory Statistics for Engineering Experimentation” (Elsevier, 2003) with sample code.
893 Design of Experiments (DoE) & Analysis of Experimental Data experiment experiment: R package for designing and analyzing randomized experiments The package provides various statistical methods for designing and analyzing randomized experiments. One main functionality of the package is the implementation of randomized-block and matched-pair designs based on possibly multivariate pre-treatment covariates. The package also provides the tools to analyze various randomized experiments including cluster randomized experiments, randomized experiments with noncompliance, and randomized experiments with missing data.
894 Design of Experiments (DoE) & Analysis of Experimental Data ez Easy Analysis and Visualization of Factorial Experiments Facilitates easy analysis of factorial experiments, including purely within-Ss designs (a.k.a. “repeated measures”), purely between-Ss designs, and mixed within-and-between-Ss designs. The functions in this package aim to provide simple, intuitive and consistent specification of data analysis and visualization. Visualization functions also include design visualization for pre-analysis data auditing, and correlation matrix visualization. Finally, this package includes functions for non-parametric analysis, including permutation tests and bootstrap resampling. The bootstrap function obtains predictions either by cell means or by more advanced/powerful mixed effects models, yielding predictions and confidence intervals that may be easily visualized at any level of the experiment’s design.
895 Design of Experiments (DoE) & Analysis of Experimental Data FMC Factorial Experiments with Minimum Level Changes Generate cost effective minimally changed run sequences for symmetrical as well as asymmetrical factorial designs.
896 Design of Experiments (DoE) & Analysis of Experimental Data FrF2 (core) Fractional Factorial Designs with 2-Level Factors Regular and non-regular Fractional Factorial 2-level designs can be created. Furthermore, analysis tools for Fractional Factorial designs with 2-level factors are offered (main effects and interaction plots for all factors simultaneously, cube plot for looking at the simultaneous effects of three factors, full or half normal plot, alias structure in a more readable format than with the built-in function alias).
897 Design of Experiments (DoE) & Analysis of Experimental Data FrF2.catlg128 Catalogues of resolution IV 128 run 2-level fractional factorials up to 33 factors that do have 5-letter words This package provides catalogues of resolution IV regular fractional factorial designs in 128 runs for up to 33 2-level factors. The catalogues are complete, excluding resolution IV designs without 5-letter words, because these do not add value for a search for clear designs. The previous package version 1.0 with complete catalogues up to 24 runs (24 runs and a namespace added later) can be downloaded from the authors website.
898 Design of Experiments (DoE) & Analysis of Experimental Data GAD GAD: Analysis of variance from general principles This package analyses complex ANOVA models with any combination of orthogonal/nested and fixed/random factors, as described by Underwood (1997). There are two restrictions: (i) data must be balanced; (ii) fixed nested factors are not allowed. Homogeneity of variances is checked using Cochran’s C test and ‘a posteriori’ comparisons of means are done using Student-Newman-Keuls (SNK) procedure.
899 Design of Experiments (DoE) & Analysis of Experimental Data geospt Geostatistical Analysis and Design of Optimal Spatial Sampling Networks Estimation of the variogram through trimmed mean, radial basis functions (optimization, prediction and cross-validation), summary statistics from cross-validation, pocket plot, and design of optimal sampling networks through sequential and simultaneous points methods.
900 Design of Experiments (DoE) & Analysis of Experimental Data granova Graphical Analysis of Variance This small collection of functions provides what we call elemental graphics for display of anova results. The term elemental derives from the fact that each function is aimed at construction of graphical displays that afford direct visualizations of data with respect to the fundamental questions that drive the particular anova methods. The two main functions are granova.1w (a graphic for one way anova) and granova.2w (a corresponding graphic for two way anova). These functions were written to display data for any number of groups, regardless of their sizes (however, very large data sets or numbers of groups can be problematic). For these two functions a specialized approach is used to construct data-based contrast vectors for which anova data are displayed. The result is that the graphics use straight lines, and when appropriate flat surfaces, to facilitate clear interpretations while being faithful to the standard effect tests in anova. The graphic results are complementary to standard summary tables for these two basic kinds of analysis of variance; numerical summary results of analyses are also provided as side effects. Two additional functions are granova.ds (for comparing two dependent samples), and granova.contr (which provides graphic displays for a priori contrasts). All functions provide relevant numerical results to supplement the graphic displays of anova data. The graphics based on these functions should be especially helpful for learning how the methods have been applied to answer the question(s) posed. This means they can be particularly helpful for students and non-statistician analysts. But these methods should be quite generally helpful for work-a-day applications of all kinds, as they can help to identify outliers, clusters or patterns, as well as highlight the role of non-linear transformations of data. In the case of granova.1w and granova.ds especially, several arguments are provided to facilitate flexibility in the construction of graphics that accommodate diverse features of data, according to their corresponding display requirements. See the help files for individual functions.
901 Design of Experiments (DoE) & Analysis of Experimental Data GroupSeq A GUI-Based Program to Compute Probabilities Regarding Group Sequential Designs A graphical user interface to compute group sequential designs based on normally distributed test statistics, particularly critical boundaries, power, drift, and confidence intervals of such designs. All computations are based on the alpha spending approach by Lan-DeMets with various alpha spending functions being available to choose among.
902 Design of Experiments (DoE) & Analysis of Experimental Data gsbDesign Group Sequential Bayes Design Group Sequential Operating Characteristics for Clinical, Bayesian two-arm Trials with known Sigma and Normal Endpoints.
903 Design of Experiments (DoE) & Analysis of Experimental Data gsDesign Group Sequential Design Derives group sequential designs and describes their properties.
904 Design of Experiments (DoE) & Analysis of Experimental Data gset Group Sequential Design in Equivalence Studies calculate equivalence and futility boundaries based on the exact bivariate \(t\) test statistics for group sequential designs in studies with equivalence hypotheses.
905 Design of Experiments (DoE) & Analysis of Experimental Data hiPOD hierarchical Pooled Optimal Design Based on hierarchical modeling, this package provides a few practical functions to find and present the optimal designs for a pooled NGS design.
906 Design of Experiments (DoE) & Analysis of Experimental Data ibd INCOMPLETE BLOCK DESIGNS This package contains several utility functions related to incomplete block designs. The package contains function to generate efficient incomplete block designs with given numbers of treatments, blocks and block size. The package also contains function to generate an incomplete block design with specified concurrence matrix. There are functions to generate balanced treatment incomplete block designs and incomplete block designs for test versus control treatments comparisons with specified concurrence matrix. Package also allows performing analysis of variance of data and computing least square means of factors from experiments using a connected incomplete block design. Tests of hypothesis of treatment contrasts in incomplete block design set up is supported.
907 Design of Experiments (DoE) & Analysis of Experimental Data ICAOD Imperialist Competitive Algorithm for Optimal Designs Finding locally D-optimal, minimax D-optimal, standardized maximin D-optimal, optim-on-the-average and multiple objective optimal designs for nonlinear models. Different Fisher information matrices can also be set by user. There are also useful functions for verifying the optimality of the designs with respect to different criteria by equivalence theorem. ICA is a meta-heuristic evolutionary algorithm inspired from the socio-political process of humans. See Masoudi et al. (2016) doi:10.1016/j.csda.2016.06.014.
908 Design of Experiments (DoE) & Analysis of Experimental Data idefix Efficient Designs for Discrete Choice Experiments Generates efficient designs for discrete choice experiments based on the multinomial logit model, and individually adapted designs for the mixed multinomial logit model. Crabbe M, Akinc D and Vandebroek M (2014) doi:10.1016/j.trb.2013.11.008.
909 Design of Experiments (DoE) & Analysis of Experimental Data JMdesign Joint Modeling of Longitudinal and Survival Data - Power Calculation Performs power calculations for joint modeling of longitudinal and survival data with k-th order trajectories when the variance-covariance matrix, Sigma_theta, is unknown.
910 Design of Experiments (DoE) & Analysis of Experimental Data LDOD Finding Locally D-optimal optimal designs for some nonlinear and generalized linear models this package provides functions for Finding Locally D-optimal designs for Logistic, Negative Binomial, Poisson, Michaelis-Menten, Exponential, Log-Linear, Emax, Richards, Weibull and Inverse Quadratic regression models and also functions for auto-constructing Fisher information matrix and Frechet derivative based on some input variables and without user-interfere.
911 Design of Experiments (DoE) & Analysis of Experimental Data lhs Latin Hypercube Samples Provides a number of methods for creating and augmenting Latin Hypercube Samples.
912 Design of Experiments (DoE) & Analysis of Experimental Data MAMS Designing Multi-Arm Multi-Stage Studies Designing multi-arm multi-stage studies with (asymptotically) normal endpoints and known variance.
913 Design of Experiments (DoE) & Analysis of Experimental Data MaxPro Maximum Projection Designs Generate a maximum projection (MaxPro) design, a MaxPro Latin hypercube design or improve an initial design based on the MaxPro criterion. Details of the MaxPro criterion can be found in: Joseph, V. R., Gul, E., and Ba, S. (2015) “Maximum Projection Designs for Computer Experiments”, Biometrika.
914 Design of Experiments (DoE) & Analysis of Experimental Data MBHdesign Spatial Designs for Ecological and Environmental Surveys Provides spatially balanced designs from a set of (contiguous) potential sampling locations in a study region. Accommodates , without detrimental effects on spatial balance, sites that the researcher wishes to include in the survey for reasons other than the current randomisation (legacy sites).
915 Design of Experiments (DoE) & Analysis of Experimental Data minimalRSD Minimally Changed CCD and BBD Generate central composite designs (CCD)with full as well as fractional factorial points (half replicate) and Box Behnken designs (BBD) with minimally changed run sequence.
916 Design of Experiments (DoE) & Analysis of Experimental Data minimaxdesign Minimax and Minimax Projection Designs Provides two main functions: mMcPSO() and miniMaxPro(), which generates minimax designs and minimax projection designs using a hybrid clustering - particle swarm optimization (PSO) algorithm. These designs can be used in a variety of settings, e.g., as space-filling designs for computer experiments or sensor allocation designs. A detailed description of the two designs and the employed algorithms can be found in Mak and Joseph (2017) doi:10.1080/10618600.2017.1302881.
917 Design of Experiments (DoE) & Analysis of Experimental Data mixexp Design and Analysis of Mixture Experiments Functions for creating designs for mixture experiments, making ternary contour plots, and making mixture effect plots.
918 Design of Experiments (DoE) & Analysis of Experimental Data mkssd Efficient multi-level k-circulant supersaturated designs mkssd is a package that generates efficient balanced non-aliased multi-level k-circulant supersaturated designs by interchanging the elements of the generator vector. The package tries to generate a supersaturated design that has chisquare efficiency more than user specified efficiency level (mef). The package also displays the progress of generation of an efficient multi-level k-circulant design through a progress bar. The progress of 100% means that one full round of interchange is completed. More than one full round (typically 4-5 rounds) of interchange may be required for larger designs.
919 Design of Experiments (DoE) & Analysis of Experimental Data mxkssd Efficient mixed-level k-circulant supersaturated designs mxkssd is a package that generates efficient balanced mixed-level k-circulant supersaturated designs by interchanging the elements of the generator vector. The package tries to generate a supersaturated design that has EfNOD efficiency more than user specified efficiency level (mef). The package also displays the progress of generation of an efficient mixed-level k-circulant design through a progress bar. The progress of 100 per cent means that one full round of interchange is completed. More than one full round (typically 4-5 rounds) of interchange may be required for larger designs.
920 Design of Experiments (DoE) & Analysis of Experimental Data OBsMD Objective Bayesian Model Discrimination in Follow-Up Designs Implements the objective Bayesian methodology proposed in Consonni and Deldossi in order to choose the optimal experiment that better discriminate between competing models. G.Consonni, L. Deldossi (2014) Objective Bayesian Model Discrimination in Follow-up Experimental Designs, Test. doi:10.1007/s11749-015-0461-3.
921 Design of Experiments (DoE) & Analysis of Experimental Data odr Optimal Design and Statistical Power of Cost-Efficient Multilevel Randomized Trials Calculate the optimal sample allocation that minimizes variance of treatment effect in a multilevel randomized trial under fixed budget and cost structure, perform power analyses with and without accommodating costs and budget. The reference for proposed methods is: Shen, Z., & Kelcey, B. (under review). Optimal design of cluster randomized trials under condition- and unit-specific cost structures. 2018 American Educational Research Association (AERA) annual conference.
922 Design of Experiments (DoE) & Analysis of Experimental Data OPDOE OPtimal Design Of Experiments Experimental Design
923 Design of Experiments (DoE) & Analysis of Experimental Data optbdmaeAT Optimal Block Designs for Two-Colour cDNA Microarray Experiments Computes A-, MV-, D- and E-optimal or near-optimal block designs for two-colour cDNA microarray experiments using the linear fixed effects and mixed effects models where the interest is in a comparison of all possible elementary treatment contrasts. The algorithms used in this package are based on the treatment exchange and array exchange algorithms of Debusho, Gemechu and Haines (2016, unpublished). The package also provides an optional method of using the graphical user interface (GUI) R package tcltk to ensure that it is user friendly.
924 Design of Experiments (DoE) & Analysis of Experimental Data optDesignSlopeInt Optimal Designs for Estimating the Slope Divided by the Intercept Compute optimal experimental designs that measure the slope divided by the intercept.
925 Design of Experiments (DoE) & Analysis of Experimental Data OptGS Near-Optimal and Balanced Group-Sequential Designs for Clinical Trials with Continuous Outcomes Functions to find near-optimal multi-stage designs for continuous outcomes.
926 Design of Experiments (DoE) & Analysis of Experimental Data OptimalDesign Algorithms for D-, A-, and IV-Optimal Designs Algorithms for D-, A- and IV-optimal designs of experiments. Some of the functions in this package require the ‘gurobi’ software and its accompanying R package. For their installation, please follow the instructions at <www.gurobi.com> and the file gurobi_inst.txt, respectively.
927 Design of Experiments (DoE) & Analysis of Experimental Data OptimaRegion Confidence Regions for Optima Computes confidence regions on the location of response surface optima.
928 Design of Experiments (DoE) & Analysis of Experimental Data OptInterim Optimal Two and Three Stage Designs for Single-Arm and Two-Arm Randomized Controlled Trials with a Long-Term Binary Endpoint Optimal two and three stage designs monitoring time-to-event endpoints at a specified timepoint
929 Design of Experiments (DoE) & Analysis of Experimental Data optrcdmaeAT Optimal Row-Column Designs for Two-Colour cDNA Microarray Experiments Computes A-, MV-, D- and E-optimal or near-optimal row-column designs for two-colour cDNA microarray experiments using the linear fixed effects and mixed effects models where the interest is in a comparison of all pairwise treatment contrasts. The algorithms used in this package are based on the array exchange and treatment exchange algorithms adopted from Debusho, Gemechu and Haines (2016, unpublished) algorithms after adjusting for the row-column designs setup. The package also provides an optional method of using the graphical user interface (GUI) R package tcltk to ensure that it is user friendly.
930 Design of Experiments (DoE) & Analysis of Experimental Data osDesign Design and analysis of observational studies The osDesign serves for planning an observational study. Currently, functionality is focused on the two-phase and case-control designs. Functions in this packages provides Monte Carlo based evaluation of operating characteristics such as powers for estimators of the components of a logistic regression model.
931 Design of Experiments (DoE) & Analysis of Experimental Data PBIBD Partially Balanced Incomplete Block Designs It constructs four series of PBIB designs and also assists in calculating the efficiencies of PBIB Designs with any number of associate classes. This will help the researchers in adopting a PBIB designs and calculating the efficiencies of any PBIB design very quickly and efficiently.
932 Design of Experiments (DoE) & Analysis of Experimental Data PGM2 Nested Resolvable Designs and their Associated Uniform Designs Construction method of nested resolvable designs from a projective geometry defined on Galois field of order 2. The obtained Resolvable designs are used to build uniform design. The presented results are based on https://eudml.org/doc/219563 and A. Boudraa et al. (See references).
933 Design of Experiments (DoE) & Analysis of Experimental Data ph2bayes Bayesian Single-Arm Phase II Designs An implementation of Bayesian single-arm phase II design methods for binary outcome based on posterior probability and predictive probability.
934 Design of Experiments (DoE) & Analysis of Experimental Data ph2bye Phase II Clinical Trial Design Using Bayesian Methods Calculate the Bayesian posterior/predictive probability and determine the sample size and stopping boundaries for single-arm Phase II design.
935 Design of Experiments (DoE) & Analysis of Experimental Data pid Process Improvement using Data A collection of scripts and data files for the statistics text: “Process Improvement using Data”. The package contains code for designed experiments, data sets and other convenience functions used in the book.
936 Design of Experiments (DoE) & Analysis of Experimental Data pipe.design Dual-Agent Dose Escalation for Phase I Trials using the PIPE Design Implements the Product of Independent beta Probabilities dose Escalation (PIPE) design for dual-agent Phase I trials as described in Mander AP, Sweeting MJ (2015) doi:10.1002/sim.6434.
937 Design of Experiments (DoE) & Analysis of Experimental Data planor (core) Generation of Regular Factorial Designs Automatic generation of regular factorial designs, including fractional designs, orthogonal block designs, row-column designs and split-plots.
938 Design of Experiments (DoE) & Analysis of Experimental Data plgp Particle Learning of Gaussian Processes Sequential Monte Carlo inference for fully Bayesian Gaussian process (GP) regression and classification models by particle learning (PL). The sequential nature of inference and the active learning (AL) hooks provided facilitate thrifty sequential design (by entropy) and optimization (by improvement) for classification and regression models, respectively. This package essentially provides a generic PL interface, and functions (arguments to the interface) which implement the GP models and AL heuristics. Functions for a special, linked, regression/classification GP model and an integrated expected conditional improvement (IECI) statistic is provides for optimization in the presence of unknown constraints. Separable and isotropic Gaussian, and single-index correlation functions are supported. See the examples section of ?plgp and demo(package=“plgp”) for an index of demos
939 Design of Experiments (DoE) & Analysis of Experimental Data PopED Population (and Individual) Optimal Experimental Design Optimal experimental designs for both population and individual studies based on nonlinear mixed-effect models. Often this is based on a computation of the Fisher Information Matrix. This package was developed for pharmacometric problems, and examples and predefined models are available for these types of systems.
940 Design of Experiments (DoE) & Analysis of Experimental Data powerAnalysis Power Analysis in Experimental Design Basic functions for power analysis and effect size calculation.
941 Design of Experiments (DoE) & Analysis of Experimental Data powerbydesign Power Estimates for ANOVA Designs Functions for bootstrapping the power of ANOVA designs based on estimated means and standard deviations of the conditions. Please refer to the documentation of the boot.power.anova() function for further details.
942 Design of Experiments (DoE) & Analysis of Experimental Data powerGWASinteraction Power Calculations for GxE and GxG Interactions for GWAS Analytical power calculations for GxE and GxG interactions for case-control studies of candidate genes and genome-wide association studies (GWAS). This includes power calculation for four two-step screening and testing procedures. It can also calculate power for GxE and GxG without any screening.
943 Design of Experiments (DoE) & Analysis of Experimental Data PwrGSD Power in a Group Sequential Design Tools the evaluation of interim analysis plans for sequentially monitored trials on a survival endpoint; tools to construct efficacy and futility boundaries, for deriving power of a sequential design at a specified alternative, template for evaluating the performance of candidate plans at a set of time varying alternatives.
944 Design of Experiments (DoE) & Analysis of Experimental Data qtlDesign Design of QTL experiments Tools for the design of QTL experiments
945 Design of Experiments (DoE) & Analysis of Experimental Data qualityTools Statistical Methods for Quality Science Contains methods associated with the Define, Measure, Analyze, Improve and Control (i.e. DMAIC) cycle of the Six Sigma Quality Management methodology.It covers distribution fitting, normal and non-normal process capability indices, techniques for Measurement Systems Analysis especially gage capability indices and Gage Repeatability (i.e Gage RR) and Reproducibility studies, factorial and fractional factorial designs as well as response surface methods including the use of desirability functions. Improvement via Six Sigma is project based strategy that covers 5 phases: Define - Pareto Chart; Measure - Probability and Quantile-Quantile Plots, Process Capability Indices for various distributions and Gage RR Analyze i.e. Pareto Chart, Multi-Vari Chart, Dot Plot; Improve - Full and fractional factorial, response surface and mixture designs as well as the desirability approach for simultaneous optimization of more than one response variable. Normal, Pareto and Lenth Plot of effects as well as Interaction Plots; Control - Quality Control Charts can be found in the ‘qcc’ package. The focus is on teaching the statistical methodology used in the Quality Sciences.
946 Design of Experiments (DoE) & Analysis of Experimental Data RcmdrPlugin.DoE R Commander Plugin for (industrial) Design of Experiments The package provides a platform-independent GUI for design of experiments. It is implemented as a plugin to the R-Commander, which is a more general graphical user interface for statistics in R based on tcl/tk. DoE functionality can be accessed through the menu Design that is added to the R-Commander menus.
947 Design of Experiments (DoE) & Analysis of Experimental Data rodd Optimal Discriminating Designs A collection of functions for numerical construction of optimal discriminating designs. At the current moment T-optimal designs (which maximize the lower bound for the power of F-test for regression model discrimination), KL-optimal designs (for lognormal errors) and their robust analogues can be calculated with the package.
948 Design of Experiments (DoE) & Analysis of Experimental Data RPPairwiseDesign Resolvable partially pairwise balanced design and Space-filling design via association scheme Using some association schemes to obtain a new series of resolvable partially pairwise balanced designs (RPPBD) and space-filling designs.
949 Design of Experiments (DoE) & Analysis of Experimental Data rsm (core) Response-Surface Analysis Provides functions to generate response-surface designs, fit first- and second-order response-surface models, make surface plots, obtain the path of steepest ascent, and do canonical analysis. A good reference on these methods is Chapter 10 of Wu, C-F J and Hamada, M (2009) “Experiments: Planning, Analysis, and Parameter Design Optimization” ISBN 978-0-471-69946-0.
950 Design of Experiments (DoE) & Analysis of Experimental Data rsurface Design of Rotatable Central Composite Experiments and Response Surface Analysis Produces tables with the level of replication (number of replicates) and the experimental uncoded values of the quantitative factors to be used for rotatable Central Composite Design (CCD) experimentation and a 2-D contour plot of the corresponding variance of the predicted response according to Mead et al. (2012) doi:10.1017/CBO9781139020879 design_ccd(), and analyzes CCD data with response surface methodology ccd_analysis(). A rotatable CCD provides values of the variance of the predicted response that are concentrically distributed around the average treatment combination used in the experimentation, which with uniform precision (implied by the use of several replicates at the average treatment combination) improves greatly the search and finding of an optimum response. These properties of a rotatable CCD represent undeniable advantages over the classical factorial design, as discussed by Panneton et al. (1999) doi:10.13031/2013.13267 and Mead et al. (2012) doi:10.1017/CBO9781139020879.018 among others.
951 Design of Experiments (DoE) & Analysis of Experimental Data SensoMineR Sensory data analysis with R an R package for analysing sensory data
952 Design of Experiments (DoE) & Analysis of Experimental Data seqDesign Simulation and Group Sequential Monitoring of Randomized Two-Stage Treatment Efficacy Trials with Time-to-Event Endpoints A modification of the preventive vaccine efficacy trial design of Gilbert, Grove et al. (2011, Statistical Communications in Infectious Diseases) is implemented, with application generally to individual-randomized clinical trials with multiple active treatment groups and a shared control group, and a study endpoint that is a time-to-event endpoint subject to right-censoring. The design accounts for the issues that the efficacy of the treatment/vaccine groups may take time to accrue while the multiple treatment administrations/vaccinations are given; there is interest in assessing the durability of treatment efficacy over time; and group sequential monitoring of each treatment group for potential harm, non-efficacy/efficacy futility, and high efficacy is warranted. The design divides the trial into two stages of time periods, where each treatment is first evaluated for efficacy in the first stage of follow-up, and, if and only if it shows significant treatment efficacy in stage one, it is evaluated for longer-term durability of efficacy in stage two. The package produces plots and tables describing operating characteristics of a specified design including an unconditional power for intention-to-treat and per-protocol/as-treated analyses; trial duration; probabilities of the different possible trial monitoring outcomes (e.g., stopping early for non-efficacy); unconditional power for comparing treatment efficacies; and distributions of numbers of endpoint events occurring after the treatments/vaccinations are given, useful as input parameters for the design of studies of the association of biomarkers with a clinical outcome (surrogate endpoint problem). The code can be used for a single active treatment versus control design and for a single-stage design.
953 Design of Experiments (DoE) & Analysis of Experimental Data sFFLHD Sequential Full Factorial-Based Latin Hypercube Design Gives design points from a sequential full factorial-based Latin hypercube design, as described in Duan, Ankenman, Sanchez, and Sanchez (2015, Technometrics, doi:10.1080/00401706.2015.1108233).
954 Design of Experiments (DoE) & Analysis of Experimental Data simrel Linear Model Data Simulation and Design of Computer Experiments Facilitates data simulation from a random regression model where the data properties can be controlled by a few input parameters. The data simulation is based on the concept of relevant latent components and relevant predictors, and was developed for the purpose of testing methods for variable selection for prediction. Included are also functions for designing computer experiments in order to investigate the effects of the data properties on the performance of the tested methods. The design is constructed using the Multi-level Binary Replacement (MBR) design approach which makes it possible to set up fractional designs for multi-factor problems with potentially many levels for each factor.
955 Design of Experiments (DoE) & Analysis of Experimental Data skpr (core) Design of Experiments Suite: Generate and Evaluate Optimal Designs Generates and evaluates D, I, A, Alias, E, T, and G optimal designs. Supports generation and evaluation of split/split-split/…/N-split plot designs. Includes parametric and Monte Carlo power evaluation functions, and supports calculating power for censored responses. Provides a framework to evaluate power using functions provided in other packages or written by the user. Includes a Shiny graphical user interface that displays the underlying code used to create and evaluate the design to improve ease-of-use and make analyses more reproducible.
956 Design of Experiments (DoE) & Analysis of Experimental Data SLHD Maximin-Distance (Sliced) Latin Hypercube Designs Generate the optimal Latin Hypercube Designs (LHDs) for computer experiments with quantitative factors and the optimal Sliced Latin Hypercube Designs (SLHDs) for computer experiments with both quantitative and qualitative factors. Details of the algorithm can be found in Ba, S., Brenneman, W. A. and Myers, W. R. (2015), “Optimal Sliced Latin Hypercube Designs,” Technometrics. Important function in this package is “maximinSLHD”.
957 Design of Experiments (DoE) & Analysis of Experimental Data soptdmaeA Sequential Optimal Designs for Two-Colour cDNA Microarray Experiments Computes sequential A-, MV-, D- and E-optimal or near-optimal block and row-column designs for two-colour cDNA microarray experiments using the linear fixed effects and mixed effects models where the interest is in a comparison of all possible elementary treatment contrasts. The package also provides an optional method of using the graphical user interface (GUI) R package ‘tcltk’ to ensure that it is user friendly.
958 Design of Experiments (DoE) & Analysis of Experimental Data sp23design Design and Simulation of seamless Phase II-III Clinical Trials Provides methods for generating, exploring and executing seamless Phase II-III designs of Lai, Lavori and Shih using generalized likelihood ratio statistics. Includes pdf and source files that describe the entire R implementation with the relevant mathematical details.
959 Design of Experiments (DoE) & Analysis of Experimental Data ssize.fdr Sample Size Calculations for Microarray Experiments This package contains a set of functions that calculates appropriate sample sizes for one-sample t-tests, two-sample t-tests, and F-tests for microarray experiments based on desired power while controlling for false discovery rates. For all tests, the standard deviations (variances) among genes can be assumed fixed or random. This is also true for effect sizes among genes in one-sample and two sample experiments. Functions also output a chart of power versus sample size, a table of power at different sample sizes, and a table of critical test values at different sample sizes.
960 Design of Experiments (DoE) & Analysis of Experimental Data ssizeRNA Sample Size Calculation for RNA-Seq Experimental Design We propose a procedure for sample size calculation while controlling false discovery rate for RNA-seq experimental design. Our procedure depends on the Voom method proposed for RNA-seq data analysis by Law et al. (2014) doi:10.1186/gb-2014-15-2-r29 and the sample size calculation method proposed for microarray experiments by Liu and Hwang (2007) doi:10.1093/bioinformatics/btl664. We develop a set of functions that calculates appropriate sample sizes for two-sample t-test for RNA-seq experiments with fixed or varied set of parameters. The outputs also contain a plot of power versus sample size, a table of power at different sample sizes, and a table of critical test values at different sample sizes. To install this package, please use ‘source(“http://bioconductor.org/biocLite.R”); biocLite(“ssizeRNA”)’.
961 Design of Experiments (DoE) & Analysis of Experimental Data support.CEs Basic Functions for Supporting an Implementation of Choice Experiments Provides seven basic functions that support an implementation of choice experiments.
962 Design of Experiments (DoE) & Analysis of Experimental Data TEQR Target Equivalence Range Design The TEQR package contains software to calculate the operating characteristics for the TEQR and the ACT designs.The TEQR (toxicity equivalence range) design is a toxicity based cumulative cohort design with added safety rules. The ACT (Activity constrained for toxicity) design is also a cumulative cohort design with additional safety rules. The unique feature of this design is that dose is escalated based on lack of activity rather than on lack of toxicity and is de-escalated only if an unacceptable level of toxicity is experienced.
963 Design of Experiments (DoE) & Analysis of Experimental Data tgp Bayesian Treed Gaussian Process Models Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes (GPs) with jumps to the limiting linear model (LLM). Special cases also implemented include Bayesian linear models, CART, treed linear models, stationary separable and isotropic GPs, and GP single-index models. Provides 1-d and 2-d plotting functions (with projection and slice capabilities) and tree drawing, designed for visualization of tgp-class output. Sensitivity analysis and multi-resolution models are supported. Sequential experimental design and adaptive sampling functions are also provided, including ALM, ALC, and expected improvement. The latter supports derivative-free optimization of noisy black-box functions.
964 Design of Experiments (DoE) & Analysis of Experimental Data ThreeArmedTrials Design and Analysis of Clinical Non-Inferiority or Superiority Trials with Active and Placebo Control Design and analyze three-arm non-inferiority or superiority trials which follow a gold-standard design, i.e. trials with an experimental treatment, an active, and a placebo control.
965 Design of Experiments (DoE) & Analysis of Experimental Data toxtestD Experimental design for binary toxicity tests Calculates sample size and dose allocation for binary toxicity tests, using the Fish Embryo Toxicity Test as example. An optimal test design is obtained by running (i) spoD (calculate the number of individuals to test under control conditions), (ii) setD (estimate the minimal sample size per treatment given the users precision requirements) and (iii) doseD (construct an individual dose scheme).
966 Design of Experiments (DoE) & Analysis of Experimental Data unrepx Analysis and Graphics for Unreplicated Experiments Provides half-normal plots, reference plots, and Pareto plots of effects from an unreplicated experiment, along with various pseudo-standard-error measures, simulated reference distributions, and other tools. Many of these methods are described in Daniel C. (1959) doi:10.1080/00401706.1959.10489866 and/or Lenth R.V. (1989) doi:10.1080/00401706.1989.10488595, but some new approaches are added and integrated in one package.
967 Design of Experiments (DoE) & Analysis of Experimental Data vdg Variance Dispersion Graphs and Fraction of Design Space Plots Facilities for constructing variance dispersion graphs, fraction- of-design-space plots and similar graphics for exploring the properties of experimental designs. The design region is explored via random sampling, which allows for more flexibility than traditional variance dispersion graphs. A formula interface is leveraged to provide access to complex model formulae. Graphics can be constructed simultaneously for multiple experimental designs and/or multiple model formulae. Instead of using pointwise optimization to find the minimum and maximum scaled prediction variance curves, which can be inaccurate and time consuming, this package uses quantile regression as an alternative.
968 Design of Experiments (DoE) & Analysis of Experimental Data Vdgraph Variance dispersion graphs and Fraction of design space plots for response surface designs Uses a modification of the published FORTRAN code in “A Computer Program for Generating Variance Dispersion Graphs” by G. Vining, Journal of Quality Technology, Vol. 25 No. 1 January 1993, to produce variance dispersion graphs. Also produces fraction of design space plots, and contains data frames for several minimal run response surface designs.
969 Design of Experiments (DoE) & Analysis of Experimental Data VdgRsm Plots of Scaled Prediction Variances for Response Surface Designs Functions for creating variance dispersion graphs, fraction of design space plots, and contour plots of scaled prediction variances for second-order response surface designs in spherical and cuboidal regions. Also, some standard response surface designs can be generated.
970 Design of Experiments (DoE) & Analysis of Experimental Data VNM Finding Multiple-Objective Optimal Designs for the 4-Parameter Logistic Model Provide tools for finding multiple-objective optimal designs for estimating the shape of dose-response, the ED50 (the dose producing an effect midway between the expected responses at the extreme doses) and the MED (the minimum effective dose level) for the 2,3,4-parameter logistic models and for evaluating its efficiencies for the three objectives. The acronym VNM stands for V-algorithm using Newton Raphson method to search multiple-objective optimal design.
971 Extreme Value Analysis copula Multivariate Dependence with Copulas Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
972 Extreme Value Analysis evd (core) Functions for Extreme Value Distributions Extends simulation, distribution, quantile and density functions to univariate and multivariate parametric extreme value distributions, and provides fitting functions which calculate maximum likelihood estimates for univariate and bivariate maxima models, and for univariate and bivariate threshold models.
973 Extreme Value Analysis evdbayes Bayesian Analysis in Extreme Value Theory Provides functions for the bayesian analysis of extreme value models, using MCMC methods.
974 Extreme Value Analysis evir (core) Extreme Values in R Functions for extreme value theory, which may be divided into the following groups; exploratory data analysis, block maxima, peaks over thresholds (univariate and bivariate), point processes, gev/gpd distributions.
975 Extreme Value Analysis extremefit Estimation of Extreme Conditional Quantiles and Probabilities Extreme value theory, nonparametric kernel estimation, tail conditional probabilities, extreme conditional quantile, adaptive estimation, quantile regression, survival probabilities.
976 Extreme Value Analysis extRemes Extreme Value Analysis Functions for performing extreme value analysis.
977 Extreme Value Analysis extremeStat Extreme Value Statistics and Quantile Estimation Code to fit, plot and compare several (extreme value) distribution functions. Can also compute (truncated) distribution quantile estimates and draw a plot with return periods on a linear scale.
978 Extreme Value Analysis fExtremes Rmetrics - Modelling Extreme Events in Finance Provides functions for analysing and modelling extreme events in financial time Series. The topics include: (i) data pre-processing, (ii) explorative data analysis, (iii) peak over threshold modelling, (iv) block maxima modelling, (v) estimation of VaR and CVaR, and (vi) the computation of the extreme index.
979 Extreme Value Analysis ismev An Introduction to Statistical Modeling of Extreme Values Functions to support the computations carried out in ‘An Introduction to Statistical Modeling of Extreme Values’ by Stuart Coles. The functions may be divided into the following groups; maxima/minima, order statistics, peaks over thresholds and point processes.
980 Extreme Value Analysis lmom L-Moments Functions related to L-moments: computation of L-moments and trimmed L-moments of distributions and data samples; parameter estimation; L-moment ratio diagram; plot vs. quantiles of an extreme-value distribution.
981 Extreme Value Analysis lmomco L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions Extensive functions for L-moments (LMs) and probability-weighted moments (PWMs), parameter estimation for distributions, LM computation for distributions, and L-moment ratio diagrams. Maximum likelihood and maximum product of spacings estimation are also available. LMs for right-tail and left-tail censoring by known or unknown threshold and by indicator variable are available. Asymmetric (asy) trimmed LMs (TL-moments, TLMs) are supported. LMs of residual (resid) and reversed (rev) resid life are implemented along with 13 quantile function operators for reliability and survival analyses. Exact analytical bootstrap estimates of order statistics, LMs, and variances- covariances of LMs are provided. The Harri-Coble Tau34-squared Normality Test is available. Distribution support with “L” (LMs), “TL” (TLMs) and added (+) support for right-tail censoring (RC) encompasses: Asy Exponential (Exp) Power [L], Asy Triangular [L], Cauchy [TL], Eta-Mu [L], Exp. [L], Gamma [L], Generalized (Gen) Exp Poisson [L], Gen Extreme Value [L], Gen Lambda [L,TL], Gen Logistic [L), Gen Normal [L], Gen Pareto [L+RC, TL], Govindarajulu [L], Gumbel [L], Kappa [L], Kappa-Mu [L], Kumaraswamy [L], Laplace [L], Linear Mean Resid. Quantile Function [L], Normal [L], 3-p log-Normal [L], Pearson Type III [L], Rayleigh [L], Rev-Gumbel [L+RC], Rice/Rician [L], Slash [TL], 3-p Student t [L], Truncated Exponential [L], Wakeby [L], and Weibull [L]. Multivariate sample L-comoments (LCMs) are implemented to measure asymmetric associations.
982 Extreme Value Analysis lmomRFA Regional Frequency Analysis using L-Moments Functions for regional frequency analysis using the methods of J. R. M. Hosking and J. R. Wallis (1997), “Regional frequency analysis: an approach based on L-moments”.
983 Extreme Value Analysis mev Multivariate Extreme Value Distributions Exact simulation from max-stable processes and multivariate extreme value distributions for various parametric models. Threshold selection methods.
984 Extreme Value Analysis POT Generalized Pareto Distribution and Peaks Over Threshold Some functions useful to perform a Peak Over Threshold analysis in univariate and bivariate cases. A user’s guide is available.
985 Extreme Value Analysis QRM Provides R-Language Code to Examine Quantitative Risk Management Concepts Accompanying package to the book Quantitative Risk Management: Concepts, Techniques and Tools by Alexander J. McNeil, Rudiger Frey, and Paul Embrechts.
986 Extreme Value Analysis ReIns Functions from “Reinsurance: Actuarial and Statistical Aspects” Functions from the book “Reinsurance: Actuarial and Statistical Aspects” (2017) by Hansjoerg Albrecher, Jan Beirlant and Jef Teugels http://wiley.com/WileyCDA/WileyTitle/productCd-0470772689.html.
987 Extreme Value Analysis Renext Renewal Method for Extreme Values Extrapolation Peaks Over Threshold (POT) or ‘methode du renouvellement’. The distribution for the exceedances can be chosen, and heterogeneous data (including historical data or block data) can be used in a Maximum-Likelihood framework.
988 Extreme Value Analysis revdbayes Ratio-of-Uniforms Sampling for Bayesian Extreme Value Analysis Provides functions for the Bayesian analysis of extreme value models. The ‘rust’ package https://cran.r-project.org/package=rust is used to simulate a random sample from the required posterior distribution. The functionality of ‘revdbayes’ is similar to the ‘evdbayes’ package https://cran.r-project.org/package=evdbayes, which uses Markov Chain Monte Carlo (‘MCMC’) methods for posterior simulation. Also provided are functions for making inferences about the extremal index, using the K-gaps model of Suveges and Davison (2010) doi:10.1214/09-AOAS292. See the ‘revdbayes’ website for more information, documentation and examples.
989 Extreme Value Analysis RTDE Robust Tail Dependence Estimation Robust tail dependence estimation for bivariate models. This package is based on two papers by the authors:‘Robust and bias-corrected estimation of the coefficient of tail dependence’ and ‘Robust and bias-corrected estimation of probabilities of extreme failure sets’. This work was supported by a research grant (VKR023480) from VILLUM FONDEN and an international project for scientific cooperation (PICS-6416).
990 Extreme Value Analysis SpatialExtremes Modelling Spatial Extremes Tools for the statistical modelling of spatial extremes using max-stable processes, copula or Bayesian hierarchical models. More precisely, this package allows (conditional) simulations from various parametric max-stable models, analysis of the extremal spatial dependence, the fitting of such processes using composite likelihoods or least square (simple max-stable processes only), model checking and selection and prediction. Other approaches (although not completely in agreement with the extreme value theory) are available such as the use of (spatial) copula and Bayesian hierarchical models assuming the so-called conditional assumptions. The latter approaches is handled through an (efficient) Gibbs sampler. Some key references: Davison et al. (2012) doi:10.1214/11-STS376, Padoan et al. (2010) doi:10.1198/jasa.2009.tm08577, Dombry et al. (2013) doi:10.1093/biomet/ass067.
991 Extreme Value Analysis texmex Statistical Modelling of Extreme Values Statistical extreme value modelling of threshold excesses, maxima and multivariate extremes. Univariate models for threshold excesses and maxima are the Generalised Pareto, and Generalised Extreme Value model respectively. These models may be fitted by using maximum (optionally penalised-)likelihood, or Bayesian estimation, and both classes of models may be fitted with covariates in any/all model parameters. Model diagnostics support the fitting process. Graphical output for visualising fitted models and return level estimates is provided. For serially dependent sequences, the intervals declustering algorithm of Ferro and Segers (2003) doi:10.1111/1467-9868.00401 is provided, with diagnostic support to aid selection of threshold and declustering horizon. Multivariate modelling is performed via the conditional approach of Heffernan and Tawn (2004) doi:10.1111/j.1467-9868.2004.02050.x, with graphical tools for threshold selection and to diagnose estimation convergence.
992 Extreme Value Analysis VGAM Vector Generalized Linear and Additive Models An implementation of about 6 major classes of statistical regression models. At the heart of it are the vector generalized linear and additive model (VGLM/VGAM) classes, and the book “Vector Generalized Linear and Additive Models: With an Implementation in R” (Yee, 2015) doi:10.1007/978-1-4939-2818-7 gives details of the statistical framework and VGAM package. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs (i.e., with smoothing). The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)―these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.
993 Empirical Finance actuar Actuarial Functions and Heavy Tailed Distributions Functions and data sets for actuarial science: modeling of loss distributions; risk theory and ruin theory; simulation of compound models, discrete mixtures and compound hierarchical models; credibility theory. Support for many additional probability distributions to model insurance loss amounts and loss frequency: 19 continuous heavy tailed distributions; the Poisson-inverse Gaussian discrete distribution; zero-truncated and zero-modified extensions of the standard discrete distributions. Support for phase-type distributions commonly used to compute ruin probabilities.
994 Empirical Finance AmericanCallOpt This package includes pricing function for selected American call options with underlying assets that generate payouts This package includes a set of pricing functions for American call options. The following cases are covered: Pricing of an American call using the standard binomial approximation; Hedge parameters for an American call with a standard binomial tree; Binomial pricing of an American call with continuous payout from the underlying asset; Binomial pricing of an American call with an underlying stock that pays proportional dividends in discrete time; Pricing of an American call on futures using a binomial approximation; Pricing of a currency futures American call using a binomial approximation; Pricing of a perpetual American call. The user should kindly notice that this material is for educational purposes only. The codes are not optimized for computational efficiency as they are meant to represent standard cases of analytical and numerical solution.
995 Empirical Finance backtest Exploring Portfolio-Based Conjectures About Financial Instruments The backtest package provides facilities for exploring portfolio-based conjectures about financial instruments (stocks, bonds, swaps, options, et cetera).
996 Empirical Finance bayesGARCH Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations Provides the bayesGARCH() function which performs the Bayesian estimation of the GARCH(1,1) model with Student’s t innovations as described in Ardia (2008) doi:10.1007/978-3-540-78657-3.
997 Empirical Finance BCC1997 Calculation of Option Prices Based on a Universal Solution Calculates the prices of European options based on the universal solution provided by Bakshi, Cao and Chen (1997) doi:10.1111/j.1540-6261.1997.tb02749.x. This solution considers stochastic volatility, stochastic interest and random jumps. Please cite their work if this package is used.
998 Empirical Finance BenfordTests Statistical Tests for Evaluating Conformity to Benford’s Law Several specialized statistical tests and support functions for determining if numerical data could conform to Benford’s law.
999 Empirical Finance betategarch Simulation, Estimation and Forecasting of Beta-Skew-t-EGARCH Models Simulation, estimation and forecasting of first-order Beta-Skew-t-EGARCH models with leverage (one-component, two-component, skewed versions).
1000 Empirical Finance bizdays Business Days Calculations and Utilities Business days calculations based on a list of holidays and nonworking weekdays. Quite useful for fixed income and derivatives pricing.
1001 Empirical Finance BLModel Black-Litterman Posterior Distribution Posterior distribution in the Black-Litterman model is computed from a prior distribution given in the form of a time series of asset returns and a continuous distribution of views provided by the user as an external function.
1002 Empirical Finance BurStFin Burns Statistics Financial A suite of functions for finance, including the estimation of variance matrices via a statistical factor model or Ledoit-Wolf shrinkage.
1003 Empirical Finance BurStMisc Burns Statistics Miscellaneous Script search, corner, genetic optimization, permutation tests, write expect test.
1004 Empirical Finance CADFtest A Package to Perform Covariate Augmented Dickey-Fuller Unit Root Tests Hansen’s (1995) Covariate-Augmented Dickey-Fuller (CADF) test. The only required argument is y, the Tx1 time series to be tested. If no stationary covariate X is passed to the procedure, then an ordinary ADF test is performed. The p-values of the test are computed using the procedure illustrated in Lupi (2009).
1005 Empirical Finance car Companion to Applied Regression Functions and Datasets to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Second Edition, Sage, 2011.
1006 Empirical Finance ccgarch Conditional Correlation GARCH models Functions for estimating and simulating the family of the CC-GARCH models.
1007 Empirical Finance ChainLadder Statistical Methods and Models for Claims Reserving in General Insurance Various statistical methods and models which are typically used for the estimation of outstanding claims reserves in general insurance, including those to estimate the claims development result as required under Solvency II.
1008 Empirical Finance copula Multivariate Dependence with Copulas Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
1009 Empirical Finance covmat Covariance Matrix Estimation We implement a collection of techniques for estimating covariance matrices. Covariance matrices can be built using missing data. Stambaugh Estimation and FMMC methods can be used to construct such matrices. Covariance matrices can be built by denoising or shrinking the eigenvalues of a sample covariance matrix. Such techniques work by exploiting the tools in Random Matrix Theory to analyse the distribution of eigenvalues. Covariance matrices can also be built assuming that data has many underlying regimes. Each regime is allowed to follow a Dynamic Conditional Correlation model. Robust covariance matrices can be constructed by multivariate cleaning and smoothing of noisy data.
1010 Empirical Finance CreditMetrics Functions for calculating the CreditMetrics risk model A set of functions for computing the CreditMetrics risk model
1011 Empirical Finance credule Credit Default Swap Functions It provides functions to bootstrap Credit Curves from market quotes (Credit Default Swap - CDS - spreads) and price Credit Default Swaps - CDS.
1012 Empirical Finance crp.CSFP CreditRisk+ Portfolio Model Modelling credit risks based on the concept of “CreditRisk+”, First Boston Financial Products, 1997 and “CreditRisk+ in the Banking Industry”, Gundlach & Lehrbass, Springer, 2003.
1013 Empirical Finance data.table Extension of ‘data.frame’ Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, a fast friendly file reader and parallel file writer. Offers a natural and flexible syntax, for faster development.
1014 Empirical Finance derivmkts Functions and R Code to Accompany Derivatives Markets A set of pricing and expository functions that should be useful in teaching a course on financial derivatives.
1015 Empirical Finance dlm Bayesian and Likelihood Analysis of Dynamic Linear Models Maximum likelihood, Kalman filtering and smoothing, and Bayesian analysis of Normal linear State Space models, also known as Dynamic Linear Models
1016 Empirical Finance Dowd Functions Ported from ‘MMR2’ Toolbox Offered in Kevin Dowd’s Book Measuring Market Risk ‘Kevin Dowd’s’ book Measuring Market Risk is a widely read book in the area of risk measurement by students and practitioners alike. As he claims, ‘MATLAB’ indeed might have been the most suitable language when he originally wrote the functions, but, with growing popularity of R it is not entirely valid. As ‘Dowd’s’ code was not intended to be error free and were mainly for reference, some functions in this package have inherited those errors. An attempt will be made in future releases to identify and correct them. ‘Dowd’s’ original code can be downloaded from www.kevindowd.org/measuring-market-risk/. It should be noted that ‘Dowd’ offers both ‘MMR2’ and ‘MMR1’ toolboxes. Only ‘MMR2’ was ported to R. ‘MMR2’ is more recent version of ‘MMR1’ toolbox and they both have mostly similar function. The toolbox mainly contains different parametric and non parametric methods for measurement of market risk as well as backtesting risk measurement methods.
1017 Empirical Finance dse Dynamic Systems Estimation (Time Series Package) Tools for multivariate, linear, time-invariant, time series models. This includes ARMA and state-space representations, and methods for converting between them. It also includes simulation methods and several estimation functions. The package has functions for looking at model roots, stability, and forecasts at different horizons. The ARMA model representation is general, so that VAR, VARX, ARIMA, ARMAX, ARIMAX can all be considered to be special cases. Kalman filter and smoother estimates can be obtained from the state space model, and state-space model reduction techniques are implemented. An introduction and User’s Guide is available in a vignette.
1018 Empirical Finance dyn Time Series Regression Time series regression. The dyn class interfaces ts, irts(), zoo() and zooreg() time series classes to lm(), glm(), loess(), quantreg::rq(), MASS::rlm(), MCMCpack::MCMCregress(), quantreg::rq(), randomForest::randomForest() and other regression functions allowing those functions to be used with time series including specifications that may contain lags, diffs and missing values.
1019 Empirical Finance dynlm Dynamic Linear Regression Dynamic linear models and time series regression.
1020 Empirical Finance ESG ESG - A package for asset projection The package presents a “Scenarios” class containing general parameters, risk parameters and projection results. Risk parameters are gathered together into a ParamsScenarios sub-object. The general process for using this package is to set all needed parameters in a Scenarios object, use the customPathsGeneration method to proceed to the projection, then use xxx_PriceDistribution() methods to get asset prices.
1021 Empirical Finance factorstochvol Bayesian Estimation of (Sparse) Latent Factor Stochastic Volatility Models Markov chain Monte Carlo (MCMC) sampler for fully Bayesian estimation of latent factor stochastic volatility models. Sparsity can be achieved through the usage of Normal-Gamma priors on the factor loading matrix.
1022 Empirical Finance fame Interface for FAME Time Series Database Read and write FAME databases.
1023 Empirical Finance fAssets (core) Rmetrics - Analysing and Modelling Financial Assets Provides a collection of functions to manage, to investigate and to analyze data sets of financial assets from different points of view.
1024 Empirical Finance FatTailsR Kiener Distributions and Fat Tails in Finance Kiener distributions K1, K2, K3, K4 and K7 to characterize distributions with left and right, symmetric or asymmetric fat tails in market finance, neuroscience and other disciplines. Two algorithms to estimate with a high accuracy distribution parameters, quantiles, value-at-risk and expected shortfall. Include power hyperbolas and power hyperbolic functions.
1025 Empirical Finance fBasics (core) Rmetrics - Markets and Basic Statistics Provides a collection of functions to explore and to investigate basic properties of financial returns and related quantities. The covered fields include techniques of explorative data analysis and the investigation of distributional properties, including parameter estimation and hypothesis testing. Even more there are several utility functions for data handling and management.
1026 Empirical Finance fBonds (core) Rmetrics - Pricing and Evaluating Bonds It implements the Nelson-Siegel and the Nelson-Siegel-Svensson term structures.
1027 Empirical Finance fCopulae (core) Rmetrics - Bivariate Dependence Structures with Copulae Provides a collection of functions to manage, to investigate and to analyze bivariate financial returns by Copulae. Included are the families of Archemedean, Elliptical, Extreme Value, and Empirical Copulae.
1028 Empirical Finance fExoticOptions (core) Rmetrics - Pricing and Evaluating Exotic Option Provides a collection of functions to evaluate barrier options, Asian options, binary options, currency translated options, lookback options, multiple asset options and multiple exercise options.
1029 Empirical Finance fExtremes (core) Rmetrics - Modelling Extreme Events in Finance Provides functions for analysing and modelling extreme events in financial time Series. The topics include: (i) data pre-processing, (ii) explorative data analysis, (iii) peak over threshold modelling, (iv) block maxima modelling, (v) estimation of VaR and CVaR, and (vi) the computation of the extreme index.
1030 Empirical Finance fgac Generalized Archimedean Copula Bi-variate data fitting is done by two stochastic components: the marginal distributions and the dependency structure. The dependency structure is modeled through a copula. An algorithm was implemented considering seven families of copulas (Generalized Archimedean Copulas), the best fitting can be obtained looking all copula’s options (totally positive of order 2 and stochastically increasing models).
1031 Empirical Finance fGarch (core) Rmetrics - Autoregressive Conditional Heteroskedastic Modelling Provides a collection of functions to analyze and model heteroskedastic behavior in financial time series models.
1032 Empirical Finance fImport (core) Rmetrics - Importing Economic and Financial Data Provides a collection of utility functions to download and manage data sets from the Internet or from other sources.
1033 Empirical Finance financial Solving financial problems in R Time value of money, cash flows and other financial functions.
1034 Empirical Finance FinancialMath Financial Mathematics for Actuaries Contains financial math functions and introductory derivative functions included in the Society of Actuaries and Casualty Actuarial Society ‘Financial Mathematics’ exam, and some topics in the ‘Models for Financial Economics’ exam.
1035 Empirical Finance FinAsym Classifies implicit trading activity from market quotes and computes the probability of informed trading This package accomplishes two tasks: a) it classifies implicit trading activity from quotes in OTC markets using the algorithm of Lee and Ready (1991); b) based on information for trade initiation, the package computes the probability of informed trading of Easley and O’Hara (1987).
1036 Empirical Finance finreportr Financial Data from U.S. Securities and Exchange Commission Download and display company financial data from the U.S. Securities and Exchange Commission’s EDGAR database. It contains a suite of functions with web scraping and XBRL parsing capabilities that allows users to extract data from EDGAR in an automated and scalable manner. See https://www.sec.gov/edgar/searchedgar/companysearch.html for more information.
1037 Empirical Finance fmdates Financial Market Date Calculations Implements common date calculations relevant for specifying the economic nature of financial market contracts that are typically defined by International Swap Dealer Association (ISDA, http://www2.isda.org) legal documentation. This includes methods to check whether dates are business days in certain locales, functions to adjust and shift dates and time length (or day counter) calculations.
1038 Empirical Finance fMultivar (core) Rmetrics - Analysing and Modeling Multivariate Financial Return Distributions Provides a collection of functions to manage, to investigate and to analyze bivariate and multivariate data sets of financial returns.
1039 Empirical Finance fNonlinear (core) Rmetrics - Nonlinear and Chaotic Time Series Modelling Provides a collection of functions for testing various aspects of univariate time series including independence and neglected nonlinearities. Further provides functions to investigate the chaotic behavior of time series processes and to simulate different types of chaotic time series maps.
1040 Empirical Finance fOptions (core) Rmetrics - Pricing and Evaluating Basic Options Provides a collection of functions to valuate basic options. This includes the generalized Black-Scholes option, options on futures and options on commodity futures.
1041 Empirical Finance forecast Forecasting Functions for Time Series and Linear Models Methods and tools for displaying and analysing univariate time series forecasts including exponential smoothing via state space models and automatic ARIMA modelling.
1042 Empirical Finance fPortfolio (core) Rmetrics - Portfolio Selection and Optimization Provides a collection of functions to optimize portfolios and to analyze them from different points of view.
1043 Empirical Finance fracdiff Fractionally differenced ARIMA aka ARFIMA(p,d,q) models Maximum likelihood estimation of the parameters of a fractionally differenced ARIMA(p,d,q) model (Haslett and Raftery, Appl.Statistics, 1989).
1044 Empirical Finance fractal Fractal Time Series Modeling and Analysis Stochastic fractal and deterministic chaotic time series analysis.
1045 Empirical Finance FRAPO Financial Risk Modelling and Portfolio Optimisation with R Accompanying package of the book ‘Financial Risk Modelling and Portfolio Optimisation with R’, second edition. The data sets used in the book are contained in this package.
1046 Empirical Finance fRegression (core) Rmetrics - Regression Based Decision and Prediction A collection of functions for linear and non-linear regression modelling. It implements a wrapper for several regression models available in the base and contributed packages of R.
1047 Empirical Finance frmqa The Generalized Hyperbolic Distribution, Related Distributions and Their Applications in Finance A collection of R and C++ functions to work with the generalized hyperbolic distribution, related distributions and their applications in financial risk management and quantitative analysis.
1048 Empirical Finance fTrading (core) Rmetrics - Trading and Rebalancing Financial Instruments A collection of functions for trading and rebalancing financial instruments. It implements various technical indicators to analyse time series such as moving averages or stochastic oscillators.
1049 Empirical Finance GCPM Generalized Credit Portfolio Model Analyze the default risk of credit portfolios. Commonly known models, like CreditRisk+ or the CreditMetrics model are implemented in their very basic settings. The portfolio loss distribution can be achieved either by simulation or analytically in case of the classic CreditRisk+ model. Models are only implemented to respect losses caused by defaults, i.e. migration risk is not included. The package structure is kept flexible especially with respect to distributional assumptions in order to quantify the sensitivity of risk figures with respect to several assumptions. Therefore the package can be used to determine the credit risk of a given portfolio as well as to quantify model sensitivities.
1050 Empirical Finance GetHFData Download and Aggregate High Frequency Trading Data from Bovespa Downloads and aggregates high frequency trading data for Brazilian instruments directly from Bovespa ftp site ftp://ftp.bmf.com.br/MarketData/.
1051 Empirical Finance gets General-to-Specific (GETS) Modelling and Indicator Saturation Methods Automated General-to-Specific (GETS) modelling of the mean and variance of a regression, and indicator saturation methods for detecting and testing for structural breaks in the mean.
1052 Empirical Finance GetTDData Get Data for Brazilian Bonds (Tesouro Direto) Downloads and aggregates data for Brazilian government issued bonds directly from the website of Tesouro Direto http://www.tesouro.fazenda.gov.br/tesouro-direto-balanco-e-estatisticas.
1053 Empirical Finance GEVStableGarch ARMA-GARCH/APARCH Models with GEV and Stable Distributions Package for simulation and estimation of ARMA-GARCH/APARCH models with GEV and stable distributions.
1054 Empirical Finance ghyp A Package on Generalized Hyperbolic Distribution and Its Special Cases Detailed functionality for working with the univariate and multivariate Generalized Hyperbolic distribution and its special cases (Hyperbolic (hyp), Normal Inverse Gaussian (NIG), Variance Gamma (VG), skewed Student-t and Gaussian distribution). Especially, it contains fitting procedures, an AIC-based model selection routine, and functions for the computation of density, quantile, probability, random variates, expected shortfall and some portfolio optimization and plotting routines as well as the likelihood ratio test. In addition, it contains the Generalized Inverse Gaussian distribution.
1055 Empirical Finance gmm Generalized Method of Moments and Generalized Empirical Likelihood It is a complete suite to estimate models based on moment conditions. It includes the two step Generalized method of moments (Hansen 1982; doi:10.2307/1912775), the iterated GMM and continuous updated estimator (Hansen, Eaton and Yaron 1996; doi:10.2307/1392442) and several methods that belong to the Generalized Empirical Likelihood family of estimators (Smith 1997; doi:10.1111/j.0013-0133.1997.174.x, Kitamura 1997; doi:10.1214/aos/1069362388, Newey and Smith 2004; doi:10.1111/j.1468-0262.2004.00482.x, and Anatolyev 2005 doi:10.1111/j.1468-0262.2005.00601.x).
1056 Empirical Finance gogarch Generalized Orthogonal GARCH (GO-GARCH) models Implementation of the GO-GARCH model class
1057 Empirical Finance GUIDE GUI for DErivatives in R A nice GUI for financial DErivatives in R.
1058 Empirical Finance highfrequency Tools for Highfrequency Data Analysis Provide functionality to manage, clean and match highfrequency trades and quotes data, calculate various liquidity measures, estimate and forecast volatility, and investigate microstructure noise and intraday periodicity.
1059 Empirical Finance IBrokers R API to Interactive Brokers Trader Workstation Provides native R access to Interactive Brokers Trader Workstation API.
1060 Empirical Finance InfoTrad Calculates the Probability of Informed Trading (PIN) Estimates the probability of informed trading (PIN) initially introduced by Easley et. al. (1996) doi:10.1111/j.1540-6261.1996.tb04074.x . Contribution of the package is that it uses likelihood factorizations of Easley et. al. (2010) doi:10.1017/S0022109010000074 (EHO factorization) and Lin and Ke (2011) doi:10.1016/j.finmar.2011.03.001 (LK factorization). Moreover, the package uses different estimation algorithms. Specifically, the grid-search algorithm proposed by Yan and Zhang (2012) doi:10.1016/j.jbankfin.2011.08.003 , hierarchical agglomerative clustering approach proposed by Gan et. al. (2015) doi:10.1080/14697688.2015.1023336 and later extended by Ersan and Alici (2016) doi:10.1016/j.intfin.2016.04.001 .
1061 Empirical Finance lgarch Simulation and Estimation of Log-GARCH Models Simulation and estimation of univariate and multivariate log-GARCH models. The main functions of the package are: lgarchSim(), mlgarchSim(), lgarch() and mlgarch(). The first two functions simulate from a univariate and a multivariate log-GARCH model, respectively, whereas the latter two estimate a univariate and multivariate log-GARCH model, respectively.
1062 Empirical Finance lifecontingencies Financial and Actuarial Mathematics for Life Contingencies Classes and methods that allow the user to manage life table, actuarial tables (also multiple decrements tables). Moreover, functions to easily perform demographic, financial and actuarial mathematics on life contingencies insurances calculations are contained therein.
1063 Empirical Finance lmtest Testing Linear Regression Models A collection of tests, data sets, and examples for diagnostic checking in linear regression models. Furthermore, some generic tools for inference in parametric models are provided.
1064 Empirical Finance longmemo Statistics for Long-Memory Processes (Jan Beran) Data and Functions Datasets and Functionality from the textbook Jan Beran (1994). Statistics for Long-Memory Processes; Chapman & Hall.
1065 Empirical Finance LSMonteCarlo American options pricing with Least Squares Monte Carlo method The package compiles functions for calculating prices of American put options with Least Squares Monte Carlo method. The option types are plain vanilla American put, Asian American put, and Quanto American put. The pricing algorithms include variance reduction techniques such as Antithetic Variates and Control Variates. Additional functions are given to derive “price surfaces” at different volatilities and strikes, create 3-D plots, quickly generate Geometric Brownian motion, and calculate prices of European options with Black & Scholes analytical solution.
1066 Empirical Finance maRketSim Market simulator for R maRketSim is a market simulator for R. It was initially designed around the bond market, with plans to expand to stocks. maRketSim is built around the idea of portfolios of fundamental objects. Therefore it is slow in its current incarnation, but allows you the flexibility of seeing exactly what is in your final results, since the objects are retained.
1067 Empirical Finance markovchain Easy Handling Discrete Time Markov Chains Functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided.
1068 Empirical Finance MarkowitzR Statistical Significance of the Markowitz Portfolio A collection of tools for analyzing significance of Markowitz portfolios.
1069 Empirical Finance matchingMarkets Analysis of Stable Matchings Implements structural estimators to correct for the sample selection bias from observed outcomes in matching markets. This includes one-sided matching of agents into groups as well as two-sided matching of students to schools. The package also contains algorithms to find stable matchings in the three most common matching problems: the stable roommates problem, the college admissions problem, and the house allocation problem.
1070 Empirical Finance MSBVAR Markov-Switching, Bayesian, Vector Autoregression Models Provides methods for estimating frequentist and Bayesian Vector Autoregression (VAR) models and Markov-switching Bayesian VAR (MSBVAR). Functions for reduced form and structural VAR models are also available. Includes methods for the generating posterior inferences for these models, forecasts, impulse responses (using likelihood-based error bands), and forecast error decompositions. Also includes utility functions for plotting forecasts and impulse responses, and generating draws from Wishart and singular multivariate normal densities. Current version includes functionality to build and evaluate models with Markov switching.
1071 Empirical Finance MSGARCH Markov-Switching GARCH Models Fit (by Maximum Likelihood or MCMC/Bayesian), simulate, and forecast various Markov-Switching GARCH models as described in Ardia et al. (2017) https://ssrn.com/abstract=2845809.
1072 Empirical Finance mvtnorm Multivariate Normal and t Distributions Computes multivariate normal and t probabilities, quantiles, random deviates and densities.
1073 Empirical Finance NetworkRiskMeasures Risk Measures for (Financial) Networks Implements some risk measures for (financial) networks, such as DebtRank, Impact Susceptibility, Impact Diffusion and Impact Fluidity.
1074 Empirical Finance nlme Linear and Nonlinear Mixed Effects Models Fit and compare Gaussian linear and nonlinear mixed-effects models.
1075 Empirical Finance NMOF Numerical Methods and Optimization in Finance Functions, examples and data from the book “Numerical Methods and Optimization in Finance” by M. ‘Gilli’, D. ‘Maringer’ and E. Schumann (2011), ISBN 978-0123756626. The package provides implementations of several optimisation heuristics, such as Differential Evolution, Genetic Algorithms and Threshold Accepting. There are also functions for the valuation of financial instruments, such as bonds and options, and functions that help with stochastic simulations.
1076 Empirical Finance obAnalytics Limit Order Book Analytics Data processing, visualisation and analysis of Limit Order Book event data.
1077 Empirical Finance opefimor Option Pricing and Estimation of Financial Models in R Companion package to the book Option Pricing and Estimation of Financial Models in R, Wiley, Chichester. ISBN: 978-0-470-74584-7.
1078 Empirical Finance OptHedging Estimation of value and hedging strategy of call and put options Estimation of value and hedging strategy of call and put options, based on optimal hedging and Monte Carlo method, from Chapter 3 of ‘Statistical Methods for Financial Engineering’, by Bruno Remillard, CRC Press, (2013).
1079 Empirical Finance OptionPricing Option Pricing with Efficient Simulation Algorithms Efficient Monte Carlo Algorithms for the price and the sensitivities of Asian and European Options under Geometric Brownian Motion.
1080 Empirical Finance pa Performance Attribution for Equity Portfolios A package that provides tools for conducting performance attribution for equity portfolios. The package uses two methods: the Brinson method and a regression-based analysis.
1081 Empirical Finance parma Portfolio Allocation and Risk Management Applications Provision of a set of models and methods for use in the allocation and management of capital in financial portfolios.
1082 Empirical Finance pbo Probability of Backtest Overfitting Following the method of Bailey et al., computes for a collection of candidate models the probability of backtest overfitting, the performance degradation and probability of loss, and the stochastic dominance.
1083 Empirical Finance PerformanceAnalytics (core) Econometric tools for performance and risk analysis Collection of econometric functions for performance and risk analysis. This package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.
1084 Empirical Finance pinbasic Fast and Stable Estimation of the Probability of Informed Trading (PIN) Utilities for fast and stable estimation of the probability of informed trading (PIN) in the model introduced by Easley et al. (2002) doi:10.1111/1540-6261.00493 are implemented. Since the basic model developed by Easley et al. (1996) doi:10.1111/j.1540-6261.1996.tb04074.x is nested in the former due to equating the intensity of uninformed buys and sells, functions can also be applied to this simpler model structure, if needed. State-of-the-art factorization of the model likelihood function as well as most recent algorithms for generating initial values for optimization routines are implemented. In total, two likelihood factorizations and three methodologies for starting values are included. Furthermore, functions for simulating datasets of daily aggregated buys and sells, calculating confidence intervals for the probability of informed trading and posterior probabilities of trading days’ conditions are available.
1085 Empirical Finance portfolio Analysing equity portfolios Classes for analysing and implementing equity portfolios.
1086 Empirical Finance PortfolioEffectHFT High Frequency Portfolio Analytics by PortfolioEffect R interface to PortfolioEffect cloud service for backtesting high frequency trading (HFT) strategies, intraday portfolio analysis and optimization. Includes auto-calibrating model pipeline for market microstructure noise, risk factors, price jumps/outliers, tail risk (high-order moments) and price fractality (long memory). Constructed portfolios could use client-side market data or access HF intraday price history for all major US Equities. See https://www.portfolioeffect.com/ for more information on the PortfolioEffect high frequency portfolio analytics platform.
1087 Empirical Finance PortfolioOptim Small/Large Sample Portfolio Optimization Two functions for financial portfolio optimization by linear programming are provided. One function implements Benders decomposition algorithm and can be used for very large data sets. The other, applicable for moderate sample sizes, finds optimal portfolio which has the smallest distance to a given benchmark portfolio.
1088 Empirical Finance portfolioSim Framework for simulating equity portfolio strategies Classes that serve as a framework for designing equity portfolio simulations.
1089 Empirical Finance PortRisk Portfolio Risk Analysis Risk Attribution of a portfolio with Volatility Risk Analysis.
1090 Empirical Finance quantmod Quantitative Financial Modelling Framework Specify, build, trade, and analyse quantitative financial trading strategies.
1091 Empirical Finance QuantTools Enhanced Quantitative Trading Modelling Download and organize historical market data from multiple sources like Yahoo (https://finance.yahoo.com), Google (https://www.google.com/finance), Finam (https://www.finam.ru/profile/moex-akcii/sberbank/export/), MOEX (https://www.moex.com/en/derivatives/contracts.aspx) and IQFeed (https://www.iqfeed.net/symbolguide/index.cfm?symbolguide=lookup). Code your trading algorithms in modern C++11 with powerful event driven tick processing API including trading costs and exchange communication latency and transform detailed data seamlessly into R. In just few lines of code you will be able to visualize every step of your trading model from tick data to multi dimensional heat maps.
1092 Empirical Finance ragtop Pricing Equity Derivatives with Extensions of Black-Scholes Algorithms to price American and European equity options, convertible bonds and a variety of other financial derivatives. It uses an extension of the usual Black-Scholes model in which jump to default may occur at a probability specified by a power-law link between stock price and hazard rate as found in the paper by Takahashi, Kobayashi, and Nakagawa (2001) doi:10.3905/jfi.2001.319302. We use ideas and techniques from Andersen and Buffum (2002) doi:10.2139/ssrn.355308 and Linetsky (2006) doi:10.1111/j.1467-9965.2006.00271.x.
1093 Empirical Finance Rbitcoin R & bitcoin integration Utilities related to Bitcoin. Unified markets API interface (bitstamp, kraken, btce, bitmarket). Both public and private API calls. Integration of data structures for all markets. Support SSL. Read Rbitcoin documentation (command: ?btc) for more information.
1094 Empirical Finance Rblpapi R Interface to ‘Bloomberg’ An R Interface to ‘Bloomberg’ is provided via the ‘Blp API’.
1095 Empirical Finance Rcmdr R Commander A platform-independent basic-statistics GUI (graphical user interface) for R, based on the tcltk package.
1096 Empirical Finance RcppQuantuccia R Bindings to the ‘Quantuccia’ Header-Only Essentials of ‘QuantLib’ ‘QuantLib’ bindings are provided for R using ‘Rcpp’ and the header-only ‘Quantuccia’ variant (put together by Peter Caspers) offering an essential subset of ‘QuantLib’. See the included file ‘AUTHORS’ for a full list of contributors to both ‘QuantLib’ and ‘Quantuccia’.
1097 Empirical Finance restimizeapi Functions for Working with the ‘www.estimize.com’ Web Services Provides the user with functions to develop their trading strategy, uncover actionable trading ideas, and monitor consensus shifts with crowdsourced earnings and economic estimate data directly from <www.estimize.com>. Further information regarding the web services this package invokes can be found at <www.estimize.com/api>.
1098 Empirical Finance riskSimul Risk Quantification for Stock Portfolios under the T-Copula Model Implements efficient simulation procedures to estimate tail loss probabilities and conditional excess for a stock portfolio. The log-returns are assumed to follow a t-copula model with generalized hyperbolic or t marginals.
1099 Empirical Finance rmgarch Multivariate GARCH Models Feasible multivariate GARCH models including DCC, GO-GARCH and Copula-GARCH.
1100 Empirical Finance RND Risk Neutral Density Extraction Package Extract the implied risk neutral density from options using various methods.
1101 Empirical Finance rpatrec Recognising Visual Charting Patterns in Time Series Data Generating visual charting patterns and noise, smoothing to find a signal in noisy time series and enabling users to apply their findings to real life data.
1102 Empirical Finance rpgm Fast Simulation of Normal/Exponential Random Variables and Stochastic Differential Equations / Poisson Processes Fast simulation of some random variables than the usual native functions, including rnorm() and rexp(), using Ziggurat method, reference: MARSAGLIA, George, TSANG, Wai Wan, and al. (2000) doi:10.18637/jss.v005.i08, and fast simulation of stochastic differential equations / Poisson processes.
1103 Empirical Finance RQuantLib R Interface to the ‘QuantLib’ Library The ‘RQuantLib’ package makes parts of ‘QuantLib’ accessible from R The ‘QuantLib’ project aims to provide a comprehensive software framework for quantitative finance. The goal is to provide a standard open source library for quantitative analysis, modeling, trading, and risk management of financial assets.
1104 Empirical Finance rugarch (core) Univariate GARCH Models ARFIMA, in-mean, external regressors and various GARCH flavors, with methods for fit, forecast, simulation, inference and plotting.
1105 Empirical Finance rwt Rice Wavelet Toolbox wrapper Provides a set of functions for performing digital signal processing.
1106 Empirical Finance sandwich Robust Covariance Matrix Estimators Model-robust standard error estimators for cross-sectional, time series, clustered, panel, and longitudinal data.
1107 Empirical Finance sde Simulation and Inference for Stochastic Differential Equations Companion package to the book Simulation and Inference for Stochastic Differential Equations With R Examples, ISBN 978-0-387-75838-1, Springer, NY.
1108 Empirical Finance SharpeR Statistical Significance of the Sharpe Ratio A collection of tools for analyzing significance of trading strategies, based on the Sharpe ratio and overfit of the same.
1109 Empirical Finance sharpeRratio Moment-Free Estimation of Sharpe Ratios An efficient moment-free estimator of the Sharpe ratio, or signal-to-noise ratio, for heavy-tailed data (see https://arxiv.org/abs/1505.01333).
1110 Empirical Finance Sim.DiffProc Simulation of Diffusion Processes A package for symbolic and numerical computations on scalar and multivariate systems of stochastic differential equations. It provides users with a wide range of tools to simulate, estimate, analyze, and visualize the dynamics of these systems in both forms Ito and Stratonovich. Statistical analysis with Parallel Monte-Carlo and moment equations methods of SDE’s. Enabled many searchers in different domains to use these equations to modeling practical problems in financial and actuarial modeling and other areas of application, e.g., modeling and simulate of first passage time problem in shallow water using the attractive center (Boukhetala K, 1996).
1111 Empirical Finance SmithWilsonYieldCurve Smith-Wilson Yield Curve Construction Constructs a yield curve by the Smith-Wilson method from a table of LIBOR and SWAP rates
1112 Empirical Finance stochvol Efficient Bayesian Inference for Stochastic Volatility (SV) Models Efficient algorithms for fully Bayesian estimation of stochastic volatility (SV) models via Markov chain Monte Carlo (MCMC) methods.
1113 Empirical Finance strucchange Testing, Monitoring, and Dating Structural Changes Testing, monitoring and dating structural changes in (linear) regression models. strucchange features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g., CUSUM, MOSUM, recursive/moving estimates) and F statistics, respectively. It is possible to monitor incoming data online using fluctuation processes. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals. Emphasis is always given to methods for visualizing the data.
1114 Empirical Finance TAQMNGR Manage Tick-by-Tick Transaction Data Manager of tick-by-tick transaction data that performs ‘cleaning’, ‘aggregation’ and ‘import’ in an efficient and fast way. The package engine, written in C++, exploits the ‘zlib’ and ‘gzstream’ libraries to handle gzipped data without need to uncompress them. ‘Cleaning’ and ‘aggregation’ are performed according to Brownlees and Gallo (2006) doi:10.1016/j.csda.2006.09.030. Currently, TAQMNGR processes raw data from WRDS (Wharton Research Data Service, https://wrds-web.wharton.upenn.edu/wrds/).
1115 Empirical Finance tawny Clean Covariance Matrices Using Random Matrix Theory and Shrinkage Estimators for Portfolio Optimization Portfolio optimization typically requires an estimate of a covariance matrix of asset returns. There are many approaches for constructing such a covariance matrix, some using the sample covariance matrix as a starting point. This package provides implementations for two such methods: random matrix theory and shrinkage estimation. Each method attempts to clean or remove noise related to the sampling process from the sample covariance matrix.
1116 Empirical Finance termstrc Zero-coupon Yield Curve Estimation The package offers a wide range of functions for term structure estimation based on static and dynamic coupon bond and yield data sets. The implementation focuses on the cubic splines approach of McCulloch (1971, 1975) and the Nelson and Siegel (1987) method with extensions by Svensson (1994), Diebold and Li (2006) and De Pooter (2007). We propose a weighted constrained optimization procedure with analytical gradients and a globally optimal start parameter search algorithm. Extensive summary statistics and plots are provided to compare the results of the different estimation methods. Several demos are available using data from European government bonds and yields.
1117 Empirical Finance TFX R API to TrueFX(tm) Connects R to TrueFX(tm) for free streaming real-time and historical tick-by-tick market data for dealable interbank foreign exchange rates with millisecond detail.
1118 Empirical Finance tidyquant Tidy Quantitative Financial Analysis Bringing financial analysis to the ‘tidyverse’. The ‘tidyquant’ package provides a convenient wrapper to various ‘xts’, ‘zoo’, ‘quantmod’, ‘TTR’ and ‘PerformanceAnalytics’ package functions and returns the objects in the tidy ‘tibble’ format. The main advantage is being able to use quantitative functions with the ‘tidyverse’ functions including ‘purrr’, ‘dplyr’, ‘tidyr’, ‘ggplot2’, ‘lubridate’, etc. See the ‘tidyquant’ website for more information, documentation and examples.
1119 Empirical Finance timeDate (core) Rmetrics - Chronological and Calendar Objects The ‘timeDate’ class fulfils the conventions of the ISO 8601 standard as well as of the ANSI C and POSIX standards. Beyond these standards it provides the “Financial Center” concept which allows to handle data records collected in different time zones and mix them up to have always the proper time stamps with respect to your personal financial center, or alternatively to the GMT reference time. It can thus also handle time stamps from historical data records from the same time zone, even if the financial centers changed day light saving times at different calendar dates.
1120 Empirical Finance timeSeries (core) Rmetrics - Financial Time Series Objects Provides a class and various tools for financial time series. This includes basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.
1121 Empirical Finance timsac Time Series Analysis and Control Package Functions for statistical analysis, prediction and control of time series.
1122 Empirical Finance tis Time Indexes and Time Indexed Series Functions and S3 classes for time indexes and time indexed series, which are compatible with FAME frequencies.
1123 Empirical Finance TSdbi Time Series Database Interface Provides a common interface to time series databases. The objective is to define a standard interface so users can retrieve time series data from various sources with a simple, common, set of commands, and so programs can be written to be portable with respect to the data source. The SQL implementations also provide a database table design, so users needing to set up a time series database have a reasonably complete way to do this easily. The interface provides for a variety of options with respect to the representation of time series in R. The interface, and the SQL implementations, also handle vintages of time series data (sometime called editions or real-time data). There is also a (not yet well tested) mechanism to handle multilingual data documentation. Comprehensive examples of all the ’TS*‘packages is provided in the vignette Guide.pdf with the ’TSdata’ package.
1124 Empirical Finance tsDyn Nonlinear Time Series Models with Regime Switching Implements nonlinear autoregressive (AR) time series models. For univariate series, a non-parametric approach is available through additive nonlinear AR. Parametric modeling and testing for regime switching dynamics is available when the transition is either direct (TAR: threshold AR) or smooth (STAR: smooth transition AR, LSTAR). For multivariate series, one can estimate a range of TVAR or threshold cointegration TVECM models with two or three regimes. Tests can be conducted for TVAR as well as for TVECM (Hansen and Seo 2002 and Seo 2006).
1125 Empirical Finance tseries (core) Time Series Analysis and Computational Finance Time series analysis and computational finance.
1126 Empirical Finance tseriesChaos Analysis of nonlinear time series Routines for the analysis of nonlinear time series. This work is largely inspired by the TISEAN project, by Rainer Hegger, Holger Kantz and Thomas Schreiber: http://www.mpipks-dresden.mpg.de/~tisean/
1127 Empirical Finance tsfa Time Series Factor Analysis Extraction of Factors from Multivariate Time Series. See ?00tsfa-Intro for more details.
1128 Empirical Finance TTR Technical Trading Rules Functions and data to construct technical trading rules with R.
1129 Empirical Finance tvm Time Value of Money Functions Functions for managing cashflows and interest rate curves.
1130 Empirical Finance urca (core) Unit Root and Cointegration Tests for Time Series Data Unit root and cointegration tests encountered in applied econometric analysis are implemented.
1131 Empirical Finance vars VAR Modelling Estimation, lag selection, diagnostic testing, forecasting, causality analysis, forecast error variance decomposition and impulse response functions of VAR models and estimation of SVAR and SVEC models.
1132 Empirical Finance VarSwapPrice Pricing a variance swap on an equity index Computes a portfolio of European options that replicates the cost of capturing the realised variance of an equity index.
1133 Empirical Finance vrtest Variance Ratio tests and other tests for Martingale Difference Hypothesis A collection of statistical tests for martingale difference hypothesis
1134 Empirical Finance wavelets A package of functions for computing wavelet filters, wavelet transforms and multiresolution analyses This package contains functions for computing and plotting discrete wavelet transforms (DWT) and maximal overlap discrete wavelet transforms (MODWT), as well as their inverses. Additionally, it contains functionality for computing and plotting wavelet transform filters that are used in the above decompositions as well as multiresolution analyses.
1135 Empirical Finance waveslim Basic wavelet routines for one-, two- and three-dimensional signal processing Basic wavelet routines for time series (1D), image (2D) and array (3D) analysis. The code provided here is based on wavelet methodology developed in Percival and Walden (2000); Gencay, Selcuk and Whitcher (2001); the dual-tree complex wavelet transform (DTCWT) from Kingsbury (1999, 2001) as implemented by Selesnick; and Hilbert wavelet pairs (Selesnick 2001, 2002). All figures in chapters 4-7 of GSW (2001) are reproducible using this package and R code available at the book website(s) below.
1136 Empirical Finance wavethresh Wavelets Statistics and Transforms Performs 1, 2 and 3D real and complex-valued wavelet transforms, nondecimated transforms, wavelet packet transforms, nondecimated wavelet packet transforms, multiple wavelet transforms, complex-valued wavelet transforms, wavelet shrinkage for various kinds of data, locally stationary wavelet time series, nonstationary multiscale transfer function modeling, density estimation.
1137 Empirical Finance XBRL Extraction of Business Financial Information from ‘XBRL’ Documents Functions to extract business financial information from an Extensible Business Reporting Language (‘XBRL’) instance file and the associated collection of files that defines its ‘Discoverable’ Taxonomy Set (‘DTS’).
1138 Empirical Finance xts (core) eXtensible Time Series Provide for uniform handling of R’s different time-based data classes by extending zoo, maximizing native format information preservation and allowing for user level customization and extension, while simplifying cross-class interoperability.
1139 Empirical Finance ycinterextra Yield curve or zero-coupon prices interpolation and extrapolation Yield curve or zero-coupon prices interpolation and extrapolation using the Nelson-Siegel, Svensson, Smith-Wilson models, and Hermite cubic splines.
1140 Empirical Finance YieldCurve Modelling and estimation of the yield curve Modelling the yield curve with some parametric models. The models implemented are: Nelson-Siegel, Diebold-Li and Svensson. The package also includes the data of the term structure of interest rate of Federal Reserve Bank and European Central Bank.
1141 Empirical Finance Zelig Everyone’s Statistical Software A framework that brings together an abundance of common statistical models found across packages into a unified interface, and provides a common architecture for estimation and interpretation, as well as bridging functions to absorb increasingly more models into the package. Zelig allows each individual package, for each statistical model, to be accessed by a common uniformly structured call and set of arguments. Moreover, Zelig automates all the surrounding building blocks of a statistical work-flowprocedures and algorithms that may be essential to one user’s application but which the original package developer did not use in their own research and might not themselves support. These include bootstrapping, jackknifing, and re-weighting of data. In particular, Zelig automatically generates predicted and simulated quantities of interest (such as relative risk ratios, average treatment effects, first differences and predicted and expected values) to interpret and visualize complex models.
1142 Empirical Finance zoo (core) S3 Infrastructure for Regular and Irregular Time Series (Z’s Ordered Observations) An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo’s key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics.
1143 Functional Data Analysis classiFunc Classification of Functional Data Efficient implementation of k-nearest neighbor estimator and a kernel estimator for functional data classification.
1144 Functional Data Analysis covsep Tests for Determining if the Covariance Structure of 2-Dimensional Data is Separable Functions for testing if the covariance structure of 2-dimensional data (e.g. samples of surfaces X_i = X_i(s,t)) is separable, i.e. if covariance(X) = C_1 x C_2. A complete descriptions of the implemented tests can be found in the paper arXiv:1505.02023.
1145 Functional Data Analysis dbstats Distance-Based Statistics Prediction methods where explanatory information is coded as a matrix of distances between individuals. Distances can either be directly input as a distances matrix, a squared distances matrix, an inner-products matrix or computed from observed predictors.
1146 Functional Data Analysis denseFLMM Functional Linear Mixed Models for Densely Sampled Data Estimation of functional linear mixed models for densely sampled data based on functional principal component analysis.
1147 Functional Data Analysis fda (core) Functional Data Analysis These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer. They were ported from earlier versions in Matlab and S-PLUS. An introduction appears in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009) Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions of the code and sample analyses are no longer distributed through CRAN, as they were when the book was published. For those, ftp from http://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/ There you find a set of .zip files containing the functions and sample analyses, as well as two .txt files giving instructions for installation and some additional information. The changes from Version 2.4.1 are fixes of bugs in density.fd and removal of functions create.polynomial.basis, polynompen, and polynomial. These were deleted because the monomial basis does the same thing and because there were errors in the code.
1148 Functional Data Analysis fda.usc (core) Functional Data Analysis and Utilities for Statistical Computing Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.
1149 Functional Data Analysis fdakma Functional Data Analysis: K-Mean Alignment It performs simultaneously clustering and alignment of a multidimensional or unidimensional functional dataset by means of k-mean alignment.
1150 Functional Data Analysis fdapace (core) Functional Data Analysis and Empirical Dynamics Provides implementation of various methods of Functional Data Analysis (FDA) and Empirical Dynamics. The core of this package is Functional Principal Component Analysis (FPCA), a key technique for functional data analysis, for sparsely or densely sampled random trajectories and time courses, via the Principal Analysis by Conditional Estimation (PACE) algorithm or numerical integration. PACE is useful for the analysis of data that have been generated by a sample of underlying (but usually not fully observed) random trajectories. It does not rely on pre-smoothing of trajectories, which is problematic if functional data are sparsely sampled. PACE provides options for functional regression and correlation, for Longitudinal Data Analysis, the analysis of stochastic processes from samples of realized trajectories, and for the analysis of underlying dynamics. The core computational algorithms are implemented using the ‘Eigen’ C++ library for numerical linear algebra and ‘RcppEigen’ “glue”.
1151 Functional Data Analysis fdaPDE Functional Data Analysis and Partial Differential Equations; Statistical Analysis of Functional and Spatial Data, Based on Regression with Partial Differential Regularizations An implementation of regression models with partial differential regularizations, making use of the Finite Element Method. The models efficiently handle data distributed over irregularly shaped domains and can comply with various conditions at the boundaries of the domain. A priori information about the spatial structure of the phenomenon under study can be incorporated in the model via the differential regularization.
1152 Functional Data Analysis fdasrvf (core) Elastic Functional Data Analysis Performs alignment, PCA, and modeling of multidimensional and unidimensional functions using the square-root velocity framework (Srivastava et al., 2011 <arXiv:1103.3817> and Tucker et al., 2014 doi:10.1016/j.csda.2012.12.001). This framework allows for elastic analysis of functional data through phase and amplitude separation.
1153 Functional Data Analysis fdatest Interval Testing Procedure for Functional Data Implementation of the Interval Testing Procedure for functional data in different frameworks (i.e., one or two-population frameworks, functional linear models) by means of different basis expansions (i.e., B-spline, Fourier, and phase-amplitude Fourier). The current version of the package requires functional data evaluated on a uniform grid; it automatically projects each function on a chosen functional basis; it performs the entire family of multivariate tests; and, finally, it provides the matrix of the p-values of the previous tests and the vector of the corrected p-values. The functional basis, the coupled or uncoupled scenario, and the kind of test can be chosen by the user. The package provides also a plotting function creating a graphical output of the procedure: the p-value heat-map, the plot of the corrected p-values, and the plot of the functional data.
1154 Functional Data Analysis FDboost (core) Boosting Functional Regression Models Regression models for functional data, i.e., scalar-on-function, function-on-scalar and function-on-function regression models, are fitted by a component-wise gradient boosting algorithm.
1155 Functional Data Analysis fdcov Analysis of Covariance Operators Provides a variety of tools for the analysis of covariance operators.
1156 Functional Data Analysis fds Functional data sets Functional data sets
1157 Functional Data Analysis flars Functional LARS Variable selection algorithm for functional linear regression with scalar response variable and mixed scalar/functional predictors.
1158 Functional Data Analysis fpca Restricted MLE for Functional Principal Components Analysis A geometric approach to MLE for functional principal components
1159 Functional Data Analysis freqdom Frequency Domain Based Analysis: Dynamic PCA Implementation of dynamic principal component analysis (DPCA), simulation of VAR and VMA processes and frequency domain tools. These frequency domain methods for dimensionality reduction of multivariate time series were introduced by David Brillinger in his book Time Series (1974). We follow implementation guidelines as described in Hormann, Kidzinski and Hallin (2016), Dynamic Functional Principal Component doi:10.1111/rssb.12076.
1160 Functional Data Analysis freqdom.fda Functional Time Series: Dynamic Functional Principal Components Implementations of functional dynamic principle components analysis. Related graphic tools and frequency domain methods. These methods directly use multivariate dynamic principal components implementation, following the guidelines from Hormann, Kidzinski and Hallin (2016), Dynamic Functional Principal Component doi:10.1111/rssb.12076.
1161 Functional Data Analysis ftsa (core) Functional Time Series Analysis Functions for visualizing, modeling, forecasting and hypothesis testing of functional time series.
1162 Functional Data Analysis ftsspec Spectral Density Estimation and Comparison for Functional Time Series Functions for estimating spectral density operator of functional time series (FTS) and comparing the spectral density operator of two functional time series, in a way that allows detection of differences of the spectral density operator in frequencies and along the curve length.
1163 Functional Data Analysis Funclustering A package for functional data clustering This packages proposes a model-based clustering algorithm for multivariate functional data. The parametric mixture model, based on the assumption of normality of the principal components resulting from a multivariate functional PCA, is estimated by an EM-like algorithm. The main advantage of the proposed algorithm is its ability to take into account the dependence among curves.
1164 Functional Data Analysis funcy (core) Functional Clustering Algorithms Unified framework to cluster functional data according to one of seven models. All models are based on the projection of the curves onto a basis. The main function funcit() calls wrapper functions for the existing algorithms, so that input parameters are the same. A list is returned with each entry representing the same or extended output for the corresponding method. Method specific as well as general visualization tools are available.
1165 Functional Data Analysis funData An S4 Class for Functional Data S4 classes for univariate and multivariate functional data with utility functions.
1166 Functional Data Analysis funFEM Clustering in the Discriminative Functional Subspace The funFEM algorithm (Bouveyron et al., 2014) allows to cluster functional data by modeling the curves within a common and discriminative functional subspace.
1167 Functional Data Analysis funHDDC Model-based clustering in group-specific functional subspaces The package provides the funHDDC algorithm (Bouveyron & Jacques, 2011) which allows to cluster functional data by modeling each group within a specific functional subspace.
1168 Functional Data Analysis geofd Spatial Prediction for Function Value Data Kriging based methods are used for predicting functional data (curves) with spatial dependence.
1169 Functional Data Analysis GPFDA Apply Gaussian Process in Functional data analysis Use functional regression as the mean structure and Gaussian Process as the covariance structure.
1170 Functional Data Analysis growfunctions Bayesian Non-Parametric Dependent Models for Time-Indexed Functional Data Estimates a collection of time-indexed functions under either of Gaussian process (GP) or intrinsic Gaussian Markov random field (iGMRF) prior formulations where a Dirichlet process mixture allows sub-groupings of the functions to share the same covariance or precision parameters. The GP and iGMRF formulations both support any number of additive covariance or precision terms, respectively, expressing either or both of multiple trend and seasonality.
1171 Functional Data Analysis MFPCA Multivariate Functional Principal Component Analysis for Data Observed on Different Dimensional Domains Calculate a multivariate functional principal component analysis for data observed on different dimensional domains. The estimation algorithm relies on univariate basis expansions for each element of the multivariate functional data. Multivariate and univariate functional data objects are represented by S4 classes for this type of data implemented in the package ‘funData’.
1172 Functional Data Analysis pcdpca Dynamic Principal Components for Periodically Correlated Functional Time Series Method extends multivariate and functional dynamic principal components to periodically correlated multivariate time series. This package allows you to compute true dynamic principal components in the presence of periodicity. We follow implementation guidelines as described in Kidzinski, Kokoszka and Jouzdani (2017), in Principal component analysis of periodically correlated functional time series <arXiv:1612.00040>.
1173 Functional Data Analysis rainbow Rainbow Plots, Bagplots and Boxplots for Functional Data Functions and data sets for functional data display and outlier detection.
1174 Functional Data Analysis refund (core) Regression with Functional Data Methods for regression for functional data, including function-on-scalar, scalar-on-function, and function-on-function regression. Some of the functions are applicable to image data.
1175 Functional Data Analysis refund.shiny Interactive Plotting for Functional Data Analyses Interactive plotting for functional data analyses.
1176 Functional Data Analysis refund.wave Wavelet-Domain Regression with Functional Data Methods for regressing scalar responses on functional or image predictors, via transformation to the wavelet domain and back.
1177 Functional Data Analysis RFgroove Importance Measure and Selection for Groups of Variables with Random Forests Variable selection tools for groups of variables and functional data based on a new grouped variable importance with random forests.
1178 Functional Data Analysis roahd Robust Analysis of High Dimensional Data A collection of methods for the robust analysis of univariate and multivariate functional data, possibly in high-dimensional cases, and hence with attention to computational efficiency and simplicity of use.
1179 Functional Data Analysis sparseFLMM Functional Linear Mixed Models for Irregularly or Sparsely Sampled Data Estimation of functional linear mixed models for irregularly or sparsely sampled data based on functional principal component analysis.
1180 Functional Data Analysis switchnpreg Switching nonparametric regression models for a single curve and functional data Functions for estimating the parameters from the latent state process and the functions corresponding to the J states as proposed by De Souza and Heckman (2013).
1181 Functional Data Analysis warpMix Mixed Effects Modeling with Warping for Functional Data Using B-Spline Mixed effects modeling with warping for functional data using B- spline. Warping coefficients are considered as random effects, and warping functions are general functions, parameters representing the projection onto B- spline basis of a part of the warping functions. Warped data are modelled by a linear mixed effect functional model, the noise is Gaussian and independent from the warping functions.
1182 Statistical Genetics adegenet Exploratory Analysis of Genetic and Genomic Data Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure (‘genind’ class), alleles counts by populations (‘genpop’), and genome-wide SNP data (‘genlight’). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
1183 Statistical Genetics ape Analyses of Phylogenetics and Evolution Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel’s test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ, BIONJ, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.
1184 Statistical Genetics Biodem Biodemography Functions The Biodem package provides a number of functions for Biodemographic analysis.
1185 Statistical Genetics bqtl Bayesian QTL Mapping Toolkit QTL mapping toolkit for inbred crosses and recombinant inbred lines. Includes maximum likelihood and Bayesian tools.
1186 Statistical Genetics dlmap Detection Localization Mapping for QTL QTL mapping in a mixed model framework with separate detection and localization stages. The first stage detects the number of QTL on each chromosome based on the genetic variation due to grouped markers on the chromosome; the second stage uses this information to determine the most likely QTL positions. The mixed model can accommodate general fixed and random effects, including spatial effects in field trials and pedigree effects. Applicable to backcrosses, doubled haploids, recombinant inbred lines, F2 intercrosses, and association mapping populations.
1187 Statistical Genetics gap (core) Genetic Analysis Package It is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates.
1188 Statistical Genetics GenABEL genome-wide SNP association analysis a package for genome-wide association analysis between quantitative or binary traits and single-nucleotide polymorphisms (SNPs).
1189 Statistical Genetics genetics (core) Population Genetics Classes and methods for handling genetic data. Includes classes to represent genotypes and haplotypes at single markers up to multiple markers on multiple chromosomes. Function include allele frequencies, flagging homo/heterozygotes, flagging carriers of certain alleles, estimating and testing for Hardy-Weinberg disequilibrium, estimating and testing for linkage disequilibrium, …
1190 Statistical Genetics hapassoc Inference of Trait Associations with SNP Haplotypes and Other Attributes using the EM Algorithm The following R functions are used for inference of trait associations with haplotypes and other covariates in generalized linear models. The functions are developed primarily for data collected in cohort or cross-sectional studies. They can accommodate uncertain haplotype phase and handle missing genotypes at some SNPs.
1191 Statistical Genetics haplo.ccs Estimate Haplotype Relative Risks in Case-Control Data ‘haplo.ccs’ estimates haplotype and covariate relative risks in case-control data by weighted logistic regression. Diplotype probabilities, which are estimated by EM computation with progressive insertion of loci, are utilized as weights.
1192 Statistical Genetics haplo.stats (core) Statistical Analysis of Haplotypes with Traits and Covariates when Linkage Phase is Ambiguous Routines for the analysis of indirectly measured haplotypes. The statistical methods assume that all subjects are unrelated and that haplotypes are ambiguous (due to unknown linkage phase of the genetic markers). The main functions are: haplo.em(), haplo.glm(), haplo.score(), and haplo.power(); all of which have detailed examples in the vignette.
1193 Statistical Genetics HardyWeinberg Statistical Tests and Graphics for Hardy-Weinberg Equilibrium Contains tools for exploring Hardy-Weinberg equilibrium for diallelic genetic marker data. All classical tests (chi-square, exact, likelihood-ratio and permutation tests) for Hardy-Weinberg equilibrium are included in the package, as well as functions for power computation and for the simulation of marker data under equilibrium and disequilibrium. Routines for dealing with markers on the X-chromosome are included. Functions for testing equilibrium in the presence of missing data by using multiple imputation are also provided. Implements several graphics for exploring the equilibrium status of a large set of diallelic markers: ternary plots with acceptance regions, log-ratio plots and Q-Q plots.
1194 Statistical Genetics hierfstat Estimation and Tests of Hierarchical F-Statistics Allows the estimation of hierarchical F-statistics from haploid or diploid genetic data with any numbers of levels in the hierarchy, following the algorithm of Yang (Evolution, 1998, 52(4):950-956; doi:10.2307/2411227. Functions are also given to test via randomisations the significance of each F and variance components, using the likelihood-ratio statistics G.
1195 Statistical Genetics hwde Models and Tests for Departure from Hardy-Weinberg Equilibrium and Independence Between Loci Fits models for genotypic disequilibria, as described in Huttley and Wilson (2000), Weir (1996) and Weir and Wilson (1986). Contrast terms are available that account for first order interactions between loci. Also implements, for a single locus in a single population, a conditional exact test for Hardy-Weinberg equilibrium.
1196 Statistical Genetics ibdreg Regression Methods for IBD Linkage With Covariates A method to test genetic linkage with covariates by regression methods with response IBD sharing for relative pairs. Account for correlations of IBD statistics and covariates for relative pairs within the same pedigree.
1197 Statistical Genetics LDheatmap Graphical Display of Pairwise Linkage Disequilibria Between SNPs Produces a graphical display, as a heat map, of measures of pairwise linkage disequilibria between SNPs. Users may optionally include the physical locations or genetic map distances of each SNP on the plot.
1198 Statistical Genetics luca Likelihood inference from case-control data Under Covariate Assumptions (LUCA) Likelihood inference in case-control studies of a rare disease under independence or simple dependence of genetic and non-genetic covariates
1199 Statistical Genetics ouch Ornstein-Uhlenbeck Models for Phylogenetic Comparative Hypotheses Fit and compare Ornstein-Uhlenbeck models for evolution along a phylogenetic tree.
1200 Statistical Genetics pbatR P2BAT This package provides data analysis via the pbat program, and an alternative internal implementation of the power calculations via simulation only. For analysis, this package provides a frontend to the PBAT software, automatically reading in the output from the pbat program and displaying the corresponding figure when appropriate (i.e. PBAT-logrank). It includes support for multiple processes and clusters. For analysis, users must download PBAT (developed by Christoph Lange) and accept it’s license, available on the PBAT webpage. Both the data analysis and power calculations have command line and graphical interfaces using tcltk.
1201 Statistical Genetics phangorn Phylogenetic Reconstruction and Analysis Package contains methods for estimation of phylogenetic trees and networks using Maximum Likelihood, Maximum Parsimony, distance methods and Hadamard conjugation. Allows to compare trees, models selection and offers visualizations for trees and split networks.
1202 Statistical Genetics qtl Tools for Analyzing QTL Experiments Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits.
1203 Statistical Genetics rmetasim An Individual-Based Population Genetic Simulation Environment An interface between R and the metasim simulation engine. The simulation environment is documented in: “Strand, A.(2002) doi:10.1046/j.1471-8286.2002.00208.x Metasim 1.0: an individual-based environment for simulating population genetics of complex population dynamics. Mol. Ecol. Notes. Please see the vignettes CreatingLandscapes and Simulating to get some ideas on how to use the packages. See the rmetasim vignette to get an overview and to see important changes to the code in the most recent version.
1204 Statistical Genetics seqinr Biological Sequences Retrieval and Analysis Exploratory data analysis and data visualization for biological sequence (DNA and protein) data. Seqinr includes utilities for sequence data management under the ACNUC system described in Gouy, M. et al. (1984) Nucleic Acids Res. 12:121-127 doi:10.1093/nar/12.1Part1.121.
1205 Statistical Genetics snp.plotter snp.plotter Creates plots of p-values using single SNP and/or haplotype data. Main features of the package include options to display a linkage disequilibrium (LD) plot and the ability to plot multiple datasets simultaneously. Plots can be created using global and/or individual haplotype p-values along with single SNP p-values. Images are created as either PDF/EPS files.
1206 Statistical Genetics SNPmaxsel Maximally selected statistics for SNP data This package implements asymptotic methods related to maximally selected statistics, with applications to SNP data.
1207 Statistical Genetics stepwise Stepwise detection of recombination breakpoints A stepwise approach to identifying recombination breakpoints in a sequence alignment.
1208 Statistical Genetics tdthap TDT tests for extended haplotypes Transmission/disequilibrium tests for extended marker haplotypes
1209 Statistical Genetics untb ecological drift under the UNTB A collection of utilities for biodiversity data. Includes the simulation of ecological drift under Hubbell’s Unified Neutral Theory of Biodiversity, and the calculation of various diagnostics such as Preston curves. Now includes functionality provided by Francois Munoz and Andrea Manica.
1210 Statistical Genetics wgaim Whole Genome Average Interval Mapping for QTL Detection using Mixed Models Integrates sophisticated mixed modelling methods with a whole genome approach to detecting significant QTL in linkage maps.
1211 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization ade4 Analysis of Ecological Data : Exploratory and Euclidean Methods in Environmental Sciences Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) doi:10.18637/jss.v022.i04.
1212 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization animation A Gallery of Animations in Statistics and Utilities to Create Animations Provides functions for animations in statistics, covering topics in probability theory, mathematical statistics, multivariate statistics, non-parametric statistics, sampling survey, linear models, time series, computational statistics, data mining and machine learning. These functions may be helpful in teaching statistics and data analysis. Also provided in this package are a series of functions to save animations to various formats, e.g. Flash, ‘GIF’, HTML pages, ‘PDF’ and videos. ‘PDF’ animations can be inserted into ‘Sweave’ / ‘knitr’ easily.
1213 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization ape Analyses of Phylogenetics and Evolution Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel’s test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ, BIONJ, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.
1214 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization aplpack Another Plot PACKage: stem.leaf, bagplot, faces, spin3R, plotsummary, plothulls, and some slider functions set of functions for drawing some special plots: stem.leaf plots a stem and leaf plot, stem.leaf.backback plots back-to-back versions of stem and leafs, bagplot plots a bagplot, skyline.hist plots several histgramm in one plot of a one dimensional data set, plotsummary plots a graphical summary of a data set with one or more variables, plothulls plots sequentially hulls of a bivariate data set, faces plots chernoff faces, spin3R for an inspection of a 3-dim point cloud, slider functions for interactive graphics.
1215 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization ash David Scott’s ASH Routines David Scott’s ASH routines ported from S-PLUS to R.
1216 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization biclust BiCluster Algorithms The main function biclust provides several algorithms to find biclusters in two-dimensional data: Cheng and Church, Spectral, Plaid Model, Xmotifs and Bimax. In addition, the package provides methods for data preprocessing (normalization and discretisation), visualisation, and validation of bicluster solutions.
1217 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization Cairo R graphics device using cairo graphics library for creating high-quality bitmap (PNG, JPEG, TIFF), vector (PDF, SVG, PostScript) and display (X11 and Win32) output Cairo graphics device that can be use to create high-quality vector (PDF, PostScript and SVG) and bitmap output (PNG,JPEG,TIFF), and high-quality rendering in displays (X11 and Win32). Since it uses the same back-end for all output, copying across formats is WYSIWYG. Files are created without the dependence on X11 or other external programs. This device supports alpha channel (semi-transparent drawing) and resulting images can contain transparent and semi-transparent regions. It is ideal for use in server environments (file output) and as a replacement for other devices that don’t have Cairo’s capabilities such as alpha support or anti-aliasing. Backends are modular such that any subset of backends is supported.
1218 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization cairoDevice Embeddable Cairo Graphics Device Driver This device uses Cairo and GTK to draw to the screen, file (png, svg, pdf, and ps) or memory (arbitrary GdkDrawable or Cairo context). The screen device may be embedded into RGtk2 interfaces and supports all interactive features of other graphics devices, including getGraphicsEvent().
1219 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization cba Clustering for Business Analytics Implements clustering techniques such as Proximus and Rock, utility functions for efficient computation of cross distances and data manipulation.
1220 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization colorspace Color Space Manipulation Carries out mapping between assorted color spaces including RGB, HSV, HLS, CIEXYZ, CIELUV, HCL (polar CIELUV), CIELAB and polar CIELAB. Qualitative, sequential, and diverging color palettes based on HCL colors are provided along with an interactive palette picker (with either a Tcl/Tk or a shiny GUI).
1221 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization diagram Functions for Visualising Simple Graphs (Networks), Plotting Flow Diagrams Visualises simple graphs (networks) based on a transition matrix, utilities to plot flow diagrams, visualising webs, electrical networks, etc. Support for the book “A practical guide to ecological modelling - using R as a simulation platform” by Karline Soetaert and Peter M.J. Herman (2009), Springer. and the book “Solving Differential Equations in R” by Karline Soetaert, Jeff Cash and Francesca Mazzia (2012), Springer. Includes demo(flowchart), demo(plotmat), demo(plotweb).
1222 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization dichromat Color Schemes for Dichromats Collapse red-green or green-blue distinctions to simulate the effects of different types of color-blindness.
1223 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization gclus Clustering Graphics Orders panels in scatterplot matrices and parallel coordinate displays by some merit index. Package contains various indices of merit, ordering functions, and enhanced versions of pairs and parcoord which color panels according to their merit level.
1224 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization ggplot2 (core) Create Elegant Data Visualisations Using the Grammar of Graphics A system for ‘declaratively’ creating graphics, based on “The Grammar of Graphics”. You provide the data, tell ‘ggplot2’ how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
1225 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization gplots Various R Programming Tools for Plotting Data Various R programming tools for plotting data, including: - calculating and plotting locally smoothed summary function as (‘bandplot’, ‘wapply’), - enhanced versions of standard plots (‘barplot2’, ‘boxplot2’, ‘heatmap.2’, ‘smartlegend’), - manipulating colors (‘col2hex’, ‘colorpanel’, ‘redgreen’, ‘greenred’, ‘bluered’, ‘redblue’, ‘rich.colors’), - calculating and plotting two-dimensional data summaries (‘ci2d’, ‘hist2d’), - enhanced regression diagnostic plots (‘lmplot2’, ‘residplot’), - formula-enabled interface to ‘stats::lowess’ function (‘lowess’), - displaying textual data in plots (‘textplot’, ‘sinkplot’), - plotting a matrix where each cell contains a dot whose size reflects the relative magnitude of the elements (‘balloonplot’), - plotting “Venn” diagrams (‘venn’), - displaying Open-Office style plots (‘ooplot’), - plotting multiple data on same region, with separate axes (‘overplot’), - plotting means and confidence intervals (‘plotCI’, ‘plotmeans’), - spacing points in an x-y plot so they don’t overlap (‘space’).
1226 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization gridBase Integration of base and grid graphics Integration of base and grid graphics
1227 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization hexbin Hexagonal Binning Routines Binning and plotting functions for hexagonal bins. Now uses and relies on grid graphics and formal (S4) classes and methods.
1228 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization IDPmisc Utilities of Institute of Data Analyses and Process Design (www.idp.zhaw.ch) The IDPmisc package contains different high-level graphics functions for displaying large datasets, displaying circular data in a very flexible way, finding local maxima, brewing color ramps, drawing nice arrows, zooming 2D-plots, creating figures with differently colored margin and plot region. In addition, the package contains auxiliary functions for data manipulation like omitting observations with irregular values or selecting data by logical vectors, which include NAs. Other functions are especially useful in spectroscopy and analyses of environmental data: robust baseline fitting, finding peaks in spectra.
1229 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization igraph Network Analysis and Visualization Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
1230 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization iplots iPlots - interactive graphics for R Interactive plots for R
1231 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization JavaGD Java Graphics Device Graphics device routing all graphics commands to a Java program. The actual functionality of the JavaGD depends on the Java-side implementation. Simple AWT and Swing implementations are included.
1232 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization klaR Classification and visualization Miscellaneous functions for classification and visualization developed at the Fakultaet Statistik, Technische Universitaet Dortmund
1233 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization lattice (core) Trellis Graphics for R A powerful and elegant high-level data visualization system inspired by Trellis graphics, with an emphasis on multivariate data. Lattice is sufficient for typical graphics needs, and is also flexible enough to handle most nonstandard requirements. See ?Lattice for an introduction.
1234 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization latticeExtra Extra Graphical Utilities Based on Lattice Building on the infrastructure provided by the lattice package, this package provides several new high-level functions and methods, as well as additional utilities such as panel and axis annotation functions.
1235 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization misc3d Miscellaneous 3D Plots A collection of miscellaneous 3d plots, including isosurfaces.
1236 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization onion Octonions and Quaternions Quaternions and Octonions are four- and eight- dimensional extensions of the complex numbers. They are normed division algebras over the real numbers and find applications in spatial rotations (quaternions) and string theory and relativity (octonions). The quaternions are noncommutative and the octonions nonassociative. See RKS Hankin 2006, Rnews Volume 6/2: 49-51, and the package vignette, for more details.
1237 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization playwith A GUI for interactive plots using GTK+ A GTK+ graphical user interface for editing and interacting with R plots.
1238 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization plotrix (core) Various Plotting Functions Lots of plots, various labeling, axis and color scaling functions.
1239 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization RColorBrewer (core) ColorBrewer Palettes Provides color schemes for maps (and other graphics) designed by Cynthia Brewer as described at http://colorbrewer2.org
1240 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization rggobi Interface Between R and ‘GGobi’ A command-line interface to ‘GGobi’, an interactive and dynamic graphics package. ‘Rggobi’ complements the graphical user interface of ‘GGobi’ providing a way to fluidly transition between analysis and exploration, as well as automating common tasks.
1241 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization rgl (core) 3D Visualization Using OpenGL Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.). Output may be on screen using OpenGL, or to various standard 3D file formats including WebGL, PLY, OBJ, STL as well as 2D image formats, including PNG, Postscript, SVG, PGF.
1242 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization RGraphics Data and Functions from the Book R Graphics, Second Edition Data and Functions from the book R Graphics, Second Edition. There is a function to produce each figure in the book, plus several functions, classes, and methods defined in Chapter 8.
1243 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization RGtk2 R Bindings for Gtk 2.8.0 and Above Facilities in the R language for programming graphical interfaces using Gtk, the Gimp Tool Kit.
1244 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization RSvgDevice An R SVG graphics device A graphics device for R that uses the w3.org xml standard for Scalable Vector Graphics.
1245 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization RSVGTipsDevice An R SVG Graphics Device with Dynamic Tips and Hyperlinks A graphics device for R that uses the w3.org xml standard for Scalable Vector Graphics. This version supports tooltips with 1 to 3 lines, hyperlinks, and line styles.
1246 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization scagnostics Compute scagnostics - scatterplot diagnostics Calculates graph theoretic scagnostics. Scagnostics describe various measures of interest for pairs of variables, based on their appearance on a scatterplot. They are useful tool for discovering interesting or unusual scatterplots from a scatterplot matrix, without having to look at every individual plot.
1247 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization scatterplot3d 3D Scatter Plot Plots a three dimensional (3D) point cloud.
1248 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization seriation Infrastructure for Ordering Objects Using Seriation Infrastructure for seriation with an implementation of several seriation/sequencing techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).
1249 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization tkrplot TK Rplot simple mechanism for placing R graphics in a Tk widget
1250 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization vcd (core) Visualizing Categorical Data Visualization techniques, data sets, summary and inference procedures aimed particularly at categorical data. Special emphasis is given to highly extensible grid graphics. The package was package was originally inspired by the book “Visualizing Categorical Data” by Michael Friendly and is now the main support package for a new book, “Discrete Data Analysis with R” by Michael Friendly and David Meyer (2015).
1251 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization vioplot Violin plot A violin plot is a combination of a box plot and a kernel density plot.
1252 Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization xgobi Interface to the XGobi and XGvis programs for graphical data analysis Interface to the XGobi and XGvis programs for graphical data analysis.
1253 High-Performance and Parallel Computing with R aprof Amdahl’s Profiler, Directed Optimization Made Easy Assists the evaluation of whether and where to focus code optimization, using Amdahl’s law and visual aids based on line profiling. Amdahl’s profiler organises profiling output files (including memory profiling) in a visually appealing way. It is meant to help to balance development vs. execution time by helping to identify the most promising sections of code to optimize and projecting potential gains. The package is an addition to R’s standard profiling tools and is not a wrapper for them.
1254 High-Performance and Parallel Computing with R batch Batching Routines in Parallel and Passing Command-Line Arguments to R Functions to allow you to easily pass command-line arguments into R, and functions to aid in submitting your R code in parallel on a cluster and joining the results afterward (e.g. multiple parameter values for simulations running in parallel, splitting up a permutation test in parallel, etc.). See ‘parseCommandArgs(…)’ for the main example of how to use this package.
1255 High-Performance and Parallel Computing with R BatchExperiments Statistical Experiments on Batch Computing Clusters Extends the BatchJobs package to run statistical experiments on batch computing clusters. For further details see the project web page.
1256 High-Performance and Parallel Computing with R BatchJobs Batch Computing with R Provides Map, Reduce and Filter variants to generate jobs on batch computing systems like PBS/Torque, LSF, SLURM and Sun Grid Engine. Multicore and SSH systems are also supported. For further details see the project web page.
1257 High-Performance and Parallel Computing with R batchtools Tools for Computation on Batch Systems As a successor of the packages ‘BatchJobs’ and ‘BatchExperiments’, this package provides a parallel implementation of the Map function for high performance computing systems managed by schedulers ‘IBM Spectrum LSF’ (http://www-03.ibm.com/systems/spectrum-computing/products/lsf/), ‘OpenLava’ (http://www.openlava.org/), ‘Univa Grid Engine’/‘Oracle Grid Engine’ (http://www.univa.com/), ‘Slurm’ (http://slurm.schedmd.com/), ‘TORQUE/PBS’ (http://www.adaptivecomputing.com/products/open-source/torque/), or ‘Docker Swarm’ (https://docs.docker.com/swarm/). A multicore and socket mode allow the parallelization on a local machines, and multiple machines can be hooked up via SSH to create a makeshift cluster. Moreover, the package provides an abstraction mechanism to define large-scale computer experiments in a well-organized and reproducible way.
1258 High-Performance and Parallel Computing with R bayesm Bayesian Inference for Marketing/Micro-Econometrics Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley 2005) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).
1259 High-Performance and Parallel Computing with R bcp Bayesian Analysis of Change Point Problems Provides an implementation of the Barry and Hartigan (1993) product partition model for the normal errors change point problem using Markov Chain Monte Carlo. It also extends the methodology to regression models on a connected graph (Wang and Emerson, 2015); this allows estimation of change point models with multivariate responses. Parallel MCMC, previously available in bcp v.3.0.0, is currently not implemented.
1260 High-Performance and Parallel Computing with R biglars Scalable Least-Angle Regression and Lasso Least-angle regression, lasso and stepwise regression for numeric datasets in which the number of observations is greater than the number of predictors. The functions can be used with the ff library to accomodate datasets that are too large to be held in memory.
1261 High-Performance and Parallel Computing with R biglm bounded memory linear and generalized linear models Regression for data too large to fit in memory
1262 High-Performance and Parallel Computing with R bigmemory Manage Massive Matrices with Shared Memory and Memory-Mapped Files Create, store, access, and manipulate massive matrices. Matrices are allocated to shared memory and may use memory-mapped files. Packages ‘biganalytics’, ‘bigtabulate’, ‘synchronicity’, and ‘bigalgebra’ provide advanced functionality.
1263 High-Performance and Parallel Computing with R bnlearn Bayesian Network Structure Learning, Parameter Learning and Inference Bayesian network structure learning, parameter learning and inference. This package implements constraint-based (GS, IAMB, Inter-IAMB, Fast-IAMB, MMPC, Hiton-PC), pairwise (ARACNE and Chow-Liu), score-based (Hill-Climbing and Tabu Search) and hybrid (MMHC and RSMAX2) structure learning algorithms for discrete, Gaussian and conditional Gaussian networks, along with many score functions and conditional independence tests. The Naive Bayes and the Tree-Augmented Naive Bayes (TAN) classifiers are also implemented. Some utility functions (model comparison and manipulation, random data generation, arc orientation testing, simple and advanced plots) are included, as well as support for parameter estimation (maximum likelihood and Bayesian) and inference, conditional probability queries and cross-validation. Development snapshots with the latest bugfixes are available from http://www.bnlearn.com.
1264 High-Performance and Parallel Computing with R caret Classification and Regression Training Misc functions for training and plotting classification and regression models.
1265 High-Performance and Parallel Computing with R cudaBayesreg CUDA Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis Compute Unified Device Architecture (CUDA) is a software platform for massively parallel high-performance computing on NVIDIA GPUs. This package provides a CUDA implementation of a Bayesian multilevel model for the analysis of brain fMRI data. A fMRI data set consists of time series of volume data in 4D space. Typically, volumes are collected as slices of 64 x 64 voxels. Analysis of fMRI data often relies on fitting linear regression models at each voxel of the brain. The volume of the data to be processed, and the type of statistical analysis to perform in fMRI analysis, call for high-performance computing strategies. In this package, the CUDA programming model uses a separate thread for fitting a linear regression model at each voxel in parallel. The global statistical model implements a Gibbs Sampler for hierarchical linear models with a normal prior. This model has been proposed by Rossi, Allenby and McCulloch in ‘Bayesian Statistics and Marketing’, Chapter 3, and is referred to as ‘rhierLinearModel’ in the R-package bayesm. A notebook equipped with a NVIDIA ‘GeForce 8400M GS’ card having Compute Capability 1.1 has been used in the tests. The data sets used in the package’s examples are available in the separate package cudaBayesregData.
1266 High-Performance and Parallel Computing with R data.table Extension of ‘data.frame’ Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, a fast friendly file reader and parallel file writer. Offers a natural and flexible syntax, for faster development.
1267 High-Performance and Parallel Computing with R dclone Data Cloning and MCMC Tools for Maximum Likelihood Methods Low level functions for implementing maximum likelihood estimating procedures for complex models using data cloning and Bayesian Markov chain Monte Carlo methods. Sequential and parallel MCMC support for JAGS, WinBUGS and OpenBUGS.
1268 High-Performance and Parallel Computing with R doFuture A Universal Foreach Parallel Adaptor using the Future API of the ‘future’ Package Provides a ‘%dopar%’ adaptor such that any type of futures can be used as backends for the ‘foreach’ framework.
1269 High-Performance and Parallel Computing with R doMC Foreach Parallel Adaptor for ‘parallel’ Provides a parallel backend for the %dopar% function using the multicore functionality of the parallel package.
1270 High-Performance and Parallel Computing with R doMPI Foreach Parallel Adaptor for the Rmpi Package Provides a parallel backend for the %dopar% function using the Rmpi package.
1271 High-Performance and Parallel Computing with R doRedis Foreach parallel adapter for the rredis package A Redis parallel backend for the %dopar% function
1272 High-Performance and Parallel Computing with R doRNG Generic Reproducible Parallel Backend for ‘foreach’ Loops Provides functions to perform reproducible parallel foreach loops, using independent random streams as generated by L’Ecuyer’s combined multiple-recursive generator [L’Ecuyer (1999), doi:10.1287/opre.47.1.159]. It enables to easily convert standard %dopar% loops into fully reproducible loops, independently of the number of workers, the task scheduling strategy, or the chosen parallel environment and associated foreach backend.
1273 High-Performance and Parallel Computing with R doSNOW Foreach Parallel Adaptor for the ‘snow’ Package Provides a parallel backend for the %dopar% function using the snow package of Tierney, Rossini, Li, and Sevcikova.
1274 High-Performance and Parallel Computing with R drake Data Frames in R for Make A solution for reproducible code and high-performance computing.
1275 High-Performance and Parallel Computing with R ff memory-efficient storage of large data on disk and fast access functions The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R’s standard atomic data types ‘double’, ‘logical’, ‘raw’ and ‘integer’ and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example ‘quad’ allows efficient storage of genomic data as an ‘A’,‘T’,‘G’,‘C’ factor. The unsigned types support ‘circular’ arithmetic. There is also support for close-to-atomic types ‘factor’, ‘ordered’, ‘POSIXct’, ‘Date’ and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with ‘permanent’ files as well as creating/removing ‘temporary’ ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, ‘logicals’ and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package ‘bit’: chunked looping, fast bit operations and coercions between different objects that can store subscript information (‘bit’, ‘bitwhich’, ff ‘boolean’, ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.
1276 High-Performance and Parallel Computing with R ffbase Basic Statistical Functions for Package ‘ff’ Extends the out of memory vectors of ‘ff’ with statistical functions and other utilities to ease their usage.
1277 High-Performance and Parallel Computing with R flowr Streamlining Design and Deployment of Complex Workflows This framework allows you to design and implement complex pipelines, and deploy them on your institution’s computing cluster. This has been built keeping in mind the needs of bioinformatics workflows. However, it is easily extendable to any field where a series of steps (shell commands) are to be executed in a (work)flow.
1278 High-Performance and Parallel Computing with R foreach Provides Foreach Looping Construct for R Support for the foreach looping construct. Foreach is an idiom that allows for iterating over elements in a collection, without the use of an explicit loop counter. This package in particular is intended to be used for its return value, rather than for its side effects. In that sense, it is similar to the standard lapply function, but doesn’t require the evaluation of a function. Using foreach without side effects also facilitates executing the loop in parallel.
1279 High-Performance and Parallel Computing with R future Unified Parallel and Distributed Processing in R for Everyone The purpose of this package is to provide a lightweight and unified Future API for sequential and parallel processing of R expression via futures. The simplest way to evaluate an expression in parallel is to use ‘x %<-% { expression }’ with ‘plan(multiprocess)’. This package implements sequential, multicore, multisession, and cluster futures. With these, R expressions can be evaluated on the local machine, on in parallel a set of local machines, or distributed on a mix of local and remote machines. Extensions to this package implements additional backends for processing futures via compute cluster schedulers etc. Because of its unified API, there is no need to modify code in order switch from sequential on the local machine to, say, distributed processing on a remote compute cluster. Another strength of this package is that global variables and functions are automatically identified and exported as needed, making it straightforward to tweak existing code to make use of futures.
1280 High-Performance and Parallel Computing with R future.BatchJobs A Future API for Parallel and Distributed Processing using BatchJobs Implementation of the Future API on top of the ‘BatchJobs’ package. This allows you to process futures, as defined by the ‘future’ package, in parallel out of the box, not only on your local machine or ad-hoc cluster of machines, but also via high-performance compute (‘HPC’) job schedulers such as ‘LSF’, ‘OpenLava’, ‘Slurm’, ‘SGE’, and ‘TORQUE’ / ‘PBS’, e.g. ‘y <- future_lapply(files, FUN = process)’.
1281 High-Performance and Parallel Computing with R GAMBoost Generalized linear and additive models by likelihood based boosting This package provides routines for fitting generalized linear and and generalized additive models by likelihood based boosting, using penalized B-splines
1282 High-Performance and Parallel Computing with R gcbd ‘GPU’/CPU Benchmarking in Debian-Based Systems ‘GPU’/CPU Benchmarking on Debian-package based systems This package benchmarks performance of a few standard linear algebra operations (such as a matrix product and QR, SVD and LU decompositions) across a number of different ‘BLAS’ libraries as well as a ‘GPU’ implementation. To do so, it takes advantage of the ability to ‘plug and play’ different ‘BLAS’ implementations easily on a Debian and/or Ubuntu system. The current version supports - ‘Reference BLAS’ (‘refblas’) which are un-accelerated as a baseline - Atlas which are tuned but typically configure single-threaded - Atlas39 which are tuned and configured for multi-threaded mode - ‘Goto Blas’ which are accelerated and multi-threaded - ‘Intel MKL’ which is a commercial accelerated and multithreaded version. As for ‘GPU’ computing, we use the CRAN package - ‘gputools’ For ‘Goto Blas’, the ‘gotoblas2-helper’ script from the ISM in Tokyo can be used. For ‘Intel MKL’ we use the Revolution R packages from Ubuntu 9.10.
1283 High-Performance and Parallel Computing with R gmatrix GPU Computing in R A general framework for utilizing R to harness the power of NVIDIA GPU’s. The “gmatrix” and “gvector” classes allow for easy management of the separate device and host memory spaces. Numerous numerical operations are implemented for these objects on the GPU. These operations include matrix multiplication, addition, subtraction, the kronecker product, the outer product, comparison operators, logical operators, trigonometric functions, indexing, sorting, random number generation and many more.
1284 High-Performance and Parallel Computing with R gpuR GPU Functions for R Objects Provides GPU enabled functions for R objects in a simple and approachable manner. New gpu* and vcl* classes have been provided to wrap typical R objects (e.g. vector, matrix), in both host and device spaces, to mirror typical R syntax without the need to know OpenCL.
1285 High-Performance and Parallel Computing with R gputools A Few GPU Enabled Functions Provides R interfaces to a handful of common functions implemented using the Nvidia CUDA toolkit. Some of the functions require at least GPU Compute Capability 1.3. Thanks to Craig Stark at UC Irvine for donating time on his lab’s Mac.
1286 High-Performance and Parallel Computing with R GUIProfiler Graphical User Interface for Rprof() Show graphically the results of profiling R functions by tracking their execution time.
1287 High-Performance and Parallel Computing with R h2o R Interface for H2O R scripting functionality for H2O, the open source math engine for big data that computes parallel distributed machine learning algorithms such as generalized linear models, gradient boosting machines, random forests, and neural networks (deep learning) within various cluster environments.
1288 High-Performance and Parallel Computing with R HadoopStreaming Utilities for using R scripts in Hadoop streaming Provides a framework for writing map/reduce scripts for use in Hadoop Streaming. Also facilitates operating on data in a streaming fashion, without Hadoop.
1289 High-Performance and Parallel Computing with R harvestr A Parallel Simulation Framework Functions for easy and reproducible simulation.
1290 High-Performance and Parallel Computing with R HistogramTools Utility Functions for R Histograms Provides a number of utility functions useful for manipulating large histograms. This includes methods to trim, subset, merge buckets, merge histograms, convert to CDF, and calculate information loss due to binning. It also provides a protocol buffer representations of the default R histogram class to allow histograms over large data sets to be computed and manipulated in a MapReduce environment.
1291 High-Performance and Parallel Computing with R inline Functions to Inline C, C++, Fortran Function Calls from R Functionality to dynamically define R functions and S4 methods with inlined C, C++ or Fortran code supporting .C and .Call calling conventions.
1292 High-Performance and Parallel Computing with R LaF Fast Access to Large ASCII Files Methods for fast access to large ASCII files. Currently the following file formats are supported: comma separated format (CSV) and fixed width format. It is assumed that the files are too large to fit into memory, although the package can also be used to efficiently access files that do fit into memory. Methods are provided to access and process files blockwise. Furthermore, an opened file can be accessed as one would an ordinary data.frame. The LaF vignette gives an overview of the functionality provided.
1293 High-Performance and Parallel Computing with R latentnet Latent Position and Cluster Models for Statistical Networks Fit and simulate latent position and cluster models for statistical networks.
1294 High-Performance and Parallel Computing with R lga Tools for linear grouping analysis (LGA) Tools for linear grouping analysis. Three user-level functions: gap, rlga and lga.
1295 High-Performance and Parallel Computing with R Matching Multivariate and Propensity Score Matching with Balance Optimization Provides functions for multivariate and propensity score matching and for finding optimal balance based on a genetic search algorithm. A variety of univariate and multivariate metrics to determine if balance has been obtained are also provided.
1296 High-Performance and Parallel Computing with R MonetDB.R Connect MonetDB to R Allows to pull data from MonetDB into R. Includes a DBI implementation and a dplyr backend.
1297 High-Performance and Parallel Computing with R nws R functions for NetWorkSpaces and Sleigh Provides coordination and parallel execution facilities, as well as limited cross-language data exchange, using the netWorkSpaces server developed by REvolution Computing
1298 High-Performance and Parallel Computing with R OpenCL Interface allowing R to use OpenCL This package provides an interface to OpenCL, allowing R to leverage computing power of GPUs and other HPC accelerator devices.
1299 High-Performance and Parallel Computing with R orloca Operations Research LOCational Analysis Models Objects and methods to handle and solve the min-sum location problem, also known as Fermat-Weber problem. The min-sum location problem search for a point such that the weighted sum of the distances to the demand points are minimized. See “The Fermat-Weber location problem revisited” by Brimberg, Mathematical Programming, 1, pg. 71-76, 1995. doi:10.1007/BF01592245. General global optimization algorithms are used to solve the problem, along with the adhoc Weiszfeld method, see “Sur le point pour lequel la Somme des distances de n points donnes est minimum”, by Weiszfeld, Tohoku Mathematical Journal, First Series, 43, pg. 355-386, 1937.
1300 High-Performance and Parallel Computing with R parSim Parallel Simulation Studies Perform flexible simulation studies using one or multiple computer cores. The package is set up to be usable on high-performance clusters in addition to being run locally, see examples on https://github.com/SachaEpskamp/parSim.
1301 High-Performance and Parallel Computing with R partDSA Partitioning Using Deletion, Substitution, and Addition Moves A novel tool for generating a piecewise constant estimation list of increasingly complex predictors based on an intensive and comprehensive search over the entire covariate space.
1302 High-Performance and Parallel Computing with R pbapply Adding Progress Bar to ’*apply’ Functions A lightweight package that adds progress bar to vectorized R functions (’*apply’). The implementation can easily be added to functions where showing the progress is useful (e.g. bootstrap). The type and style of the progress bar (with percentages or remaining time) can be set through options. Supports several parallel processing backends.
1303 High-Performance and Parallel Computing with R pbdBASE Programming with Big Data Base Wrappers for Distributed Matrices An interface to and extensions for the ‘PBLAS’ and ‘ScaLAPACK’ numerical libraries. This enables R to utilize distributed linear algebra for codes written in the ‘SPMD’ fashion. This interface is deliberately low-level and mimics the style of the native libraries it wraps. For a much higher level way of managing distributed matrices, see the ‘pbdDMAT’ package.
1304 High-Performance and Parallel Computing with R pbdDEMO Programming with Big Data Demonstrations and Examples Using ‘pbdR’ Packages A set of demos of ‘pbdR’ packages, together with a useful, unifying vignette.
1305 High-Performance and Parallel Computing with R pbdDMAT ‘pbdR’ Distributed Matrix Methods A set of classes for managing distributed matrices, and a collection of methods for computing linear algebra and statistics. Computation is handled mostly by routines from the ‘pbdBASE’ package, which itself relies on the ‘ScaLAPACK’ and ‘PBLAS’ numerical libraries for distributed computing.
1306 High-Performance and Parallel Computing with R pbdMPI Programming with Big Data Interface to MPI An efficient interface to MPI by utilizing S4 classes and methods with a focus on Single Program/Multiple Data (‘SPMD’) parallel programming style, which is intended for batch parallel execution.
1307 High-Performance and Parallel Computing with R pbdNCDF4 Programming with Big Data Interface to Parallel Unidata NetCDF4 Format Data Files This package adds collective parallel read and write capability to the R package ncdf4 version 1.8. Typical use is as a parallel NetCDF4 file reader in SPMD style programming. Each R process reads and writes its own data in a synchronized collective mode, resulting in faster parallel performance. Performance improvement is conditional on a parallel file system.
1308 High-Performance and Parallel Computing with R pbdPROF Programming with Big Data ― MPI Profiling Tools MPI profiling tools.
1309 High-Performance and Parallel Computing with R pbdSLAP Programming with Big Data Scalable Linear Algebra Packages Utilizing scalable linear algebra packages mainly including BLACS, PBLAS, and ScaLAPACK in double precision via pbdMPI based on ScaLAPACK version 2.0.2.
1310 High-Performance and Parallel Computing with R peperr Parallelised Estimation of Prediction Error Package peperr is designed for prediction error estimation through resampling techniques, possibly accelerated by parallel execution on a compute cluster. Newly developed model fitting routines can be easily incorporated.
1311 High-Performance and Parallel Computing with R permGPU Using GPUs in Statistical Genomics Can be used to carry out permutation resampling inference in the context of RNA microarray studies.
1312 High-Performance and Parallel Computing with R PGICA Parallel Group ICA Algorithm A Group ICA Algorithm that can run in parallel on an SGE platform or multi-core PCs
1313 High-Performance and Parallel Computing with R pls Partial Least Squares and Principal Component Regression Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).
1314 High-Performance and Parallel Computing with R pmclust Parallel Model-Based Clustering using Expectation-Gathering-Maximization Algorithm for Finite Mixture Gaussian Model Aims to utilize model-based clustering (unsupervised) for high dimensional and ultra large data, especially in a distributed manner. The code employs pbdMPI to perform a expectation-gathering-maximization algorithm for finite mixture Gaussian models. The unstructured dispersion matrices are assumed in the Gaussian models. The implementation is default in the single program multiple data programming model. The code can be executed through pbdMPI and independent to most MPI applications. See the High Performance Statistical Computing website for more information, documents and examples.
1315 High-Performance and Parallel Computing with R profr An alternative display for profiling information profr provides an alternative data structure and visual rendering for the profiling information generated by Rprof.
1316 High-Performance and Parallel Computing with R proftools Profile Output Processing Tools for R Tools for examining Rprof profile output.
1317 High-Performance and Parallel Computing with R pvclust Hierarchical Clustering with P-Values via Multiscale Bootstrap Resampling An implementation of multiscale bootstrap resampling for assessing the uncertainty in hierarchical cluster analysis. It provides AU (approximately unbiased) p-value as well as BP (bootstrap probability) value for each cluster in a dendrogram.
1318 High-Performance and Parallel Computing with R randomForestSRC Random Forests for Survival, Regression, and Classification (RF-SRC) A unified treatment of Breiman’s random forests for survival, regression and classification problems based on Ishwaran and Kogalur’s random survival forests (RSF) package. Now extended to include multivariate and unsupervised forests. Also includes quantile regression forests for univariate and multivariate training/testing settings. The package runs in both serial and parallel (OpenMP) modes.
1319 High-Performance and Parallel Computing with R Rborist Extensible, Parallelizable Implementation of the Random Forest Algorithm Scalable decision tree training and prediction.
1320 High-Performance and Parallel Computing with R Rcpp Seamless R and C++ Integration The ‘Rcpp’ package provides R functions as well as C++ classes which offer a seamless integration of R and C++. Many R data types and objects can be mapped back and forth to C++ equivalents which facilitates both writing of new code as well as easier integration of third-party libraries. Documentation about ‘Rcpp’ is provided by several vignettes included in this package, via the ‘Rcpp Gallery’ site at http://gallery.rcpp.org, the paper by Eddelbuettel and Francois (2011, doi:10.18637/jss.v040.i08), the book by Eddelbuettel (2013, doi:10.1007/978-1-4614-6868-4) and the paper by Eddelbuettel and Balamuta (2017, doi:10.7287/peerj.preprints.3188v1); see ‘citation(“Rcpp”)’ for details.
1321 High-Performance and Parallel Computing with R RcppParallel Parallel Programming Tools for ‘Rcpp’ High level functions for parallel programming with ‘Rcpp’. For example, the ‘parallelFor()’ function can be used to convert the work of a standard serial “for” loop into a parallel one and the ‘parallelReduce()’ function can be used for accumulating aggregate or other values.
1322 High-Performance and Parallel Computing with R Rdsm Threads Environment for R Provides a threads-type programming environment for R. The package gives the R programmer the clearer, more concise shared memory world view, and in some cases gives superior performance as well. In addition, it enables parallel processing on very large, out-of-core matrices.
1323 High-Performance and Parallel Computing with R rgenoud R Version of GENetic Optimization Using Derivatives A genetic algorithm plus derivative optimizer.
1324 High-Performance and Parallel Computing with R Rhpc Permits *apply() Style Dispatch for ‘HPC’ Function of apply style using ‘MPI’ provides better ‘HPC’ environment on R. and this package supports long vector, can deal with slightly big data.
1325 High-Performance and Parallel Computing with R RhpcBLASctl Control the Number of Threads on ‘BLAS’ Control the number of threads on ‘BLAS’ (Aka ‘GotoBLAS’, ‘ACML’ and ‘MKL’). and possible to control the number of threads in ‘OpenMP’. get a number of logical cores and physical cores if feasible.
1326 High-Performance and Parallel Computing with R RInside C++ Classes to Embed R in C++ Applications C++ classes to embed R in C++ applications The ‘RInside’ packages makes it easier to have “R inside” your C++ application by providing a C++ wrapper class providing the R interpreter. As R itself is embedded into your application, a shared library build of R is required. This works on Linux, OS X and even on Windows provided you use the same tools used to build R itself. Numerous examples are provided in the eight subdirectories of the examples/ directory of the installed package: standard, mpi (for parallel computing) qt (showing how to embed ‘RInside’ inside a Qt GUI application), wt (showing how to build a “web-application” using the Wt toolkit), armadillo (for ‘RInside’ use with ‘RcppArmadillo’) and eigen (for ‘RInside’ use with ‘RcppEigen’). The example use GNUmakefile(s) with GNU extensions, so a GNU make is required (and will use the GNUmakefile automatically). Doxygen-generated documentation of the C++ classes is available at the ‘RInside’ website as well.
1327 High-Performance and Parallel Computing with R rJava Low-Level R to Java Interface Low-level interface to Java VM very much like .C/.Call and friends. Allows creation of objects, calling methods and accessing fields.
1328 High-Performance and Parallel Computing with R rlecuyer R Interface to RNG with Multiple Streams Provides an interface to the C implementation of the random number generator with multiple independent streams developed by L’Ecuyer et al (2002). The main purpose of this package is to enable the use of this random number generator in parallel R applications.
1329 High-Performance and Parallel Computing with R Rmpi (core) Interface (Wrapper) to MPI (Message-Passing Interface) An interface (wrapper) to MPI APIs. It also provides interactive R manager and worker environment.
1330 High-Performance and Parallel Computing with R RProtoBuf R Interface to the ‘Protocol Buffers’ ‘API’ (Version 2 or 3) Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal ‘RPC’ protocols and file formats. Additional documentation is available in two included vignettes one of which corresponds to our ‘JSS’ paper (2016, doi:10.18637/jss.v071.i02. Either version 2 or 3 of the ‘Protocol Buffers’ ‘API’ is supported.
1331 High-Performance and Parallel Computing with R rredis “Redis” Key/Value Database Client R client interface to the “Redis” key-value database.
1332 High-Performance and Parallel Computing with R rslurm Submit R Calculations to a Slurm Cluster Functions that simplify submitting R scripts to a Slurm workload manager, in part by automating the division of embarrassingly parallel calculations across cluster nodes.
1333 High-Performance and Parallel Computing with R Sim.DiffProc Simulation of Diffusion Processes A package for symbolic and numerical computations on scalar and multivariate systems of stochastic differential equations. It provides users with a wide range of tools to simulate, estimate, analyze, and visualize the dynamics of these systems in both forms Ito and Stratonovich. Statistical analysis with Parallel Monte-Carlo and moment equations methods of SDE’s. Enabled many searchers in different domains to use these equations to modeling practical problems in financial and actuarial modeling and other areas of application, e.g., modeling and simulate of first passage time problem in shallow water using the attractive center (Boukhetala K, 1996).
1334 High-Performance and Parallel Computing with R snow (core) Simple Network of Workstations Support for simple parallel computing in R.
1335 High-Performance and Parallel Computing with R snowfall Easier cluster computing (based on snow) Usability wrapper around snow for easier development of parallel R programs. This package offers e.g. extended error checks, and additional functions. All functions work in sequential mode, too, if no cluster is present or wished. Package is also designed as connector to the cluster management tool sfCluster, but can also used without it.
1336 High-Performance and Parallel Computing with R snowFT Fault Tolerant Simple Network of Workstations Extension of the snow package supporting fault tolerant and reproducible applications, as well as supporting easy-to-use parallel programming - only one function is needed. Dynamic cluster size is also available.
1337 High-Performance and Parallel Computing with R speedglm Fitting Linear and Generalized Linear Models to Large Data Sets Fitting linear models and generalized linear models to large data sets by updating algorithms.
1338 High-Performance and Parallel Computing with R sprint Simple Parallel R INTerface SPRINT (Simple Parallel R INTerface) is a parallel framework for R. It provides a High Performance Computing (HPC) harness which allows R scripts to run on HPC clusters. SPRINT contains a library of selected R functions that have been parallelized. Functions are named after the original R function with the added prefix ‘p’, i.e. the parallel version of cor() in SPRINT is called pcor(). Call to the parallel R functions are included directly in standard R scripts. SPRINT contains functions for correlation (pcor), partitioning around medoids (ppam), apply (papply), permutation testing (pmaxT), bootstrapping (pboot), random forest (prandomForest), rank product (pRP) and hamming distance (pstringdistmatrix).
1339 High-Performance and Parallel Computing with R sqldf Manipulate R Data Frames Using SQL The sqldf() function is typically passed a single argument which is an SQL select statement where the table names are ordinary R data frame names. sqldf() transparently sets up a database, imports the data frames into that database, performs the SQL select or other statement and returns the result using a heuristic to determine which class to assign to each column of the returned data frame. The sqldf() or read.csv.sql() functions can also be used to read filtered files into R even if the original files are larger than R itself can handle. ‘RSQLite’, ‘RH2’, ‘RMySQL’ and ‘RPostgreSQL’ backends are supported.
1340 High-Performance and Parallel Computing with R STAR Spike Train Analysis with R Functions to analyze neuronal spike trains from a single neuron or from several neurons recorded simultaneously.
1341 High-Performance and Parallel Computing with R tm Text Mining Package A framework for text mining applications within R.
1342 High-Performance and Parallel Computing with R toaster Big Data in-Database Analytics that Scales with Teradata Aster Distributed Platform A consistent set of tools to perform in-database analytics on Teradata Aster Big Data Discovery Platform. toaster (a.k.a ‘to Aster’) embraces simple 2-step approach: compute in Aster - visualize and analyze in R. Its ‘compute’ functions use combination of parallel SQL, SQL-MR and SQL-GR executing in Aster database - highly scalable parallel and distributed analytical platform. Then ‘create’ functions visualize results with boxplots, scatterplots, histograms, heatmaps, word clouds, maps, networks, or slope graphs. Advanced options such as faceting, coloring, labeling, and others are supported with most visualizations.
1343 High-Performance and Parallel Computing with R varSelRF Variable Selection using Random Forests Variable selection from random forests using both backwards variable elimination (for the selection of small sets of non-redundant variables) and selection based on the importance spectrum (somewhat similar to scree plots; for the selection of large, potentially highly-correlated variables). Main applications in high-dimensional data (e.g., microarray data, and other genomics and proteomics applications).
1344 Machine Learning & Statistical Learning ahaz Regularization for semiparametric additive hazards regression Computationally efficient procedures for regularized estimation with the semiparametric additive hazards regression model.
1345 Machine Learning & Statistical Learning arules Mining Association Rules and Frequent Itemsets Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides interfaces to C implementations of the association mining algorithms Apriori and Eclat by C. Borgelt.
1346 Machine Learning & Statistical Learning BayesTree Bayesian Additive Regression Trees This is an implementation of BART:Bayesian Additive Regression Trees, by Chipman, George, McCulloch (2010).
1347 Machine Learning & Statistical Learning biglasso Extending Lasso Model Fitting to Big Data Extend lasso and elastic-net model fitting for ultrahigh-dimensional, multi-gigabyte data sets that cannot be loaded into memory. It’s much more memory- and computation-efficient as compared to existing lasso-fitting packages like ‘glmnet’ and ‘ncvreg’, thus allowing for very powerful big data analysis even with an ordinary laptop.
1348 Machine Learning & Statistical Learning bigRR Generalized Ridge Regression (with special advantage for p >> n cases) The package fits large-scale (generalized) ridge regression for various distributions of response. The shrinkage parameters (lambdas) can be pre-specified or estimated using an internal update routine (fitting a heteroscedastic effects model, or HEM). It gives possibility to shrink any subset of parameters in the model. It has special computational advantage for the cases when the number of shrinkage parameters exceeds the number of observations. For example, the package is very useful for fitting large-scale omics data, such as high-throughput genotype data (genomics), gene expression data (transcriptomics), metabolomics data, etc.
1349 Machine Learning & Statistical Learning bmrm Bundle Methods for Regularized Risk Minimization Package Bundle methods for minimization of convex and non-convex risk under L1 or L2 regularization. Implements the algorithm proposed by Teo et al. (JMLR 2010) as well as the extension proposed by Do and Artieres (JMLR 2012). The package comes with lot of loss functions for machine learning which make it powerful for big data analysis. The applications includes: structured prediction, linear SVM, multi-class SVM, f-beta optimization, ROC optimization, ordinal regression, quantile regression, epsilon insensitive regression, least mean square, logistic regression, least absolute deviation regression (see package examples), etc… all with L1 and L2 regularization.
1350 Machine Learning & Statistical Learning Boruta Wrapper Algorithm for All Relevant Feature Selection An all relevant feature selection wrapper algorithm. It finds relevant features by comparing original attributes’ importance with importance achievable at random, estimated using their permuted copies.
1351 Machine Learning & Statistical Learning bst Gradient Boosting Functional gradient descent algorithm for a variety of convex and non-convex loss functions, for both classical and robust regression and classification problems.
1352 Machine Learning & Statistical Learning C50 C5.0 Decision Trees and Rule-Based Models C5.0 decision trees and rule-based models for pattern recognition.
1353 Machine Learning & Statistical Learning caret Classification and Regression Training Misc functions for training and plotting classification and regression models.
1354 Machine Learning & Statistical Learning CORElearn Classification, Regression and Feature Evaluation A suite of machine learning algorithms written in C++ with the R interface contains several learning techniques for classification and regression. Predictive models include e.g., classification and regression trees with optional constructive induction and models in the leaves, random forests, kNN, naive Bayes, and locally weighted regression. All predictions obtained with these models can be explained and visualized with the ‘ExplainPrediction’ package. This package is especially strong in feature evaluation where it contains several variants of Relief algorithm and many impurity based attribute evaluation functions, e.g., Gini, information gain, MDL, and DKM. These methods can be used for feature selection or discretization of numeric attributes. The OrdEval algorithm and its visualization is used for evaluation of data sets with ordinal features and class, enabling analysis according to the Kano model of customer satisfaction. Several algorithms support parallel multithreaded execution via OpenMP. The top-level documentation is reachable through ?CORElearn.
1355 Machine Learning & Statistical Learning CoxBoost Cox models by likelihood based boosting for a single survival endpoint or competing risks This package provides routines for fitting Cox models by likelihood based boosting for a single endpoint or in presence of competing risks
1356 Machine Learning & Statistical Learning Cubist Rule- And Instance-Based Regression Modeling Regression modeling using rules with added instance-based corrections.
1357 Machine Learning & Statistical Learning darch Package for Deep Architectures and Restricted Boltzmann Machines The darch package is built on the basis of the code from G. E. Hinton and R. R. Salakhutdinov (available under Matlab Code for deep belief nets). This package is for generating neural networks with many layers (deep architectures) and train them with the method introduced by the publications “A fast learning algorithm for deep belief nets” (G. E. Hinton, S. Osindero, Y. W. Teh (2006) doi:10.1162/neco.2006.18.7.1527) and “Reducing the dimensionality of data with neural networks” (G. E. Hinton, R. R. Salakhutdinov (2006) doi:10.1126/science.1127647). This method includes a pre training with the contrastive divergence method published by G.E Hinton (2002) doi:10.1162/089976602760128018 and a fine tuning with common known training algorithms like backpropagation or conjugate gradients. Additionally, supervised fine-tuning can be enhanced with maxout and dropout, two recently developed techniques to improve fine-tuning for deep learning.
1358 Machine Learning & Statistical Learning deepnet deep learning toolkit in R Implement some deep learning architectures and neural network algorithms, including BP,RBM,DBN,Deep autoencoder and so on.
1359 Machine Learning & Statistical Learning e1071 (core) Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, …
1360 Machine Learning & Statistical Learning earth Multivariate Adaptive Regression Splines Build regression models using the techniques in Friedman’s papers “Fast MARS” and “Multivariate Adaptive Regression Splines”. (The term “MARS” is trademarked and thus not used in the name of the package.)
1361 Machine Learning & Statistical Learning effects Effect Displays for Linear, Generalized Linear, and Other Models Graphical and tabular effect displays, e.g., of interactions, for various statistical models with linear predictors.
1362 Machine Learning & Statistical Learning elasticnet Elastic-Net for Sparse Estimation and Sparse PCA This package provides functions for fitting the entire solution path of the Elastic-Net and also provides functions for estimating sparse Principal Components. The Lasso solution paths can be computed by the same function. First version: 2005-10.
1363 Machine Learning & Statistical Learning ElemStatLearn Data Sets, Functions and Examples from the Book: “The Elements of Statistical Learning, Data Mining, Inference, and Prediction” by Trevor Hastie, Robert Tibshirani and Jerome Friedman Useful when reading the book above mentioned, in the documentation referred to as ‘the book’.
1364 Machine Learning & Statistical Learning evclass Evidential Distance-Based Classification Different evidential distance-based classifiers, which provide outputs in the form of Dempster-Shafer mass functions. The methods are: the evidential K-nearest neighbor rule and the evidential neural network.
1365 Machine Learning & Statistical Learning evtree Evolutionary Learning of Globally Optimal Trees Commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as splits are chosen to maximize homogeneity at the next step only. An alternative way to search over the parameter space of trees is to use global optimization methods like evolutionary algorithms. The ‘evtree’ package implements an evolutionary algorithm for learning globally optimal classification and regression trees in R. CPU and memory-intensive tasks are fully computed in C++ while the ‘partykit’ package is leveraged to represent the resulting trees in R, providing unified infrastructure for summaries, visualizations, and predictions.
1366 Machine Learning & Statistical Learning FCNN4R Fast Compressed Neural Networks for R Provides an interface to kernel routines from the FCNN C++ library. FCNN is based on a completely new Artificial Neural Network representation that offers unmatched efficiency, modularity, and extensibility. FCNN4R provides standard teaching (backpropagation, Rprop, simulated annealing, stochastic gradient) and pruning algorithms (minimum magnitude, Optimal Brain Surgeon), but it is first and foremost an efficient computational engine. Users can easily implement their algorithms by taking advantage of fast gradient computing routines, as well as network reconstruction functionality (removing weights and redundant neurons, reordering inputs, merging networks). Networks can be exported to C functions in order to integrate them into virtually any software solution.
1367 Machine Learning & Statistical Learning frbs Fuzzy Rule-Based Systems for Classification and Regression Tasks An implementation of various learning algorithms based on fuzzy rule-based systems (FRBSs) for dealing with classification and regression tasks. Moreover, it allows to construct an FRBS model defined by human experts. FRBSs are based on the concept of fuzzy sets, proposed by Zadeh in 1965, which aims at representing the reasoning of human experts in a set of IF-THEN rules, to handle real-life problems in, e.g., control, prediction and inference, data mining, bioinformatics data processing, and robotics. FRBSs are also known as fuzzy inference systems and fuzzy models. During the modeling of an FRBS, there are two important steps that need to be conducted: structure identification and parameter estimation. Nowadays, there exists a wide variety of algorithms to generate fuzzy IF-THEN rules automatically from numerical data, covering both steps. Approaches that have been used in the past are, e.g., heuristic procedures, neuro-fuzzy techniques, clustering methods, genetic algorithms, squares methods, etc. Furthermore, in this version we provide a universal framework named ‘frbsPMML’, which is adopted from the Predictive Model Markup Language (PMML), for representing FRBS models. PMML is an XML-based language to provide a standard for describing models produced by data mining and machine learning algorithms. Therefore, we are allowed to export and import an FRBS model to/from ‘frbsPMML’. Finally, this package aims to implement the most widely used standard procedures, thus offering a standard package for FRBS modeling to the R community.
1368 Machine Learning & Statistical Learning GAMBoost Generalized linear and additive models by likelihood based boosting This package provides routines for fitting generalized linear and and generalized additive models by likelihood based boosting, using penalized B-splines
1369 Machine Learning & Statistical Learning gamboostLSS Boosting Methods for ‘GAMLSS’ Boosting models for fitting generalized additive models for location, shape and scale (‘GAMLSS’) to potentially high dimensional data.
1370 Machine Learning & Statistical Learning gbm (core) Generalized Boosted Regression Models An implementation of extensions to Freund and Schapire’s AdaBoost algorithm and Friedman’s gradient boosting machine. Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart).
1371 Machine Learning & Statistical Learning ggRandomForests Visually Exploring Random Forests Graphic elements for exploring Random Forests using the ‘randomForest’ or ‘randomForestSRC’ package for survival, regression and classification forests and ‘ggplot2’ package plotting.
1372 Machine Learning & Statistical Learning glmnet Lasso and Elastic-Net Regularized Generalized Linear Models Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression and the Cox model. Two recent additions are the multiple-response Gaussian, and the grouped multinomial regression. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the paper linked to via the URL below.
1373 Machine Learning & Statistical Learning glmpath L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model A path-following algorithm for L1 regularized generalized linear models and Cox proportional hazards model
1374 Machine Learning & Statistical Learning GMMBoost Likelihood-based Boosting for Generalized mixed models Likelihood-based Boosting for Generalized mixed models
1375 Machine Learning & Statistical Learning gmum.r GMUM Machine Learning Group Package Direct R interface to Support Vector Machine libraries (‘LIBSVM’ and ‘SVMLight’) and efficient C++ implementations of Growing Neural Gas and models developed by ‘GMUM’ group (Cross Entropy Clustering and 2eSVM).
1376 Machine Learning & Statistical Learning gradDescent Gradient Descent for Regression Tasks An implementation of various learning algorithms based on Gradient Descent for dealing with regression tasks. The variants of gradient descent algorithm are : Mini-Batch Gradient Descent (MBGD), which is an optimization to use training data partially to reduce the computation load. Stochastic Gradient Descent (SGD), which is an optimization to use a random data in learning to reduce the computation load drastically. Stochastic Average Gradient (SAG), which is a SGD-based algorithm to minimize stochastic step to average. Momentum Gradient Descent (MGD), which is an optimization to speed-up gradient descent learning. Accelerated Gradient Descent (AGD), which is an optimization to accelerate gradient descent learning. Adagrad, which is a gradient-descent-based algorithm that accumulate previous cost to do adaptive learning. Adadelta, which is a gradient-descent-based algorithm that use hessian approximation to do adaptive learning. RMSprop, which is a gradient-descent-based algorithm that combine Adagrad and Adadelta adaptive learning ability. Adam, which is a gradient-descent-based algorithm that mean and variance moment to do adaptive learning.
1377 Machine Learning & Statistical Learning grplasso Fitting user specified models with Group Lasso penalty Fits user specified (GLM-) models with Group Lasso penalty
1378 Machine Learning & Statistical Learning grpreg Regularization Paths for Regression Models with Grouped Covariates Efficient algorithms for fitting the regularization path of linear or logistic regression models with grouped penalties. This includes group selection methods such as group lasso, group MCP, and group SCAD as well as bi-level selection methods such as the group exponential lasso, the composite MCP, and the group bridge.
1379 Machine Learning & Statistical Learning h2o R Interface for H2O R scripting functionality for H2O, the open source math engine for big data that computes parallel distributed machine learning algorithms such as generalized linear models, gradient boosting machines, random forests, and neural networks (deep learning) within various cluster environments.
1380 Machine Learning & Statistical Learning hda Heteroscedastic Discriminant Analysis Functions to perform dimensionality reduction for classification if the covariance matrices of the classes are unequal.
1381 Machine Learning & Statistical Learning hdi High-Dimensional Inference Implementation of multiple approaches to perform inference in high-dimensional models.
1382 Machine Learning & Statistical Learning hdm High-Dimensional Metrics Implementation of selected high-dimensional statistical and econometric methods for estimation and inference. Efficient estimators and uniformly valid confidence intervals for various low-dimensional causal/ structural parameters are provided which appear in high-dimensional approximately sparse models. Including functions for fitting heteroscedastic robust Lasso regressions with non-Gaussian errors and for instrumental variable (IV) and treatment effect estimation in a high-dimensional setting. Moreover, the methods enable valid post-selection inference and rely on a theoretically grounded, data-driven choice of the penalty.
1383 Machine Learning & Statistical Learning ICEbox Individual Conditional Expectation Plot Toolbox Implements Individual Conditional Expectation (ICE) plots, a tool for visualizing the model estimated by any supervised learning algorithm. ICE plots refine Friedman’s partial dependence plot by graphing the functional relationship between the predicted response and a covariate of interest for individual observations. Specifically, ICE plots highlight the variation in the fitted values across the range of a covariate of interest, suggesting where and to what extent they may exist.
1384 Machine Learning & Statistical Learning ipred Improved Predictors Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error.
1385 Machine Learning & Statistical Learning kernlab (core) Kernel-Based Machine Learning Lab Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods ‘kernlab’ includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.
1386 Machine Learning & Statistical Learning klaR Classification and visualization Miscellaneous functions for classification and visualization developed at the Fakultaet Statistik, Technische Universitaet Dortmund
1387 Machine Learning & Statistical Learning lars Least Angle Regression, Lasso and Forward Stagewise Efficient procedures for fitting an entire lasso sequence with the cost of a single least squares fit. Least angle regression and infinitesimal forward stagewise regression are related to the lasso, as described in the paper below.
1388 Machine Learning & Statistical Learning lasso2 L1 constrained estimation aka ‘lasso’ Routines and documentation for solving regression problems while imposing an L1 constraint on the estimates, based on the algorithm of Osborne et al. (1998)
1389 Machine Learning & Statistical Learning LiblineaR Linear Predictive Models Based on the ‘LIBLINEAR’ C/C++ Library A wrapper around the ‘LIBLINEAR’ C/C++ library for machine learning (available at http://www.csie.ntu.edu.tw/~cjlin/liblinear). ‘LIBLINEAR’ is a simple library for solving large-scale regularized linear classification and regression. It currently supports L2-regularized classification (such as logistic regression, L2-loss linear SVM and L1-loss linear SVM) as well as L1-regularized classification (such as L2-loss linear SVM and logistic regression) and L2-regularized support vector regression (with L1- or L2-loss). The main features of LiblineaR include multi-class classification (one-vs-the rest, and Crammer & Singer method), cross validation for model selection, probability estimates (logistic regression only) or weights for unbalanced data. The estimation of the models is particularly fast as compared to other libraries.
1390 Machine Learning & Statistical Learning LogicForest Logic Forest Two classification ensemble methods based on logic regression models. LogForest uses a bagging approach to construct an ensemble of logic regression models. LBoost uses a combination of boosting and cross-validation to construct an ensemble of logic regression models. Both methods are used for classification of binary responses based on binary predictors and for identification of important variables and variable interactions predictive of a binary outcome.
1391 Machine Learning & Statistical Learning LogicReg Logic Regression Routines for fitting Logic Regression models.
1392 Machine Learning & Statistical Learning LTRCtrees Survival Trees to Fit Left-Truncated and Right-Censored and Interval-Censored Survival Data Recursive partition algorithms designed for fitting survival tree with left-truncated and right censored (LTRC) data, as well as interval-censored data. The LTRC trees can also be used to fit survival tree with time-varying covariates.
1393 Machine Learning & Statistical Learning maptree Mapping, pruning, and graphing tree models Functions with example data for graphing, pruning, and mapping models from hierarchical clustering, and classification and regression trees.
1394 Machine Learning & Statistical Learning mboost (core) Model-Based Boosting Functional gradient descent algorithm (boosting) for optimizing general risk functions utilizing component-wise (penalised) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data.
1395 Machine Learning & Statistical Learning mlr Machine Learning in R Interface to a large number of classification and regression techniques, including machine-readable parameter descriptions. There is also an experimental extension for survival analysis, clustering and general, example-specific cost-sensitive learning. Generic resampling, including cross-validation, bootstrapping and subsampling. Hyperparameter tuning with modern optimization techniques, for single- and multi-objective problems. Filter and wrapper methods for feature selection. Extension of basic learners with additional operations common in machine learning, also allowing for easy nested resampling. Most operations can be parallelized.
1396 Machine Learning & Statistical Learning MXM Feature Selection (Including Multiple Solutions) and Bayesian Networks Feature selection methods for identifying minimal, statistically-equivalent and equally-predictive feature subsets. Bayesian network algorithms and related functions are also included. The package name ‘MXM’ stands for “Mens eX Machina”, meaning “Mind from the Machine” in Latin. Reference: Feature Selection with the R Package MXM: Discovering Statistically Equivalent Feature Subsets, Lagani, V. and Athineou, G. and Farcomeni, A. and Tsagris, M. and Tsamardinos, I. (2017). Journal of Statistical Software, 80(7). doi:10.18637/jss.v080.i07.
1397 Machine Learning & Statistical Learning ncvreg Regularization Paths for SCAD and MCP Penalized Regression Models Efficient algorithms for fitting regularization paths for linear or logistic regression models penalized by MCP or SCAD, with optional additional L2 penalty.
1398 Machine Learning & Statistical Learning nnet (core) Feed-Forward Neural Networks and Multinomial Log-Linear Models Software for feed-forward neural networks with a single hidden layer, and for multinomial log-linear models.
1399 Machine Learning & Statistical Learning OneR One Rule Machine Learning Classification Algorithm with Enhancements Implements the One Rule (OneR) Machine Learning classification algorithm (Holte, R.C. (1993) doi:10.1023/A:1022631118932) with enhancements for sophisticated handling of numeric data and missing values together with extensive diagnostic functions. It is useful as a baseline for machine learning models and the rules are often helpful heuristics.
1400 Machine Learning & Statistical Learning opusminer OPUS Miner Algorithm for Filtered Top-k Association Discovery Provides a simple R interface to the OPUS Miner algorithm (implemented in C++) for finding the top-k productive, non-redundant itemsets from transaction data. The OPUS Miner algorithm uses the OPUS search algorithm to efficiently discover the key associations in transaction data, in the form of self-sufficient itemsets, using either leverage or lift. See http://i.giwebb.com/index.php/research/association-discovery/ for more information in relation to the OPUS Miner algorithm.
1401 Machine Learning & Statistical Learning pamr Pam: prediction analysis for microarrays Some functions for sample classification in microarrays
1402 Machine Learning & Statistical Learning party A Laboratory for Recursive Partytioning A computational toolbox for recursive partitioning. The core of the package is ctree(), an implementation of conditional inference trees which embed tree-structured regression models into a well defined theory of conditional inference procedures. This non-parametric class of regression trees is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Based on conditional inference trees, cforest() provides an implementation of Breiman’s random forests. The function mob() implements an algorithm for recursive partitioning based on parametric models (e.g. linear models, GLMs or survival regression) employing parameter instability tests for split selection. Extensible functionality for visualizing tree-structured regression models is available. The methods are described in Hothorn et al. (2006) doi:10.1198/106186006X133933, Zeileis et al. (2008) doi:10.1198/106186008X319331 and Strobl et al. (2007) doi:10.1186/1471-2105-8-25.
1403 Machine Learning & Statistical Learning partykit A Toolkit for Recursive Partytioning A toolkit with infrastructure for representing, summarizing, and visualizing tree-structured regression and classification models. This unified infrastructure can be used for reading/coercing tree models from different sources (‘rpart’, ‘RWeka’, ‘PMML’) yielding objects that share functionality for print()/plot()/predict() methods. Furthermore, new and improved reimplementations of conditional inference trees (ctree()) and model-based recursive partitioning (mob()) from the ‘party’ package are provided based on the new infrastructure.
1404 Machine Learning & Statistical Learning pdp Partial Dependence Plots A general framework for constructing partial dependence (i.e., marginal effect) plots from various types machine learning models in R.
1405 Machine Learning & Statistical Learning penalized L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model Fitting possibly high dimensional penalized regression models. The penalty structure can be any combination of an L1 penalty (lasso and fused lasso), an L2 penalty (ridge) and a positivity constraint on the regression coefficients. The supported regression models are linear, logistic and Poisson regression and the Cox Proportional Hazards model. Cross-validation routines allow optimization of the tuning parameters.
1406 Machine Learning & Statistical Learning penalizedLDA Penalized Classification using Fisher’s Linear Discriminant Implements the penalized LDA proposal of “Witten and Tibshirani (2011), Penalized classification using Fisher’s linear discriminant, to appear in Journal of the Royal Statistical Society, Series B”.
1407 Machine Learning & Statistical Learning penalizedSVM Feature Selection SVM using penalty functions This package provides feature selection SVM using penalty functions. The smoothly clipped absolute deviation (SCAD), ‘L1-norm’, ‘Elastic Net’ (‘L1-norm’ and ‘L2-norm’) and ‘Elastic SCAD’ (SCAD and ‘L2-norm’) penalties are availible. The tuning parameters can be founf using either a fixed grid or a interval search.
1408 Machine Learning & Statistical Learning plotmo Plot a Model’s Response and Residuals Plot model surfaces for a wide variety of models using partial dependence plots and other techniques. Also plot model residuals and other information on the model.
1409 Machine Learning & Statistical Learning quantregForest Quantile Regression Forests Quantile Regression Forests is a tree-based ensemble method for estimation of conditional quantiles. It is particularly well suited for high-dimensional data. Predictor variables of mixed classes can be handled. The package is dependent on the package ‘randomForest’, written by Andy Liaw.
1410 Machine Learning & Statistical Learning randomForest (core) Breiman and Cutler’s Random Forests for Classification and Regression Classification and regression based on a forest of trees using random inputs.
1411 Machine Learning & Statistical Learning randomForestSRC Random Forests for Survival, Regression, and Classification (RF-SRC) A unified treatment of Breiman’s random forests for survival, regression and classification problems based on Ishwaran and Kogalur’s random survival forests (RSF) package. Now extended to include multivariate and unsupervised forests. Also includes quantile regression forests for univariate and multivariate training/testing settings. The package runs in both serial and parallel (OpenMP) modes.
1412 Machine Learning & Statistical Learning ranger A Fast Implementation of Random Forests A fast implementation of Random Forests, particularly suited for high dimensional data. Ensembles of classification, regression, survival and probability prediction trees are supported. Data from genome-wide association studies can be analyzed efficiently. In addition to data frames, datasets of class ‘gwaa.data’ (R package ‘GenABEL’) can be directly analyzed.
1413 Machine Learning & Statistical Learning rattle Graphical User Interface for Data Science in R The R Analytic Tool To Learn Easily (Rattle) provides a Gnome (RGtk2) based interface to R functionality for data science. The aim is to provide a simple and intuitive interface that allows a user to quickly load data from a CSV file (or via ODBC), transform and explore the data, build and evaluate models, and export models as PMML (predictive modelling markup language) or as scores. All of this with knowing little about R. All R commands are logged and commented through the log tab. Thus they are available to the user as a script file or as an aide for the user to learn R or to copy-and-paste directly into R itself. Rattle also exports a number of utility functions and the graphical user interface, invoked as rattle(), does not need to be run to deploy these.
1414 Machine Learning & Statistical Learning Rborist Extensible, Parallelizable Implementation of the Random Forest Algorithm Scalable decision tree training and prediction.
1415 Machine Learning & Statistical Learning RcppDL Deep Learning Methods via Rcpp This package is based on the C++ code from Yusuke Sugomori, which implements basic machine learning methods with many layers (deep learning), including dA (Denoising Autoencoder), SdA (Stacked Denoising Autoencoder), RBM (Restricted Boltzmann machine) and DBN (Deep Belief Nets).
1416 Machine Learning & Statistical Learning rda Shrunken Centroids Regularized Discriminant Analysis Shrunken Centroids Regularized Discriminant Analysis for the classification purpose in high dimensional data.
1417 Machine Learning & Statistical Learning rdetools Relevant Dimension Estimation (RDE) in Feature Spaces The package provides functions for estimating the relevant dimension of a data set in feature spaces, applications to model selection, graphical illustrations and prediction.
1418 Machine Learning & Statistical Learning REEMtree Regression Trees with Random Effects for Longitudinal (Panel) Data This package estimates regression trees with random effects as a way to use data mining techniques to describe longitudinal or panel data.
1419 Machine Learning & Statistical Learning relaxo Relaxed Lasso Relaxed Lasso is a generalisation of the Lasso shrinkage technique for linear regression. Both variable selection and parameter estimation is achieved by regular Lasso, yet both steps do not necessarily use the same penalty parameter. The results include all standard Lasso solutions but allow often for sparser models while having similar or even slightly better predictive performance if many predictor variables are present. The package depends on the LARS package.
1420 Machine Learning & Statistical Learning rgenoud R Version of GENetic Optimization Using Derivatives A genetic algorithm plus derivative optimizer.
1421 Machine Learning & Statistical Learning rgp R genetic programming framework RGP is a simple modular Genetic Programming (GP) system build in pure R. In addition to general GP tasks, the system supports Symbolic Regression by GP through the familiar R model formula interface. GP individuals are represented as R expressions, an (optional) type system enables domain-specific function sets containing functions of diverse domain- and range types. A basic set of genetic operators for variation (mutation and crossover) and selection is provided.
1422 Machine Learning & Statistical Learning RLT Reinforcement Learning Trees Random forest with a variety of additional features for regression, classification and survival analysis. The features include: parallel computing with OpenMP, embedded model for selecting the splitting variable (based on Zhu, Zeng & Kosorok, 2015), subject weight, variable weight, tracking subjects used in each tree, etc.
1423 Machine Learning & Statistical Learning Rmalschains Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R An implementation of an algorithm family for continuous optimization called memetic algorithms with local search chains (MA-LS-Chains). Memetic algorithms are hybridizations of genetic algorithms with local search methods. They are especially suited for continuous optimization.
1424 Machine Learning & Statistical Learning rminer Data Mining Classification and Regression Methods Facilitates the use of data mining algorithms in classification and regression (including time series forecasting) tasks by presenting a short and coherent set of functions. Versions: 1.4.2 new NMAE metric, “xgboost” and “cv.glmnet” models (16 classification and 18 regression models); 1.4.1 new tutorial and more robust version; 1.4 - new classification and regression models/algorithms, with a total of 14 classification and 15 regression methods, including: Decision Trees, Neural Networks, Support Vector Machines, Random Forests, Bagging and Boosting; 1.3 and 1.3.1 - new classification and regression metrics (improved mmetric function); 1.2 - new input importance methods (improved Importance function); 1.0 - first version.
1425 Machine Learning & Statistical Learning rnn Recurrent Neural Network Implementation of a Recurrent Neural Network in R.
1426 Machine Learning & Statistical Learning ROCR Visualizing the Performance of Scoring Classifiers ROC graphs, sensitivity/specificity curves, lift charts, and precision/recall plots are popular examples of trade-off visualizations for specific pairs of performance measures. ROCR is a flexible tool for creating cutoff-parameterized 2D performance curves by freely combining two from over 25 performance measures (new performance measures can be added using a standard interface). Curves from different cross-validation or bootstrapping runs can be averaged by different methods, and standard deviations, standard errors or box plots can be used to visualize the variability across the runs. The parameterization can be visualized by printing cutoff values at the corresponding curve positions, or by coloring the curve according to cutoff. All components of a performance plot can be quickly adjusted using a flexible parameter dispatching mechanism. Despite its flexibility, ROCR is easy to use, with only three commands and reasonable default values for all optional parameters.
1427 Machine Learning & Statistical Learning RoughSets Data Analysis Using Rough Set and Fuzzy Rough Set Theories Implementations of algorithms for data analysis based on the rough set theory (RST) and the fuzzy rough set theory (FRST). We not only provide implementations for the basic concepts of RST and FRST but also popular algorithms that derive from those theories. The methods included in the package can be divided into several categories based on their functionality: discretization, feature selection, instance selection, rule induction and classification based on nearest neighbors. RST was introduced by Zdzisaw Pawlak in 1982 as a sophisticated mathematical tool to model and process imprecise or incomplete information. By using the indiscernibility relation for objects/instances, RST does not require additional parameters to analyze the data. FRST is an extension of RST. The FRST combines concepts of vagueness and indiscernibility that are expressed with fuzzy sets (as proposed by Zadeh, in 1965) and RST.
1428 Machine Learning & Statistical Learning rpart (core) Recursive Partitioning and Regression Trees Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman, Friedman, Olshen and Stone.
1429 Machine Learning & Statistical Learning RPMM Recursively Partitioned Mixture Model Recursively Partitioned Mixture Model for Beta and Gaussian Mixtures. This is a model-based clustering algorithm that returns a hierarchy of classes, similar to hierarchical clustering, but also similar to finite mixture models.
1430 Machine Learning & Statistical Learning RSNNS Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS) The Stuttgart Neural Network Simulator (SNNS) is a library containing many standard implementations of neural networks. This package wraps the SNNS functionality to make it available from within R. Using the ‘RSNNS’ low-level interface, all of the algorithmic functionality and flexibility of SNNS can be accessed. Furthermore, the package contains a convenient high-level interface, so that the most common neural network topologies and learning algorithms integrate seamlessly into R.
1431 Machine Learning & Statistical Learning RWeka R/Weka Interface An R interface to Weka (Version 3.9.1). Weka is a collection of machine learning algorithms for data mining tasks written in Java, containing tools for data pre-processing, classification, regression, clustering, association rules, and visualization. Package ‘RWeka’ contains the interface code, the Weka jar is in a separate package ‘RWekajars’. For more information on Weka see http://www.cs.waikato.ac.nz/ml/weka/.
1432 Machine Learning & Statistical Learning RXshrink Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression Identify and display TRACEs for a specified shrinkage path and determine the extent of shrinkage most likely, under normal distribution theory, to produce an optimal reduction in MSE Risk in estimates of regression (beta) coefficients.
1433 Machine Learning & Statistical Learning sda Shrinkage Discriminant Analysis and CAT Score Variable Selection Provides an efficient framework for high-dimensional linear and diagonal discriminant analysis with variable selection. The classifier is trained using James-Stein-type shrinkage estimators and predictor variables are ranked using correlation-adjusted t-scores (CAT scores). Variable selection error is controlled using false non-discovery rates or higher criticism.
1434 Machine Learning & Statistical Learning SIS Sure Independence Screening Variable selection techniques are essential tools for model selection and estimation in high-dimensional statistical models. Through this publicly available package, we provide a unified environment to carry out variable selection using iterative sure independence screening (SIS) and all of its variants in generalized linear models and the Cox proportional hazards model.
1435 Machine Learning & Statistical Learning spa Implements The Sequential Predictions Algorithm Implements the Sequential Predictions Algorithm
1436 Machine Learning & Statistical Learning stabs Stability Selection with Error Control Resampling procedures to assess the stability of selected variables with additional finite sample error control for high-dimensional variable selection procedures such as Lasso or boosting. Both, standard stability selection (Meinshausen & Buhlmann, 2010, doi:10.1111/j.1467-9868.2010.00740.x) and complementary pairs stability selection with improved error bounds (Shah & Samworth, 2013, doi:10.1111/j.1467-9868.2011.01034.x) are implemented. The package can be combined with arbitrary user specified variable selection approaches.
1437 Machine Learning & Statistical Learning SuperLearner Super Learner Prediction Implements the super learner prediction method and contains a library of prediction algorithms to be used in the super learner.
1438 Machine Learning & Statistical Learning svmpath The SVM Path Algorithm Computes the entire regularization path for the two-class svm classifier with essentially the same cost as a single SVM fit.
1439 Machine Learning & Statistical Learning tensorflow R Interface to ‘TensorFlow’ Interface to ‘TensorFlow’ https://www.tensorflow.org/, an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more ‘CPUs’ or ‘GPUs’ in a desktop, server, or mobile device with a single ‘API’. ‘TensorFlow’ was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.
1440 Machine Learning & Statistical Learning tgp Bayesian Treed Gaussian Process Models Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes (GPs) with jumps to the limiting linear model (LLM). Special cases also implemented include Bayesian linear models, CART, treed linear models, stationary separable and isotropic GPs, and GP single-index models. Provides 1-d and 2-d plotting functions (with projection and slice capabilities) and tree drawing, designed for visualization of tgp-class output. Sensitivity analysis and multi-resolution models are supported. Sequential experimental design and adaptive sampling functions are also provided, including ALM, ALC, and expected improvement. The latter supports derivative-free optimization of noisy black-box functions.
1441 Machine Learning & Statistical Learning tree Classification and Regression Trees Classification and regression trees.
1442 Machine Learning & Statistical Learning varSelRF Variable Selection using Random Forests Variable selection from random forests using both backwards variable elimination (for the selection of small sets of non-redundant variables) and selection based on the importance spectrum (somewhat similar to scree plots; for the selection of large, potentially highly-correlated variables). Main applications in high-dimensional data (e.g., microarray data, and other genomics and proteomics applications).
1443 Machine Learning & Statistical Learning vcrpart Tree-Based Varying Coefficient Regression for Generalized Linear and Ordinal Mixed Models Recursive partitioning for varying coefficient generalized linear models and ordinal linear mixed models. Special features are coefficient-wise partitioning, non-varying coefficients and partitioning of time-varying variables in longitudinal regression.
1444 Machine Learning & Statistical Learning wsrf Weighted Subspace Random Forest for Classification A parallel implementation of Weighted Subspace Random Forest. The Weighted Subspace Random Forest algorithm was proposed in the International Journal of Data Warehousing and Mining by Baoxun Xu, Joshua Zhexue Huang, Graham Williams, Qiang Wang, and Yunming Ye (2012) doi:10.4018/jdwm.2012040103. The algorithm can classify very high-dimensional data with random forests built using small subspaces. A novel variable weighting method is used for variable subspace selection in place of the traditional random variable sampling.This new approach is particularly useful in building models from high-dimensional data.
1445 Machine Learning & Statistical Learning xgboost Extreme Gradient Boosting Extreme Gradient Boosting, which is an efficient implementation of the gradient boosting framework from Chen & Guestrin (2016) doi:10.1145/2939672.2939785. This package is its R interface. The package includes efficient linear model solver and tree learning algorithms. The package can automatically do parallel computation on a single machine which could be more than 10 times faster than existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is made to be extensible, so that users are also allowed to define their own objectives easily.
1446 Medical Image Analysis adaptsmoFMRI Adaptive Smoothing of FMRI Data This package contains R functions for estimating the blood oxygenation level dependent (BOLD) effect by using functional Magnetic Resonance Imaging (fMRI) data, based on adaptive Gauss Markov random fields, for real as well as simulated data. The implemented simulations make use of efficient Markov Chain Monte Carlo methods.
1447 Medical Image Analysis adimpro (core) Adaptive Smoothing of Digital Images Implements tools for manipulation of digital images and the Propagation Separation approach by Polzehl and Spokoiny (2006) doi:10.1007/s00440-005-0464-1 for smoothing digital images, see Polzehl and Tabelow (2007) doi:10.18637/jss.v019.i01.
1448 Medical Image Analysis AnalyzeFMRI (core) Functions for analysis of fMRI datasets stored in the ANALYZE or NIFTI format Functions for I/O, visualisation and analysis of functional Magnetic Resonance Imaging (fMRI) datasets stored in the ANALYZE or NIFTI format.
1449 Medical Image Analysis arf3DS4 (core) Activated Region Fitting, fMRI data analysis (3D) Activated Region Fitting (ARF) is an analysis method for fMRI data.
1450 Medical Image Analysis bayesImageS Bayesian Methods for Image Segmentation using a Potts Model Various algorithms for segmentation of 2D and 3D images, such as computed tomography and satellite remote sensing. This package implements Bayesian image analysis using the hidden Potts model with external field prior. Latent labels are sampled using chequerboard updating or Swendsen-Wang. Algorithms for the smoothing parameter include pseudolikelihood, path sampling, the exchange algorithm, approximate Bayesian computation (ABC-MCMC and ABC-SMC), and Bayesian indirect likelihood (BIL).
1451 Medical Image Analysis bayesm Bayesian Inference for Marketing/Micro-Econometrics Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley 2005) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).
1452 Medical Image Analysis brainR Helper Functions to ‘misc3d’ and ‘rgl’ Packages for Brain Imaging This includes functions for creating 3D and 4D images using ‘WebGL’, ‘rgl’, and ‘JavaScript’ commands. This package relies on the X toolkit (‘XTK’, https://github.com/xtk/X#readme).
1453 Medical Image Analysis brainwaver Basic wavelet analysis of multivariate time series with a visualisation and parametrisation using graph theory This package computes the correlation matrix for each scale of a wavelet decomposition, namely the one performed by the R package waveslim (Whitcher, 2000). An hypothesis test is applied to each entry of one matrix in order to construct an adjacency matrix of a graph. The graph obtained is finally analysed using the small-world theory (Watts and Strogatz, 1998) and using the computation of efficiency (Latora, 2001), tested using simulated attacks. The brainwaver project is complementary to the camba project for brain-data preprocessing. A collection of scripts (with a makefile) is avalaible to download along with the brainwaver package, see information on the webpage mentioned below.
1454 Medical Image Analysis cudaBayesreg CUDA Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis Compute Unified Device Architecture (CUDA) is a software platform for massively parallel high-performance computing on NVIDIA GPUs. This package provides a CUDA implementation of a Bayesian multilevel model for the analysis of brain fMRI data. A fMRI data set consists of time series of volume data in 4D space. Typically, volumes are collected as slices of 64 x 64 voxels. Analysis of fMRI data often relies on fitting linear regression models at each voxel of the brain. The volume of the data to be processed, and the type of statistical analysis to perform in fMRI analysis, call for high-performance computing strategies. In this package, the CUDA programming model uses a separate thread for fitting a linear regression model at each voxel in parallel. The global statistical model implements a Gibbs Sampler for hierarchical linear models with a normal prior. This model has been proposed by Rossi, Allenby and McCulloch in ‘Bayesian Statistics and Marketing’, Chapter 3, and is referred to as ‘rhierLinearModel’ in the R-package bayesm. A notebook equipped with a NVIDIA ‘GeForce 8400M GS’ card having Compute Capability 1.1 has been used in the tests. The data sets used in the package’s examples are available in the separate package cudaBayesregData.
1455 Medical Image Analysis DATforDCEMRI (core) Deconvolution Analysis Tool for Dynamic Contrast Enhanced MRI This package performs voxel-wise deconvolution analysis of DCE-MRI contrast agent concentration versus time data and generates the Impulse Response Function, which can be used to approximate commonly utilized kinetic parameters such as Ktrans and ve. An interactive advanced voxel diagnosis tool (AVDT) is also provided to facilitate easy navigation of voxel-wise data.
1456 Medical Image Analysis dcemriS4 (core) A Package for Image Analysis of DCE-MRI (S4 Implementation) A collection of routines and documentation that allows one to perform voxel-wise quantitative analysis of dynamic contrast-enhanced MRI (DEC-MRI) and diffusion-weighted imaging (DWI) data, with emphasis on oncology applications.
1457 Medical Image Analysis divest (core) Get Images Out of DICOM Format Quickly Provides tools to convert DICOM-format files to NIfTI-1 format.
1458 Medical Image Analysis dpmixsim (core) Dirichlet Process Mixture model simulation for clustering and image segmentation The package implements a Dirichlet Process Mixture (DPM) model for clustering and image segmentation. The DPM model is a Bayesian nonparametric methodology that relies on MCMC simulations for exploring mixture models with an unknown number of components. The code implements conjugate models with normal structure (conjugate normal-normal DP mixture model). The package’s applications are oriented towards the classification of magnetic resonance images according to tissue type or region of interest.
1459 Medical Image Analysis dti (core) Analysis of Diffusion Weighted Imaging (DWI) Data Diffusion Weighted Imaging (DWI) is a Magnetic Resonance Imaging modality, that measures diffusion of water in tissues like the human brain. The package contains R-functions to process diffusion-weighted data. The functionality includes diffusion tensor imaging (DTI), diffusion kurtosis imaging (DKI), modeling for high angular resolution diffusion weighted imaging (HARDI) using Q-ball-reconstruction and tensor mixture models, several methods for structural adaptive smoothing including POAS and msPOAS, and a streamline fiber tracking for tensor and tensor mixture models. The package provides functionality to manipulate and visualize results in 2D and 3D.
1460 Medical Image Analysis edfReader (core) Reading EDF(+) and BDF(+) Files Reads European Data Format files EDF and EDF+, see http://www.edfplus.info, BioSemi Data Format files BDF, see http://www.biosemi.com/faq/file_format.htm, and BDF+ files, see http://www.teuniz.net/edfbrowser/bdfplus%20format%20description.html. The files are read in two steps: first the header is read and then the signals (using the header object as a parameter).
1461 Medical Image Analysis eegkit (core) Toolkit for Electroencephalography Data Analysis and visualization tools for electroencephalography (EEG) data. Includes functions for plotting (a) EEG caps, (b) single- and multi-channel EEG time courses, and (c) EEG spatial maps. Also includes smoothing and Independent Component Analysis functions for EEG data analysis, and a function for simulating event-related potential EEG data.
1462 Medical Image Analysis fmri (core) Analysis of fMRI Experiments Contains R-functions to perform an fMRI analysis as described in Tabelow et al. (2006) doi:10.1016/j.neuroimage.2006.06.029, Polzehl et al. (2010) doi:10.1016/j.neuroimage.2010.04.241, Tabelow and Polzehl (2011) doi:10.18637/jss.v044.i11.
1463 Medical Image Analysis fslr Wrapper Functions for FSL (‘FMRIB’ Software Library) from Functional MRI of the Brain (‘FMRIB’) Wrapper functions that interface with ‘FSL’ http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/, a powerful and commonly-used ‘neuroimaging’ software, using system commands. The goal is to be able to interface with ‘FSL’ completely in R, where you pass R objects of class ‘nifti’, implemented by package ‘oro.nifti’, and the function executes an ‘FSL’ command and returns an R object of class ‘nifti’ if desired.
1464 Medical Image Analysis gdimap (core) Generalized Diffusion Magnetic Resonance Imaging Diffusion anisotropy has been used to characterize white matter neuronal pathways in the human brain, and infer global connectivity in the central nervous system. The package implements algorithms to estimate and visualize the orientation of neuronal pathways in model-free methods (q-space imaging methods). For estimating fibre orientations two methods have been implemented. One method implements fibre orientation detection through local maxima extraction. A second more robust method is based on directional statistical clustering of ODF voxel data. Fibre orientations in multiple fibre voxels are estimated using a mixture of von Mises-Fisher (vMF) distributions. This statistical estimation procedure is used to resolve crossing fibre configurations. Reconstruction of orientation distribution function (ODF) profiles may be performed using the standard generalized q-sampling imaging (GQI) approach, Garyfallidis’ GQI (GQI2) approach, or Aganj’s variant of the Q-ball imaging (CSA-QBI) approach. Procedures for the visualization of RGB-maps, line-maps and glyph-maps of real diffusion magnetic resonance imaging (dMRI) data-sets are included in the package.
1465 Medical Image Analysis KATforDCEMRI (core) Kinetic analysis and visualization of DCE-MRI data Package for kinetic analysis of longitudinal voxel-wise Dynamic Contrast Enhanced MRI data. Includes tools for visualization and exploration of voxel-wise parametric maps.
1466 Medical Image Analysis mmand (core) Mathematical Morphology in Any Number of Dimensions Provides tools for performing mathematical morphology operations, such as erosion and dilation, on data of arbitrary dimensionality. Can also be used for finding connected components, resampling, filtering, smoothing and other image processing-style operations.
1467 Medical Image Analysis Morpho (core) Calculations and Visualisations Related to Geometric Morphometrics A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.
1468 Medical Image Analysis mritc (core) MRI Tissue Classification Various methods for MRI tissue classification.
1469 Medical Image Analysis neuroim (core) Data Structures and Handling for Neuroimaging Data A collection of data structures that represent volumetric brain imaging data. The focus is on basic data handling for 3D and 4D neuroimaging data. In addition, there are function to read and write NIFTI files and limited support for reading AFNI files.
1470 Medical Image Analysis neuRosim (core) Functions to Generate fMRI Data Including Activated Data, Noise Data and Resting State Data The package allows users to generate fMRI time series or 4D data. Some high-level functions are created for fast data generation with only a few arguments and a diversity of functions to define activation and noise. For more advanced users it is possible to use the low-level functions and manipulate the arguments.
1471 Medical Image Analysis occ (core) Estimates PET neuroreceptor occupancies This package provides a generic function for estimating positron emission tomography (PET) neuroreceptor occupancies from the total volumes of distribution of a set of regions of interest. Fittings methods include the simple ‘reference region’ and ‘ordinary least squares’ (sometimes known as occupancy plot) methods, as well as the more efficient ‘restricted maximum likelihood estimation’.
1472 Medical Image Analysis oro.dicom (core) Rigorous - DICOM Input / Output Data input/output functions for data that conform to the Digital Imaging and Communications in Medicine (DICOM) standard, part of the Rigorous Analytics bundle.
1473 Medical Image Analysis oro.nifti (core) Rigorous - NIfTI + ANALYZE + AFNI : Input / Output Functions for the input/output and visualization of medical imaging data that follow either the ANALYZE, NIfTI or AFNI formats. This package is part of the Rigorous Analytics bundle.
1474 Medical Image Analysis PET Simulation and Reconstruction of PET Images This package implements different analytic/direct and iterative reconstruction methods of Peter Toft. It also offer the possibility to simulate PET data.
1475 Medical Image Analysis PTAk Principal Tensor Analysis on k Modes A multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.
1476 Medical Image Analysis RNifti (core) Fast R and C++ Access to NIfTI Images Provides very fast access to images stored in the NIfTI-1 file format http://www.nitrc.org/docman/view.php/26/64/nifti1.h, with seamless synchronisation between compiled C and interpreted R code. Not to be confused with ‘RNiftyReg’, which provides tools for image registration.
1477 Medical Image Analysis RNiftyReg (core) Image Registration Using the ‘NiftyReg’ Library Provides an ‘R’ interface to the ‘NiftyReg’ image registration tools http://sourceforge.net/projects/niftyreg/. Linear and nonlinear registration are supported, in two and three dimensions.
1478 Medical Image Analysis Rvcg (core) Manipulations of Triangular Meshes Based on the ‘VCGLIB’ API Operations on triangular meshes based on ‘VCGLIB’. This package integrates nicely with the R-package ‘rgl’ to render the meshes processed by ‘Rvcg’. The Visualization and Computer Graphics Library (VCG for short) is an open source portable C++ templated library for manipulation, processing and displaying with OpenGL of triangle and tetrahedral meshes. The library, composed by more than 100k lines of code, is released under the GPL license, and it is the base of most of the software tools of the Visual Computing Lab of the Italian National Research Council Institute ISTI http://vcg.isti.cnr.it, like ‘metro’ and ‘MeshLab’. The ‘VCGLIB’ source is pulled from trunk https://github.com/cnr-isti-vclab/vcglib and patched to work with options determined by the configure script as well as to work with the header files included by ‘RcppEigen’.
1479 Medical Image Analysis tractor.base (core) Read, Manipulate and Visualise Magnetic Resonance Images Functions for working with magnetic resonance images. Analyze, NIfTI-1, NIfTI-2 and MGH format images can be read and written; DICOM files can only be read.
1480 Medical Image Analysis waveslim Basic wavelet routines for one-, two- and three-dimensional signal processing Basic wavelet routines for time series (1D), image (2D) and array (3D) analysis. The code provided here is based on wavelet methodology developed in Percival and Walden (2000); Gencay, Selcuk and Whitcher (2001); the dual-tree complex wavelet transform (DTCWT) from Kingsbury (1999, 2001) as implemented by Selesnick; and Hilbert wavelet pairs (Selesnick 2001, 2002). All figures in chapters 4-7 of GSW (2001) are reproducible using this package and R code available at the book website(s) below.
1481 Meta-Analysis altmeta Alternative Meta-Analysis Methods Provides alternative statistical methods for meta-analysis, including new heterogeneity tests and measures that are robust to outliers.
1482 Meta-Analysis bamdit Bayesian Meta-Analysis of Diagnostic Test Data Functions for Bayesian meta-analysis of diagnostic test data which are based on a scale mixtures bivariate random-effects model.
1483 Meta-Analysis bayesmeta Bayesian Random-Effects Meta-Analysis A collection of functions allowing to derive the posterior distribution of the two parameters in a random-effects meta-analysis, and providing functionality to evaluate joint and marginal posterior probability distributions, predictive distributions, shrinkage effects, posterior predictive p-values, etc.
1484 Meta-Analysis bmeta Bayesian Meta-Analysis and Meta-Regression Provides a collection of functions for conducting meta-analyses under Bayesian context in R. The package includes functions for computing various effect size or outcome measures (e.g. odds ratios, mean difference and incidence rate ratio) for different types of data based on MCMC simulations. Users are allowed to fit fixed- and random-effects models with different priors to the data. Meta-regression can be carried out if effects of additional covariates are observed. Furthermore, the package provides functions for creating posterior distribution plots and forest plot to display main model output. Traceplots and some other diagnostic plots are also available for assessing model fit and performance.
1485 Meta-Analysis bspmma bspmma: Bayesian Semiparametric Models for Meta-Analysis Some functions for nonparametric and semiparametric Bayesian models for random effects meta-analysis
1486 Meta-Analysis CAMAN Finite Mixture Models and Meta-Analysis Tools - Based on C.A.MAN Tools for the analysis of finite semiparametric mixtures. These are useful when data is heterogeneous, e.g. in pharmacokinetics or meta-analysis. The NPMLE and VEM algorithms (flexible support size) and EM algorithms (fixed support size) are provided for univariate and bivariate data.
1487 Meta-Analysis CIAAWconsensus Isotope Ratio Meta-Analysis Calculation of consensus values for atomic weights, isotope amount ratios, and isotopic abundances with the associated uncertainties using multivariate meta-regression approach for consensus building.
1488 Meta-Analysis clubSandwich Cluster-Robust (Sandwich) Variance Estimators with Small-Sample Corrections Provides several cluster-robust variance estimators (i.e., sandwich estimators) for ordinary and weighted least squares linear regression models, including the bias-reduced linearization estimator introduced by Bell and McCaffrey (2002) http://www.statcan.gc.ca/pub/12-001-x/2002002/article/9058-eng.pdf and developed further by Pustejovsky and Tipton (2017) doi:10.1080/07350015.2016.1247004. The package includes functions for estimating the variance- covariance matrix and for testing single- and multiple-contrast hypotheses based on Wald test statistics. Tests of single regression coefficients use Satterthwaite or saddle-point corrections. Tests of multiple-contrast hypotheses use an approximation to Hotelling’s T-squared distribution. Methods are provided for a variety of fitted models, including lm() and mlm objects, glm(), ivreg (from package ‘AER’), plm() (from package ‘plm’), gls() and lme() (from ‘nlme’), robu() (from ‘robumeta’), and rma.uni() and rma.mv() (from ‘metafor’).
1489 Meta-Analysis compute.es Compute Effect Sizes This package contains several functions for calculating the most widely used effect sizes (ES), along with their variances, confidence intervals and p-values. The output includes ES’s of d (mean difference), g (unbiased estimate of d), r (correlation coefficient), z’ (Fisher’s z), and OR (odds ratio and log odds ratio). In addition, NNT (number needed to treat), U3, CLES (Common Language Effect Size) and Cliff’s Delta are computed. This package uses recommended formulas as described in The Handbook of Research Synthesis and Meta-Analysis (Cooper, Hedges, & Valentine, 2009).
1490 Meta-Analysis ConfoundedMeta Sensitivity Analyses for Unmeasured Confounding in Meta-Analyses Conducts sensitivity analyses for unmeasured confounding in random-effects meta-analysis per Mathur & VanderWeele (in preparation). Given output from a random-effects meta-analysis with a relative risk outcome, computes point estimates and inference for: (1) the proportion of studies with true causal effect sizes more extreme than a specified threshold of scientific significance; and (2) the minimum bias factor and confounding strength required to reduce to less than a specified threshold the proportion of studies with true effect sizes of scientifically significant size. Creates plots and tables for visualizing these metrics across a range of bias values. Provides tools to easily scrape study-level data from a published forest plot or summary table to obtain the needed estimates when these are not reported.
1491 Meta-Analysis CopulaREMADA Copula Mixed Effect Models for Bivariate and Trivariate Meta-Analysis of Diagnostic Test Accuracy Studies It has functions to implement the copula mixed models for bivariate and trivariate meta-analysis of diagnostic test accuracy studies.
1492 Meta-Analysis CPBayes Bayesian Meta Analysis for Studying Cross-Phenotype Genetic Associations A Bayesian meta-analysis method for studying cross-phenotype genetic associations. It uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. CPBayes is based on a spike and slab prior and is implemented by Markov chain Monte Carlo technique Gibbs sampling.
1493 Meta-Analysis CRTSize Sample Size Estimation Functions for Cluster Randomized Trials Sample size estimation in cluster (group) randomized trials. Contains traditional power-based methods, empirical smoothing (Rotondi and Donner, 2009), and updated meta-analysis techniques (Rotondi and Donner, 2012).
1494 Meta-Analysis dosresmeta Multivariate Dose-Response Meta-Analysis Estimates dose-response relations from summarized dose-response data and to combines them according to principles of (multivariate) random-effects models.
1495 Meta-Analysis EasyStrata Evaluation of stratified genome-wide association meta-analysis results This is a pipelining tool that facilitates evaluation and visualisation of stratified genome-wide association meta-analyses (GWAMAs) results data. It provides (i) statistical methods to test and to account for between-strata difference and to clump genome-wide results into independent loci and (ii) extended graphical features (e.g., Manhattan, Miami and QQ plots) tailored for stratified GWAMA results.
1496 Meta-Analysis ecoreg Ecological Regression using Aggregate and Individual Data Estimating individual-level covariate-outcome associations using aggregate data (“ecological inference”) or a combination of aggregate and individual-level data (“hierarchical related regression”).
1497 Meta-Analysis effsize Efficient Effect Size Computation A collection of functions to compute the standardized effect sizes for experiments (Cohen d, Hedges g, Cliff delta, Vargha-Delaney A). The computation algorithms have been optimized to allow efficient computation even with very large data sets.
1498 Meta-Analysis epiR Tools for the Analysis of Epidemiological Data Tools for the analysis of epidemiological data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, and computing confidence intervals around incidence risk and incidence rate estimates. Miscellaneous functions for use in meta-analysis, diagnostic test interpretation, and sample size calculations.
1499 Meta-Analysis esc Effect Size Computation for Meta Analysis Implementation of the web-based ‘Practical Meta-Analysis Effect Size Calculator’ from David B. Wilson (http://www.campbellcollaboration.org/escalc/html/EffectSizeCalculator-Home.php) in R. Based on the input, the effect size can be returned as standardized mean difference, Cohen’s f, Hedges’ g, Pearson’s r or Fisher’s transformation z, odds ratio or log odds, or eta squared effect size.
1500 Meta-Analysis etma Epistasis Test in Meta-Analysis Traditional meta-regression based method has been developed for using meta-analysis data, but it faced the challenge of inconsistent estimates. This package purpose a new statistical method to detect epistasis using incomplete information summary, and have proven it not only successfully let consistency of evidence, but also increase the power compared with traditional method (Detailed tutorial is shown in website).
1501 Meta-Analysis exactmeta Exact fixed effect meta analysis Perform exact fixed effect meta analysis for rare events data without the need of artificial continuity correction.
1502 Meta-Analysis extfunnel Additional Funnel Plot Augmentations This is a package containing the function extfunnel() which produces a funnel plot including additional augmentations such as statistical significance contours and heterogeneity contours.
1503 Meta-Analysis forestmodel Forest Plots from Regression Models Produces forest plots using ‘ggplot2’ from models produced by functions such as stats::lm(), stats::glm() and survival::coxph().
1504 Meta-Analysis forestplot Advanced Forest Plot Using ‘grid’ Graphics A forest plot that allows for multiple confidence intervals per row, custom fonts for each text element, custom confidence intervals, text mixed with expressions, and more. The aim is to extend the use of forest plots beyond meta-analyses. This is a more general version of the original ‘rmeta’ package’s forestplot() function and relies heavily on the ‘grid’ package.
1505 Meta-Analysis gap Genetic Analysis Package It is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates.
1506 Meta-Analysis gemtc Network Meta-Analysis Using Bayesian Methods Network meta-analyses (mixed treatment comparisons) in the Bayesian framework using JAGS. Includes methods to assess heterogeneity and inconsistency, and a number of standard visualizations.
1507 Meta-Analysis getmstatistic Quantifying Systematic Heterogeneity in Meta-Analysis Quantifying systematic heterogeneity in meta-analysis using R. The M statistic aggregates heterogeneity information across multiple variants to, identify systematic heterogeneity patterns and their direction of effect in meta-analysis. It’s primary use is to identify outlier studies, which either show “null” effects or consistently show stronger or weaker genetic effects than average across, the panel of variants examined in a GWAS meta-analysis. In contrast to conventional heterogeneity metrics (Q-statistic, I-squared and tau-squared) which measure random heterogeneity at individual variants, M measures systematic (non-random) heterogeneity across multiple independently associated variants. Systematic heterogeneity can arise in a meta-analysis due to differences in the study characteristics of participating studies. Some of the differences may include: ancestry, allele frequencies, phenotype definition, age-of-disease onset, family-history, gender, linkage disequilibrium and quality control thresholds. See https://magosil86.github.io/getmstatistic/ for statistical statistical theory, documentation and examples.
1508 Meta-Analysis gmeta Meta-Analysis via a Unified Framework of Confidence Distribution An implementation of an all-in-one function for a wide range of meta-analysis problems. It contains three functions. The gmeta() function unifies all standard meta-analysis methods and also several newly developed ones under a framework of combining confidence distributions (CDs). Specifically, the package can perform classical p-value combination methods (such as methods of Fisher, Stouffer, Tippett, etc.), fit meta-analysis fixed-effect and random-effects models, and synthesizes 2x2 tables. Furthermore, it can perform robust meta-analysis, which provides protection against model-misspecifications, and limits the impact of any unknown outlying studies. In addition, the package implements two exact meta-analysis methods from synthesizing 2x2 tables with rare events (e.g., zero total event). The np.gmeta() function summarizes information obtained from multiple studies and makes inference for study-level parameters with no distributional assumption. Specifically, it can construct confidence intervals for unknown, fixed study-level parameters via confidence distribution. Furthermore, it can perform estimation via asymptotic confidence distribution whether tie or near tie condition exist or not. The plot.gmeta() function to visualize individual and combined CDs through extended forest plots is also available. Compared to version 2.2-6, version 2.3-0 contains a new function np.gmeta().
1509 Meta-Analysis hetmeta Heterogeneity Measures in Meta-Analysis Assess the presence of statistical heterogeneity and quantify its impact in the context of meta-analysis. It includes test for heterogeneity as well as other statistical measures (R_b, I^2, R_I).
1510 Meta-Analysis HSROC Meta-Analysis of Diagnostic Test Accuracy when Reference Test is Imperfect Implements a model for joint meta-analysis of sensitivity and specificity of the diagnostic test under evaluation, while taking into account the possibly imperfect sensitivity and specificity of the reference test. This hierarchical model accounts for both within and between study variability. Estimation is carried out using a Bayesian approach, implemented via a Gibbs sampler. The model can be applied in situations where more than one reference test is used in the selected studies.
1511 Meta-Analysis ipdmeta Tools for subgroup analyses with multiple trial data using aggregate statistics This package provides functions to estimate an IPD linear mixed effects model for a continuous outcome and any categorical covariate from study summary statistics. There are also functions for estimating the power of a treatment-covariate interaction test in an individual patient data meta-analysis from aggregate data.
1512 Meta-Analysis joint.Cox Penalized Likelihood Estimation and Dynamic Prediction under the Joint Frailty-Copula Models Between Tumour Progression and Death for Meta-Analysis Perform the Cox regression and dynamic prediction methods under the joint frailty-copula model between tumour progression and death for meta-analysis. A penalized likelihood is employed for estimating model parameters, where the baseline hazard functions are approximated by smoothing splines. The methods are applicable for meta-analytic data combining several studies. The methods can analyze data having information on both terminal event time (e.g., time-to-death) and non-terminal event time (e.g., time-to-tumour progression). See Emura et al. (2015) doi:10.1177/0962280215604510 and Emura et al. (2017) doi:10.1177/0962280216688032 for details. Survival data from ovarian cancer patients are also available.
1513 Meta-Analysis MAc Meta-Analysis with Correlations This is an integrated meta-analysis package for conducting a correlational research synthesis. One of the unique features of this package is in its integration of user-friendly functions to facilitate statistical analyses at each stage in a meta-analysis with correlations. It uses recommended procedures as described in The Handbook of Research Synthesis and Meta-Analysis (Cooper, Hedges, & Valentine, 2009).
1514 Meta-Analysis MAd Meta-Analysis with Mean Differences A collection of functions for conducting a meta-analysis with mean differences data. It uses recommended procedures as described in The Handbook of Research Synthesis and Meta-Analysis (Cooper, Hedges, & Valentine, 2009).
1515 Meta-Analysis mada Meta-Analysis of Diagnostic Accuracy Provides functions for diagnostic meta-analysis. Next to basic analysis and visualization the bivariate Model of Reitsma et al. (2005) that is equivalent to the HSROC of Rutter & Gatsonis (2001) can be fitted. A new approach based to diagnostic meta-analysis of Holling et al. (2012) is also available. Standard methods like summary, plot and so on are provided.
1516 Meta-Analysis MAVIS Meta Analysis via Shiny Interactive shiny application for running a meta-analysis, provides support for both random effects and fixed effects models with the ‘metafor’ package. Additional support is included for calculating effect sizes plus support for single case designs, graphical output, and detecting publication bias.
1517 Meta-Analysis meta (core) General Package for Meta-Analysis User-friendly general package providing standard methods for meta-analysis and supporting Schwarzer, Carpenter, and Rucker doi:10.1007/978-3-319-21416-0, “Meta-Analysis with R” (2015): - fixed effect and random effects meta-analysis; - several plots (forest, funnel, Galbraith / radial, L’Abbe, Baujat, bubble); - statistical tests and trim-and-fill method to evaluate bias in meta-analysis; - import data from ‘RevMan 5’; - prediction interval, Hartung-Knapp and Paule-Mandel method for random effects model; - cumulative meta-analysis and leave-one-out meta-analysis; - meta-regression (if R package ‘metafor’ is installed); - generalised linear mixed models (if R packages ‘metafor’, ‘lme4’, ‘numDeriv’, and ‘BiasedUrn’ are installed).
1518 Meta-Analysis meta4diag Meta-Analysis for Diagnostic Test Studies Bayesian inference analysis for bivariate meta-analysis of diagnostic test studies using integrated nested Laplace approximation with INLA. A purpose built graphic user interface is available. The installation of R package INLA is compulsory for successful usage. The INLA package can be obtained from http://www.r-inla.org. We recommend the testing version, which can be downloaded by running: install.packages(“INLA”, repos=“http://www.math.ntnu.no/inla/R/testing”).
1519 Meta-Analysis MetaAnalyser An Interactive Visualisation of Meta-Analysis as a Physical Weighing Machine An interactive application to visualise meta-analysis data as a physical weighing machine. The interface is based on the Shiny web application framework, though can be run locally and with the user’s own data.
1520 Meta-Analysis MetABEL Meta-analysis of genome-wide SNP association results A package for meta-analysis of genome-wide association scans between quantitative or binary traits and SNPs
1521 Meta-Analysis metaBMA Bayesian Model Averaging for Random and Fixed Effects Meta-Analysis Computes the posterior model probabilities for four meta-analysis models (null model vs. alternative model assuming either fixed- or random-effects, respectively). These posterior probabilities are used to estimate the overall mean effect size as the weighted average of the mean effect size estimates of the random- and fixed-effect model as proposed by Gronau, Van Erp, Heck, Cesario, Jonas, & Wagenmakers (2017, doi:10.1080/23743603.2017.1326760). The user can define a wide range of noninformative or informative priors for the mean effect size and the heterogeneity coefficient. Funding for this research was provided by the Berkeley Initiative for Transparency in the Social Sciences, a program of the Center for Effective Global Action (CEGA), with support from the Laura and John Arnold Foundation.
1522 Meta-Analysis metacart Meta-CART: A Flexible Approach to Identify Moderators in Meta-Analysis Fits meta-CART by integrating classification and regression trees (CART) into meta-analysis. Meta-CART is a flexible approach to identify interaction effects between moderators in meta-analysis. The methods are described in Dusseldorp et al. (2014) doi:10.1037/hea0000018 and Li et al. (2017) doi:10.1111/bmsp.12088.
1523 Meta-Analysis metacor Meta-analysis of correlation coefficients Implement the DerSimonian-Laird (DSL) and Olkin-Pratt (OP) meta-analytical approaches with correlation coefficients as effect sizes.
1524 Meta-Analysis MetaDE MetaDE: Microarray meta-analysis for differentially expressed gene detection MetaDE package implements 12 major meta-analysis methods for differential expression analysis.
1525 Meta-Analysis metafor (core) Meta-Analysis Package for R A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L’Abbe, Baujat, GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto’s method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted.
1526 Meta-Analysis metaforest Exploring Heterogeneity in Meta-Analysis using Random Forests A requirement of classic meta-analysis is that the studies being aggregated are conceptually similar, and ideally, close replications. However, in many fields, there is substantial heterogeneity between studies on the same topic. Similar research questions are studied in different laboratories, using different methods, instruments, and samples. Classic meta-analysis lacks the power to assess more than a handful of univariate moderators, or to investigate interactions between moderators, and non-linear effects. MetaForest, by contrast, has substantial power to explore heterogeneity in meta-analysis. It can identify important moderators from a larger set of potential candidates, even with as little as 20 studies (Van Lissa, in preparation). This is an appealing quality, because many meta-analyses have small sample sizes. Moreover, MetaForest yields a measure of variable importance which can be used to identify important moderators, and offers partial prediction plots to explore the shape of the marginal relationship between moderators and effect size.
1527 Meta-Analysis metafuse Fused Lasso Approach in Regression Coefficient Clustering Fused lasso method to cluster and estimate regression coefficients of the same covariate across different data sets when a large number of independent data sets are combined. Package supports Gaussian, binomial, Poisson and Cox PH models.
1528 Meta-Analysis metagear Comprehensive Research Synthesis Tools for Systematic Reviews and Meta-Analysis Functionalities for facilitating systematic reviews, data extractions, and meta-analyses. It includes a GUI (graphical user interface) to help screen the abstracts and titles of bibliographic data; tools to assign screening effort across multiple collaborators/reviewers and to assess inter- reviewer reliability; tools to help automate the download and retrieval of journal PDF articles from online databases; figure and image extractions from PDFs; web scraping of citations; automated and manual data extraction from scatter-plot and bar-plot images; PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagrams; simple imputation tools to fill gaps in incomplete or missing study parameters; generation of random effects sizes for Hedges’ d, log response ratio, odds ratio, and correlation coefficients for Monte Carlo experiments; covariance equations for modelling dependencies among multiple effect sizes (e.g., effect sizes with a common control); and finally summaries that replicate analyses and outputs from widely used but no longer updated meta-analysis software. Funding for this package was supported by National Science Foundation (NSF) grants DBI-1262545 and DEB-1451031.
1529 Meta-Analysis metagen Inference in Meta Analysis and Meta Regression Provides methods for making inference in the random effects meta regression model such as point estimates and confidence intervals for the heterogeneity parameter and the regression coefficients vector. Inference methods are based on different approaches to statistical inference. Methods from three different schools are included: methods based on the method of moments approach, methods based on likelihood, and methods based on generalised inference. The package also includes tools to run extensive simulation studies in parallel on high performance clusters in a modular way. This allows extensive testing of custom inferential methods with all implemented state-of-the-art methods in a standardised way. Tools for evaluating the performance of both point and interval estimates are provided. Also, a large collection of different pre-defined plotting functions is implemented in a ready-to-use fashion.
1530 Meta-Analysis MetaIntegrator Meta-Analysis of Gene Expression Data A pipeline for the meta-analysis of gene expression data. We have assembled several analysis and plot functions to perform integrated multi-cohort analysis of gene expression data (meta- analysis). Methodology described in: http://biorxiv.org/content/early/2016/08/25/071514.
1531 Meta-Analysis metaLik Likelihood Inference in Meta-Analysis and Meta-Regression Models First- and higher-order likelihood inference in meta-analysis and meta-regression models.
1532 Meta-Analysis metaMA Meta-analysis for MicroArrays Combines either p-values or modified effect sizes from different studies to find differentially expressed genes
1533 Meta-Analysis metamisc Diagnostic and Prognostic Meta-Analysis Meta-analysis of diagnostic and prognostic modeling studies. Summarize estimates of diagnostic test accuracy and prediction model performance. Validate, update and combine published prediction models. Develop new prediction models with data from multiple studies.
1534 Meta-Analysis metansue Meta-Analysis of Studies with Non Statistically-Significant Unreported Effects Revisited version of MetaNSUE, a novel meta-analytic method that allows an unbiased inclusion of studies with Non Statistically-Significant Unreported Effects (NSUEs). Briefly, the method first calculates the interval where the unreported effects (e.g. t-values) should be according to the threshold of statistical significance used in each study. Afterwards, maximum likelihood techniques are used to impute the expected effect size of each study with NSUEs, accounting for between-study heterogeneity and potential covariates. Multiple imputations of the NSUEs are then randomly created based on the expected value, variance and statistical significance bounds. Finally, a restricted-maximum likelihood random-effects meta-analysis is separately conducted for each set of imputations, and estimations from these meta-analyses are pooled. Please read the reference in ‘metansue’ for details of the procedure.
1535 Meta-Analysis metap Meta-Analysis of Significance Values The canonical way to perform meta-analysis involves using effect sizes. When they are not available this package provides a number of methods for meta-analysis of significance values including the methods of Edgington, Fisher, Stouffer, Tippett, and Wilkinson; a number of data-sets to replicate published results; and a routine for graphical display.
1536 Meta-Analysis MetaPath Perform the Meta-Analysis for Pathway Enrichment Analysis (MAPE) Perform the Meta-analysis for Pathway Enrichment (MAPE) methods introduced by Shen and Tseng (2010). It includes functions to automatically perform MAPE_G (integrating multiple studies at gene level), MAPE_P (integrating multiple studies at pathway level) and MAPE_I (a hybrid method integrating MAEP_G and MAPE_P methods). In the simulation and real data analyses in the paper, MAPE_G and MAPE_P have complementary advantages and detection power depending on the data structure. In general, the integrative form of MAPE_I is recommended to use. In the case that MAPE_G (or MAPE_P) detects almost none pathway, the integrative MAPE_I does not improve performance and MAPE_P (or MAPE_G) should be used. Reference: Shen, Kui, and George C Tseng. Meta-analysis for pathway enrichment analysis when combining multiple microarray studies.Bioinformatics (Oxford, England) 26, no. 10 (April 2010): 1316-1323. doi:10.1093/bioinformatics/btq148. http://www.ncbi.nlm.nih.gov/pubmed/20410053.
1537 Meta-Analysis MetaPCA MetaPCA: Meta-analysis in the Dimension Reduction of Genomic data MetaPCA implements simultaneous dimension reduction using PCA when multiple studies are combined. We propose two basic ideas to find a common PC subspace by eigenvalue maximization approach and angle minimization approach, and we extend the concept to incorporate Robust PCA and Sparse PCA in the meta-analysis realm.
1538 Meta-Analysis metaplotr Creates CrossHairs Plots for Meta-Analyses Creates crosshairs plots to summarize and analyse meta-analysis results. In due time this package will contain code that will create other kind of meta-analysis graphs.
1539 Meta-Analysis metaplus Robust Meta-Analysis and Meta-Regression Performs meta-analysis and meta-regression using standard and robust methods with confidence intervals based on the profile likelihood. Robust methods are based on alternative distributions for the random effect, either the t-distribution (Lee and Thompson, 2008 doi:10.1002/sim.2897 or Baker and Jackson, 2008 doi:10.1007/s10729-007-9041-8) or mixtures of normals (Beath, 2014 doi:10.1002/jrsm.1114).
1540 Meta-Analysis MetaQC MetaQC: Objective Quality Control and Inclusion/Exclusion Criteria for Genomic Meta-Analysis MetaQC implements our proposed quantitative quality control measures: (1) internal homogeneity of co-expression structure among studies (internal quality control; IQC); (2) external consistency of co-expression structure correlating with pathway database (external quality control; EQC); (3) accuracy of differentially expressed gene detection (accuracy quality control; AQCg) or pathway identification (AQCp); (4) consistency of differential expression ranking in genes (consistency quality control; CQCg) or pathways (CQCp). (See the reference for detailed explanation.) For each quality control index, the p-values from statistical hypothesis testing are minus log transformed and PCA biplots were applied to assist visualization and decision. Results generate systematic suggestions to exclude problematic studies in microarray meta-analysis and potentially can be extended to GWAS or other types of genomic meta-analysis. The identified problematic studies can be scrutinized to identify technical and biological causes (e.g. sample size, platform, tissue collection, preprocessing etc) of their bad quality or irreproducibility for final inclusion/exclusion decision.
1541 Meta-Analysis metaRNASeq Meta-analysis of RNA-seq data Implementation of two p-value combination techniques (inverse normal and Fisher methods). A vignette is provided to explain how to perform a meta-analysis from two independent RNA-seq experiments.
1542 Meta-Analysis metaSEM Meta-Analysis using Structural Equation Modeling A collection of functions for conducting meta-analysis using a structural equation modeling (SEM) approach via the ‘OpenMx’ package. It also implements the two-stage SEM approach to conduct meta-analytic structural equation modeling on correlation and covariance matrices.
1543 Meta-Analysis metasens Advanced Statistical Methods to Model and Adjust for Bias in Meta-Analysis The following methods are implemented to evaluate how sensitive the results of a meta-analysis are to potential bias in meta-analysis and to support Schwarzer et al. (2015) doi:10.1007/978-3-319-21416-0, Chapter 5 “Small-Study Effects in Meta-Analysis”: - Copas selection model described in Copas & Shi (2001) doi:10.1177/096228020101000402; - limit meta-analysis by Rucker et al. (2011) doi:10.1093/biostatistics/kxq046; - upper bound for outcome reporting bias by Copas & Jackson (2004) doi:10.1111/j.0006-341X.2004.00161.x.
1544 Meta-Analysis MetaSKAT Meta Analysis for SNP-Set (Sequence) Kernel Association Test Functions for Meta-analysis Burden test, SKAT and SKAT-O. These methods use summary-level score statistics to carry out gene-based meta-analysis for rare variants.
1545 Meta-Analysis MetaSubtract Subtracting Summary Statistics of One or more Cohorts from Meta-GWAS Results If results from a meta-GWAS are used for validation in one of the cohorts that was included in the meta-analysis, this will yield biased (i.e. too optimistic) results. The validation cohort needs to be independent from the meta-GWAS results. MetaSubtract will subtract the results of the respective cohort from the meta-GWAS results analytically without having to redo the meta-GWAS analysis using the leave-one-out methodology. It can handle different meta-analyses methods and takes into account if single or double genomic control correction was applied to the original meta-analysis. It can be used for whole GWAS, but also for a limited set of SNPs or other genetic markers.
1546 Meta-Analysis metatest Fit and test metaregression models This package fits meta regression models and generates a number of statistics: next to t- and z-tests, the likelihood ratio, bartlett corrected likelihood ratio and permutation tests are performed on the model coefficients.
1547 Meta-Analysis Metatron Meta-analysis for Classification Data and Correction to Imperfect Reference This package allows doing meta-analysis for primary studies with classification outcomes in order to evaluate systematically the accuracies of classifiers, namely, the diagnostic tests. It provides functions to fit the bivariate model of Reitsma et al.(2005). Moreover, if the reference employed in the classification process isn’t a gold standard, its deficit can be detected and its influence to the underestimation of the diagnostic test’s accuracy can be corrected, as described in Botella et al.(2013).
1548 Meta-Analysis metavcov Variance-Covariance Matrix for Multivariate Meta-Analysis Compute variance-covariance matrix for multivariate meta-analysis. Effect sizes include correlation (r), mean difference (MD), standardized mean difference (SMD), log odds ratio (logOR), log risk ratio (logRR), and risk difference (RD).
1549 Meta-Analysis metaviz Rainforest Plots and Visual Funnel Plot Inference for Meta-Analysis Creates rainforest plots (proposed by Schild & Voracek, 2015 doi:10.1002/jrsm.1125), a variant and enhancement of the classic forest plot for meta-analysis. In addition, functionalities for visual funnel plot inference are provided. In the near future, the ‘metaviz’ package will be extended by further, established as well as novel, plotting options for visualizing meta-analytic data.
1550 Meta-Analysis mmeta Multivariate Meta-Analysis A novel multivariate meta-analysis.
1551 Meta-Analysis MultiMeta Meta-analysis of Multivariate Genome Wide Association Studies Allows running a meta-analysis of multivariate Genome Wide Association Studies (GWAS) and easily visualizing results through custom plotting functions. The multivariate setting implies that results for each single nucleotide polymorphism (SNP) include several effect sizes (also known as “beta coefficients”, one for each trait), as well as related variance values, but also covariance between the betas. The main goal of the package is to provide combined beta coefficients across different cohorts, together with the combined variance/covariance matrix. The method is inverse-variance based, thus each beta is weighted by the inverse of its variance-covariance matrix, before taking the average across all betas. The default options of the main function will work with files obtained from GEMMA multivariate option for GWAS (Zhou & Stephens, 2014). It will work with any other output, as soon as columns are formatted to have the according names. The package also provides several plotting functions for QQ-plots, Manhattan Plots and custom summary plots.
1552 Meta-Analysis mvmeta Multivariate and Univariate Meta-Analysis and Meta-Regression Collection of functions to perform fixed and random-effects multivariate and univariate meta-analysis and meta-regression.
1553 Meta-Analysis mvtmeta Multivariate meta-analysis This package contains functions to run fixed effects or random effects multivariate meta-analysis.
1554 Meta-Analysis netmeta Network Meta-Analysis using Frequentist Methods A comprehensive set of functions providing frequentist methods for network meta-analysis and supporting Schwarzer et al. (2015) doi:10.1007/978-3-319-21416-0, Chapter 8 “Network Meta-Analysis”: - frequentist network meta-analysis following Rucker (2012) doi:10.1002/jrsm.1058; - net heat plot and design-based decomposition of Cochran’s Q according to Krahn et al. (2013) doi:10.1186/1471-2288-13-35; - measures characterizing the flow of evidence between two treatments by Konig et al. (2013) doi:10.1002/sim.6001; - ranking of treatments (frequentist analogue of SUCRA) according to Rucker & Schwarzer (2015) doi:10.1186/s12874-015-0060-8; - partial order of treatment rankings (‘poset’) and Hasse diagram for ‘poset’ (Carlsen & Bruggemann, 2014) doi:10.1002/cem.2569; - split direct and indirect evidence to check consistency (Dias et al., 2010) doi:10.1002/sim.3767; - league table with network meta-analysis results; - automated drawing of network graphs described in Rucker & Schwarzer (2016) doi:10.1002/jrsm.1143.
1555 Meta-Analysis nmaINLA Network Meta-Analysis using Integrated Nested Laplace Approximations Performs network meta-analysis using integrated nested Laplace approximations (‘INLA’). Includes methods to assess the heterogeneity and inconsistency in the network. Contains more than ten different network meta-analysis data. ‘INLA’ package can be obtained from http://www.r-inla.org. We recommend the testing version.
1556 Meta-Analysis pcnetmeta Patient-Centered Network Meta-Analysis Performs arm-based network meta-analysis for datasets with binary, continuous, and count outcomes using the Bayesian methods of Zhang et al (2014) doi:10.1177/1740774513498322 and Lin et al (2017) doi:10.18637/jss.v080.i05.
1557 Meta-Analysis psychmeta Psychometric Meta-Analysis Toolkit Tools for computing bare-bones and psychometric meta-analyses and for generating psychometric data for use in meta-analysis simulations. Supports bare-bones, individual-correction, and artifact-distribution methods for meta-analyzing correlations and d values. Includes tools for converting effect sizes, computing sporadic artifact corrections, reshaping meta-analytic databases, computing multivariate corrections for range variation, and more.
1558 Meta-Analysis psychometric Applied Psychometric Theory Contains functions useful for correlation theory, meta-analysis (validity-generalization), reliability, item analysis, inter-rater reliability, and classical utility
1559 Meta-Analysis PubBias Performs simulation study to look for publication bias, using a technique described by Ioannidis and Trikalinos; Clin Trials. 2007;4(3):245-53 I adapted a method designed by Ioannidis and Trikalinos, which compares the observed number of positive studies in a meta-analysis with the expected number, if the summary measure of effect, averaged over the individual studies, were assumed true. Excess in the observed number of positive studies, compared to the expected, is taken as evidence of publication bias. The observed number of positive studies, at a given level for statistical significance, is calculated by applying Fisher’s exact test to the reported 2x2 table data of each constituent study, doubling the Fisher one-sided P-value to make a two-sided test. The corresponding expected number of positive studies was obtained by summing the statistical powers of each study. The statistical power depended on a given measure of effect which, here, was the pooled odds ratio of the meta-analysis was used. By simulating each constituent study, with the given odds ratio, and the same number of treated and non-treated as in the real study, the power of the study is estimated as the proportion of simulated studies that are positive, again by a Fisher’s exact test. The simulated number of events in the treated and untreated groups was done with binomial sampling. In the untreated group, the binomial proportion was the percentage of actual events reported in the study and, in the treated group, the binomial sampling proportion was the untreated percentage multiplied by the risk ratio which was derived from the assumed common odds ratio. The statistical significance for judging a positive study may be varied and large differences between expected and observed number of positive studies around the level of 0.05 significance constitutes evidence of publication bias. The difference between the observed and expected is tested by chi-square. A chi-square test P-value for the difference below 0.05 is suggestive of publication bias, however, a less stringent level of 0.1 is often used in studies of publication bias as the number of published studies is usually small.
1560 Meta-Analysis RandMeta Efficient Numerical Algorithm for Exact Inference in Meta Analysis A novel numerical algorithm that provides functionality for estimating the exact 95% confidence interval of the location parameter in the random effects model, and is much faster than the naive method. Works best when the number of studies is between 6-20.
1561 Meta-Analysis ratesci Confidence Intervals for Comparisons of Binomial or Poisson Rates Computes confidence intervals for the rate (or risk) difference (“RD”) or rate ratio (or relative risk, “RR”) for binomial proportions or Poisson rates, or for odds ratio (“OR”, binomial only). Also confidence intervals for a single binomial or Poisson rate, and intervals for matched pairs. Includes asymptotic score methods including skewness corrections, which have been developed in Laud (2017, in press) from Miettinen & Nurminen (1985) doi:10.1002/sim.4780040211 and Gart & Nam (1988) doi:10.2307/2531848. Also includes MOVER methods (Method Of Variance Estimates Recovery), derived from the Newcombe method but using equal-tailed Jeffreys intervals, and generalised for incorporating prior information. Also methods for stratified calculations (e.g. meta-analysis), either assuming fixed effects or incorporating stratum heterogeneity.
1562 Meta-Analysis RcmdrPlugin.EZR R Commander Plug-in for the EZR (Easy R) Package EZR (Easy R) adds a variety of statistical functions, including survival analyses, ROC analyses, metaanalyses, sample size calculation, and so on, to the R commander. EZR enables point-and-click easy access to statistical functions, especially for medical statistics. EZR is platform-independent and runs on Windows, Mac OS X, and UNIX. Its complete manual is available only in Japanese (Chugai Igakusha, ISBN: 978-4-498-10901-8 or Nankodo, ISBN: 978-4-524-26158-1), but an report that introduced the investigation of EZR was published in Bone Marrow Transplantation (Nature Publishing Group) as an Open article. This report can be used as a simple manual. It can be freely downloaded from the journal website as shown below. This report has been cited in more than 1,000 scientific articles.
1563 Meta-Analysis RcmdrPlugin.RMTCJags R MTC Jags ‘Rcmdr’ Plugin Mixed Treatment Comparison is a methodology to compare directly and/or indirectly health strategies (drugs, treatments, devices). This package provides an ‘Rcmdr’ plugin to perform Mixed Treatment Comparison for binary outcome using BUGS code from Bristol University (Lu and Ades).
1564 Meta-Analysis rma.exact Exact Confidence Intervals for Random Effects Meta-Analyses Compute an exact CI for the population mean under a random effects model. The routines implement the algorithm described in Michael, Thronton, Xie, and Tian (2017) https://haben-michael.github.io/research/Exact_Inference_Meta.pdf.
1565 Meta-Analysis rmeta Meta-analysis Functions for simple fixed and random effects meta-analysis for two-sample comparisons and cumulative meta-analyses. Draws standard summary plots, funnel plots, and computes summaries and tests for association and heterogeneity
1566 Meta-Analysis robumeta Robust Variance Meta-Regression Functions for conducting robust variance estimation (RVE) meta-regression using both large and small sample RVE estimators under various weighting schemes. These methods are distribution free and provide valid point estimates, standard errors and hypothesis tests even when the degree and structure of dependence between effect sizes is unknown. Also included are functions for conducting sensitivity analyses under correlated effects weighting and producing RVE-based forest plots.
1567 Meta-Analysis SAMURAI Sensitivity Analysis of a Meta-analysis with Unpublished but Registered Analytical Investigations This package contains R functions to gauge the impact of unpublished studies upon the meta-analytic summary effect of a set of published studies. (Credits: The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 282574.)
1568 Meta-Analysis SCMA Single-Case Meta-Analysis Perform meta-analysis of single-case experiments, including calculating various effect size measures (SMD, PND, PEM and NAP) and probability combining (additive and multiplicative method).
1569 Meta-Analysis selectMeta Estimation of Weight Functions in Meta Analysis Publication bias, the fact that studies identified for inclusion in a meta analysis do not represent all studies on the topic of interest, is commonly recognized as a threat to the validity of the results of a meta analysis. One way to explicitly model publication bias is via selection models or weighted probability distributions. In this package we provide implementations of several parametric and nonparametric weight functions. The novelty in Rufibach (2011) is the proposal of a non-increasing variant of the nonparametric weight function of Dear & Begg (1992). The new approach potentially offers more insight in the selection process than other methods, but is more flexible than parametric approaches. To maximize the log-likelihood function proposed by Dear & Begg (1992) under a monotonicity constraint we use a differential evolution algorithm proposed by Ardia et al (2010a, b) and implemented in Mullen et al (2009). In addition, we offer a method to compute a confidence interval for the overall effect size theta, adjusted for selection bias as well as a function that computes the simulation-based p-value to assess the null hypothesis of no selection as described in Rufibach (2011, Section 6).
1570 Meta-Analysis seqMeta Meta-Analysis of Region-Based Tests of Rare DNA Variants Computes necessary information to meta analyze region-based tests for rare genetic variants (e.g. SKAT, T1) in individual studies, and performs meta analysis.
1571 Meta-Analysis surrosurv Evaluation of Failure Time Surrogate Endpoints in Individual Patient Data Meta-Analyses Provides functions for the evaluation of surrogate endpoints when both the surrogate and the true endpoint are failure time variables. The approaches implemented are: (1) the two-step approach (Burzykowski et al, 2001) doi:10.1111/1467-9876.00244 with a copula model (Clayton, Plackett, Hougaard) at the first step and either a linear regression of log-hazard ratios at the second step (either adjusted or not for measurement error); (2) mixed proportional hazard models estimated via mixed Poisson GLM (Rotolo et al, 2017 doi:10.1177/0962280217718582).
1572 Meta-Analysis TFisher Optimal Thresholding Fisher’s P-Value Combination Method We provide the cumulative distribution function (CDF), quantile, and statistical power calculator for a collection of thresholding Fisher’s p-value combination methods, including Fisher’s p-value combination method, truncated product method and, in particular, soft-thresholding Fisher’s p-value combination method which is proven to be optimal in some context of signal detection. The p-value calculator for the omnibus version of these tests are also included. For reference, please see Hong Zhang and Zheyang Wu. “Optimal Thresholding of Fisher’s P-value Combination Tests for Signal Detection”, submitted.
1573 Meta-Analysis weightr Estimating Weight-Function Models for Publication Bias Estimates the Vevea and Hedges (1995) doi:10.1007/BF02294384 weight-function model. By specifying arguments, users can also estimate the modified model described in Vevea and Woods (2005) doi:10.1037/1082-989X.10.4.428, which may be more practical with small datasets. Users can also specify moderators to estimate a linear model. The package functionality allows users to easily extract the results of these analyses as R objects for other uses. In addition, the package includes a function to launch both models as a Shiny application. Although the Shiny application is also available online, this function allows users to launch it locally if they choose.
1574 Meta-Analysis xmeta A Toolbox for Multivariate Meta-Analysis A toolbox for meta-analysis. This package includes a collection of functions for (1) implementing robust multivariate meta-analysis of continuous or binary outcomes; and (2) a bivariate Egger’s test for detecting publication bias.
1575 Multivariate Statistics abind Combine Multidimensional Arrays Combine multidimensional arrays into a single array. This is a generalization of ‘cbind’ and ‘rbind’. Works with vectors, matrices, and higher-dimensional arrays. Also provides functions ‘adrop’, ‘asub’, and ‘afill’ for manipulating, extracting and replacing data in arrays.
1576 Multivariate Statistics ade4 (core) Analysis of Ecological Data : Exploratory and Euclidean Methods in Environmental Sciences Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) doi:10.18637/jss.v022.i04.
1577 Multivariate Statistics amap Another Multidimensional Analysis Package Tools for Clustering and Principal Component Analysis (With robust methods, and parallelized functions).
1578 Multivariate Statistics aplpack Another Plot PACKage: stem.leaf, bagplot, faces, spin3R, plotsummary, plothulls, and some slider functions set of functions for drawing some special plots: stem.leaf plots a stem and leaf plot, stem.leaf.backback plots back-to-back versions of stem and leafs, bagplot plots a bagplot, skyline.hist plots several histgramm in one plot of a one dimensional data set, plotsummary plots a graphical summary of a data set with one or more variables, plothulls plots sequentially hulls of a bivariate data set, faces plots chernoff faces, spin3R for an inspection of a 3-dim point cloud, slider functions for interactive graphics.
1579 Multivariate Statistics ash David Scott’s ASH Routines David Scott’s ASH routines ported from S-PLUS to R.
1580 Multivariate Statistics bayesm Bayesian Inference for Marketing/Micro-Econometrics Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley 2005) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).
1581 Multivariate Statistics ca Simple, Multiple and Joint Correspondence Analysis Computation and visualization of simple, multiple and joint correspondence analysis.
1582 Multivariate Statistics calibrate Calibration of Scatterplot and Biplot Axes Package for drawing calibrated scales with tick marks on (non-orthogonal) variable vectors in scatterplots and biplots.
1583 Multivariate Statistics car Companion to Applied Regression Functions and Datasets to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Second Edition, Sage, 2011.
1584 Multivariate Statistics caret Classification and Regression Training Misc functions for training and plotting classification and regression models.
1585 Multivariate Statistics class Functions for Classification Various functions for classification, including k-nearest neighbour, Learning Vector Quantization and Self-Organizing Maps.
1586 Multivariate Statistics clue Cluster Ensembles CLUster Ensembles.
1587 Multivariate Statistics cluster (core) “Finding Groups in Data”: Cluster Analysis Extended Rousseeuw et al. Methods for Cluster analysis. Much extended the original from Peter Rousseeuw, Anja Struyf and Mia Hubert, based on Kaufman and Rousseeuw (1990) “Finding Groups in Data”.
1588 Multivariate Statistics clusterGeneration Random Cluster Generation (with Specified Degree of Separation) We developed the clusterGeneration package to provide functions for generating random clusters, generating random covariance/correlation matrices, calculating a separation index (data and population version) for pairs of clusters or cluster distributions, and 1-D and 2-D projection plots to visualize clusters. The package also contains a function to generate random clusters based on factorial designs with factors such as degree of separation, number of clusters, number of variables, number of noisy variables.
1589 Multivariate Statistics clusterSim Searching for Optimal Clustering Procedure for a Data Set Distance measures (GDM1, GDM2, Sokal-Michener, Bray-Curtis, for symbolic interval-valued data), cluster quality indices (Calinski-Harabasz, Baker-Hubert, Hubert-Levine, Silhouette, Krzanowski-Lai, Hartigan, Gap, Davies-Bouldin), data normalization formulas, data generation (typical and non-typical data), HINoV method, replication analysis, linear ordering methods, spectral clustering, agreement indices between two partitions, plot functions (for categorical and symbolic interval-valued data). (MILLIGAN, G.W., COOPER, M.C. (1985) doi:10.1007/BF02294245, HUBERT, L., ARABIE, P. (1985), doi:10.1007%2FBF01908075, RAND, W.M. (1971) doi:10.1080/01621459.1971.10482356, JAJUGA, K., WALESIAK, M. (2000) doi:10.1007/978-3-642-57280-7_11, MILLIGAN, G.W., COOPER, M.C. (1988) doi:10.1007/BF01897163, CORMACK, R.M. (1971) doi:10.2307/2344237, JAJUGA, K., WALESIAK, M., BAK, A. (2003) doi:10.1007/978-3-642-55721-7_12, CARMONE, F.J., KARA, A., MAXWELL, S. (1999) doi:10.2307/3152003, DAVIES, D.L., BOULDIN, D.W. (1979) doi:10.1109/TPAMI.1979.4766909, CALINSKI, T., HARABASZ, J. (1974) doi:10.1080/03610927408827101, HUBERT, L. (1974) doi:10.1080/01621459.1974.10480191, TIBSHIRANI, R., WALTHER, G., HASTIE, T. (2001) doi:10.1111/1467-9868.00293, KRZANOWSKI, W.J., LAI, Y.T. (1988) doi:10.2307/2531893, BRECKENRIDGE, J.N. (2000) doi:10.1207/S15327906MBR3502_5, WALESIAK, M., DUDEK, A. (2008) doi:10.1007/978-3-540-78246-9_11).
1590 Multivariate Statistics clustvarsel Variable Selection for Gaussian Model-Based Clustering Variable selection for Gaussian model-based clustering as implemented in the ‘mclust’ package. The methodology allows to find the (locally) optimal subset of variables in a data set that have group/cluster information. A greedy or headlong search can be used, either in a forward-backward or backward-forward direction, with or without sub-sampling at the hierarchical clustering stage for starting ‘mclust’ models. By default the algorithm uses a sequential search, but parallelisation is also available.
1591 Multivariate Statistics clv Cluster Validation Techniques Package contains most of the popular internal and external cluster validation methods ready to use for the most of the outputs produced by functions coming from package “cluster”. Package contains also functions and examples of usage for cluster stability approach that might be applied to algorithms implemented in “cluster” package as well as user defined clustering algorithms.
1592 Multivariate Statistics cocorresp Co-Correspondence Analysis Methods Fits predictive and symmetric co-correspondence analysis (CoCA) models to relate one data matrix to another data matrix. More specifically, CoCA maximises the weighted covariance between the weighted averaged species scores of one community and the weighted averaged species scores of another community. CoCA attempts to find patterns that are common to both communities.
1593 Multivariate Statistics concor Concordance The four functions svdcp (cp for column partitioned), svdbip or svdbip2 (bip for bi-partitioned), and svdbips (s for a simultaneous optimization of one set of r solutions), correspond to a “SVD by blocks” notion, by supposing each block depending on relative subspaces, rather than on two whole spaces as usual SVD does. The other functions, based on this notion, are relative to two column partitioned data matrices x and y defining two sets of subsets xi and yj of variables and amount to estimate a link between xi and yj for the pair (xi, yj) relatively to the links associated to all the other pairs.
1594 Multivariate Statistics copula Multivariate Dependence with Copulas Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
1595 Multivariate Statistics corpcor Efficient Estimation of Covariance and (Partial) Correlation Implements a James-Stein-type shrinkage estimator for the covariance matrix, with separate shrinkage for variances and correlations. The details of the method are explained in Schafer and Strimmer (2005) doi:10.2202/1544-6115.1175 and Opgen-Rhein and Strimmer (2007) doi:10.2202/1544-6115.1252. The approach is both computationally as well as statistically very efficient, it is applicable to “small n, large p” data, and always returns a positive definite and well-conditioned covariance matrix. In addition to inferring the covariance matrix the package also provides shrinkage estimators for partial correlations and partial variances. The inverse of the covariance and correlation matrix can be efficiently computed, as well as any arbitrary power of the shrinkage correlation matrix. Furthermore, functions are available for fast singular value decomposition, for computing the pseudoinverse, and for checking the rank and positive definiteness of a matrix.
1596 Multivariate Statistics covRobust Robust Covariance Estimation via Nearest Neighbor Cleaning The cov.nnve() function implements robust covariance estimation by the nearest neighbor variance estimation (NNVE) method of Wang and Raftery (2002) doi:10.1198/016214502388618780.
1597 Multivariate Statistics cramer Multivariate nonparametric Cramer-Test for the two-sample-problem Provides R routine for the so called two-sample Cramer-Test. This nonparametric two-sample-test on equality of the underlying distributions can be applied to multivariate data as well as univariate data. It offers two possibilities to approximate the critical value both of which are included in this package.
1598 Multivariate Statistics cwhmisc Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools Miscellaneous useful or interesting functions.
1599 Multivariate Statistics delt Estimation of Multivariate Densities Using Adaptive Partitions We implement methods for estimating multivariate densities. We include a discretized kernel estimator, an adaptive histogram (a greedy histogram and a CART-histogram), stagewise minimization, and bootstrap aggregation.
1600 Multivariate Statistics denpro Visualization of Multivariate Functions, Sets, and Data We provide tools to (1) visualize multivariate density functions and density estimates with level set trees, (2) visualize level sets with shape trees, (3) visualize multivariate data with tail trees, (4) visualize scales of multivariate density estimates with mode graphs and branching maps, and (5) visualize anisotropic spread with 2D volume functions and 2D probability content functions. Level set trees visualize mode structure, shape trees visualize shapes of level sets of unimodal densities, and tail trees visualize connected data sets. The kernel estimator is implemented but the package may also be applied for visualizing other density estimates.
1601 Multivariate Statistics desirability Function Optimization and Ranking via Desirability Functions S3 classes for multivariate optimization using the desirability function by Derringer and Suich (1980).
1602 Multivariate Statistics dr Methods for Dimension Reduction for Regression Functions, methods, and datasets for fitting dimension reduction regression, using slicing (methods SAVE and SIR), Principal Hessian Directions (phd, using residuals and the response), and an iterative IRE. Partial methods, that condition on categorical predictors are also available. A variety of tests, and stepwise deletion of predictors, is also included. Also included is code for computing permutation tests of dimension. Adding additional methods of estimating dimension is straightforward. For documentation, see the vignette in the package. With version 3.0.4, the arguments for dr.step have been modified.
1603 Multivariate Statistics e1071 Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, …
1604 Multivariate Statistics earth Multivariate Adaptive Regression Splines Build regression models using the techniques in Friedman’s papers “Fast MARS” and “Multivariate Adaptive Regression Splines”. (The term “MARS” is trademarked and thus not used in the name of the package.)
1605 Multivariate Statistics ellipse Functions for drawing ellipses and ellipse-like confidence regions This package contains various routines for drawing ellipses and ellipse-like confidence regions, implementing the plots described in Murdoch and Chow (1996), A graphical display of large correlation matrices, The American Statistician 50, 178-180. There are also routines implementing the profile plots described in Bates and Watts (1988), Nonlinear Regression Analysis and its Applications.
1606 Multivariate Statistics energy E-Statistics: Multivariate Inference via the Energy of Data E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.
1607 Multivariate Statistics eRm Extended Rasch Modeling Fits Rasch models (RM), linear logistic test models (LLTM), rating scale model (RSM), linear rating scale models (LRSM), partial credit models (PCM), and linear partial credit models (LPCM). Missing values are allowed in the data matrix. Additional features are the ML estimation of the person parameters, Andersen’s LR-test, item-specific Wald test, Martin-Lof-Test, nonparametric Monte-Carlo Tests, itemfit and personfit statistics including infit and outfit measures, various ICC and related plots, automated stepwise item elimination, simulation module for various binary data matrices.
1608 Multivariate Statistics FactoMineR Multivariate Exploratory Data Analysis and Data Mining Exploratory data analysis methods to summarize, visualize and describe datasets. The main principal component methods are available, those with the largest potential in terms of applications: principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, Multiple Factor Analysis when variables are structured in groups, etc. and hierarchical cluster analysis. F. Husson, S. Le and J. Pages (2017) doi:10.1201/b10345-2.
1609 Multivariate Statistics FAiR Factor Analysis in R This package estimates factor analysis models using a genetic algorithm, which permits a general mechanism for restricted optimization with arbitrary restrictions that are chosen at run time with the help of a GUI. Importantly, inequality restrictions can be imposed on functions of multiple parameters, which provides a new avenues for testing and generating theories with factor analysis models. This package also includes an entirely new estimator of the common factor analysis model called semi-exploratory factor analysis, which is a general alternative to exploratory and confirmatory factor analysis. Finally, this package integrates a lot of other packages that estimate sample covariance matrices and thus provides a lot of alternatives to the traditional sample covariance calculation. Note that you need to have the Gtk run time library installed on your system to use this package; see the URL below for detailed installation instructions. Most users would only need to understand the first twenty-four pages of the PDF manual.
1610 Multivariate Statistics fastICA FastICA Algorithms to Perform ICA and Projection Pursuit Implementation of FastICA algorithm to perform Independent Component Analysis (ICA) and Projection Pursuit.
1611 Multivariate Statistics feature Local Inferential Feature Significance for Multivariate Kernel Density Estimation Local inferential feature significance for multivariate kernel density estimation.
1612 Multivariate Statistics fgac Generalized Archimedean Copula Bi-variate data fitting is done by two stochastic components: the marginal distributions and the dependency structure. The dependency structure is modeled through a copula. An algorithm was implemented considering seven families of copulas (Generalized Archimedean Copulas), the best fitting can be obtained looking all copula’s options (totally positive of order 2 and stochastically increasing models).
1613 Multivariate Statistics fpc Flexible Procedures for Clustering Various methods for clustering and cluster validation. Fixed point clustering. Linear regression clustering. Clustering by merging Gaussian mixture components. Symmetric and asymmetric discriminant projections for visualisation of the separation of groupings. Cluster validation statistics for distance based clustering including corrected Rand index. Cluster-wise cluster stability assessment. Methods for estimation of the number of clusters: Calinski-Harabasz, Tibshirani and Walther’s prediction strength, Fang and Wang’s bootstrap stability. Gaussian/multinomial mixture fitting for mixed continuous/categorical variables. Variable-wise statistics for cluster interpretation. DBSCAN clustering. Interface functions for many clustering methods implemented in R, including estimating the number of clusters with kmeans, pam and clara. Modality diagnosis for Gaussian mixtures. For an overview see package?fpc.
1614 Multivariate Statistics fso Fuzzy Set Ordination Fuzzy set ordination is a multivariate analysis used in ecology to relate the composition of samples to possible explanatory variables. While differing in theory and method, in practice, the use is similar to ‘constrained ordination.’ The package contains plotting and summary functions as well as the analyses
1615 Multivariate Statistics gclus Clustering Graphics Orders panels in scatterplot matrices and parallel coordinate displays by some merit index. Package contains various indices of merit, ordering functions, and enhanced versions of pairs and parcoord which color panels according to their merit level.
1616 Multivariate Statistics GenKern Functions for generating and manipulating binned kernel density estimates Computes generalised KDEs
1617 Multivariate Statistics geometry Mesh Generation and Surface Tesselation Makes the qhull library (www.qhull.org) available in R, in a similar manner as in Octave and MATLAB. Qhull computes convex hulls, Delaunay triangulations, halfspace intersections about a point, Voronoi diagrams, furthest-site Delaunay triangulations, and furthest-site Voronoi diagrams. It runs in 2-d, 3-d, 4-d, and higher dimensions. It implements the Quickhull algorithm for computing the convex hull. Qhull does not support constrained Delaunay triangulations, or mesh generation of non-convex objects, but the package does include some R functions that allow for this. Currently the package only gives access to Delaunay triangulation and convex hull computation.
1618 Multivariate Statistics geozoo Zoo of Geometric Objects Geometric objects defined in ‘geozoo’ can be simulated or displayed in the R package ‘tourr’.
1619 Multivariate Statistics gmodels Various R Programming Tools for Model Fitting Various R programming tools for model fitting.
1620 Multivariate Statistics GPArotation GPA Factor Rotation Gradient Projection Algorithm Rotation for Factor Analysis. See ?GPArotation.Intro for more details.
1621 Multivariate Statistics hddplot Use Known Groups in High-Dimensional Data to Derive Scores for Plots Cross-validated linear discriminant calculations determine the optimum number of features. Test and training scores from successive cross-validation steps determine, via a principal components calculation, a low-dimensional global space onto which test scores are projected, in order to plot them. Further functions are included that are intended for didactic use. The package implements, and extends, methods described in J.H. Maindonald and C.J. Burden (2005) https://journal.austms.org.au/V46/CTAC2004/Main/home.html.
1622 Multivariate Statistics Hmisc Harrell Miscellaneous Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, and recoding variables.
1623 Multivariate Statistics homals Gifi Methods for Optimal Scaling Performs a homogeneity analysis (multiple correspondence analysis) and various extensions. Rank restrictions on the category quantifications can be imposed (nonlinear PCA). The categories are transformed by means of optimal scaling with options for nominal, ordinal, and numerical scale levels (for rank-1 restrictions). Variables can be grouped into sets, in order to emulate regression analysis and canonical correlation analysis.
1624 Multivariate Statistics hybridHclust Hybrid Hierarchical Clustering Hybrid hierarchical clustering via mutual clusters. A mutual cluster is a set of points closer to each other than to all other points. Mutual clusters are used to enrich top-down hierarchical clustering.
1625 Multivariate Statistics ICS Tools for Exploring Multivariate Data via ICS/ICA Implementation of Tyler, Critchley, Duembgen and Oja’s (JRSS B, 2009, doi:10.1111/j.1467-9868.2009.00706.x) and Oja, Sirkia and Eriksson’s (AJS, 2006, http://www.ajs.or.at/index.php/ajs/article/view/vol35,%20no2%263%20-%207) method of two different scatter matrices to obtain an invariant coordinate system or independent components, depending on the underlying assumptions.
1626 Multivariate Statistics ICSNP Tools for Multivariate Nonparametrics Tools for multivariate nonparametrics, as location tests based on marginal ranks, spatial median and spatial signs computation, Hotelling’s T-test, estimates of shape are implemented.
1627 Multivariate Statistics iplots iPlots - interactive graphics for R Interactive plots for R
1628 Multivariate Statistics JADE Blind Source Separation Methods Based on Joint Diagonalization and Some BSS Performance Criteria Cardoso’s JADE algorithm as well as his functions for joint diagonalization are ported to R. Also several other blind source separation (BSS) methods, like AMUSE and SOBI, and some criteria for performance evaluation of BSS algorithms, are given.
1629 Multivariate Statistics kernlab Kernel-Based Machine Learning Lab Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods ‘kernlab’ includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.
1630 Multivariate Statistics KernSmooth Functions for Kernel Smoothing Supporting Wand & Jones (1995) Functions for kernel smoothing (and density estimation) corresponding to the book: Wand, M.P. and Jones, M.C. (1995) “Kernel Smoothing”.
1631 Multivariate Statistics kknn Weighted k-Nearest Neighbors Weighted k-Nearest Neighbors for Classification, Regression and Clustering.
1632 Multivariate Statistics klaR Classification and visualization Miscellaneous functions for classification and visualization developed at the Fakultaet Statistik, Technische Universitaet Dortmund
1633 Multivariate Statistics knncat Nearest-neighbor Classification with Categorical Variables Scale categorical variables in such a way as to make NN classification as accurate as possible. The code also handles continuous variables and prior probabilities, and does intelligent variable selection and estimation of both error rates and the right number of NN’s.
1634 Multivariate Statistics kohonen Supervised and Unsupervised Self-Organising Maps Functions to train self-organising maps (SOMs). Also interrogation of the maps and prediction using trained maps are supported. The name of the package refers to Teuvo Kohonen, the inventor of the SOM.
1635 Multivariate Statistics ks Kernel Smoothing Kernel smoothers for univariate and multivariate data, including density functions, density derivatives, cumulative distributions, modal clustering, discriminant analysis, significant modal regions and two-sample hypothesis testing.
1636 Multivariate Statistics lattice Trellis Graphics for R A powerful and elegant high-level data visualization system inspired by Trellis graphics, with an emphasis on multivariate data. Lattice is sufficient for typical graphics needs, and is also flexible enough to handle most nonstandard requirements. See ?Lattice for an introduction.
1637 Multivariate Statistics ltm Latent Trait Models under IRT Analysis of multivariate dichotomous and polytomous data using latent trait models under the Item Response Theory approach. It includes the Rasch, the Two-Parameter Logistic, the Birnbaum’s Three-Parameter, the Graded Response, and the Generalized Partial Credit Models.
1638 Multivariate Statistics mAr Multivariate AutoRegressive analysis R functions for multivariate autoregressive analysis
1639 Multivariate Statistics MASS (core) Support Functions and Datasets for Venables and Ripley’s MASS Functions and datasets to support Venables and Ripley, “Modern Applied Statistics with S” (4th edition, 2002).
1640 Multivariate Statistics Matrix Sparse and Dense Matrix Classes and Methods A rich hierarchy of matrix classes, including triangular, symmetric, and diagonal matrices, both dense and sparse and with pattern, logical and numeric entries. Numerous methods for and operations on these matrices, using ‘LAPACK’ and ‘SuiteSparse’ libraries.
1641 Multivariate Statistics matrixcalc Collection of functions for matrix calculations A collection of functions to support matrix calculations for probability, econometric and numerical analysis. There are additional functions that are comparable to APL functions which are useful for actuarial models such as pension mathematics. This package is used for teaching and research purposes at the Department of Finance and Risk Engineering, New York University, Polytechnic Institute, Brooklyn, NY 11201.
1642 Multivariate Statistics mclust Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.
1643 Multivariate Statistics MCMCpack Markov Chain Monte Carlo (MCMC) Package Contains functions to perform Bayesian inference using posterior simulation for a number of statistical models. Most simulation is done in compiled C++ written in the Scythe Statistical Library Version 1.0.3. All models return coda mcmc objects that can then be summarized using the coda package. Some useful utility functions such as density functions, pseudo-random number generators for statistical distributions, a general purpose Metropolis sampling algorithm, and tools for visualization are provided.
1644 Multivariate Statistics mda Mixture and Flexible Discriminant Analysis Mixture and flexible discriminant analysis, multivariate adaptive regression splines (MARS), BRUTO, …
1645 Multivariate Statistics mice Multivariate Imputation by Chained Equations Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) doi:10.18637/jss.v045.i03. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
1646 Multivariate Statistics misc3d Miscellaneous 3D Plots A collection of miscellaneous 3d plots, including isosurfaces.
1647 Multivariate Statistics mitools Tools for multiple imputation of missing data Tools to perform analyses and combine results from multiple-imputation datasets.
1648 Multivariate Statistics mix Estimation/Multiple Imputation for Mixed Categorical and Continuous Data Estimation/multiple imputation programs for mixed categorical and continuous data.
1649 Multivariate Statistics mnormt The Multivariate Normal and t Distributions Functions are provided for computing the density and the distribution function of multivariate normal and “t” random variables, and for generating random vectors sampled from these distributions. Probabilities are computed via non-Monte Carlo methods; different routines are used in the case d=1, d=2, d>2, if d denotes the number of dimensions.
1650 Multivariate Statistics MNP R Package for Fitting the Multinomial Probit Model Fits the Bayesian multinomial probit model via Markov chain Monte Carlo. The multinomial probit model is often used to analyze the discrete choices made by individuals recorded in survey data. Examples where the multinomial probit model may be useful include the analysis of product choice by consumers in market research and the analysis of candidate or party choice by voters in electoral studies. The MNP package can also fit the model with different choice sets for each individual, and complete or partial individual choice orderings of the available alternatives from the choice set. The estimation is based on the efficient marginal data augmentation algorithm that is developed by Imai and van Dyk (2005). “A Bayesian Analysis of the Multinomial Probit Model Using the Data Augmentation,” Journal of Econometrics, Vol. 124, No. 2 (February), pp. 311-334. doi:10.1016/j.jeconom.2004.02.002 Detailed examples are given in Imai and van Dyk (2005). “MNP: R Package for Fitting the Multinomial Probit Model.” Journal of Statistical Software, Vol. 14, No. 3 (May), pp. 1-32. doi:10.18637/jss.v014.i03.
1651 Multivariate Statistics monomvn Estimation for Multivariate Normal and Student-t Data with Monotone Missingness Estimation of multivariate normal and student-t data of arbitrary dimension where the pattern of missing data is monotone. Through the use of parsimonious/shrinkage regressions (plsr, pcr, lasso, ridge, etc.), where standard regressions fail, the package can handle a nearly arbitrary amount of missing data. The current version supports maximum likelihood inference and a full Bayesian approach employing scale-mixtures for Gibbs sampling. Monotone data augmentation extends this Bayesian approach to arbitrary missingness patterns. A fully functional standalone interface to the Bayesian lasso (from Park & Casella), Normal-Gamma (from Griffin & Brown), Horseshoe (from Carvalho, Polson, & Scott), and ridge regression with model selection via Reversible Jump, and student-t errors (from Geweke) is also provided.
1652 Multivariate Statistics mvnmle ML estimation for multivariate normal data with missing values Finds the maximum likelihood estimate of the mean vector and variance-covariance matrix for multivariate normal data with missing values.
1653 Multivariate Statistics mvnormtest Normality test for multivariate variables Generalization of shapiro-wilk test for multivariate variables.
1654 Multivariate Statistics mvoutlier Multivariate Outlier Detection Based on Robust Methods Various Methods for Multivariate Outlier Detection.
1655 Multivariate Statistics mvtnorm Multivariate Normal and t Distributions Computes multivariate normal and t probabilities, quantiles, random deviates and densities.
1656 Multivariate Statistics nFactors Parallel Analysis and Non Graphical Solutions to the Cattell Scree Test Indices, heuristics and strategies to help determine the number of factors/components to retain: 1. Acceleration factor (af with or without Parallel Analysis); 2. Optimal Coordinates (noc with or without Parallel Analysis); 3. Parallel analysis (components, factors and bootstrap); 4. lambda > mean(lambda) (Kaiser, CFA and related); 5. Cattell-Nelson-Gorsuch (CNG); 6. Zoski and Jurs multiple regression (b, t and p); 7. Zoski and Jurs standard error of the regression coeffcient (sescree); 8. Nelson R2; 9. Bartlett khi-2; 10. Anderson khi-2; 11. Lawley khi-2 and 12. Bentler-Yuan khi-2.
1657 Multivariate Statistics pan Multiple Imputation for Multivariate Panel or Clustered Data Multiple imputation for multivariate panel or clustered data.
1658 Multivariate Statistics paran Horn’s Test of Principal Components/Factors paran is an implementation of Horn’s technique for numerically and graphically evaluating the components or factors retained in a principle components analysis (PCA) or common factor analysis (FA). Horn’s method contrasts eigenvalues produced through a PCA or FA on a number of random data sets of uncorrelated variables with the same number of variables and observations as the experimental or observational data set to produce eigenvalues for components or factors that are adjusted for the sample error-induced inflation. Components with adjusted eigenvalues greater than one are retained. paran may also be used to conduct parallel analysis following Glorfeld’s (1995) suggestions to reduce the likelihood of over-retention.
1659 Multivariate Statistics party A Laboratory for Recursive Partytioning A computational toolbox for recursive partitioning. The core of the package is ctree(), an implementation of conditional inference trees which embed tree-structured regression models into a well defined theory of conditional inference procedures. This non-parametric class of regression trees is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Based on conditional inference trees, cforest() provides an implementation of Breiman’s random forests. The function mob() implements an algorithm for recursive partitioning based on parametric models (e.g. linear models, GLMs or survival regression) employing parameter instability tests for split selection. Extensible functionality for visualizing tree-structured regression models is available. The methods are described in Hothorn et al. (2006) doi:10.1198/106186006X133933, Zeileis et al. (2008) doi:10.1198/106186008X319331 and Strobl et al. (2007) doi:10.1186/1471-2105-8-25.
1660 Multivariate Statistics pcaPP Robust PCA by Projection Pursuit Provides functions for robust PCA by projection pursuit. The methods are described in Croux et al. (2006) doi:10.2139/ssrn.968376, Croux et al. (2013) doi:10.1080/00401706.2012.727746, Todorov and Filzmoser (2013) doi:10.1007/978-3-642-33042-1_31.
1661 Multivariate Statistics PearsonICA Independent component analysis using score functions from the Pearson system The Pearson-ICA algorithm is a mutual information-based method for blind separation of statistically independent source signals. It has been shown that the minimization of mutual information leads to iterative use of score functions, i.e. derivatives of log densities. The Pearson system allows adaptive modeling of score functions. The flexibility of the Pearson system makes it possible to model a wide range of source distributions including asymmetric distributions. The algorithm is designed especially for problems with asymmetric sources but it works for symmetric sources as well.
1662 Multivariate Statistics pls Partial Least Squares and Principal Component Regression Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).
1663 Multivariate Statistics plsgenomics PLS Analyses for Genomics Routines for PLS-based genomic analyses, implementing PLS methods for classification with microarray data and prediction of transcription factor activities from combined ChIP-chip analysis. The >=1.2-1 versions include two new classification methods for microarray data: GSIM and Ridge PLS. The >=1.3 versions includes a new classification method combining variable selection and compression in logistic regression context: logit-SPLS; and an adaptive version of the sparse PLS.
1664 Multivariate Statistics poLCA Polytomous variable Latent Class Analysis Latent class analysis and latent class regression models for polytomous outcome variables. Also known as latent structure analysis.
1665 Multivariate Statistics polycor Polychoric and Polyserial Correlations Computes polychoric and polyserial correlations by quick “two-step” methods or ML, optionally with standard errors; tetrachoric and biserial correlations are special cases.
1666 Multivariate Statistics ppls Penalized Partial Least Squares This package contains linear and nonlinear regression methods based on Partial Least Squares and Penalization Techniques. Model parameters are selected via cross-validation, and confidence intervals ans tests for the regression coefficients can be conducted via jackknifing.
1667 Multivariate Statistics prim Patient Rule Induction Method (PRIM) Patient Rule Induction Method (PRIM) for bump hunting in high-dimensional data.
1668 Multivariate Statistics proxy Distance and Similarity Measures Provides an extensible framework for the efficient calculation of auto- and cross-proximities, along with implementations of the most popular ones.
1669 Multivariate Statistics psy Various procedures used in psychometry Kappa, ICC, Cronbach alpha, screeplot, mtmm
1670 Multivariate Statistics PTAk Principal Tensor Analysis on k Modes A multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.
1671 Multivariate Statistics rda Shrunken Centroids Regularized Discriminant Analysis Shrunken Centroids Regularized Discriminant Analysis for the classification purpose in high dimensional data.
1672 Multivariate Statistics relaimpo Relative importance of regressors in linear models relaimpo provides several metrics for assessing relative importance in linear models. These can be printed, plotted and bootstrapped. The recommended metric is lmg, which provides a decomposition of the model explained variance into non-negative contributions. There is a version of this package available that additionally provides a new and also recommended metric called pmvd. If you are a non-US user, you can download this extended version from Ulrike Groempings web site.
1673 Multivariate Statistics rggobi Interface Between R and ‘GGobi’ A command-line interface to ‘GGobi’, an interactive and dynamic graphics package. ‘Rggobi’ complements the graphical user interface of ‘GGobi’ providing a way to fluidly transition between analysis and exploration, as well as automating common tasks.
1674 Multivariate Statistics rgl 3D Visualization Using OpenGL Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.). Output may be on screen using OpenGL, or to various standard 3D file formats including WebGL, PLY, OBJ, STL as well as 2D image formats, including PNG, Postscript, SVG, PGF.
1675 Multivariate Statistics robustbase Basic Robust Statistics “Essential” Robust Statistics. Tools allowing to analyze data with robust methods. This includes regression methodology including model selections and multivariate statistics where we strive to cover the book “Robust Statistics, Theory and Methods” by ‘Maronna, Martin and Yohai’; Wiley 2006.
1676 Multivariate Statistics ROCR Visualizing the Performance of Scoring Classifiers ROC graphs, sensitivity/specificity curves, lift charts, and precision/recall plots are popular examples of trade-off visualizations for specific pairs of performance measures. ROCR is a flexible tool for creating cutoff-parameterized 2D performance curves by freely combining two from over 25 performance measures (new performance measures can be added using a standard interface). Curves from different cross-validation or bootstrapping runs can be averaged by different methods, and standard deviations, standard errors or box plots can be used to visualize the variability across the runs. The parameterization can be visualized by printing cutoff values at the corresponding curve positions, or by coloring the curve according to cutoff. All components of a performance plot can be quickly adjusted using a flexible parameter dispatching mechanism. Despite its flexibility, ROCR is easy to use, with only three commands and reasonable default values for all optional parameters.
1677 Multivariate Statistics rpart Recursive Partitioning and Regression Trees Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman, Friedman, Olshen and Stone.
1678 Multivariate Statistics rrcov Scalable Robust Estimators with High Breakdown Point Robust Location and Scatter Estimation and Robust Multivariate Analysis with High Breakdown Point.
1679 Multivariate Statistics sca Simple Component Analysis Simple Component Analysis (SCA) often provides much more interpretable components than Principal Components (PCA) while still representing much of the variability in the data.
1680 Multivariate Statistics scatterplot3d 3D Scatter Plot Plots a three dimensional (3D) point cloud.
1681 Multivariate Statistics sem Structural Equation Models Functions for fitting general linear structural equation models (with observed and latent variables) using the RAM approach, and for fitting structural equations in observed-variable models by two-stage least squares.
1682 Multivariate Statistics SensoMineR Sensory data analysis with R an R package for analysing sensory data
1683 Multivariate Statistics seriation Infrastructure for Ordering Objects Using Seriation Infrastructure for seriation with an implementation of several seriation/sequencing techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).
1684 Multivariate Statistics simba A Collection of functions for similarity analysis of vegetation data Besides functions for the calculation of similarity and multiple plot similarity measures with binary data (for instance presence/absence species data) the package contains some simple wrapper functions for reshaping species lists into matrices and vice versa and some other functions for further processing of similarity data (Mantel-like permutation procedures) as well as some other useful stuff for vegetation analysis.
1685 Multivariate Statistics smatr (Standardised) Major Axis Estimation and Testing Routines This package provides methods of fitting bivariate lines in allometry using the major axis (MA) or standardised major axis (SMA), and for making inferences about such lines. The available methods of inference include confidence intervals and one-sample tests for slope and elevation, testing for a common slope or elevation amongst several allometric lines, constructing a confidence interval for a common slope or elevation, and testing for no shift along a common axis, amongst several samples.
1686 Multivariate Statistics sn The Skew-Normal and Related Distributions Such as the Skew-t Build and manipulate probability distributions of the skew-normal family and some related ones, notably the skew-t family, and provide related statistical methods for data fitting and model diagnostics, in the univariate and the multivariate case.
1687 Multivariate Statistics spam SPArse Matrix Set of functions for sparse matrix algebra. Differences with other sparse matrix packages are: (1) we only support (essentially) one sparse matrix format, (2) based on transparent and simple structure(s), (3) tailored for MCMC calculations within G(M)RF. (4) and it is fast and scalable (with the extension package spam64).
1688 Multivariate Statistics SparseM Sparse Linear Algebra Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.
1689 Multivariate Statistics SpatialNP Multivariate Nonparametric Methods Based on Spatial Signs and Ranks Test and estimates of location, tests of independence, tests of sphericity and several estimates of shape all based on spatial signs, symmetrized signs, ranks and signed ranks. For details, see Oja and Randles (2004) doi:10.1214/088342304000000558 and Oja (2010) doi:10.1007/978-1-4419-0468-3.
1690 Multivariate Statistics superpc Supervised principal components Supervised principal components for regression and survival analsysis. Especially useful for high-dimnesional data, including microarray data.
1691 Multivariate Statistics trimcluster Cluster analysis with trimming Trimmed k-means clustering.
1692 Multivariate Statistics tsfa Time Series Factor Analysis Extraction of Factors from Multivariate Time Series. See ?00tsfa-Intro for more details.
1693 Multivariate Statistics vcd Visualizing Categorical Data Visualization techniques, data sets, summary and inference procedures aimed particularly at categorical data. Special emphasis is given to highly extensible grid graphics. The package was package was originally inspired by the book “Visualizing Categorical Data” by Michael Friendly and is now the main support package for a new book, “Discrete Data Analysis with R” by Michael Friendly and David Meyer (2015).
1694 Multivariate Statistics vegan (core) Community Ecology Package Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
1695 Multivariate Statistics VGAM Vector Generalized Linear and Additive Models An implementation of about 6 major classes of statistical regression models. At the heart of it are the vector generalized linear and additive model (VGLM/VGAM) classes, and the book “Vector Generalized Linear and Additive Models: With an Implementation in R” (Yee, 2015) doi:10.1007/978-1-4939-2818-7 gives details of the statistical framework and VGAM package. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs (i.e., with smoothing). The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)―these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.
1696 Multivariate Statistics VIM Visualization and Imputation of Missing Values New tools for the visualization of missing and/or imputed values are introduced, which can be used for exploring the data and the structure of the missing and/or imputed values. Depending on this structure of the missing values, the corresponding methods may help to identify the mechanism generating the missing values and allows to explore the data including missing values. In addition, the quality of imputation can be visually explored using various univariate, bivariate, multiple and multivariate plot methods. A graphical user interface available in the separate package VIMGUI allows an easy handling of the implemented plot methods.
1697 Multivariate Statistics xgobi Interface to the XGobi and XGvis programs for graphical data analysis Interface to the XGobi and XGvis programs for graphical data analysis.
1698 Multivariate Statistics YaleToolkit Data exploration tools from Yale University This collection of data exploration tools was developed at Yale University for the graphical exploration of complex multivariate data; barcode and gpairs now have their own packages. The new big.read.table() provided here may be useful for large files when only a subset is needed.
1699 Natural Language Processing alineR Alignment of Phonetic Sequences Using the ‘ALINE’ Algorithm Functions are provided to calculate the ‘ALINE’ Distance between words as per (Kondrak 2000) and (Downey, Hallmark, Cox, Norquest, & Lansing, 2008, doi:10.1080/09296170802326681). The score is based on phonetic features represented using the Unicode-compliant International Phonetic Alphabet (IPA). Parameterized features weights are used to determine the optimal alignment and functions are provided to estimate optimum values using a genetic algorithm and supervised learning. See (Downey, Sun, and Norquest 2017, https://journal.r-project.org/archive/2017/RJ-2017-005/index.html.
1700 Natural Language Processing boilerpipeR Interface to the Boilerpipe Java Library Generic Extraction of main text content from HTML files; removal of ads, sidebars and headers using the boilerpipe (http://code.google.com/p/boilerpipe/) Java library. The extraction heuristics from boilerpipe show a robust performance for a wide range of web site templates.
1701 Natural Language Processing corpora Statistics and data sets for corpus frequency data Utility functions and data sets for the statistical analysis of corpus frequency data, used in the SIGIL statistics course.
1702 Natural Language Processing gsubfn Utilities for strings and function arguments gsubfn is like gsub but can take a replacement function or certain other objects instead of the replacement string. Matches and back references are input to the replacement function and replaced by the function output. gsubfn can be used to split strings based on content rather than delimiters and for quasi-perl-style string interpolation. The package also has facilities for translating formulas to functions and allowing such formulas in function calls instead of functions. This can be used with R functions such as apply, sapply, lapply, optim, integrate, xyplot, Filter and any other function that expects another function as an input argument or functions like cat or sql calls that may involve strings where substitution is desirable.
1703 Natural Language Processing gutenbergr Download and Process Public Domain Works from Project Gutenberg Download and process public domain works in the Project Gutenberg collection http://www.gutenberg.org/. Includes metadata for all Project Gutenberg works, so that they can be searched and retrieved.
1704 Natural Language Processing hunspell High-Performance Stemmer, Tokenizer, and Spell Checker for R A spell checker and morphological analyzer library designed for languages with rich morphology and complex word compounding or character encoding. The package can check and analyze individual words as well as search for incorrect words within a text, latex, html or xml document. Use the ‘devtools’ package to spell check R documentation with ‘hunspell’.
1705 Natural Language Processing kernlab Kernel-Based Machine Learning Lab Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods ‘kernlab’ includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.
1706 Natural Language Processing KoNLP Korean NLP Package POS Tagger and Morphological Analyzer for Korean text based research. It provides tools for corpus linguistics research such as Keystroke converter, Hangul automata, Concordance, and Mutual Information. It also provides a convenient interface for users to apply, edit and add morphological dictionary selectively.
1707 Natural Language Processing koRpus An R Package for Text Analysis A set of tools to analyze texts. Includes, amongst others, functions for automatic language detection, hyphenation, several indices of lexical diversity (e.g., type token ratio, HD-D/vocd-D, MTLD) and readability (e.g., Flesch, SMOG, LIX, Dale-Chall). Basic import functions for language corpora are also provided, to enable frequency analyses (supports Celex and Leipzig Corpora Collection file formats) and measures like tf-idf. Support for additional languages can be added on-the-fly or by plugin packages. Note: For full functionality a local installation of TreeTagger is recommended. ‘koRpus’ also includes a plugin for the R GUI and IDE RKWard, providing graphical dialogs for its basic features. The respective R package ‘rkward’ cannot be installed directly from a repository, as it is a part of RKWard. To make full use of this feature, please install RKWard from https://rkward.kde.org (plugins are detected automatically). Due to some restrictions on CRAN, the full package sources are only available from the project homepage. To ask for help, report bugs, request features, or discuss the development of the package, please subscribe to the koRpus-dev mailing list (http://korpusml.reaktanz.de).
1708 Natural Language Processing languageR Data sets and functions with “Analyzing Linguistic Data: A practical introduction to statistics” Data sets exemplifying statistical methods, and some facilitatory utility functions used in “Analyzing Linguistic Data: A practical introduction to statistics using R”, Cambridge University Press, 2008.
1709 Natural Language Processing lda Collapsed Gibbs Sampling Methods for Topic Models Implements latent Dirichlet allocation (LDA) and related models. This includes (but is not limited to) sLDA, corrLDA, and the mixed-membership stochastic blockmodel. Inference for all of these models is implemented via a fast collapsed Gibbs sampler written in C. Utility functions for reading/writing data typically used in topic models, as well as tools for examining posterior distributions are also included.
1710 Natural Language Processing lsa Latent Semantic Analysis The basic idea of latent semantic analysis (LSA) is, that text do have a higher order (=latent semantic) structure which, however, is obscured by word usage (e.g. through the use of synonyms or polysemy). By using conceptual indices that are derived statistically via a truncated singular value decomposition (a two-mode factor analysis) over a given document-term matrix, this variability problem can be overcome.
1711 Natural Language Processing maxent Low-memory Multinomial Logistic Regression with Support for Text Classification maxent is an R package with tools for low-memory multinomial logistic regression, also known as maximum entropy. The focus of this maximum entropy classifier is to minimize memory consumption on very large datasets, particularly sparse document-term matrices represented by the tm package. The classifier is based on an efficient C++ implementation written by Dr. Yoshimasa Tsuruoka.
1712 Natural Language Processing monkeylearn Accesses the Monkeylearn API for Text Classifiers and Extractors Allows using some services of Monkeylearn http://monkeylearn.com/ which is a Machine Learning platform on the cloud for text analysis (classification and extraction).
1713 Natural Language Processing movMF Mixtures of von Mises-Fisher Distributions Fit and simulate mixtures of von Mises-Fisher distributions.
1714 Natural Language Processing mscstexta4r R Client for the Microsoft Cognitive Services Text Analytics REST API R Client for the Microsoft Cognitive Services Text Analytics REST API, including Sentiment Analysis, Topic Detection, Language Detection, and Key Phrase Extraction. An account MUST be registered at the Microsoft Cognitive Services website https://www.microsoft.com/cognitive-services/ in order to obtain a (free) API key. Without an API key, this package will not work properly.
1715 Natural Language Processing mscsweblm4r R Client for the Microsoft Cognitive Services Web Language Model REST API R Client for the Microsoft Cognitive Services Web Language Model REST API, including Break Into Words, Calculate Conditional Probability, Calculate Joint Probability, Generate Next Words, and List Available Models. A valid account MUST be registered at the Microsoft Cognitive Services website https://www.microsoft.com/cognitive-services/ in order to obtain a (free) API key. Without an API key, this package will not work properly.
1716 Natural Language Processing openNLP Apache OpenNLP Tools Interface An interface to the Apache OpenNLP tools (version 1.5.3). The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text written in Java. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. See http://opennlp.apache.org/ for more information.
1717 Natural Language Processing ore An R Interface to the Onigmo Regular Expression Library Provides an alternative to R’s built-in functionality for handling regular expressions, based on the Onigmo library. Offers first-class compiled regex objects, partial matching and function-based substitutions, amongst other features.
1718 Natural Language Processing phonics Phonetic Spelling Algorithms Provides a collection of phonetic algorithms including Soundex, Metaphone, NYSIIS, Caverphone, and others.
1719 Natural Language Processing phonics Phonetic Spelling Algorithms Provides a collection of phonetic algorithms including Soundex, Metaphone, NYSIIS, Caverphone, and others.
1720 Natural Language Processing qdap Bridging the Gap Between Qualitative Data and Quantitative Analysis Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. ‘qdap’ is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
1721 Natural Language Processing quanteda Quantitative Analysis of Textual Data A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and ngrams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
1722 Natural Language Processing RcmdrPlugin.temis Graphical Integrated Text Mining Solution An ‘R Commander’ plug-in providing an integrated solution to perform a series of text mining tasks such as importing and cleaning a corpus, and analyses like terms and documents counts, vocabulary tables, terms co-occurrences and documents similarity measures, time series analysis, correspondence analysis and hierarchical clustering. Corpora can be imported from spreadsheet-like files, directories of raw text files, ‘Twitter’ queries, as well as from ‘Dow Jones Factiva’, ‘LexisNexis’, ‘Europresse’ and ‘Alceste’ files.
1723 Natural Language Processing rel Reliability Coefficients Derives point estimates with confidence intervals for Bennett et als S, Cohen’s kappa, Conger’s kappa, Fleiss’ kappa, Gwet’s AC, intraclass correlation coefficients, Krippendorff’s alpha, Scott’s pi, the standard error of measurement, and weighted kappa.
1724 Natural Language Processing RKEA R/KEA Interface An R interface to KEA (Version 5.0). KEA (for Keyphrase Extraction Algorithm) allows for extracting keyphrases from text documents. It can be either used for free indexing or for indexing with a controlled vocabulary. For more information see http://www.nzdl.org/Kea/.
1725 Natural Language Processing RTextTools Automatic Text Classification via Supervised Learning RTextTools is a machine learning package for automatic text classification that makes it simple for novice users to get started with machine learning, while allowing experienced users to easily experiment with different settings and algorithm combinations. The package includes nine algorithms for ensemble classification (svm, slda, boosting, bagging, random forests, glmnet, decision trees, neural networks, maximum entropy), comprehensive analytics, and thorough documentation.
1726 Natural Language Processing RWeka R/Weka Interface An R interface to Weka (Version 3.9.1). Weka is a collection of machine learning algorithms for data mining tasks written in Java, containing tools for data pre-processing, classification, regression, clustering, association rules, and visualization. Package ‘RWeka’ contains the interface code, the Weka jar is in a separate package ‘RWekajars’. For more information on Weka see http://www.cs.waikato.ac.nz/ml/weka/.
1727 Natural Language Processing skmeans Spherical k-Means Clustering Algorithms to compute spherical k-means partitions. Features several methods, including a genetic and a fixed-point algorithm and an interface to the CLUTO vcluster program.
1728 Natural Language Processing SnowballC Snowball stemmers based on the C libstemmer UTF-8 library An R interface to the C libstemmer library that implements Porter’s word stemming algorithm for collapsing words to a common root to aid comparison of vocabulary. Currently supported languages are Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish and Turkish.
1729 Natural Language Processing stm Estimation of the Structural Topic Model The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions.
1730 Natural Language Processing stringdist Approximate String Matching and String Distance Functions Implements an approximate string matching version of R’s native ‘match’ function. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences.
1731 Natural Language Processing stringi Character String Processing Facilities Allows for fast, correct, consistent, portable, as well as convenient character string/text processing in every locale and any native encoding. Owing to the use of the ICU library, the package provides R users with platform-independent functions known to Java, Perl, Python, PHP, and Ruby programmers. Available features include: pattern searching (e.g., with ICU Java-like regular expressions or the Unicode Collation Algorithm), random string generation, case mapping, string transliteration, concatenation, Unicode normalization, date-time formatting and parsing, etc.
1732 Natural Language Processing tau Text Analysis Utilities Utilities for text analysis.
1733 Natural Language Processing tesseract Open Source OCR Engine for R Bindings to ‘Tesseract’: An OCR engine with unicode (UTF-8) support that can recognize over 100 languages out of the box.
1734 Natural Language Processing text2vec Modern Text Mining Framework for R Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents w