NPL
Neurological Programs and Libraries
|
Classes | |
struct | npl::RegrResult |
class | npl::StudentsT |
Student's T-distribution. A cache of the Probability Density Function and cumulative density function is created using the analytical PDF. More... | |
class | npl::Classifier |
Base class for all ND classifiers. More... | |
class | npl::KMeans |
K-means classifier. More... | |
class | npl::ExpMax |
Expectation Maximization With Gaussian Mixture Model. More... | |
Functions | |
double | npl::sample_var (const Ref< const VectorXd > vec) |
Computes the statistical variance of a column vector. More... | |
void | npl::regressOutLS (VectorXd &signal, const MatrixXd &X, const MatrixXd &covInv) |
Removes the effects of X from signal (y). Note that this takes both X and the pseudoinverse of X because the bulk of the computation will be on the pseudoinverse. More... | |
template<typename T > | |
void | npl::fillGaussian (Ref< T > m) |
double | npl::gaussianPDF (double mean, double sd, double x) |
1D Gaussian distribution function More... | |
double | npl::gaussianCDF (double mean, double sd, double x) |
1D Gaussian cumulative distribution function More... | |
double | npl::gammaPDF_MS (double mean, double sd, double x) |
PDF for the gamma distribution, if mean is negative then it is assumed that x should be negated as well. Unlike gammaPDF, this takes the mean and standard deviation (_MS) More... | |
double | npl::mutualInformation (size_t len, double *a, double *b, size_t mbin) |
Computes mutual information between signal a and signal b which are of length len. Marginal bins used is mbin. More... | |
double | npl::correlation (size_t len, double *a, double *b) |
Computes correlation between signal a and signal b which are of length len. More... | |
double | npl::sample_var (int count, double sum, double sumsqr) |
Takes a count, sum and sumsqr and returns the sample variance. This is slightly different than the variance definition and I can never remember the exact formulation. More... | |
double | npl::sample_corr (int count, double sum1, double sum2, double sumsq1, double sumsq2, double s1s2) |
Computes the sample correlation. More... | |
void | npl::regress (RegrResult *out, const Ref< const VectorXd > y, const Ref< const MatrixXd > X, const Ref< const VectorXd > covInv, const Ref< const MatrixXd > Xinv, const StudentsT &distrib) |
Computes the Ordinary Least Square predictors, beta for. More... | |
void | npl::regress (RegrResult *out, const Ref< const VectorXd > y, const Ref< const MatrixXd > X) |
Computes the Ordinary Least Square predictors, beta for. More... | |
void | npl::randomizePowerIterationSVD (const Ref< const MatrixXd > A, size_t subsize, size_t poweriters, MatrixXd &U, VectorXd &E, MatrixXd &V) |
void | npl::randomizePowerIterationSVD (const Ref< const MatrixXd > A, double tol, size_t startrank, size_t maxrank, size_t poweriters, MatrixXd &U, VectorXd &E, MatrixXd &V) |
MatrixXd | npl::pca (const Ref< const MatrixXd > X, double varth=1, int odim=-1) |
Computes the Principal Components of input matrix X. More... | |
MatrixXd | npl::rpiPCA (const Ref< const MatrixXd > X, double varth, int odim) |
Computes the Principal Components of input matrix X using the randomized power iteration SVD algorithm. More... | |
MatrixXd | npl::symICA (const Ref< const MatrixXd > Xin, MatrixXd *unmix=NULL) |
Computes the Independent Components of input matrix X using symmetric decorrlation. Note that you should run PCA on X before running ICA. More... | |
MatrixXd | npl::asymICA (const Ref< const MatrixXd > Xin, MatrixXd *unmix=NULL) |
Computes the Independent Components of input matrix X using sequential component extraction. Note that you should run PCA on X before running ICA. More... | |
MatrixXd | npl::pseudoInverse (const Ref< const MatrixXd > X) |
Computes the pseudoinverse of the input matrix. More... | |
VectorXd | npl::shootingRegr (const Ref< const MatrixXd > X, const Ref< const VectorXd > y, double gamma) |
Performs LASSO regression using the 'shooting' algorithm of. More... | |
VectorXd | npl::activeShootingRegr (const Ref< const MatrixXd > X, const Ref< const VectorXd > y, double gamma) |
Performs LASSO regression using the 'activeShooting' algorithm of. More... | |
MatrixXd | npl::pcacov (const Ref< const MatrixXd > cov, double varth) |
Computes the Principal Components of input matrix X using the covariance matrix. More... | |
void | npl::expMax1D (const Ref< const VectorXd > data, vector< std::function< double(double, double, double)>> pdfs, Ref< VectorXd > mean, Ref< VectorXd > sd, Ref< VectorXd > prior, std::string plotfile="") |
Computes the mean and standard deviation of multiple distributions based on 1D data. The probability distribution functions should be passed in through a vector of function objects (pdfs) taking mu/sd/x. More... | |
void | npl::gaussGammaMixtureModel (const Ref< const VectorXd > data, Ref< VectorXd > mu, Ref< VectorXd > sd, Ref< VectorXd > prior, std::string plotfile) |
Computes the mean and standard deviation of multiple distributions based on 1D data. This is a special version in which a negative gamma, a gaussian and positive gamma. The means of the negative and positive gamma are relative to the center of the gaussian. More... | |
void | npl::approxKMeans (const Ref< const MatrixXd > samples, size_t nclass, MatrixXd &means) |
Approximates k-means using the algorithm of: More... | |
int | npl::fastSearchFindDP (const Eigen::MatrixXf &samples, double thresh, double outthresh, Eigen::VectorXi &classes, bool brute=false) |
Algorithm of unsupervised learning (clustering) based on density. More... | |
int | npl::findDensityPeaks (const Eigen::MatrixXf &samples, double thresh, Eigen::VectorXf &rho, Eigen::VectorXf &delta, Eigen::VectorXi &parent) |
Computes Density and Peak computation for Fast Search and Find of Density Peaks algorithm. More... | |
int | npl::findDensityPeaks_brute (const Eigen::MatrixXf &samples, double thresh, Eigen::VectorXf &rho, Eigen::VectorXf &delta, Eigen::VectorXi &parent) |
Computes Density and Peak computation for Fast Search and Find of Density Peaks algorithm. This is a slower, non-bin based version. More... | |
VectorXd npl::activeShootingRegr | ( | const Ref< const MatrixXd > | X, |
const Ref< const VectorXd > | y, | ||
double | gamma | ||
) |
Performs LASSO regression using the 'activeShooting' algorithm of.
Peng, J., Wang, P., Zhou, N., & Zhu, J. (2009). Partial Correlation Estimation by Joint Sparse Regression Models. Journal of the American Statistical Association, 104(486), 735–746. doi:10.1198/jasa.2009.0126
Essentially solves the equation:
y = X * beta
where beta is mostly 0's
X | Design matrix |
y | Measured value |
gamma | Weight of regularization (larger values forces sparser model) |
void npl::approxKMeans | ( | const Ref< const MatrixXd > | samples, |
size_t | nclass, | ||
MatrixXd & | means | ||
) |
Approximates k-means using the algorithm of:
'Fast Approximate k-Means via Cluster Closures' by Wang et al
This is not really meant to be used by outside users, but as an initializer
samples | Matrix of samples, one sample per row, |
nclass | Number of classes to break samples up into |
means | Estimated mid-points/means |
MatrixXd npl::asymICA | ( | const Ref< const MatrixXd > | Xin, |
MatrixXd * | unmix = NULL |
||
) |
Computes the Independent Components of input matrix X using sequential component extraction. Note that you should run PCA on X before running ICA.
Outputs reduced dimension (fewer cols) in output
Note that this whole problem is the transpose of the version listed on wikipedia.
In: I number of components In: X RxC matrix with rows representing C-D samples for p in 1 to I: wp = random weight while wp changes: wp = g(X wp)X/R - wp SUM(g'(X wp))/R wp = wp - SUM wp^T wj wj normalize(wp)
Output: W = [w0 w1 w2 ... ] Output: S = XW, where each column is a dimension, each row a sample
Xin | RxC matrix where each row is a sample, each column a dimension (or feature). The number of columns in the output will be fewer because there will be fewer features. Columns should be zero-mean and uncorrelated with one another. |
unmix | output unmixing matrix, can be null if you don't need it |
double npl::correlation | ( | size_t | len, |
double * | a, | ||
double * | b | ||
) |
Computes correlation between signal a and signal b which are of length len.
len | Length of signal and and b |
a | Signal a |
b | Signal b |
void npl::expMax1D | ( | const Ref< const VectorXd > | data, |
vector< std::function< double(double, double, double)>> | pdfs, | ||
Ref< VectorXd > | mean, | ||
Ref< VectorXd > | sd, | ||
Ref< VectorXd > | prior, | ||
std::string | plotfile = "" |
||
) |
Computes the mean and standard deviation of multiple distributions based on 1D data. The probability distribution functions should be passed in through a vector of function objects (pdfs) taking mu/sd/x.
data | Data points to fit |
pdfs | Probability distribution functions of the form pdf(mu, sd, x) and returning the probability density at x |
mean | Output mean of each distribution, value when called will be used for initialization so it should be pre-set and pre-allocated |
sd | Output standard deviation of each distribution, value when called will be used for initialization so it should be pre-set and pre-allocated |
prior | Output prior probability of each distribution, initial value is not used but it should be pre-allocated |
plotfile | file to plot histogram in (svg) |
int npl::fastSearchFindDP | ( | const Eigen::MatrixXf & | samples, |
double | thresh, | ||
double | outthresh, | ||
Eigen::VectorXi & | classes, | ||
bool | brute = false |
||
) |
Algorithm of unsupervised learning (clustering) based on density.
see "Clustering by fast search and find of density peaks" by Rodriguez, a. Laio, A.
samples | Samples, S x D matrix with S is the number of samples and D is the dimensionality. This must match the internal dimension count. |
thresh | Threshold distance for density calculation |
outthresh | threshold for outlier, ratio of standard devation. Should be > 2 because you want to be well outside the center of the distribution. |
classes | Output classes |
brute | whether ther use slower brute force method for density calculation (for testing purposes only) |
return -1 if maximum number of iterations hit, 0 otherwise (converged)
void npl::fillGaussian | ( | Ref< T > | m | ) |
Definition at line 60 of file statistics.h.
int npl::findDensityPeaks | ( | const Eigen::MatrixXf & | samples, |
double | thresh, | ||
Eigen::VectorXf & | rho, | ||
Eigen::VectorXf & | delta, | ||
Eigen::VectorXi & | parent | ||
) |
Computes Density and Peak computation for Fast Search and Find of Density Peaks algorithm.
samples | Samples, S x D matrix with S is the number of samples and D is the dimensionality. This must match the internal dimension count. |
thresh | Threshold for density calculation |
rho | Point densities |
delta | Distance to nearest peak |
parent | Index (point) that is the nearest peak |
int npl::findDensityPeaks_brute | ( | const Eigen::MatrixXf & | samples, |
double | thresh, | ||
Eigen::VectorXf & | rho, | ||
Eigen::VectorXf & | delta, | ||
Eigen::VectorXi & | parent | ||
) |
Computes Density and Peak computation for Fast Search and Find of Density Peaks algorithm. This is a slower, non-bin based version.
samples | Samples, S x D matrix with S is the number of samples and D is the dimensionality. This must match the internal dimension count. |
thresh | Threshold for density calculation |
rho | Point densities |
delta | Distance to nearest peak |
parent | Index (point) that is the nearest peak |
double npl::gammaPDF_MS | ( | double | mean, |
double | sd, | ||
double | x | ||
) |
PDF for the gamma distribution, if mean is negative then it is assumed that x should be negated as well. Unlike gammaPDF, this takes the mean and standard deviation (_MS)
mean = k theta var = k theta theta
theta = var / mean k = mean / theta
prob(mu, sd, x) = x^{k-1}exp(-x/theta)/(gamma(k) theta^k) log prob(mu, sd, x) = (k-1)log(x)-x/theta-log(gamma(k)) - k*log(theta)
mean | Mean to compute density of |
sd | Standard deviation of distribution |
x | X position to compute density of (<0 produces 0 probability unless mu is negative |
void npl::gaussGammaMixtureModel | ( | const Ref< const VectorXd > | data, |
Ref< VectorXd > | mu, | ||
Ref< VectorXd > | sd, | ||
Ref< VectorXd > | prior, | ||
std::string | plotfile | ||
) |
Computes the mean and standard deviation of multiple distributions based on 1D data. This is a special version in which a negative gamma, a gaussian and positive gamma. The means of the negative and positive gamma are relative to the center of the gaussian.
data | Data points to fit |
mu | Output mean of each distribution, value when called will be used for initialization so it should be pre-set and pre-allocated |
sd | Output standard deviation of each distribution, value when called will be used for initialization so it should be pre-set and pre-allocated |
prior | Output prior probability of each distribution, initial value is not used but it should be pre-allocated |
plotfile | file to plot histogram in (svg) |
double npl::gaussianCDF | ( | double | mean, |
double | sd, | ||
double | x | ||
) |
1D Gaussian cumulative distribution function
mean | Mean of gaussian distribution |
sd | Standard deviation |
x | Position to get probability of |
double npl::gaussianPDF | ( | double | mean, |
double | sd, | ||
double | x | ||
) |
1D Gaussian distribution function
mean | Mean of gaussian distribution |
sd | Standard deviation |
x | Position to get probability of |
double npl::mutualInformation | ( | size_t | len, |
double * | a, | ||
double * | b, | ||
size_t | mbin | ||
) |
Computes mutual information between signal a and signal b which are of length len. Marginal bins used is mbin.
len | Length of signal and and b |
a | Signal a |
b | Signal b |
mbin | Bins to use in marginal distribution (mbin*mbin) used in joint |
MatrixXd npl::pca | ( | const Ref< const MatrixXd > | X, |
double | varth = 1 , |
||
int | odim = -1 |
||
) |
Computes the Principal Components of input matrix X.
Outputs reduced dimension (fewer cols) in output. Note that prior to this, the columns of X should be 0 mean otherwise the first component will be the mean
X | RxC matrix where each column row is a sample, each column a dimension (or feature). The number of columns in the output will be fewer because there will be fewer features |
varth | Variance threshold. This is the ratio (0-1) of variance to include in the output. This is used to determine the dimensionality of the output. If this is 1 then all variance will be included. If this < 1 and odim > 0 then whichever gives a larger output dimension will be selected. |
odim | Threshold for output dimensions. If this is <= 0 then it is ignored, if it is > 0 then max(dim(varth), odim) is used as the output dimension. |
MatrixXd npl::pcacov | ( | const Ref< const MatrixXd > | cov, |
double | varth | ||
) |
Computes the Principal Components of input matrix X using the covariance matrix.
Outputs projections in the columns of a matrix.
cov | Covariance matrix of XtX |
varth | Variance threshold. Don't include dimensions after this percent of the variance has been explained. |
MatrixXd npl::pseudoInverse | ( | const Ref< const MatrixXd > | X | ) |
Computes the pseudoinverse of the input matrix.
void npl::randomizePowerIterationSVD | ( | const Ref< const MatrixXd > | A, |
size_t | subsize, | ||
size_t | poweriters, | ||
MatrixXd & | U, | ||
VectorXd & | E, | ||
MatrixXd & | V | ||
) |
A | M x N |
subsize | Columns in projection matrix, |
poweriters | Number of power iterations to perform |
U | Output Left Singular Vectors |
E | OUtput Singular Values |
V | Output Right Singular Vectors |
void npl::randomizePowerIterationSVD | ( | const Ref< const MatrixXd > | A, |
double | tol, | ||
size_t | startrank, | ||
size_t | maxrank, | ||
size_t | poweriters, | ||
MatrixXd & | U, | ||
VectorXd & | E, | ||
MatrixXd & | V | ||
) |
void npl::regress | ( | RegrResult * | out, |
const Ref< const VectorXd > | y, | ||
const Ref< const MatrixXd > | X, | ||
const Ref< const VectorXd > | covInv, | ||
const Ref< const MatrixXd > | Xinv, | ||
const StudentsT & | distrib | ||
) |
Computes the Ordinary Least Square predictors, beta for.
Returning beta. This is the same as the other regress function, but allows for cacheing of pseudoinverse of X
out | Struct with Regression Results. |
y | response variables |
X | independent variables |
covInv | Inverse of covariance matrix, to compute us pseudoinverse(X^TX) |
Xinv | Pseudo inverse of X. Compute with pseudoInverse(X) |
distrib | Pre-computed students' T distribution. Example: auto v = students_t_cdf(X.rows()-1, .1, 1000); |
void npl::regress | ( | RegrResult * | out, |
const Ref< const VectorXd > | y, | ||
const Ref< const MatrixXd > | X | ||
) |
Computes the Ordinary Least Square predictors, beta for.
Returning beta. This is the same as the other regress function, but allows for cacheing of pseudoinverse of X
out | Struct with Regression Results. |
y | response variables |
X | independent variables |
|
inline |
Removes the effects of X from signal (y). Note that this takes both X and the pseudoinverse of X because the bulk of the computation will be on the pseudoinverse.
Each column of X should be a regressor, Xinv should be the pseudoinverse of X
Beta in OLS may be computed with the pseudoinverse (P): B = Py where P is the pseudoinverse: P = VE^{-1}U^*
signal | response term (y), will be modified to remove the effects of X |
X | Design matrix, or independent values in colums |
covInv | the pseudoinverse of XtX |
Definition at line 53 of file statistics.h.
MatrixXd npl::rpiPCA | ( | const Ref< const MatrixXd > | X, |
double | varth, | ||
int | odim | ||
) |
Computes the Principal Components of input matrix X using the randomized power iteration SVD algorithm.
Outputs reduced dimension (fewer cols) in output. Note that prior to this, the columns of X should be 0 mean otherwise the first component will be the mean
X | RxC matrix where each column row is a sample, each column a dimension (or feature). The number of columns in the output will be fewer because there will be fewer features |
varth | Variance threshold. This is the ratio (0-1) of variance to include in the output. This is used to determine the dimensionality of the output. If this is 1 then all variance will be included. If this < 1 and odim > 0 then whichever gives a larger output dimension will be selected. |
odim | Threshold for output dimensions. If this is <= 0 then it is ignored, if it is > 0 then max(dim(varth), odim) is used as the output dimension. |
|
inline |
Computes the sample correlation.
count | Number of samples |
sum1 | Sum of group 1 |
sum2 | Sum of group 2 |
sumsq1 | Sum of the sqaure of the elements of group1 |
sumsq2 | Sum of the sqaure of the elements of group2 |
s1s2 | Elements of group1*elements of group 2 |
Definition at line 179 of file statistics.h.
double npl::sample_var | ( | const Ref< const VectorXd > | vec | ) |
Computes the statistical variance of a column vector.
vec | Input samples |
|
inline |
Takes a count, sum and sumsqr and returns the sample variance. This is slightly different than the variance definition and I can never remember the exact formulation.
count | Number of samples |
sum | sum of samples |
sumsqr | sum of square of the samples |
Definition at line 161 of file statistics.h.
VectorXd npl::shootingRegr | ( | const Ref< const MatrixXd > | X, |
const Ref< const VectorXd > | y, | ||
double | gamma | ||
) |
Performs LASSO regression using the 'shooting' algorithm of.
Fu, W. J. (1998). Penalized Regressions: The Bridge versus the Lasso. Journal of Computational and Graphical Statistics, 7(3), 397. doi:10.2307/1390712
Essentially solves the equation:
y = X * beta
where beta is mostly 0's
X | Design matrix |
y | Measured value |
gamma | Weight of regularization (larger values forces sparser model) |
MatrixXd npl::symICA | ( | const Ref< const MatrixXd > | Xin, |
MatrixXd * | unmix = NULL |
||
) |
Computes the Independent Components of input matrix X using symmetric decorrlation. Note that you should run PCA on X before running ICA.
Same number of cols as output
Note that this whole problem is the transpose of the version listed on wikipedia.
In: I number of components In: X RxC matrix with rows representing C-D samples for p in 1 to I: wp = random weight while wp changes: wp = g(X wp)X/R - wp SUM(g'(X wp))/R wp = wp - SUM wp^T wj wj normalize(wp)
Output: W = [w0 w1 w2 ... ] Output: S = XW, where each column is a dimension, each row a sample
Xin | RxC matrix where each row is a sample, each column a dimension (or feature). The number of columns in the output will be fewer because there will be fewer features. Columns should be zero-mean and uncorrelated with one another. |
unmix | output unmixing matrix, can be null if you don't need it |