PR ML STAT KDD

Statistical Data Analysis

Classical statistics techniques are a necessary complement to all data analysis and must imperatively be used in knowledge discovery in databases , machine learning and pattern recognition .

Experimental design

Experimental design techniques were first developed in agriculture where the data collection process takes years and every bit of information counts. We can best help our customers if we are involved since the early stages of data collection, to make it most cost effective and efficient. We can work closely with the engineers to solve difficult problems of calibration and normalization. We use well developed tools of statistical learning theory to predict the minimum size data sets required to train the learning machine and accurately predict its performance on unseen data.

Exploratory data analysis

We use all the tools of classical statistics, including Principal Component Analysis and clustering to check the data sanity and visualize its intrinsic structure.

Confidence intervals and hypothesis testing

The last step of data analysis consists in assessing with what confidence certain claims can be made. Examples in machine learning include finding with what confidence we can assert that one learning machine will make better predictions than another or with what confidence we can assert that the error rate of a learning machine will be less than a certain value. Other examples in classical statistics include finding how much we can trust a given correlation between variables or the invariance of a variable with respect to a given parameter change. We can give you answers to these questions using well known hypothesis testing methods, including the T-test and the analysis of variance. We also design our own tests as needed.

ClopiNet

955 Creston road
Berkeley, CA 94708, USA
1+ (510) 524-6211
Email: info at clopinet.com