Statistics

Multidimensional statistics video lectures

Multidimensional statistics theory is very useful in many fields. It relies on measure and probability theories and gives the first insights for machine learning. As an example, you will learn to build confidence intervals for your estimators.

I Basic statistics

PDFVidéo
We aim in this lecture to present some basic results from statistics underlying machine learning theory. In forthcoming lectures we shall present more involved results from multivariate statistics.

Outline:
Statistical estimation
    Statistical framework
    Mean squared error of an estimator
    Estimators of expectation and variance
Confidence interval
Hypothesis testing
    Hypothesis tests
    Hypothesis tests for Gaussian random variables
    Statistic and test statistic
    The p-value
Likelihood

II Linear fitting and regression

PDFVidéo
Linear fitting and regression are classical problems in statistics, which permit to illustrate the basic ideas underlying machine learning theory. Indeed, linear fitting is one of the simplest examples of machine learning problems.

Outline:
Unidimensional input
    Linear fitting
    Linear regression
    Linear prediction
Multidimensional input
    Multidimensional linear fitting
    Multidimensional linear regression
    Multidimensional linear prediction
Gaussian model
    Maximum likelihood estimators

III Logistic regression

PDFVidéo
Logistic regression is also one of the simplest examples of machine learning problems.  It permits to illustrate the ideas underlying some machine learning algorithms. It is an adaptation of linear regression to the case when the output variable is discrete.

Outline:
Binary logistic regression
    Binary logistic regression model
Binary logistic prediction
Multiclass logistic regression
    Multiclass logistic regression model
    Multiclass logistic prediction

IV Principal component and factor analysis

PDF
We shall present two techniques, called ‘principal component analysis’ and ‘factor analysis’, which aim to reduce the eventual redundancy among some observed random variables Y1, Y2, . . . , Yp by using a smaller number of components (or factors). The objective is the same, but each of these techniques has its own specificities, as we shall see.

Outline:
Principal component analysis (PCA)
    PCA for random observed variables
    PCA for deterministic observed variables
Factor analysis
    Existence of factor analysis model
    Scaling in factor analysis
    Rotation of factors
    Factors interpretation
    Estimating loadings

Mohamed Kadhem KARRAY

My research activities at Orange aim to evaluate the performance of communication networks, by combining information, queueing theories, stochastic geometry, as well as machine and deep learning. Recently, I prepared video lectures on "Data science: From multivariate statistics to machine and deep learning" available on my YouTube channel. I also teached at Ecole Normale Supérieure, Ecole Polyetechnique, Ecole Centrale Paris, and prepared several mathematical books.