Contents

Content Description:
Today's currency is data. However, data is only useful if we are able to extract useful information from it. This is the aim of data analysis in general. This course aims to survey the foundations of data analysis. This includes concepts from statistical inference, regression analysis, classification analysis, clustering analysis, dimensionality reduction. Topics include, but are not limited to:
  • Models, Statistical Inference, and General Techniques
    • Fundamental Concepts in Inference
    • Parametric Inference
    • Hypothesis Testing and p-values
    • The Bootstrap
    • Data Splitting, Cross-Validation
  • Regression Modelling
    • Simple Linear Regression
    • Multiple Regression
    • Further Regression Methods
    • Generalized Linear Models
    • Regression Trees
  • Classification Modelling
    • Decision Theoretic Introduction; Error rates, and Bayes Optimality
    • Logistic Regression
    • Classification Trees
    • Support Vector Machines
    • Further Classification Methods
  • Neural Networks
  • Basic Techniques of Unsupervised Learning
    • Dimension Reduction (Matrix Factorization)
    • Association Rules
  • Clustering Methods
    • Hierarchical Clustering
    • Model-based Clustering
    • Evaluation and Validation of Clustering Results
    • Density-based Clustering
    • Self Organizing Maps