mds:smd:2018

# Statistical Methods for Data Science A.Y. 2017/18

## Instructors

• Daniele Tantari
• Scuola Normale Superiore
• Salvatore Ruggieri
• Università di Pisa

## Classes

Day of Week Hour Room
Monday 16:00 - 18:00 Fib-L1
Tuesday 9:00 - 11:00 Fib-N1

## Office hours

• Prof. Tantari: Tuesday h 11:00 - 15:00, Scuola Normale Superiore, room 93 (please send an email in advance)
• Prof. Ruggieri: Tuesday h 14:00 - 17:00, Department of Computer Science, room 321/DO.

## Text Books

The following are mandatory text books:

• [B1] F.M. Dekking C. Kraaikamp, H.P. Lopuha, L.E. Meester. A Modern Introduction to Probability and Statistics. Springer, 2005.
• [B2] P. Dalgaard. Introductory Statistics with R. 2nd edition, Springer, 2008.

The following is an optional text book for recalling mathematics pre-requisites of the course:

• [B3] J. Ward, J. Abdey. Mathematics and Statistics. University of London, 2013. Chapters 4-8 of Part 1 present basic calculus (derivatives and integrals).

## Written exam

Written exam consists of open questions and exercises. Example text: sample1, sample2. The exam lasts 2 hours. No teaching material can be consulted during the exam. Registration is mandatory.

Date Hour Room
22/1/2019 9:00 - 11:00 Fib-L1
12/2/2019 9:00 - 11:00 Fib-L1

## Class calendar (final)

Day Room Topic Learning material Instructor
1. 19.02 16:00-18:00 L1 Introduction. Probability and independence. [B1] Chpts. 1-3 Tantari
2. 20.02 9:00-11:00 N1 R basics. [B2] Chpts. 1,2.1,2.4 slides script1.R Ruggieri
3. 27.02 9:00-11:00 N1 Discrete and continuous random variables. [B1] Chpts. 4-5 Tantari
4. 06.03 9:00-11:00 N1 Simulation. Expectation and variance [B1] Chpts. 6-7 noteSim Tantari
5. 12.03 16:00-18:00 L1 R basics and distributions. [B2] Chpts. 2.2,3-4 script2.R Ruggieri
6. 13.03 9:00-11:00 N1 R programming and graphics. [B2] Chpts. 2.3,3-4 exercise.R script3.R Ruggieri
7. 19.03 16:00-18:00 L1 Computations with random variables. Covariance [B1] Chpts. 8-10 Tantari
8. 20.03 9:00-11:00 N1 Sum of random variables. Law of large numbers [B1] Chpts. 11,13 Tantari
9. 26.03 16:00-18:00 L1 The central limit theorem. Graphical summaries [B1] Chpts. 14,15 Tantari
10. 27.03 9:00-11:00 N1 Numerical summaries. Poisson process [B1] Chpts. 12,16 Rcode slidesTantari
11. 16.04 16:00-18:00 L1 Examples on CLT. Data preprocessing. [B2] Chpt. 10 dataprep.r script4.R Ruggieri
12. 17.04 9:00-11:00 N1 Unbiased estimators. Efficiency and MSE [B1] Chpts. 17,19, 20 Tantari
13. 23.04 16:00-18:00 L1 Maximum likelihood. [B1] Chpt. 21 Tantari
14. 24.04 9:00-11:00 N1 Fisher Information. Linear Regressions and Least Squares. [B1] Chpt. 22 fisherTantari
15. 30.04 16:00-18:00 L1 Examples on and MSE. Power-laws Newman's paper, roc_adult.R script5.R Ruggieri
16. 02.05 14:00-16:00 A1 Project and data presentation Tantari+Ruggieri
17. 07.05 16:00-18:00 L1 Confidence Intervals: Gaussian, T-student, large sample method. [B1] Chpt. 23,24 Tantari
18. 08.05 9:00-11:00 N1 Empirical and parametric bootstrap. Application to confidence intervals. [B1] Chpts. 18,23 Tantari
19. 14.05 16:00-18:00 L1 Hypotheses testing. [B1] Chpts. 25-26 Tantari
20. 15.05 9:00-11:00 N1 Hypotheses testing. Bootstrap. Project tutoring. [B2] Chpt. 5.1, script6.R Ruggieri
21. 21.05 16:00-18:00 L1 Hypotheses testing. t-test and application to linear regressions [B1] Chpts. 27 Tantari
22. 22.05 9:00-11:00 N1 Hypotheses testing: correlation and Fisher transformation, comparing samples [B1] Chpt. 28 CorrNotes Tantari
23. 28.05 16:00-18:00 L1 Hypotheses testing: F-test, K-S, chi-square K-S Tantari
24. 29.05 9:00-11:00 N1 Hypotheses testing, parameter estimation. [B2] Chpts. 5.2-5.7, 6, script7.R Ruggieri

## Previous years

mds/smd/2018.txt · Ultima modifica: 24/02/2021 alle 15:46 (7 mesi fa) da Salvatore Ruggieri