Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente |
mds:smd:2020 [24/02/2021 alle 15:46 (4 anni fa)] – Salvatore Ruggieri | mds:smd:2020 [04/12/2021 alle 17:56 (3 anni fa)] (versione attuale) – eliminata Salvatore Ruggieri |
---|
<html> | |
<!-- Google Analytics --> | |
<script type="text/javascript" charset="utf-8"> | |
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ | |
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), | |
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) | |
})(window,document,'script','//www.google-analytics.com/analytics.js','ga'); | |
| |
ga('create', 'UA-34685760-1', 'auto', 'personalTracker', {'allowLinker': true}); | |
ga('personalTracker.require', 'linker'); | |
ga('personalTracker.linker:autoLink', ['pages.di.unipi.it', 'enforce.di.unipi.it', 'didawiki.di.unipi.it'] ); | |
| |
ga('personalTracker.require', 'displayfeatures'); | |
ga('personalTracker.send', 'pageview', 'ruggieri/teaching/smd/'); | |
setTimeout("ga('send','event','adjusted bounce rate','30 seconds')",30000); | |
</script> | |
<!-- End Google Analytics --> | |
<!-- Global site tag (gtag.js) - Google Analytics --> | |
<script async src="https://www.googletagmanager.com/gtag/js?id=G-LPWY0VLB5W"></script> | |
<script> | |
window.dataLayer = window.dataLayer || []; | |
function gtag(){dataLayer.push(arguments);} | |
gtag('js', new Date()); | |
| |
gtag('config', 'G-LPWY0VLB5W'); | |
</script> | |
<!-- Capture clicks --> | |
<script> | |
jQuery(document).ready(function(){ | |
jQuery('a[href$=".pdf"]').click(function() { | |
var fname = this.href.split('/').pop(); | |
ga('personalTracker.send', 'event', 'SMD', 'PDFs', fname); | |
}); | |
jQuery('a[href$=".r"]').click(function() { | |
var fname = this.href.split('/').pop(); | |
ga('personalTracker.send', 'event', 'SMD', 'Rs', fname); | |
}); | |
jQuery('a[href$=".zip"]').click(function() { | |
var fname = this.href.split('/').pop(); | |
ga('personalTracker.send', 'event', 'SMD', 'ZIPs', fname); | |
}); | |
}); | |
</script> | |
</html> | |
====== Statistical Methods for Data Science A.Y. 2019/20 ====== | |
| |
=====Instructor===== | |
| |
* **Salvatore Ruggieri** | |
* Università di Pisa | |
* [[http://pages.di.unipi.it/ruggieri/]] | |
* [[salvatore.ruggieri@unipi.it]] | |
* **Office hours** | |
* Tuesday h 14:00 - 17:00, Department of Computer Science, room 321/DO. | |
* **Office hours only via skype. Skype contact: salvatore.ruggieri** | |
| |
| |
| |
=====Classes===== | |
| |
^ Day of Week ^ Hour ^ Room ^ | |
| Tuesday | 16:00 - 18:00 | <del>Fib-L1</del> Distance Learning | | |
| Wednesday| 9:00 - 11:00 | <del>Fib-A1</del> Distance Learning | | |
| |
| |
=====Pre-requisites===== | |
| |
Students should be comfortable with most of the topics on mathematical calculus covered in: | |
| |
* **[P]** J. Ward, J. Abdey. **Mathematics and Statistics**. University of London, 2013. __Chapters 1-8 of Part 1__. | |
| |
Extra-lessons refreshing such notions may be planned in the first part of the course. | |
| |
| |
=====Mandatory Teaching Material===== | |
| |
The following are //mandatory text books//: | |
| |
* **[T]** F.M. Dekking C. Kraaikamp, H.P. Lopuha, L.E. Meester. **A Modern Introduction to Probability and Statistics**. Springer, 2005. | |
* **[R]** P. Dalgaard. **Introductory Statistics with R**. 2nd edition, Springer, 2008. | |
| |
=====Software===== | |
| |
* [[https://cran.r-project.org/|R]] | |
* [[https://www.rstudio.com/|R Studio]] | |
| |
=====Preliminary program and calendar===== | |
| |
* [[https://esami.unipi.it/esami2/programma.php?c=44017&aa=2019|Preliminary program]]. | |
* [[https://didattica.di.unipi.it/en/master-programme-in-data-science-and-business-informatics/academic-calendar-2019-2020/|Calendar of lessons]]. | |
| |
| |
=====Student project===== | |
| |
* The project can be done in groups of at most 3 students. | |
* The project must be delivered (report + code) by end of July. | |
* The oral discussion must be done by the September session, and it will cover both the project and all topics of the course. | |
* The project replaces the written exam but **students have to [[https://esami.unipi.it/esami2/|register for the written dates]] in order to fill the student's questionnaire**. | |
* Groups ready to discuss send the project to the teacher plus availability time slots for oral discussion. | |
* {{ :mds:smd:smd.project.2020.pdf | Project presentation slides}} and [[http://apa.di.unipi.it/smd/video/2020_project.flv|project info audio-video (.flv)]] and [[http://apa.di.unipi.it/smd/video/2020_project_data.flv|project data audio-video (.flv)]]. | |
* [[https://drive.google.com/drive/folders/1HytbG8cQbQtgVTrdUXqyKQIv1Rz-EKU1?usp=sharing|Google Drive project directory]] (accessible only to authorized students) | |
=====Written exam===== | |
| |
__//There are no mid-terms//.__ The exam consists of a written part and an oral part. The written part consists of exercises on the topics of the course. Each question is assigned a grade, summing up to 30 points. Students are admitted to the oral part if they receive a grade of at least 18 points. Written exam consists of open questions and exercises. Example written texts: **{{ :mds:smd:smdsample.pdf | sample1}}**, **{{ :mds:smd:smdsample2.pdf | sample2}}**. Oral consists of critical discussion of the written part and of open questions and problem solving on the topics of the course.\\ | |
**Online exams:** during the COVID-19 restrictions, the written part and the oral part will be online. For the written part, students will connect to [[ https://www.unipi.it/index.php/docenti2/item/17671-corsi-online | Google Meet]] (room code: 500PP) and will activate both microphone and web-cam. Each sheet will include name, surname, student id, and it will be signed. A picture of the sheets will be delivered to [[ruggieri@di.unipi.it]]. | |
| |
| |
Registration to exams is mandatory (**look at the deadline for registering!**): [[https://esami.unipi.it/esami2/|register here]]\\ | |
| |
^ Date ^ Hour ^ Room ^ Notes ^ | |
| 19/01/2021 | 16:00 - 18:00 | Online exam | | | |
| 09/02/2021 | 16:00 - 18:00 | Online exam | | | |
=====Class calendar===== | |
| |
**Distance-learning lessons**: see instructions for [[ https://www.unipi.it/index.php/docenti2/item/17671-corsi-online | Google Meet]] and use the room code: 500PP. | |
| |
^ ^ Date ^ Room ^ Topic ^ Learning material ^ | |
|1| 25.02 16:00-18:00 | L1 | Introduction. Probability and independence. | **[T]** Chpts. 1-3 | | |
|2| 26.02 9:00-11:00 | A1 | R basics. | **[R]** Chpts. 1,2.1,2.2 {{ :mds:smd:r_intro.pdf | slides}} {{ :mds:smd:2019smdr1.r | script1.R}} | | |
|3| 03.03 16:00-18:00 | L1 | Discrete random variables. | **[T]** Chpt. 4 **[R]** Chpt. 3 {{ :mds:smd:2019smdr2.r | script2.R}} | | |
|4| 04.03 9:00-11:00 | A1 | Continuous random variables. Simulation. | **[T]** Chpts. 5, 6.1-6.2 **[R]** Chpt. 3 {{ :mds:smd:2019smdr3.r | script3.R}} | | |
|5| 10.03 16:00-18:00 | Distance-learning | Recalls: derivatives and integrals. [[http://apa.di.unipi.it/smd/video/rec01_20200310.flv|rec01 audio-video (.flv)]] | **[P]** Chpt. 1-8 {{ :mds:smd:2018smdrmath.r | scriptMath.R}}| | |
|6| 11.03 9:00-11:00 | Distance-learning| Expectation and variance. R data access. [[http://apa.di.unipi.it/smd/video/rec02_20200311.flv|rec02 audio-video (.flv)]] | **[T]** Chpt. 7 **[R]** Chpt. 2.4 {{ :mds:smd:2019smdr4.r | script4.R}} | | |
|7| 17.03 16:00-18:00 | Distance-learning | R programming. Project presentation. [[http://apa.di.unipi.it/smd/video/rec03_20200317.flv|rec03 audio-video (.flv)]] and [[http://apa.di.unipi.it/smd/video/2020_project.flv|project info audio-video (.flv)]] | **[R]** Chpt. 2.3 {{ :mds:smd:r_intro_exercise.r | exercise.R}} {{ :mds:smd:2019smdr5.zip | script5.zip}} | | |
|8| 18.03 9:00-11:00 | Distance-learning | Project presentation. Power laws and Zipf laws. [[http://apa.di.unipi.it/smd/video/rec04_20200318.flv|rec04 audio-video (.flv)]] | [[https://arxiv.org/pdf/cond-mat/0412004.pdf | Newman's paper]] Sect I, II, III(A,B,E,F) {{ :mds:smd:2019smdr6.r | script6.R}} | | |
|9| 24.03 16:00-18:00 | Distance-learning | Computations with random variables. Joint distributions. [[http://apa.di.unipi.it/smd/video/rec05_20200324.flv|rec05 audio-video (.flv)]] | **[T]** Chpts. 8-9 {{ :mds:smd:2019smdr7.zip | script7.zip}} | | |
|10| 25.03 9:00-11:00 | Distance-learning | Covariance. Sum of random variables. [[http://apa.di.unipi.it/smd/video/rec06_20200325.flv|rec06 audio-video (.flv)]] | **[T]** Chpts. 10-11 {{ :mds:smd:2019smdr8.r | script8.R}} | | |
|11| 31.03 16:00-18:00 | Distance-learning | Law of large numbers. The central limit theorem. [[http://apa.di.unipi.it/smd/video/rec07_20200331.flv|rec07 audio-video (.flv)]] | **[T]** Chpts. 13-14 {{ :mds:smd:2019smdr9.r | script9.R}} | | |
|12| 1.04 9:00-11:00 | Distance-learning | Graphical summaries. [[http://apa.di.unipi.it/smd/video/rec08_20200401.flv|rec08 audio-video (.flv)]] | **[T]** Chpt. 15 {{ :mds:smd:2019smdr10.r | script10.R}} | | |
|13| 7.04 16:00-18:00 | Distance-learning | Numerical summaries. Data preprocessing in R. Q&A on the project. [[http://apa.di.unipi.it/smd/video/rec09_20200407.flv|rec09 audio-video (.flv)]], [[http://apa.di.unipi.it/smd/video/2020_project_data.flv|project data audio-video (.flv)]] | **[T]** Chpt. 16, **[R]** Chpts. 4,10 {{ :mds:smd:2019smdr11.r | script11.R}}, {{ :mds:smd:dataprep.r | dataprep.R}} | | |
|14| 8.04 9:00-11:00 | Distance-learning | Unbiased estimators. Efficiency and MSE. [[http://apa.di.unipi.it/smd/video/rec10_20200408.flv|rec10 audio-video (.flv)]] | **[T]** Chpts. 17.1-17.3, 19, 20 {{ :mds:smd:2019smdr12.r | script12.R}} | | |
|<del>XX</del>| <del>15.04 9:00-11:00</del> | | No lesson on this date. Students work on the project on their own. | | | |
|15| 21.04 16:00-18:00 | Distance-learning | Maximum likelihood. Fisher information.[[http://apa.di.unipi.it/smd/video/rec11_20200421.flv|rec11 audio-video (.flv)]] | **[T]** Chpt. 21 {{ :mds:smd:notes1.pdf |}} {{ :mds:smd:2019smdr13.r | script13.R}} | | |
|16| 22.04 9:00-11:00 | Distance-learning | Simple linear and polynomial regression. Least squares. [[http://apa.di.unipi.it/smd/video/rec12_20200422.flv|rec12 audio-video (.flv)]] | **[T]** Chpts. 17.4,22 **[R]** Chpts. 6,12.1 {{ :mds:smd:2019smdr14.r | script14.R}} | | |
|17| 28.04 16:00-18:00 | Distance-learning | Multiple, non-linear, and logistic regression. [[http://apa.di.unipi.it/smd/video/rec13_20200428.flv|rec13 audio-video (.flv)]] | **[R]** Chpt. 13,16.1-16.2 {{ :mds:smd:notes2.pdf |}} {{ :mds:smd:2019smdr15.r | script15.R}} | | |
|18| 29.04 9:00-11:00 | Distance-learning | Confidence intervals: Gaussian, T-student, large sample method. [[http://apa.di.unipi.it/smd/video/rec14_20200429.flv|rec14 audio-video (.flv)]] | **[T]** Chpts. 23.1,23.2,23.4, 24.3,24.4 {{ :mds:smd:2019smdr16.r | script16.R}} | | |
|19| 05.05 16:00-18:00 | Distance-learning | Confidence intervals in linear regression. Empirical bootstrap. Application to confidence intervals. [[http://apa.di.unipi.it/smd/video/rec15_20200505.flv|rec15 audio-video (.flv)]] | **[T]** Chpts. 18.1,18.2,23.3 {{ :mds:smd:notes2.pdf |}} {{ :mds:smd:2019smdr17.r | script17.R}} | | |
|20| 06.05 9:00-11:00 | Distance-learning | Parametric bootstrap. Hypotheses testing. [[http://apa.di.unipi.it/smd/video/rec16_20200506.flv|rec16 audio-video (.flv)]] | **[T]** Chpts. 18.3,25 {{ :mds:smd:2019smdr18.r | script18.R}} | | |
|21| 12.05 16:00-18:00 | Distance-learning | One-sample t-test and application to linear regression. [[http://apa.di.unipi.it/smd/video/rec17_20200512.flv|rec17 audio-video (.flv)]] | **[T]** Chpts. 26-27, **[R]** Chpts. 5.1,5.2 {{ :mds:smd:notes2.pdf |}} {{ :mds:smd:2019smdr19.r | script19.R}} | | |
|22| 13.05 9:00-11:00 | Distance-learning | Goodness of fit: chi-square, K-S. Fitting power laws. [[http://apa.di.unipi.it/smd/video/rec18_20200513.flv|rec18 audio-video (.flv)]] | {{ :mds:smd:ks.pdf | K-S}} {{ :mds:smd:2019smdr20.r | script20.R}} | | |
|<del>XX</del>| <del>19.05 16:00-18:00</del> | | No lesson on this date. Students work on the project on their own. | | | |
|23| 20.05 9:00-11:00 | Distance-learning| Hypotheses testing: F-test, comparing two samples. [[http://apa.di.unipi.it/smd/video/rec19_20200520.flv|rec19 audio-video (.flv)]] | **[T]** Chpts. 28, **[R]** Chpts. 5.3-5.7 {{ :mds:smd:2019smdr21.r | script21.R}} | | |
|<del>XX</del>| <del>26.05 16:00-18:00</del> | | No lesson on this date. Students work on the project on their own. | | | |
|24| 27.05 9:00-11:00 | Distance-learning | Project tutoring. [[http://apa.di.unipi.it/smd/video/rec20_20200527.flv|rec20 audio-video (.flv)]] | | | |
| |
| |
=====Previous years===== | |
| |
* [[mds:smd:2019|Statistical Methods for Data Science A.Y. 2018/19]] | |
* [[mds:smd:2018|Statistical Methods for Data Science A.Y. 2017/18]] | |
* [[mds:smd:2017|Statistical Methods for Data Science A.Y. 2016/17]] | |
| |
| |