CdSDATA SCIENCE AND BUSINESS INFORMATICS

Codice500PP

CFU6

PeriodoSecondo semestre

The student who completes successfully the course will have a solid knowledge on the main concepts and tools of statistical analysis, including the definition of a statistical model, the inference of its parameters with confidence intervals, the use of hypothesis testing. and some basic knowledge of the statistics of linear time series. Finally the student will be able to use the language R for performing statistical analyses.

The student will be assessed on his/her demonstrated ability to discuss the main course contents using the appropriate terminology, and to apply the main statistical methods in different contexts. Written exam consists of a 2 hours test. Oral exam consists of a discussion of the written exam, and open questions on the topics of the course.

Methods:

- Final oral exam
- Final written exam

The students taking the exam in the first sessions have the option of preparing a small research project in place of the written exam.

The student will be able to understand the main concept of statistical analysis and to choose and apply the appropriate tool to the case under study. The student will also be able to use the language R for performing statistical analyses.

Attending students will do a group project on the statistical analysis of a large dataset, for which a series of questions will be proposed. The project will assess skills in the choice and use of existing statistical tests.

Basic knowledge of calculus. Basic knowledge of probability might be useful even if not indispensable.

Delivery: face to face

Learning activities:

- attending lectures
- participation in discussions
- individual study
- group project

Attendance: Advised

Teaching methods:

- Lectures
- Lab sessions

The program covers the basic methodologies, techniques and tools of statistical analysis. This includes basic knowledge of probability theory, random variables, convergence theorems, statistical models, estimation theory, and hypothesis testing. Other topics covered include bootstrap, expectation-maximization, and basic knowledge of time series analysis. Finally the program covers the use of the language R for statistical analysis.

- F.M. Dekking C. Kraaikamp, H.P. Lopuha, L.E. Meester.
**A Modern Introduction to Probability and Statistics**. Springer, 2005. - P. Dalgaard.
**Introductory Statistics with R**. 2nd edition, Springer, 2008.

Non-attending students cannot do the project. All the rest remains unchanged.

The exam consists of a written part and an oral part. The written part includes open questions and exercises. The oral part consists of open questions on the topics of the course. Attending students may replace the written part with a project to be done in groups throughout the course.