Scheda programma d'esame
BIG DATA ANALYTICS
FOSCA GIANNOTTI
Anno accademico2017/18
CdSINFORMATICA PER L'ECONOMIA E PER L'AZIENDA (BUSINESS INFORMATICS)
Codice599AA
CFU6
PeriodoPrimo semestre
LinguaItaliano

ModuliSettore/iTipoOreDocente/i
BIG DATA ANALYTICSINF/01LEZIONI48
FOSCA GIANNOTTI unimap
ROBERTO TRASARTI unimap
Learning outcomes
Knowledge

This course is meant to put at work the many data analytics technologies and competences: data mining, machine learning, social network analytics, visual analytics in realizing a whole  Big data analytics project: from acquiring and analyzing big data from multiple sources to the purpose of discovering the patterns and  models that explain certain phenomena, till the validation and presentation of the discoveries. The students will be exposed to experience in different domains: mobility and transportation, urban planning, demographics, economics, social  relationships, opinion and sentiment, etc.; and on the analytical and mining methods that can be used.

An  introduction to scalable analytics is also given, using the “map-reduce” and Hadoop paradigm and technology and the students will be capable to solve programming task of analytical problems using the scalable technology introduced in the class. 

Assessment criteria of knowledge

The students are required to be capable to formulate, design and implement an individual BigData Analytics project and to solve simple programming task of analytical problems using the scalable technology introduced in the class

Skills

 

 

Module1:  What can be learnt from big data in different domains. Use cases will be presented and discussed on: mobility and transportation, urban planning, demographics, economics, social relationships, opinion and sentiment, etc.; and on the analytical and mining methods that can be used. 

Module2: Scalable Data Analytics Technologies. The focus is on managing the pipeline of the analytical process to build scalable, robust data science applications: introduction to Hadoop, Spark and Mahout. Managing scalability: real case examples.

Teaching methods

The course is organized into two intertwined modules:

Module1: Big Data Analytics and Social Mining. What can be learnt from big data in different domains. Use cases will be presented and discussed on: mobility and transportation, urban planning, demographics, economics, social relationships, opinion and sentiment, etc.; and on the analytical and mining methods that can be used. 

Module2: Scalable Data Analytics Technologies. The focus is on managing the pipeline of the analytical process to build scalable, robust data science applications: introduction to Hadoop, Spark and Mahout. Managing scalability: real case examples.

Syllabus

Objective: In our digital society, every human activity is mediated by information technologies. Therefore, every activity leaves digital traces behind, that can be stored in some repository. Phone call records, transaction records, web search logs, movement trajectories, social media texts and tweets, Every minute, an avalanche of “big data” is produced by humans, consciously or not, that represents a novel, accurate digital proxy of social activities at global scale. Big data provide an unprecedented “social microscope”, a novel opportunity to understand the complexity of our societies, and a paradigm shift for the social sciences. Objective of the course is twofold: an introduction to the emergent field of big data analytics and social mining, aimed at acquiring and analyzing big data from multiple sources to the purpose of discovering the patterns and models of human behavior that explain social phenomena and an introduction to the technological scenario of scalable analytics.

Bibliography

Several research papers will be provided to the aim of discuss new trends and developments on Big Data application scenarious. Some basilar white papers and reference books are the followings   F Giannotti, et. al. A planetary nervous system for social mining and collective awareness. The European Physical Journal Special Topics 214 (1), 49-75, 2012     F. Giannotti, et al. Big Data Analytics: towards a European research agenda. http://www.ercim.eu/news/387-ercim-white-paper-on-big-data-analytics:     Agrawal et al. Challenges and Opportunities with Big Data 2011-1 (2011). Cyber Center Technical Reports. Paper 1. http://docs.lib.purdue.edu/cctech/1     Data, data everywhere. The Economist, Special Report on Big Data, February 2010.   Data Science for Business -- Foster Provost, Tom Fawcett, Publisher: O'Reilly Media   SOCIAL MEDIA E SENTIMENT ANALYSIS L'EVOLUZIONE DEI FENOMENI SOCIALI ATTRAVERSO LA RETE Ceron Andrea; Curini Luigi; Iacus Stefano

Technologies references:
https://spark.apache.org/docs/latest/ - https://spark.apache.org/docs/latest/mllib-guide.html - http://hadoop.apache.org/docs/stable/https://cwiki.apache.org/confluence/display/Hive/LanguageManual - https://pig.apache.org/  

Assessment methods
  • Written exam on simple programming task of analytical problems using the scalable technology introduced in the class
  • A project to be summarized in two reports (max. 10 pages) rep1. Data understanding and project formulation, rep2: Model construction and validation and execution on BigData platform. (70%)
  • An oral exam, the discussion of the project ; (30%).
Work placement

As Data Scientists, Chief data Officer, Data Analysts

Ultimo aggiornamento 09/01/2018 19:20