Data mining
Code 309AA
Credits 9
Learning outcomes
Objectives
Recent tremendous technical advances in processing power, storage capacity, and interconnectivity are creating unprecedented quantities of digital data. Data mining, the science of extracting useful knowledge from such huge data repositories, has emerged as an interdisciplinary field in computer science. Data mining techniques have been widely applied to problems in industry, science, engineering and government, and it is believed that data mining will have profound impact on our society. The objective of this course is to provide:
1. an introduction to the basic concepts of data mining and the knowledge discovery process, and associated analytical models and algorithms;
2. an account of advanced techniques for analysis and mining of novel forms of data;
3. an account of main application areas and prototypical case studies.
Syllabus
- Concepts of data mining and the knowledge discovery process
- Data preprocessing and exploratory data analysis
- Frequent patterns and associations rules
- Classification: decision trees and Bayesian methods
- Cluster analysis: partition-based, hierarchical and density-based clustering
- Experiments with data mining toolkits
- Mining time-series and spatio-temporal data
- Mining sequential data, mining large graphs and networks
- Data mining languages, standards and system architectures
- Social impact of data mining
- Privacy-preserving data mining
- Applications (hints):
- Retail industry, Marketing, CRM
- Telecommunication industry,
- Financial data analysis, risk analysis
- Fraud detection
- Public administration and health
- Mobility and transportation
Course Structure
9 credits (6 on foundations and 3 on advanced topics and applications.) The course is taught in English. The exam consists in a written test, a data mining project and an oral examination.
Recent tremendous technical advances in processing power, storage capacity, and interconnectivity are creating unprecedented quantities of digital data. Data mining, the science of extracting useful knowledge from such huge data repositories, has emerged as an interdisciplinary field in computer science. Data mining techniques have been widely applied to problems in industry, science, engineering and government, and it is believed that data mining will have profound impact on our society. The objective of this course is to provide:
1. an introduction to the basic concepts of data mining and the knowledge discovery process, and associated analytical models and algorithms;
2. an account of advanced techniques for analysis and mining of novel forms of data;
3. an account of main application areas and prototypical case studies.
Syllabus
- Concepts of data mining and the knowledge discovery process
- Data preprocessing and exploratory data analysis
- Frequent patterns and associations rules
- Classification: decision trees and Bayesian methods
- Cluster analysis: partition-based, hierarchical and density-based clustering
- Experiments with data mining toolkits
- Mining time-series and spatio-temporal data
- Mining sequential data, mining large graphs and networks
- Data mining languages, standards and system architectures
- Social impact of data mining
- Privacy-preserving data mining
- Applications (hints):
- Retail industry, Marketing, CRM
- Telecommunication industry,
- Financial data analysis, risk analysis
- Fraud detection
- Public administration and health
- Mobility and transportation
Course Structure
9 credits (6 on foundations and 3 on advanced topics and applications.) The course is taught in English. The exam consists in a written test, a data mining project and an oral examination.