Università di Pisa - Valutazione della didattica e iscrizione agli esami

Scheda programma d'esame

ANALISI GENETICHE E GENOMICHE

ROBERTO MARANGONI

Anno accademico2023/24
CdSBIOLOGIA MOLECOLARE E CELLULARE
Codice176EE
CFU6
PeriodoPrimo semestre
LinguaItaliano

Moduli

Settore/i

Tipo

Ore

Docente/i

ANALISI GENETICHE E GENOMICHE

BIO/18

LEZIONI

ROBERTO MARANGONI unimap

ROBERTO SILVESTRI unimap

Esporta in pdf

Obiettivi di apprendimento

Learning outcomes

Conoscenze

Il corso di Analisi Genetiche e Genomiche si pone l'obiettivo di formare laureati con una specifica competenza nelle tematiche e nelle metodologie caratterizzanti la Genetica, la Genomica e la Trascrittomica nelle loro applicazioni per la ricerca di base e applicata.

Knowledge

The student who successfully completed the course will have the ability to highlight the molecular methods of analysis for amplification, genotyping, and sequencing of DNA samples. The student will be able to understand how genetic markers can be used to discover novel genes involved in mendelian traits as well as in complex phenotypes.

Modalità di verifica delle conoscenze

La verifica delle conoscenze avviene con esame scritto e orale

Assessment criteria of knowledge

Written and oral examination

Capacità

Al termine del corso lo studente sarà in grado di imparare un metodo per allineare sequenze derivanti dai sequenziatori di nuova generazione (Next Generation Sequencing) e disegnare alcuni test molecolari di analisi del DNA, nonchè di contestualizzare l'utilizzo e l'applicazione di una determinata metodica genomica, genetica e di biologia molecolare, con particolare riferimento al campo della biomedicina. Lo studente apprenderà a usare l'ambiente R e in particolare la suite Bioconductor per l'elaborazione di pipeline di analisi su dati genome-wide.

Skills

At the end of the course students will be able to manage the reads form the Next Generation Sequencers and to design molecular testing for DNA analysis and be able to interpret the most recent scientific literature regarding the genetics of Mendelian and complex diseases with particular reference to biomedicine and diagnostic field.

Modalità di verifica delle capacità

La verifica delle capacità acquisite avverrà in sede di esame tramite domande aperte o esercizi mirati

Assessment criteria of skills

Verifying the acquired skills will be in the examination via open or targeted exercises questions

Comportamenti

Al termine del corso lo studente avrà le basi necessarie (da approfondire) per progettare uno studio di indagine genetica e molecolare su malattie mendeliane e complesse .

Behaviors

At the end of the course students will have the necessary foundation (to be deepen) to design a study of genetic investigation of any type of disease, Mendelian or not.

Modalità di verifica dei comportamenti

I comportamenti acquisiti verranno verificati in sede di esame.

Assessment criteria of behaviors

The acqusition of the behaviors will be verified acquired through targeted exercises at the exam.

Prerequisiti (conoscenze iniziali)

Le conoscenze della genetica di base, biologia molecolare e alcune nozioni di statistica sono essenziali. E' utile, ma non indispensabile, la conoscenza dell'ambiente R.

Prerequisites

Knowledge of basic genetics (the undergraduate program) are essential. Some statistical concepts, such as the chi-square test, analysis of variance and simple linear regression would be preferable.

Indicazioni metodologiche

Le lezioni sono tutte di tipo frontale.

Teaching methods

Delivery: face to face

Learning activities:

attending lectures
preparation of oral/written report
individual study

Attendance: Advised

Teaching methods:

Lectures

Programma (contenuti dell'insegnamento)

-modulo biologico:

Ripasso concetti di base. Definizione di polimorfismo e mutazione. Classificazione e caratteristiche generali delle varianti del DNA. Minisatelliti: caratteristiche principali e loro possibili applicazioni. Sviluppo del DNA fingerprinting e prime applicazioni. Microsatelliti: instabilità e genesi attraverso il modello dello slippage-misallignment. Evoluzione della tecnica del DNA fingerprinting basata sui microsatelliti. Conseguenze biologiche dell’instabilità microsatellitare e patologie legate al fenomeno dell’espansione di triplette. SNP e INS/DEL. Metodi di scoperta: Polimorfismo di conformazione a filamento singolo (SSCP). Fusione ad alta risoluzione (HRM). La reazione di sequenziamento di Sanger. Genotipizzazione: DOT BLOT. Genotipizzazione mediante RFLP (polimorfismo della lunghezza dei frammenti di restrizione) e PCR-RFLP, ASO-PCR (o ARMS-PCR), OLA (Oligonucleotide Ligation Assay), TaqMan Allelic Discrimination Assay. Microarray per la genotipizzazione: la Single Base Extension (principio generale). Illumina® BeadArray. Microarray per la genotipizzazione: ibridazione con Affymetrix. Un breve tour in dbSNPs, gnomAD, UCSC Genome Browser, ENSEMBL, i database più utilizzati per sfogliare le variazioni genetiche. Duplicazioni segmentali obbligatorie e facoltative. Meccanismi di formazione: crossing-over ineguale, duplicazioni dell'intero genoma. Riarrangiamenti cromosomici. I loci del CYP2D6 come esempio di duplicazioni obbligatorie e polimorfiche e delezioni interstiziali nel genoma umano. L'esempio del CYP2D6 nel metabolismo degli antidepressivi e di altri farmaci. I loci di GSTM1, GSTT1 e TP53 come esempio di duplicazioni obbligatorie e polimorfiche e delezioni interstiziali nel genoma umano. Genotipizzazione di delezioni interstiziali polimorfe e piccole inserzioni mediante elettroforesi su gel di prodotti di PCR. Analisi mediante multiplex Ligation-dependent Probe Amplification (chiamato anche Multiplex Oligonucleotide Ligation Assay), analisi mediante saggio TaqMan (PCR quantitativa in tempo reale). In che modo i polimorfismi sono correlati tra loro all'interno dei genomi? Aplotipi e Linkage Disequilibrium (LD), definizioni. Il concetto di blocchi LD. Calcolo della forza LD tra due polimorfismi utilizzando i parametri D e r2. Le forze che modellano la variabilità aplotipica e modulano l'intensità del linkage disequilibrium. Metodi molecolari e non molecolari per definire un aplotipo. Gli SNP che codificano gli aplotipi. Un esempio visivo. Perché non abbiamo bisogno di genotipizzare tutti i polimorfismi nel genoma. Il database ENSEMBL per la valutazione dei blocchi di linkage. Un uso pratico degli array SNP per rilevare delezioni/duplicazioni nei genomi. L'esempio della distrofia muscolare di Duchenne. Dalle prime sonde (Southern blot), ai microsatelliti, analisi MLPA e array SNP. Utilizzo di array SNP per la mappatura di delezioni ereditarie e de novo in probandi affetti da tratti mendeliani/oligogeni. Combinazione di diversi pazienti con lo stesso fenotipo per affinare la regione critica minima per una malattia. Altri esempi sulle delezioni rilevate dagli array SNP. Introduzione all'analisi di linkage per la mappatura dei tratti mendeliani. Storia (malattia di Huntington). Il contenuto informativo di un polimorfismo (PIC) e l'eterozigosi come proxy per PIC. Perché questo parametro è importante per gli studi di linkage. Rilevamento di famiglie ricombinanti per il calcolo dei punteggi LOD nei tratti dominanti. Esempio di calcolo di un punteggio LOD in due famiglie. Il punteggio lod multipunto (MLS). Introduzione all'analisi di linkage per i caratteri recessivi. Mappatura dell'autozigosi che sfrutta i segmenti di identità per discendenza (IBD) nelle famiglie. Esempio di mappatura del gene per la fibrosi cistica sfruttando i segmenti IBD a livello di popolazione. Introduzione al sequenziamento di nuova generazione. La reazione di pirosequenziamento e il sistema 454. 454/GS_FLX – PCR a base di emulsione. Torrente ionico. Illumina®, l'amplificazione a ponte e il sequenziamento per sintesi. Metodi NGS a singola molecola: Oxford Nanopore e Pacific-Biosciences. Applicazioni: sequenziamento dell'intero genoma (WGS), sequenziamento dell'intero esoma (WES), sequenziamento dell'RNA (RNA-seq). Illustrazione grafica della mappatura delle "letture" in un browser genoma. Copertura irregolare del genoma. Relazione tra "profondità" e frazione di genoma non sequenziato. Relazione tra numero di letture e frazione del genoma non sequenziato. Relazione tra profondità e copertura. Relazione tra profondità e numero di "variazioni genetiche" rilevabili in un genoma. Possibile uso "quantitativo" di RNA-seq. RNA-seq per la rilevazione delle isoforme dell'mRNA. NGS per rilevare CNV. I progetti 1000Genomes, ExAC e GnomAD. Il progetto di sequenziamento dell'esoma del progetto NHLBI. Esempio di mappatura per l'autozigosi usando NGS. Sequenziamento dell'esoma vs sequenziamento dell'intero genoma. Quali informazioni possiamo comprendere? Il progetto ENCODE.

Modulo bioinformatico: il modulo è focalizzato sulle metodiche bioinformatiche di analisi ed elaborazione dei dati di NGS. Lo strumento fondamentale è rappresentato dalla suite Bioconductor che permette di implementare la quasi totalità delle pipeline normalmente impiegate per l'analisi di dati molecolari. I principali punti che verranno coperti sono:
- Introduzione all'ambiente R (elementi fondamentali del linguaggio, tipi di variabili, costrutti)

- L'ambiente "tidy R": caratteristiche e usabilità per le applicazioni bioinformatiche

- La suite Bioconductor: istallazione e caratteristiche fondamentali

- Acquisizione di dati NGS

- Un esempio di workflow di mappatura e genotipizzazione

- Esempi di workflow per analisi di dati NGS genome-wide

Syllabus

"biological" module:

Introduction to the module.

Refreshing the basics of Genetics. Differences between polymorphism and mutation. Minisatellites, characteristics and possible applications: the first development of DNA fingerprinting. Microsatellites: instability and generation through the slippage-misalignment mechanism. Evolution of DNA fingerprinting in the microsatellites era. Biological consequences of the triplets expansion phenomenon. SNPs and INS/DELs. Discovery methods: Single Strand Conformation Polymorphism (SSCP). High-resolution melting (HRM). Sanger’s sequencing reaction. Genotyping: DOT BLOT. Genotyping by RFLP (restriction fragment length polymorphism) e PCR-RFLP, ASO-PCR (o ARMS-PCR), OLA (Oligonucleotide Ligation Assay), TaqMan Allelic Discrimination Assay. Microarrays for genotyping: the Single Base Extension (general principle). Illumina® BeadArray. Microarrays for genotyping: hybridization with Affymetrix. A short tour of dbSNPs, gnomAD, UCSC Genome Browser, and ENSEMBL, the most used databases for browsing the genetics variations. Segmental Duplications. Mandatory and optional. Mechanisms of formation: unequal crossing-over, whole-genome duplications. Chromosomal rearrangements. The loci of CYP2D6 as an example of mandatory and polymorphic duplications and interstitial deletions in the human genome. The example of CYP2D6 in the metabolism of antidepressants and other drugs. The loci of GSTM1, GSTT1, and TP53 as examples of mandatory and polymorphic duplications and interstitial deletions in the human genome. Genotyping of polymorphic interstitial deletions and small insertions by gel electrophoresis of PCR products. Analysis by Multiplex Ligation-dependent Probe Amplification (also called Multiplex Oligonucleotide Ligation Assay), analysis by TaqMan assay (Real-time quantitative PCR). How are polymorphisms related each to other within genomes? Haplotypes and Linkage Disequilibrium (LD), definitions. The concept of LD blocks. Calculation of the LD force between two polymorphisms using parameters D and r2. The forces that shape haplotypic variability and modulate the intensity of the linkage disequilibrium. Molecular and non-molecular methods to define a haplotype. The haplotype-tagging SNPs. A visual example. Why we do not need to genotype all the polymorphisms in the genome. The ENSEMBL database for evaluating the SNPs LD. A practical use of the SNP arrays for detecting deletions/duplications in genomes. The example of Duchenne’s Muscular Dystrophy. From the early probes (southern blot) to the microsatellites, MLPA analysis and SNP arrays. Use of SNP arrays for mapping inherited and de novo deletions in probands affected by mendelian/oligogenic traits. Combination of different patients with the same phenotype for refining the minimal critical region for a disease. More examples of deletions detected by SNP arrays. Introduction to the linkage analysis for mapping mendelian traits. History (Huntington’s Disease). The information content of a polymorphism (PIC) and the heterozygosity as a proxy for PIC. Why this parameter is important for linkage studies. Detecting recombinant kindreds for calculating the LOD scores in dominant traits. Example of the calculation of a LOD score in two families. The multipoint lod score (MLS). Introduction to the linkage analysis for recessive traits. Autozygosity mapping exploiting the segments of identity by descent (IBD) in families. Example of the mapping of the gene for cystic fibrosis exploiting the IBD segments at population level. Introduction to Next Generation Sequencing. The reaction of pyrosequencing and the 454 system. 454/GS_FLX – emulsion-based PCR. Ion Torrent. Illumina®, the bridge amplification and the sequencing by synthesis. Single-molecule NGS methods: Oxford Nanopore and Pacific-Biosciences. Applications: whole genome sequencing (WGS), whole-Exome sequencing (WES), RNA sequencing (RNA-seq). Graphic illustration of the mapping of the "reads" in a genome browser. Irregular coverage of the genome. Relationship between "depth" and non-sequenced genome fraction. Relationship between number of reads and fraction of the non-sequenced genome. Relationship between depth and coverage. Relationship between depth and number of "genetic variations" detectable in a genome. Possible "quantitative" use of RNA-seq. RNA-seq for the detection of mRNA isoforms. NGS to detect CNVs. The 1000Genomes, ExAC and GnomAD projects. The NHLBI project exome sequencing project. Example of mapping for autozygosity using NGS. Exome sequencing vs. whole genome sequencing. What information can we understand? The ENCODE project.

"Computational" module:

This module is focused on the bioinformatics methods of analysis and processing of NGS data. The fundamental tool is represented by the Bioconductor suite which allows to implement almost all the pipelines normally used for the analysis of molecular data. The main points that will be covered are:
- Introduction to the R environment (fundamental elements of the language, types of variables, constructs)

- The "tidy R" environment: characteristics and usability for bioinformatics applications

- The Bioconductor suite: installation and fundamental characteristics

- Acquisition of NGS data

- An example of a mapping and genotyping workflow

- Examples of workflows for genome-wide NGS data analysis

Bibliografia e materiale didattico

1) Libri di testo consigliati:

"Genetica molecolare umana ", by Tom Strachan & Andrew P. Read (Zanichelli)

Powepoint forniti dai docenti

Bibliography

"Genetica molecolare umana ", by Tom Strachan & Andrew P. Read (Zanichelli)

Slides and othe material supplied bt teachers

Indicazioni per non frequentanti

Tutto il materiale si puo' trovare su e-learning e sulla pagina

http://www.stefanolandi.eu

Modalità d'esame

Prova d'esame scritta e prova orale

Assessment methods

Written and oral

Altri riferimenti web

cercare su e-learning e Moodle

Additional web pages

Look at

e-learning

Moodle

Note

Composizione della commissione d'esame:
- Presidente: Roberto Marangoni (titolare)
- Commissario: Roberto Silvestri (co-docente)

Composizione commissione supplente
- Presidente supplente: Roberto Silvestri (co-docente)
- Membro supplente: Stefano Landi

Ultimo aggiornamento 19/12/2023 18:41