This wiki service has now been shut down and archived
Data Flows in Next Generation Sequencing: Replication, Durability and Metrology
From ESIWiki
Synopsis
This workshop was held 16 March 2011 at the e-Science Institute, Edinburgh, as part of the e-SI mini-theme Genomic and Environmental Science Data Flows. The event was organized by Dr. Ruth McNally, Dr. Adrian Mackenzie and Jennifer Tomomitsu, ESRC Cesagen, in association with the eSI Thematic Programme.
Making and using genomics data through Next Generation Sequencing (NGS) is a multidisciplinary enterprise that requires collaboration between domain experts and technical experts. As an innovative scientific enterprise, it entails experiments and experimental forms of actions. This workshop explored the arrangements that bring specific application areas and specific technical areas together for this e-science.
The objectives were to:
- develop an awareness of the problems, obstacles, friction points or gaps that hinder transformations or reshaping of data flows to do better e-science in NGS;
- identify practices and devices in the conduct of NGS that sustain collaborative development;
- develop an awareness of some alternative ways of thinking about data flows in genomics;
- develop alternative socio-technical models that open up new avenues for interdisciplinary collaboration on devices and practices for research with high throughput data flows.
One outcome will be maps of the trajectories of NGS data, from production through assembly, storage, analysis, visualisation, modelling and publication. We will seek to enrich the descriptions of how data are characterised at different places and in different domains, with particular reference to the qualities of replication, durability and metrology. The data flows developed in the workshop will then be validated in the broader community.
Workshop Agenda
| 1030-1100 | Registration with refreshments | |
| 1100-1120 | Welcome and introduction | Dr. Adrian Mackenzie and Dr. Ruth McNally, ESRC Cesagen, Lancaster University |
| 1120-1150 | Next generation sequence data archiving in the European Nucleotide Archive | Dr. Guy Cochrane, leader of EBI European Nucleotide Archive (ENA) team |
| 1150-1210 | Breakout 1 - NGS data production | |
| 1210-1310 | The arc of an experiment in GenePool | Professor Mark Blaxter, Principle Investigator at GenePool, NERC / MRC Next Generation Sequencing and Genomics Facility, Edinburgh University |
| One Terabase a run and beyond... | Dr. Matt Clark, Head of Technology Development, BBSRC Genome Analysis Centre (TGAC), Norwich | |
| 1315-1400 | Lunch | |
| 1400-1420 | Breakout 2 - Data size, speed and cost | |
| 1420-1520 | Genome Content Management A Tale of Small RNA | Dr. Will Spooner, Technical Director, Eagle Genomics open-source bioinformatics services for genome content management |
| Open Source NGS Annotation Pipelines in the Cloud | Professor Carole Goble, co-Director of myGrid e-Science Consortium, Manchester University | |
| 1520-1540 | Breakout 3 - Workflows and data pipelines | |
| 1540-1600 | Refreshments break | |
| 1600-1700 | The 1000 Genomes Project: A Large Data Problem | Dr. Laura Clarke, Technical Leader for 1000 Genomes Project Data Management Group, European Bioinformatics Institute |
| 1000 Genomes Data Processing Pipeline | Dr. Shane McCarthy, co-developer of 1000 Genomes Project Data Processing Pipeline, Wellcome Trust Sanger Institute | |
| 1720-1740 | Breakout 4 - Alignment, annotation and visualisation | |
| 1740-1810 | Sharing (NGS) Data | Dr. Chris Taylor, Senior Software Engineer, European Bioinformatics Institute |
| 1810-1830 | Roundtable discussion of emergent issues and future work/projects/research |