This wiki service has now been shut down and archived

Data Flows in Next Generation Sequencing: Replication, Durability and Metrology

From ESIWiki

Jump to: navigation, search


This workshop was held 16 March 2011 at the e-Science Institute, Edinburgh, as part of the e-SI mini-theme Genomic and Environmental Science Data Flows. The event was organized by Dr. Ruth McNally, Dr. Adrian Mackenzie and Jennifer Tomomitsu, ESRC Cesagen, in association with the eSI Thematic Programme.

Making and using genomics data through Next Generation Sequencing (NGS) is a multidisciplinary enterprise that requires collaboration between domain experts and technical experts. As an innovative scientific enterprise, it entails experiments and experimental forms of actions. This workshop explored the arrangements that bring specific application areas and specific technical areas together for this e-science.

The objectives were to:

  • develop an awareness of the problems, obstacles, friction points or gaps that hinder transformations or reshaping of data flows to do better e-science in NGS;
  • identify practices and devices in the conduct of NGS that sustain collaborative development;
  • develop an awareness of some alternative ways of thinking about data flows in genomics;
  • develop alternative socio-technical models that open up new avenues for interdisciplinary collaboration on devices and practices for research with high throughput data flows.

One outcome will be maps of the trajectories of NGS data, from production through assembly, storage, analysis, visualisation, modelling and publication. We will seek to enrich the descriptions of how data are characterised at different places and in different domains, with particular reference to the qualities of replication, durability and metrology. The data flows developed in the workshop will then be validated in the broader community.

Workshop Agenda

1030-1100 Registration with refreshments
1100-1120 Welcome and introduction Dr. Adrian Mackenzie and Dr. Ruth McNally, ESRC Cesagen, Lancaster University
1120-1150 Next generation sequence data archiving in the European Nucleotide Archive Dr. Guy Cochrane, leader of EBI European Nucleotide Archive (ENA) team
1150-1210 Breakout 1 - NGS data production
1210-1310 The arc of an experiment in GenePool Professor Mark Blaxter, Principle Investigator at GenePool, NERC / MRC Next Generation Sequencing and Genomics Facility, Edinburgh University
One Terabase a run and beyond... Dr. Matt Clark, Head of Technology Development, BBSRC Genome Analysis Centre (TGAC), Norwich
1315-1400 Lunch
1400-1420 Breakout 2 - Data size, speed and cost
1420-1520 Genome Content Management A Tale of Small RNA Dr. Will Spooner, Technical Director, Eagle Genomics open-source bioinformatics services for genome content management
Open Source NGS Annotation Pipelines in the Cloud Professor Carole Goble, co-Director of myGrid e-Science Consortium, Manchester University
1520-1540 Breakout 3 - Workflows and data pipelines
1540-1600 Refreshments break
1600-1700 The 1000 Genomes Project: A Large Data Problem Dr. Laura Clarke, Technical Leader for 1000 Genomes Project Data Management Group, European Bioinformatics Institute
1000 Genomes Data Processing Pipeline Dr. Shane McCarthy, co-developer of 1000 Genomes Project Data Processing Pipeline, Wellcome Trust Sanger Institute
1720-1740 Breakout 4 - Alignment, annotation and visualisation
1740-1810 Sharing (NGS) Data Dr. Chris Taylor, Senior Software Engineer, European Bioinformatics Institute
1810-1830 Roundtable discussion of emergent issues and future work/projects/research
This is an archived website, preserved and hosted by the School of Physics and Astronomy at the University of Edinburgh. The School of Physics and Astronomy takes no responsibility for the content, accuracy or freshness of this website. Please email webmaster [at] ph [dot] ed [dot] ac [dot] uk for enquiries about this archive.