This wiki service has now been shut down and archived

Monday: Introduction

From ESIWiki

Jump to: navigation, search

Return to Workshop wiki Main Page

Data-Intensive Research Workshop: Monday Introduction: Soliciting comments

Please add any comment you would like to make about the Monday's DIR Workshop programme. It can be specifically related to a talk or be general. Please separate entries with headings and add your signature. There is a separate page for comments on the Research Village.


Purpose of Monday's Open Programme

If the content or organisation for today doesn't work for you it is probably my fault. What I hope will happen is that everyone will feel welcome and find at least some of the day interesting and informative. I hope useful ideas will emerge and those who can only manage today go home with new knowledge relevant to their work.

I hope you'll all encounter new people as well as meet people you know, and that conversations about DIR will start building that will go on all week and beyond. I have put headings below to group comments by the sessions they refer to. Please enjoy the day. --MalcolmAtkinson 15:10, 5 March 2010 (UTC)

Registration and Opening (Dave Robertson)

Welcome and Setting the Agenda for the DIR Workshop (Malcolm Atkinson)

Strategies for exploiting large data (Alex Szalay)

Interesting point Alex made is that the integral of data volume is increasing over size of data sets: large data sets are getting larger, but also the number of small data sets is increasing fast.

Galaxy Zoo shows a good example of scaling interaction: by involving 10,000-20,000 volunteers a much larger amount of data can be analysed. Also, it allowed validating the use of public in this case was not worse than using epxerts.

Alex speaks of a paradigm of enabling interaction with turbulence simulation where people can essentially experiment with a simulation in a box, putting in their probes and setting parameters for the system. Such interaction facilities are necessary as almost nobody can deal with the volume of data and amount of computation required themselves.

"Software is becoming a new kind of experiment." so, "how do we build a scalable architecture?", which links to Amdahl's law with IO as the main component. Current supercomputers fail to reach data sets over 20 Terabytes of data because of their architecture. --Jvhemert 11:29, 15 March 2010 (UTC)

Learning from Data in Online Advertising and Games (Thore Graepel)

Research Village

Please add your comments on this other page.

Introduction to Data Analysis (Chris Williams)

Introduction to Databases for Research Data (Stratis Viglas)

Introduction to Languages and Workflows (Geoffrey Fox)

Soaring through clouds with Meandre (Bernie A'cs & Xavier LlorĂ )

General thoughts and Items not linked with a particular talk

This is an archived website, preserved and hosted by the School of Physics and Astronomy at the University of Edinburgh. The School of Physics and Astronomy takes no responsibility for the content, accuracy or freshness of this website. Please email webmaster [at] ph [dot] ed [dot] ac [dot] uk for enquiries about this archive.