This wiki service has now been shut down and archived

Programming Paradigms

From ESIWiki

Jump to: navigation, search

Return to Workshop wiki Main Page

Data-Intensive Research: Programming Paradigms: Soliciting comments

Please add any comment you would like to make about this theme after the organisers', Shantenu Jha's and Geoffrey Fox's, introduction below. It can be specifically related to a talk or breakout session or be general. Please separate entries with headings (two = signs) that flag new topics, or subheadings (three = signs) and add your signature.

Data-Intensive Research: Programming Paradigms

As the scale of data increases along several dimensions (volume, distributedness, complexity, etc.) we need to rethink the way we programmatically handle different data at different stages (management, production and analysis).

There have been several recent advances towards programmatically addressing these challenges, e.g., Sawzall, Pig, Dryad, not to ignore the many variants of MapReduce. Many of these approaches re-establish the primacy of data-parallelism.

Some questions that the Programming Paradigm cross-cutting theme will explore:

  • Advantage and applicability of programmatic approaches over others.
  • A mapping between existing approaches and application requirements. What is missing? How can these be met?
  • Many approaches are tied to a specific infrastructure (eg Hadoop on HDFS). Is this lack of interoperability and extensibility a limitation and can it be overcome? Or does it reflect how applications are developed?
  • How does the way we store and manage data (distributed versus local, structured versus table) influence our ability to process it?

Over the week there are a number of talks that address different aspects of data-intensive programming paradigms, including:

  • Roger Barga (Microsoft Research) on Emerging Trends and Converging Technologies in Data Intensive Scalable Computing,
  • Joel Saltz (Medical image process & CaBIG)
  • Xavier Llora (Experience with SEARS & Meandre)

as well as several talks that overlap with the Analysis Paradigm --

  • Thore Graepel (Microsoft Research) on Analyzing large-scale complex data streams from online services;
  • Chris Williams (University of Edinburgh) on The complexity dimension in data analysis; and
  • Andrew McCallum (University of Massachusetts Amherst) on "Discovering patterns in text and relational data with Bayesian latent-variable models.

Add further comments here with headings like this

=and subtopics with headings like this

This is an archived website, preserved and hosted by the School of Physics and Astronomy at the University of Edinburgh. The School of Physics and Astronomy takes no responsibility for the content, accuracy or freshness of this website. Please email webmaster [at] ph [dot] ed [dot] ac [dot] uk for enquiries about this archive.