Track B

From ESIWiki

Jump to: navigation, search

Back to November SAB Workshop

Contents

Track B

Reporter: Iain Coleman

Extreme e-Science

AIM: To write the e-Science BlueBook

QUESTIONS:

  • What are e-Science’s grand challenges?
  • What are the steps to address those challenges?
  • How do we need to change to meet these challenges?
  • What is the road map to get e-Science ready for 21st Century challenges?

NOTES:

Hard to have a discussion when you're not sure what the subject is: what is e-Science? What are we trying to extremise?

Is e-Science the same thing as e-Research?

e-Research is enabling research through innovative computation/informational resources. A lot of it is stuff researchers will do anyway regardless of a set of things we call e-Research. Important point is that e-Rsearch is innovative - computer scientists and application scientis have both got to agree that what they are doing is innovative. A shared view of what's an interesting thing to work on. Key point: Can both computer scientist and application scientist publish in their own community.

Emphasis on e-Science is methodology, not just research.

Many scientists out there doing things with computers that are inefficient and outdated. e-Science is about bringing ways of doing things into another domain - reduce time to insight, get scientists making progress more quickly because they're not spending time on substandard code. This is fine, but it's not a discipline in itself. It's an interdisciplinary effort.

At extreme end, the more you need both computer scientist and application scientist to be doing really innovative stuff.

Extreme is solving the problem you will need to have solved in five year's time.

Example: Japanese technology that links up CCTV to PDAs to allow you to see through buildings. Definitely extreme, but not bigger/faster - it's conceptually extreme.

"Axes of Extremity"

Whiteboard Image: Challenges and Actions


Large Data

  • Massive computation/data handling in real time.

Access/integration of data

  • Integration of many processors
  • Refining scales so they are useful to individuals: will it rain in my immediate area in the next hour?

Distributedness/Heterogeneity

  • Biggest challenge is access to data. Lots of data that average scientist cannot access in any simple way, for no good reason. If only data were available in an annotated way that you could just get, maybe a lot of time would be saved in research. Must be easily available in an automated way so you can easily explore data and find new relationships.
  • Extremely unmarked-up/non-digital. E.g. materials science and engineering has hardly any databases, unaware of e-science methods.
  • Google model: does an amazing job of finding data, but doesn't answer how you correlate disease with temperature. Often you want to search on things other than keywords.
  • Extreme interoperability - different research processes, perhaps across different domains.

Extreme computation.

Extreme representation/visualisation.

Extreme data input (e.g. sensor nets). Lots of data isn't nicely wrapped - how do you deal with ad hoc data formats?

Virtualisation.

Extremity of meaning - how rich our semantics are


What are the steps to address these challenges? What are the limits of what has been done so far?

Need research/experiments that will drive these extremes, as the e-Science pilot projects did. Grand challenges driven by big science.

Military may do some of this.

Commercial interst might solve some of these.

Require lots of effort, therefore lots of money - could come from publicly funded research, but at these extremes the bigger money of military need or market forces may be the solution.

Tension between democratising for the many and extending for the few. Does extension eventually feed into democratisation?

Now becoming economically feasible to roll out large sensor networks - pushes issue of real-time data.

Impediments

Communication difficulties between computer scientists and application scientists.

Shared resources versus need to focus on big challenges: democratisation versus elitism.

Proper rewards

Recognition of interdisciplinarity

Cannot "manage" innovative process - needs creativity of a few

Ensure persistence of infrastructure across the dimensions

Fragmentation

  • Of compute resources
  • Of data

Need a sensible funding stream to support joint research by computer scientists and application scientists. Need somebody to do the work to scope research, prepare grant proposal. A long-standing problem. Need to change value and reward system. Need to create panels of the right kinds of people, with computer scientists as well as experimentalists, modellers etc - this is what e-Science programme tried to do. Is it a chicken and egg problem? Conservatives in science don't buy the e-Science agenda. Lots of rhetoric about importance of interdisciplinarity, but RAE exercises oppose this, and there are forces within research councils against it. Bureaucracy favours research silos. It's not necessarily conscious opposition, just neglect - there's no mechanism in the system to do what we want to do. Had there been an RAE e-Science panel, people would have got credit for this work and be encouraged to do more of it. Need metrics to measure and value contributions if e-Science. UK e-Science programme almost managed this by providing pot of money for collaborative work, but unfortunately put up artificial walls between e-Science and grid.


Google encourages workforce to work 20% of their time on anything they like - produced Google Sky. Could something similar work in academia?

Models that work in our community aren't command models - Google started as some folk in a garage, competing with lots of other search engines.

Enabling collaboration between computer scientists and application scientists is a grand challenge. If we solve that problem we could transform science. Not a scientific grand challenge, but a sociological one.

Scientific grand challenges are hard to prescribe.

Providing a near-perfect representation of a piece of nature can be a grand challenge for any discipline, depending on which bit of nature you pick - e.g. virtual human for medics. These involve all the problems we've talked about. It's therefore not hard to give examples of extreme e-Science. Generic grand challenge: virtualising a part of the real world.

"Mixed Reality" - combining information from simulation with something that is happening in real time, whether that's doing an experiment or driving a car or doing brain surgery. Refining your model acording to real-time information. Model has many possibilities which you narrow down.

What's happened to solutions that were brought up by e-Science projects? What barriers came up?

Projects that are driving these extremes: we have to ensure that we learn from them, that the understanding is captured in some larger sphere.

If you've succeeded in something that pushes one axis of extremity, how do you later stretch it along other axes.

Essential to ensure sustainability - take infrastructure that is built for specific project and make it persistent.

Many communities determine the science they do not by what they want to do, but what fits in to the facilities available to them. Different facilities would let them ask different questions.

Communicating across disciplines to avoid unnecessary duplication of effort.

Can't do extreme IT without controlling the infrastructure. If you don't build your own systems, you won't be doing extreme work. (Though there are cases where this doesn't apply, like climateprediction.net)

Distinction between "niche" grids like TeraGrid and general, cloud computing grids. There is a place for both.

Overcome fragmentation with common interfaces. Need intelligence in some infrastructure layer so user doesn't need to know where a job is being submitted to. Not just computing science barrier - fragmentation of funding models also a problem.

What can we do to help process of collaboration between computer scientists and application scientists? Should we set up institutions where they all work together? Is proximity important? OeRC is along these lines, but the academics are still affiliated to subject departments. Should we go a step further, establish interdisciplinary institutes to which people owe their primary allegiance? Could this force the issue of developing appropriate RAE recognition/reward mechanisms?

Road Map: Action items / targets / milestones

Whiteboard Image: Roadmap


Role of grand challenges (different from a pilot project). Specify end result, then ask people to come up with solutions.

Identify and articulate grand challenges that will be recognised as such by both computer scientists and application scientists. Accurate virtual models on many scales:

  • Virtual Human - real-time diagnostic tool for clinicians
  • Virtual City
  • Virtual Planet

(Easier for this group to articulate grand challenges in physical and environmental sciences than in other fields)

Requirements analysis and review for extreme e-Science.

Bring in both computer scientists and application scientists.

Need right infrastructure at right time.

Requires action on funding/reward mechanisms.

Factors we don't control: industry, politics. Need industry to provide increasing compute power, need political backing.

Middleware and software: as it's leading edge, this will mostly be written by the scientists doing the research. Making it more generally useable is another task. To get computer scientists involved in research, it is a requirement that existing middleware/systems are not sufficient, so they have an interesting research challenge.

Academic community can't neccessarily solve grand challenges alone: backing from industry/military?

Timescales? EU framework funding ends 2014 - would like substantial output by then.

We have a lot of the component technologies near to hand - can articulate a very positive agenda for e-Science that anyone would recognise as worthwhile. Easier to sell to taxpayer/politicians, and easier to produce results, than (for example) LHC.

Need good marketing/articulation of what grand challenges have to offer.

Personal tools