Meandre: Semantic-Driven Data-Intensive Flows in the Clouds

by Llorà, X., Ács, B., Auvil, L., Capitanu, B., Welge, M.E., Goldberg, D.E. (2008). This paper has been accepted at the 4th IEEE International Conference on e-Science. An early draft of the paper can be found as IlliGAL technical report 2008013. You can download the pdf here. More information is also available at the Meandre website as part of the SEASR project. Abstract: Data-intensive flow computing allows efficient processing of large volumes of data otherwise unapproachable. This paper introduces a new semantic-driven data-intensive flow infrastructure which: (1) provides a robust and transparent scalable solution from a laptop to large-scale clusters,(2) creates an unified solution for batch and interactive tasks in high-performance computing environments, and (3) encourages reusing and sharing components. Banking on virtualization and cloud computing techniques the Meandre infrastructure is able to create and dispose Meandre clusters on demand, being transparent to the final user. This paper also presents a prototype of such clustered infrastructure and some results obtained using it. ...

Nov 15, 2008 · 1 min · 162 words · Xavier Llorà

ZooKeeper and distributed applications

Lately I have been exploring different alternatives for coordinating the execution of distributed applications. Yes, you guessed it right, I am working on the distribution of the execution of Meandre flows. Chopping the data-intensive flow and mapping the chunks onto a set of distributed processors requires several elements (graph analysis, resource management, etc.). However, the basic element that needs to be solved first is the need for a reliable and scalable coordination system. ...

May 22, 2008 · 1 min · 205 words · Xavier Llorà

Meandre: Semantic-Driven Data-Intensive Flow Engine

Finally we have finished setting up the website for Meandre a semantic-driven data-intensive flow engine. Meandre provides basic infrastructure for data-intensive computation. It provides, among others, tools for creating components and flows, a high-level language to describe flows, and multicore and distributed execution environment based on a service-oriented paradigm. We are currently working on getting gear up for a first alpha release. You can visit the Meandre site here. I will be posting in the Meandre blog about our current steps toward getting the release out of the door. The Meandre infrastructure is being build to support the SEASR project ...

Apr 19, 2008 · 1 min · 100 words · Xavier Llorà

[BDCSG2008] Data-Intensive Scalable Computing (Randy Bryant)

Randy opens fire reviewing models of parallelisms and how Google’s Mpa-Reduce model (the core of Yahoo’s Hadoop) is changing the picture. He is emphasizing how data is and integral part of the computational process (which has been greatly unregarded). Map-Reduce model can greatly help because of it fault tolerant capabilities. Now he is reviewing the two traditional parallel programming models (shared model and message-passing model) and how this differ from map-reduce (and how this increases the IO). Initiatives like Hadoop allow to cut-down cost for accessing large scale computing. ...

Mar 26, 2008 · 1 min · 89 words · Xavier Llorà

DITA+ALG+DISCUS = VAST contest entry

DITA and ALG at NCSA have joined forces with the DISCUS team to enter the 2007 VAST contest. You can find a podcast of the entry to the contest here, and a description of the VAST contest below. Visual Analytics is the science of analytical reasoning supported by highly interactive visual interfaces. People use visual analytics tools and techniques to synthesize information into knowledge; derive insight from massive, dynamic, and often conflicting data; detect the expected and discover the unexpected; provide timely, defensible, and understandable assessments; and communicate assessments effectively for action. The issues stimulating this body of research provide a grand challenge in science: turning information overload into the opportunity of the decade. ...

Jul 29, 2007 · 1 min · 161 words · Xavier Llorà