Meandre: Semantic-Driven Data-Intensive Flows in the Clouds

by Llorà, X., Ács, B., Auvil, L., Capitanu, B., Welge, M.E., Goldberg, D.E. (2008). This paper has been accepted at the 4th IEEE International Conference on e-Science. An early draft of the paper can be found as IlliGAL technical report 2008013. You can download the pdf here. More information is also available at the Meandre website as part of the SEASR project. Abstract: Data-intensive flow computing allows efficient processing of large volumes of data otherwise unapproachable. This paper introduces a new semantic-driven data-intensive flow infrastructure which: (1) provides a robust and transparent scalable solution from a laptop to large-scale clusters,(2) creates an unified solution for batch and interactive tasks in high-performance computing environments, and (3) encourages reusing and sharing components. Banking on virtualization and cloud computing techniques the Meandre infrastructure is able to create and dispose Meandre clusters on demand, being transparent to the final user. This paper also presents a prototype of such clustered infrastructure and some results obtained using it. ...

Nov 15, 2008 · 1 min · 162 words · Xavier Llorà

Free online survey service

Pier Luca Lanzi sent an email the other about help SigEvolution newsletter by taking an on-line survey. The survey was host at SurveyMonkey.com. I had never run into this guys before, but after digging a bit, the idea is pretty sweet. Need to run a survey? Just register to their site, create the survey, graph the link to it, and spread it around. Also, they allow you to upload surveys to their servers. As I said, pretty interesting option if you want to run a survey and do not want to stand up your own version of it. ...

Nov 14, 2008 · 1 min · 98 words · Xavier Llorà

Fast mutation implementation for genetic algorithms in Python

The other day I was playing to see how much I could squeeze out of a genetic algorithm written in Python. The code below shows the example I used. The first part implements a simple two loop version of a traditional allele random mutation. The second part is coded using numpy 2D arrays. The code also measures the time spent on both implementations using cProfile. from numpy import * pop_size = 2000 l = 200 z = zeros((pop_size,l)) def mutate () : for i in xrange(pop_size): for j in xrange(l) : if random.random()<0.5 : z[i,j] = random.random() import cProfile cProfile.run('mutate()') def mutate_matrix () : r = random.random(size=(pop_size,l))<0.5 v = random.random(size=(pop_size,l)) k = r*v + logical_not(r)*z cProfile.run('mutate_matrix()') If you run the code listed above you may get something similar to ...

Nov 13, 2008 · 2 min · 256 words · Xavier Llorà

Synchronizing Mac OS X Mail rules across machines

If you are switching between mac and do not use mobile.me, but you want to carry your mail rule’s around you can play a little trick. The file that contains the mail rules is located at ~/Library/Mail/MessageRules.plist You can just manually copy it from one box to another (you may want to make sure that the target machine does not run Mail), or you can create a little cron entry that uses rsync to keep them in sync. ...

Oct 23, 2008 · 1 min · 78 words · Xavier Llorà

On the road again for Internet2 and Bamboo

Yesterday I just got to New Orleans for the Internet2 fall meeting. I was invited to give a talk about work we are doing on the SEASR project at NCSA. SEASR fosters collaboration through empowering scholars to share data and research in virtual work environments. The SEASR project is funded by the Andrew W. Mellon Foundation. The last part of the week I will be at San Francisco joining the Project Bamboo workshop, again representing the SEASR project seeking a better understanding of possible synergies. ...

Oct 13, 2008 · 1 min · 85 words · Xavier Llorà