Notes

Meandre is going Scala

After quite a bit of experimenting with different alternatives, Meandre is moving into Scala. Scala is a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. This is not a radical process, but a gradual one while I am starting to revisit the infrastructure for the next major release. Scala also generates code for the JVM making mix and match trivial. I started fuzzing around with Scala back when I started the development of Meandre during the summer of 2007, however I did fall back to Java since that was what most of the people in the group was comfortable with. I was fascinated with Scala fusion of object oriented programming and functional programming. Time went by and the codebase has grown to a point that I cannot stand anymore cutting through the weeds of Java when I have to extend the infrastructure or do bug fixing—not to mention its verbosity even for writing trivial code. This summer I decided to go on a quest to get me out of the woods. I do not mind relying on the JVM and the large collection of libraries available, but I would also like to get my sanity back. Yes, I tested some of the usual suspects for the JVM (Jython, JRuby, Clojure, and Groovy) but not quite what I wanted. For instance, I wrote most of the Meandre infrastructure services using Jython (much more concise than Java), but still not quite happy to jump on that boat. Clojure is also interesting (functional programming) but it would be hard to justify for the group to move into it since not everybody may feel comfortable with a pure functional language. I also toyed with some not-so-usual ones like Erlang and Haskell, but again, I ended up with no real argument that could justify such a decision. So, as I started doing back in 2007, I went back to my original idea of using Scala and its mixed object-oriented- and functional-programming- paradigm. To test it seriously, I started developing the distributed execution engine for Meandre in Scala using its Earlang-inspired actors. And, boom, suddenly I found myself spending more time thinking that writing/debugging threaded/networking code :D. Yes, I regret my 2007 decision instead of running with my original intuition, but better late than never. With a working seed of the distributed engine working and tested (did I mention that scalacheck and specs are really powerful tools for behavior driven development?), I finally decided to start gravitating the Meandre infrastructure development effort from Java to Scala-–did I mention that Scala is Martin Odersky’s child? Yes, such a decision has some impact on my colleagues, but I envision that the benefits will eventually weight out the initial resistance and step learning curve. At least, the last two group meetings nobody jumped off the window while presenting the key elements of Scala, and demonstrating how concise and elegant it made the first working seed of the distributed execution engine :D. We even got in discussions about the benefits of using Scala if it delivered everything I showed. I am lucky to work with such smart guys. If you want to take a peek at the distributed execution engine (a.k.a. Snowfield) at SEASR’s Fisheye. Oh, one last thing. Are you using Atlassian’s Fisheye? Do you want syntax highlighting for Scala? I tweaked the Java definitions to make it highlight Scala code. Remember to drop the scala.def file on $FISHEYE_HOME/syntax directory add an entry on the filename.map to make it highlight anything with extension .scala. ...

Liquid: RDF endpoint for FluidDB

A while ago I wrote some thoughts about how to map RDF to and from FluidDB. There I explored how you could map RDF onto FluidDB, and how to get it back. That got me thinking about how to get a simple endpoint you could query for RDF. Imagine that you could pull FluidDB data in RDF, then I could just get all the flexibility of SPARQL for free. With this idea in my mind I just went and grabbed Meandre, the JFLuidDB library started by Ross Jones, and build a few components. The main goal was to be able to get an object, list of the tags, and express the result in RDF. FluidDB helps the mapping since objects are uniquely identified by URIs. For instance, the unique object 5ff74371-455b-4299-83f9-ba13ae898ad1 (FluidDB relies on UUID version four with the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx) is uniquely identified by http://sandbox.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1 (or a url of the form http://sandbox.fluidinfo.com/objects/xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx), in case you are using the sandbox or http://fluiddb.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1 if you are using the main instance. Same story for tags. The tag fluiddb/about can be uniquely identified by the URI http://sandbox.fluidinfo.com/tags/fluiddb/about, or http://fluiddb.fluidinfo.com/tags/fluiddb/about. ...

Liquid: RDF meandering in FluidDB

Meandre (NCSA pushed data-intensive computing infrastructure) relies on RDF to describe components, flows, locations and repositories. RDF has become the central piece that makes possible Meandre’s flexibility and reusability. However, one piece still remains largely sketchy and still has no clear optimal solution: How can we facilitate to anybody sharing, publishing and annotating flows, components, locations and repositories? More importantly, how can that be done in the cloud in an open-ended fashion and allow anybody to annotate and comment on each of the afore mentioned pieces? ...

LCS & GBML Central under inspection

It is very common that NCSA machines get continual attacks over the net. Today I got a note from the security team that LCS & GBML Central may have been compromised. We are looking into it. As a precaution the automatic measures has been fired, that means, that the box is currently unroutable. Hope it can be fixed soon. I’ll keep you posted about the progress.

Squeezing for cycles

Sometimes thinking a bit helps to rush decisions that may lead to weird places. Today I was going over a simple genetic algorithm for numeric optimization written in C. The code is nothing special, tournament selection without replacement, SBX crossover operator, and polynomial mutation. To the point, I was running a simple OneMax-like problem (in this case, minimize the value of the sum of all the genes), and I was quite surprised the guy was taking so long for. ...