Easy, reliable, and flexible storage for Python

A while ago I wrote a little post about alternative column stores. One that I mentioned was Tokyo Cabinet (and its associated server Tokyo Tyrant. Tokyo Cabinet it is a key-value store written in C and with bindings for multiple languages (including Python and Java). It can maintain data bases in memory or spin them to disk (you can pick between hash or B-tree based stores). Having heard a bunch of good things, I finally gave it a try. I just installed both Cabinet and Tyrant (you may find useful installation instructions here using the usual configure, make, make install cycle). Another nice feature of Tyrant is that it also supports HTTP gets and puts. So having all this said, I just wanted to check how easy it was to use it from Python. And the answer was very simple. Joseph Turian’s examples got me running in less than 2 minutes—see the piece of code below—when dealing with a particular data base. Using Tyrant over HTTP is quite simple too—see PeteSearch blog post. ...

Aug 13, 2009 · 1 min · 198 words · Xavier Llorà

Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre

Below you may find the slides I used during GECCO 2009 to present the paper titled “Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre”. An early preprint in form of technical report can be found as an IlliGAL TR No. 2009001 or the full paper at the ACM digital library

Jul 14, 2009 · 1 min · 53 words · Xavier Llorà

Meandre overview slides

On May 26th I gave a seminar about Meandre’s basics at the Computer Science department at University of Illinois . The talk was part of the Cloud Computing Seminars. I merged together slides I have been using to talk about Meandre, and tried to give it an easy to grasp overview flavor. You view them below.

Apr 2, 2009 · 1 min · 56 words · Xavier Llorà

Squeezing for cycles

Sometimes thinking a bit helps to rush decisions that may lead to weird places. Today I was going over a simple genetic algorithm for numeric optimization written in C. The code is nothing special, tournament selection without replacement, SBX crossover operator, and polynomial mutation. To the point, I was running a simple OneMax-like problem (in this case, minimize the value of the sum of all the genes), and I was quite surprised the guy was taking so long for. ...

Apr 2, 2009 · 6 min · 1137 words · Xavier Llorà

Usages of R

R has gained a lot of traction on the scientific community for data analysis, modeling, and exploratory work. I just run into a post by Michael E. Driscoll in his Data Evolution blog about how R is used in Google and Facebook. Nothing new, but what got my attention was ParallelR. If you have been using R for large problems, I am pretty sure you have been wishing that there was some parallelization capabilities. ParallelR targets the problem, and it definitely an option to check out. ...

Feb 23, 2009 · 1 min · 86 words · Xavier Llorà