Efficient storage for Python

Did you ever run into the situation that your analysis/simulation data is too large to fit it in memory? Does the flat file format you use for storing your data sets become to big that renders it slow to a crawl? If you answered yes, you may want to give a spin to the HDF5 library. HDF5 file are not replacement for relational data bases. They are catered for storing complex data objects and a wide variety of metadata. It is also optimize for efficiency of storage and retrieval. The underlying library is written in C. If you are a Python user, PyTables provides a very efficient wrapper for HDF5 files. It gives you access to all the HDF5 api, plus it is nicely integrated with NumPy and provides natural naming conventions. In another words, you can quickly store and retrieve your arrays/matrix to HDF5 files, giving you a very interesting persistence layer. For instance you can do a simple table scan by: ...

Jul 1, 2008 · 2 min · 256 words · Xavier Llorà

The next generation of data bases

Yesterday I was reading an interview to Brian Aker (MySQL director of technology) I found via Slashdot when something caught my attention. On the second side of this which may actually be more exciting is the issue of–instead of the structured data world of the relational database but the semi–the semi-structured world. You look at what is being done today with CouchDB, you look at Amazon ScaleDB, to a lesser extent but to a similar extent you–not ScaleDB, SimpleDB–to a lesser extent or a similar extent Tokyo Cabinet, those databases are really kind of fascinating because those databases are redefining really how we access data and how we are going to be searching and using data. So there’s a whole world out there that’s just starting to open up in that direction. ...

Jun 5, 2008 · 3 min · 433 words · Xavier Llorà

Visualizing content from metadata stores

Last Friday with ALG and DITA people we put a brief presentation for NCSA’s cyberarch group on our common efforts to create a generic framework for querying and visualizing content stored in metadata stores. Mulgara, SOAP, XLSTs, and custom Java code to render content using Prefuse and JFreeChart. You can download the slides here.

Apr 15, 2007 · 1 min · 54 words · Xavier Llorà

List of papers to be presented at IWLCS 2006

The list of papers to be presented at the Ninth International Workshop on Learning Classifier Systems (IWLCS 2006) can be found here.

May 5, 2006 · 1 min · 22 words · Xavier Llorà