Loading RDF/XML files into Virtuoso’s metadata store

Bernie just put together this beauty to load small RDF/XML files into Virtuoso’s metadata store (We are using testing the open source version). DB.DBA.RDF_LOAD_RDFXML(http_get ('URI to the RDF/XML file'),'','Name of the graph in the store'); We have tested loading a 5 Million triple RDF/XML and results are pretty nice (It took around 6 minutes to load into a dual Pentium 4 extreme edition at 3GHz with 4GB of RAM on a slow 7500rpm ext3fs). When pushing to larger files, the stream version of this is a must to reduce memory consumption. ...

Jul 20, 2007 · 1 min · 91 words · Xavier Llorà

Crawlers on the loose

Recently, our lab web server has been experiencing a great increase of bot visits. Besides the usual Google, Yahoo, MSN, etc. bots a new one caught my attention. It is visiting pretty frequently: Cuill. From the few information they have on their site, one of the co-founders is a former UIUC CS alumni Anna Patterson. Cuill Inc. (pronounced [kool]) is a startup company that is pioneering a new approach to Search. The company was founded by Anna, Russell, and Tom. Our offices are located on a quiet street in Menlo Park, CA. If you’d like to learn more about Cuill or the people behind it, please get in touch. ...

Jul 20, 2007 · 1 min · 109 words · Xavier Llorà

Google Notebook

Google Notebook has a pretty cool add-on for Firefox. It allows you to clip, and organize information from across the web in a single online location that’s accessible from any computer.

Jul 18, 2007 · 1 min · 31 words · Xavier Llorà

Reset user’s password on a MediaWiki

I needed to reset the password for a user on a MediaWiki site. Luckily, I run into this post “Reset a user password on MediaWiki - Greg’s Postgres stuff” which helps you to do so. The five-cent summary for a MySQL powered site: UPDATE user SET user_password = md5(CONCAT('123-',md5('newpassword'))) WHERE user_id=123;

May 14, 2007 · 1 min · 51 words · Xavier Llorà

Uniform sampling of a data set

Sometimes you may need to sample a dataset. You may want to get a uniformly sampled subset out of a datatset stored in a file. The perlscript below does the job for you. if ( $#ARGV!=1 ) { print "Wrong number of arguments\\n\\t". "uniform-sampler.pl <file> <sample_proportion>\\n"; } else { srand(); open(FILE,$ARGV[0]) or die "File $ARGV[0] could not be open"; while($line=<FILE>) { if ( rand()<$ARGV[1] ) { print $line; } } close FILE; } 1;

May 11, 2007 · 1 min · 74 words · Xavier Llorà