Sometimes you may need to sample a dataset. You may want to get a uniformly sampled subset out of a datatset stored in a file. The perlscript below does the job for you.
if ( $#ARGV!=1 ) {
print "Wrong number of arguments\\n\\t".
"uniform-sampler.pl <file> <sample_proportion>\\n";
}
else {
srand();
open(FILE,$ARGV[0]) or die "File $ARGV[0] could not be open";
while($line=<FILE>) {
if ( rand()<$ARGV[1] ) {
print $line;
}
}
close FILE;
}
1;