The other day I was playing to see how much I could squeeze out of a genetic algorithm written in Python. The code below shows the example I used. The first part implements a simple two loop version of a traditional allele random mutation. The second part is coded using numpy
2D arrays. The code also measures the time spent on both implementations using cProfile
.
from numpy import *
pop_size = 2000
l = 200
z = zeros((pop_size,l))
def mutate () :
for i in xrange(pop_size):
for j in xrange(l) :
if random.random()<0.5 :
z[i,j] = random.random()
import cProfile
cProfile.run('mutate()')
def mutate_matrix () :
r = random.random(size=(pop_size,l))<0.5
v = random.random(size=(pop_size,l))
k = r*v + logical_not(r)*z
cProfile.run('mutate_matrix()')
If you run the code listed above you may get something similar to
$ python scan.py
599933 function calls in 0.857 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.857 0.857 :1()
1 0.615 0.615 0.857 0.857 scan.py:7(mutate)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
599930 0.242 0.000 0.242 0.000 {method 'random_sample' of 'mtrand.RandomState' objects}
3 function calls in 0.082 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 0.082 0.082 :1()
1 0.080 0.080 0.080 0.080 scan.py:16(mutate_matrix)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Also if you make the simple math, the numpy-based version is 10.45 times faster than a simple loop-based implementation. Yup, sometimes the easy way out is not the best, and giving it some thought helps :D