The other day I was playing to see how much I could squeeze out of a genetic algorithm written in Python. The code below shows the example I used. The first part implements a simple two loop version of a traditional allele random mutation. The second part is coded using numpy 2D arrays. The code also measures the time spent on both implementations using cProfile.

from numpy import *

pop_size = 2000
l = 200
z = zeros((pop_size,l))

def mutate () :
        for i in xrange(pop_size):
                for j in xrange(l) :
                        if random.random()<0.5 :
                                z[i,j] = random.random()

import cProfile
cProfile.run('mutate()')

def mutate_matrix () :
        r = random.random(size=(pop_size,l))<0.5
        v = random.random(size=(pop_size,l))
        k = r*v + logical_not(r)*z
        
cProfile.run('mutate_matrix()')

If you run the code listed above you may get something similar to

$ python scan.py 
         599933 function calls in 0.857 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.857    0.857 :1()
        1    0.615    0.615    0.857    0.857 scan.py:7(mutate)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   599930    0.242    0.000    0.242    0.000 {method 'random_sample' of 'mtrand.RandomState' objects}


         3 function calls in 0.082 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002    0.082    0.082 :1()
        1    0.080    0.080    0.080    0.080 scan.py:16(mutate_matrix)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects} 

Also if you make the simple math, the numpy-based version is 10.45 times faster than a simple loop-based implementation. Yup, sometimes the easy way out is not the best, and giving it some thought helps :D