Improving Performance - Optimizing code

Speed
Memory usage
I/O (disk, network, database)

Problems

Optimization strategy

The 3 rules of optimization

Don't do it!
Don't do it!
Don't do it yet!

Premature optimization is the root of all evil ~ Donald Knuth

Locate the source of the problem

I/O is expensive! Database access, file access, GUI update
If memory is full swapping starts - speed decreases

Optimizing tactics

Choose the Right Data Structure (Dictionary?, Set?, List?)
Sorting: Decorate Sort Undecorate (DSU) aka. Schwartzian Transform.
String Concatenation: avoid extensive concatenation.
Loops: for, list comprehension: use generators and iterators.
Delay expanding range, map, filter, etc. iterables.
Caching results, memoizing.

DSU: Decorate Sort Undecorate

In Perl it is called Schwartzian transform

animals = ['chicken', 'cow', 'snail', 'elephant']
print(sorted(animals))
print(sorted(animals, key=len))

decorated = [(len(w), w) for w in animals]
print(decorated)

decorated.sort()
result = [ d[1] for d in decorated]
print(result)

# at once
print( [ d[1] for d in sorted( [(len(w), w) for w in animals] ) ] )

['chicken', 'cow', 'elephant', 'snail']
['cow', 'snail', 'chicken', 'elephant']
[(7, 'chicken'), (3, 'cow'), (5, 'snail'), (8, 'elephant')]
['cow', 'snail', 'chicken', 'elephant']
['cow', 'snail', 'chicken', 'elephant']

Profile code

Always profile before starting to optimize!

profile

Slow example

This code does some stuff which was deemed to be "too slow" by some client. The actual content is not that interesting.

import random

def f():
    n = 0
    for i in range(30):
        n += random.random()
    return n

def g():
    return random.random() * 30


def main(n):
    text = get_str(n)

    #print(str)
    text_sorted = sort(text)
    return text_sorted

def sort(s):
    chars = list(s)
    for i in reversed(range(len(chars))):
        a = f()
        b = g()
        for j in range(i, len(chars)-1):
            swap(chars, j)

    return ''.join(chars)

def get_str(n):
    text = ''
    for i in range(1, n):
        text += chr(65 + random.randrange(0, 26))
    return text

def swap(lst, loc):
    if lst[loc] > lst[loc + 1]:
        lst[loc], lst[loc + 1] = lst[loc + 1], lst[loc]

if __name__ == '__main__':
    print(main(1000))

profile slow code

profile

import slow
import profile

profile.run('slow.main(1000)')

         537471 function calls in 3.078 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      999    0.003    0.000    0.003    0.000 :0(chr)
        1    0.000    0.000    0.000    0.000 :0(join)
     1000    0.003    0.000    0.003    0.000 :0(len)
    31968    0.083    0.000    0.083    0.000 :0(random)
     1999    0.009    0.000    0.009    0.000 :0(range)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        1    0.000    0.000    3.076    3.076 <string>:1(<module>)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    3.078    3.078 profile:0(slow.main(1000))
      999    0.009    0.000    0.012    0.000 random.py:173(randrange)
      999    0.005    0.000    0.008    0.000 slow.py:10(g)
        1    0.000    0.000    3.076    3.076 slow.py:14(main)
        1    1.410    1.410    3.053    3.053 slow.py:21(sort)
        1    0.008    0.008    0.023    0.023 slow.py:31(get_str)
   498501    1.456    0.000    1.456    0.000 slow.py:37(swap)
      999    0.090    0.000    0.171    0.000 slow.py:4(f)

cProfile slow code

cProfile

import slow
import cProfile

cProfile.run('slow.main(1000)')

      537470 function calls in 0.325 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.325    0.325 <string>:1(<module>)
   999    0.002    0.000    0.002    0.000 random.py:173(randrange)
   999    0.000    0.000    0.000    0.000 slow.py:10(g)
     1    0.000    0.000    0.325    0.325 slow.py:14(main)
     1    0.119    0.119    0.322    0.322 slow.py:21(sort)
     1    0.001    0.001    0.003    0.003 slow.py:31(get_str)
498501    0.189    0.000    0.189    0.000 slow.py:37(swap)
   999    0.008    0.000    0.010    0.000 slow.py:4(f)
   999    0.000    0.000    0.000    0.000 {chr}
  1000    0.000    0.000    0.000    0.000 {len}
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
 31968    0.003    0.000    0.003    0.000 {method 'random' of '_random.Random' objects}
  1999    0.003    0.000    0.003    0.000 {range}

Benchmarking

benchmark

import timeit
from functools import reduce
import random

chars = []
for i in range(200):
    chars.append(chr(65 + random.randrange(0, 26)))

print(timeit.timeit('string = "".join(chars)',
    setup="from __main__ import chars", number=10000))

print(timeit.timeit('reduce(lambda x, y: x+y, chars)',
    setup="from __main__ import chars, reduce", number=10000))

0.01576369699614588
0.15464225399773568

Benchmarking subs

import timeit

def one_by_one():
    import random
    text = ""
    for i in range(200):
        text += chr(65 + random.randrange(0, 26))
    return text

def at_once():
    import random
    chars = []
    for i in range(200):
        chars.append(chr(65 + random.randrange(0, 26)))
    text = ''.join(chars)
    return text

print(timeit.timeit('one_by_one()',
    setup="from __main__ import one_by_one", number=10000))

print(timeit.timeit('at_once()',
    setup="from __main__ import at_once", number=10000))

1.5248507579963189
1.5566942970035598

Counting words - which implementation is faster?

collections
defaultdict
Counter
timeit
try
except
In this example we have 4 functions counting the number of appearances of words that are already in memmory in a list.
We use timeit to benchmark them.
repeat is the number of repetition of each string.
different is the number of different string.

from collections import defaultdict
from collections import Counter
import timeit

def generate_list_of_words(number, repeat):
    #words = ['Wombat', 'Rhino', 'Sloth', 'Tarantula', 'Sloth', 'Rhino', 'Sloth']
    words = []
    for ix in range(number):
        for _ in range(repeat):
            words.append(str(ix))
    return words

def plain_counter(words):
    counter = {}
    for word in words:
        if word not in counter:
            counter[word] = 0
        counter[word] += 1
    return counter

def counter_with_exceptions(words):
    counter = {}
    for word in words:
        try:
            counter[word] += 1
        except KeyError:
           counter[word] = 1
    return counter


def counter_with_counter(words):
    counter = Counter()
    for word in words:
       counter[word] += 1
    return counter

def counter_with_default_dict(words):
    counter = defaultdict(int)
    for word in words:
       counter[word] += 1
    return counter


def main():
    #words = generate_list_of_words(1000, 1)

    #counter = plain_counter(words)
    #counter = counter_with_counter(words)
    #counter = counter_with_default_dict(words)
    #counter = counter_with_exceptions(words)
    #for word in sorted(counter.keys()):
    #   print("{}:{}".format(word, counter[word]))

    for repeat in [1, 10, 20, 50]:
        different = int(1000 / repeat)
        print(f'repeat {repeat}  different {different}')
        for name in ['plain_counter', 'counter_with_counter', 'counter_with_default_dict', 'counter_with_exceptions']:
            print("{:26} {}".format(name, timeit.timeit(f'{name}(words)',
                number=10000,
                setup=f'from __main__ import {name}, generate_list_of_words; words = generate_list_of_words({different}, {repeat})')))
        print()

if __name__ == "__main__":
    main()

repeat 1  different 1000
plain_counter              0.6091844770126045
counter_with_counter       1.232734862016514
counter_with_default_dict  0.7378899219911546
counter_with_exceptions    1.4480015779845417

repeat 10  different 100
plain_counter              0.4949962190585211
counter_with_counter       0.7886336819501594
counter_with_default_dict  0.4284116430208087
counter_with_exceptions    0.4748374510090798

repeat 20  different 50
plain_counter              0.4847069630632177
counter_with_counter       0.7627606929745525
counter_with_default_dict  0.4116779019823298
counter_with_exceptions    0.407719356007874

repeat 50  different 20
plain_counter              0.4709314970532432
counter_with_counter       0.7357207209570333
counter_with_default_dict  0.3903243549866602
counter_with_exceptions    0.36094399297144264

for loop or reduce to add numbers?

for
functools
reduce

import timeit
from functools import reduce

def add_in_loop(num):
    total = 0
    for ix in range(num+1):
        total += ix
    return total

def add_with_reduce(num):
    total = reduce(lambda x, y: x + y, range(num+1))
    return total


def main():
    #num = 4
    #print(add_in_loop(num))
    #print(add_with_reduce(num))

    for num in [10, 1000]:
        print(f'num {num}')
        for name in ['add_in_loop', 'add_with_reduce']:
            print("{:16} {}".format(name, timeit.timeit(f'{name}({num})',
                number=100000,
                setup=f'from __main__ import {name}')))
        print()

if __name__ == "__main__":
    main()

num 10
add_in_loop      0.023712733993306756
add_with_reduce  0.05284293496515602

num 1000
add_in_loop      1.57034146389924
add_with_reduce  3.0021417930256575

Levenshtein distance

editdistance Levenshtein distance implemented in C
python-Levenshtein implemented in C
pylev
pyxdameraulevenshtein
weighted-levenshtein

Generate words

import sys
import random
import string

# TODO: set min, max word length
# TODO: set filename
# TODO: set character types
# TODO: allow spaces?

def main():
    filename = "words.txt"
    min_len  = 6
    max_len  = 6

    if len(sys.argv) != 2:
        exit(f"Usage: {sys.argv[0]} WORD_COUNT")
    count = int(sys.argv[1])
    with open(filename, 'w') as fh:
        for _ in range(count):
            word = ''
            length = random.randrange(min_len, max_len+1)
            for _ in range(length):
                word += random.choice(string.ascii_lowercase)
            fh.write(word + "\n")

main()

Levenshtein - pylev

import sys
import pylev

def main():
    if len(sys.argv) != 2:
        exit(f"Usage: {sys.argv[0]} filename")
    filename = sys.argv[1]
    outfile = 'out.txt'

    rows = []
    with open(filename) as fh:
        for row in fh:
            rows.append(row.rstrip("\n"))
    with open(outfile, 'w') as fh:
        for a in rows:
            for b in rows:
                dist = pylev.levenshtein(a, b)
                fh.write(f"{a},{b},{dist}\n")

main()

Levenshtein - editdistance

import sys
import editdistance

def main():
    if len(sys.argv) != 2:
        exit(f"Usage: {sys.argv[0]} filename")
    filename = sys.argv[1]
    outfile = 'out.txt'

    rows = []
    with open(filename) as fh:
        for row in fh:
            rows.append(row.rstrip("\n"))
    with open(outfile, 'w') as fh:
        for a in rows:
            for b in rows:
                dist = editdistance.eval(a, b)
                fh.write(f"{a},{b},{dist}\n")

main()

Editdistance benchmark

editdistance

A Tool to Generate text files

import sys
import string
import random
import argparse
import os

# Generate n file of size S with random letters

def get_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--dir',             help="Directory where to create the files", default=".")
    parser.add_argument('--files', type=int, help="Number of files to create", default=1)
    parser.add_argument('--size',  type=int, help="Size of files",             default=10)
    args = parser.parse_args()
    return args

def main():
    args = get_args()
    chars = list(string.ascii_lowercase) + [' '] * 5 + ['\n']

    for ix in range(args.files):
        all_chars = []
        for _ in range(args.size):
            all_chars.extend(random.sample(chars, 1))
        #print(len(all_chars))

        #print(all_chars)
        filename = os.path.join(args.dir, str(ix) + '.txt')
        with open(filename, 'w') as fh:
            fh.write(''.join(all_chars))


def old_main():
    if len(sys.argv) < 2:
        exit(f"Usage: {sys.argv[0]} NUMBER_OF_ROWS")

    row_count = int(sys.argv[1])
    min_width = 30
    max_width = 50
    filename  = 'data.log'

    chars = list(string.ascii_lowercase) + [' '] * 5
    all_chars = chars * max_width

    with open(filename, 'w') as fh:
        for i in range(row_count):
            width = random.randrange(min_width, max_width+1)
            row = ''.join(random.sample(all_chars, width))
            fh.write(row + "\n")

main()

Count characters

# changes chars and counter
def add_char(chars, counter, ch, cnt=1):
    for ix in range(len(chars)):
        if chars[ix] == ch:
            counter[ix] += cnt
            break
    else:
        chars.append(ch)
        counter.append(cnt)


def count_in_file(filename):
    #print(filename)
    chars   = []
    counter = []
    with open(filename) as fh:
        for row in fh:
            for ch in row:
                #print(ch)
                if ch == ' ':
                    continue
                if ch == '\n':
                    continue
                add_char(chars, counter, ch)

    #print(chars)
    #print(counter)
    return chars, counter

def merge(chars1, counter1, chars2, counter2):
    chars = []
    counter = []
    for ix in range(len(chars1)):
        add_char(chars, counter, chars1[ix], cnt=counter1[ix])
    for ix in range(len(chars2)):
        add_char(chars, counter, chars2[ix], cnt=counter2[ix])
    return chars, counter


def print_results(chars, counter):
    print("Results")
    for ix in range(len(chars)):
        print("{}  {}".format(chars[ix], counter[ix]))

def count_in(filenames):
    total_chars = []
    total_counter = []
    for filename in filenames:
        chars, counter = count_in_file(filename)
        total_chars, total_counter = merge(total_chars, total_counter, chars, counter)

    return total_chars, total_counter


if __name__ == '__main__':
    import sys
    chars, counter = count_in(sys.argv[1:])
    print_results(chars, counter)

import count_characters as count
import cProfile
import sys

cProfile.run('chars, counter = count.count_in(sys.argv[1:])')

Memory leak

import random

def alloc():
    a = {
        'data': str(random.random()) + "a" * 10000000,
    }
    b = {
        'data': str(random.random()) + "b" * 10000000,
    }
    a['other'] = b
    b['other'] = a

import sys
from mymem import alloc

if len(sys.argv) < 2:
    exit(f"Usage: {sys.argv[0]} N")

count = int(sys.argv[1])

for _ in range(count):
    alloc()
input("End the script")

Garbage collection

import sys
from mymem import alloc
import gc

if len(sys.argv) < 2:
    exit(f"Usage: {sys.argv[0]} N")

count = int(sys.argv[1])

for _ in range(count):
    alloc()
input("Run gc")

gc.collect()
input("End the script")

Weak reference

weakref

import random
import weakref

def alloc():
    a = {
        'data': str(random.random()) + "a" * 10000000,
    }
    b = {
        'data': str(random.random()) + "b" * 10000000,
    }
    #a['other'] = weakref.WeakKeyDictionary(b)
    z = weakref.ref(b)
    #a['other'] = 
    #weakref.ref(a['other'])
    #b['other'] = a
    #weakref.ref(b['other'])

import sys
from weakmymem import alloc

if len(sys.argv) < 2:
    exit(f"Usage: {sys.argv[0]} N")

count = int(sys.argv[1])

for _ in range(count):
    alloc()
input("End the script")

Exercise: benchmark list-comprehension, map, for

Create several functions that accept a list of numbers from 1 to 1000 and calculate their square:
A function with a for-loop.
A function that uses map.
A function that uses list-comprehension.
Feel free to have any other calucaltion and measure that.
Send me the code and the results!

Exercise: Benchmark Levenshtein

Take the implementation of the Levenshtein distance calculations and check which one is faster.

Exercise: sort files

Write a script that given a path to a directory will print the files sorted by date. If you don't have one large folder, then use os.walk to get the path to the files of a whole directory tree.

Write a simple solution.
Profile.
Use DSU.

Exercise: compare split words:

We have three ways of splitting a string into words. Using split, using re.split and by going over it character-by-charcter. Which one is the fastest?

import sys
import re

def split_to_words_by_regex(text):
    return re.split(' ', text)

def split_to_words_by_split(text):
    return text.split()

def split_to_words_by_chars(text):
    words = []
    word = ''
    for ch in text:
        if ch == ' ':
            words.append(word)
            word = ''
        else:
            word += ch
    if word:
        words.append(word)
    return words


if __name__ == '__main__':
    if len(sys.argv) < 2:
        exit(f"Usage: {sys.argv[0]} FILENAME")

    filename = sys.argv[1]
    with open(filename) as fh:
        text = fh.read()
    res1 = split_to_words_by_split(text)
    res2 = split_to_words_by_chars(text)
    res3 = split_to_words_by_regex(text)
    #print(res1)
    #print(res2)
    assert res1 == res2
    assert res1 == res3

Exercise: count words

Given a file count how many times each word appears. Have two implementations. One using two list and one using a dictionary. Profile the code and benchmark the two solutions.

See examples/perf/count_words_two_lists.py and examples/dictionary/count_words.py

Keyboard shortcuts

Python: Improving performance - Code optimizations