Improving Performance - Optimizing code
- Speed
- Memory usage
- I/O (disk, network, database)
Problems
Optimization strategy
The 3 rules of optimization
- Don't do it!
- Don't do it!
- Don't do it yet!
Premature optimization is the root of all evil ~ Donald Knuth
Locate the source of the problem
- I/O is expensive! Database access, file access, GUI update
- If memory is full swapping starts - speed decreases
Optimizing tactics
- Choose the Right Data Structure (Dictionary?, Set?, List?)
- Sorting: Decorate Sort Undecorate (DSU) aka. Schwartzian Transform.
- String Concatenation: avoid extensive concatenation.
- Loops: for, list comprehension: use generators and iterators.
- Delay expanding range, map, filter, etc. iterables.
- Caching results, memoizing.
Read more performance tips
DSU: Decorate Sort Undecorate
In Perl it is called Schwartzian transform
animals = ['chicken', 'cow', 'snail', 'elephant']
print(sorted(animals))
print(sorted(animals, key=len))
decorated = [(len(w), w) for w in animals]
print(decorated)
decorated.sort()
result = [ d[1] for d in decorated]
print(result)
# at once
print( [ d[1] for d in sorted( [(len(w), w) for w in animals] ) ] )
['chicken', 'cow', 'elephant', 'snail']
['cow', 'snail', 'chicken', 'elephant']
[(7, 'chicken'), (3, 'cow'), (5, 'snail'), (8, 'elephant')]
['cow', 'snail', 'chicken', 'elephant']
['cow', 'snail', 'chicken', 'elephant']
Profile code
Always profile before starting to optimize!
Slow example
This code does some stuff which was deemed to be "too slow" by some client. The actual content is not that interesting.
import random
def f():
n = 0
for i in range(30):
n += random.random()
return n
def g():
return random.random() * 30
def main(n):
text = get_str(n)
#print(str)
text_sorted = sort(text)
return text_sorted
def sort(s):
chars = list(s)
for i in reversed(range(len(chars))):
a = f()
b = g()
for j in range(i, len(chars)-1):
swap(chars, j)
return ''.join(chars)
def get_str(n):
text = ''
for i in range(1, n):
text += chr(65 + random.randrange(0, 26))
return text
def swap(lst, loc):
if lst[loc] > lst[loc + 1]:
lst[loc], lst[loc + 1] = lst[loc + 1], lst[loc]
if __name__ == '__main__':
print(main(1000))
profile slow code
- profile
import slow
import profile
profile.run('slow.main(1000)')
537471 function calls in 3.078 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
999 0.003 0.000 0.003 0.000 :0(chr)
1 0.000 0.000 0.000 0.000 :0(join)
1000 0.003 0.000 0.003 0.000 :0(len)
31968 0.083 0.000 0.083 0.000 :0(random)
1999 0.009 0.000 0.009 0.000 :0(range)
1 0.001 0.001 0.001 0.001 :0(setprofile)
1 0.000 0.000 3.076 3.076 <string>:1(<module>)
0 0.000 0.000 profile:0(profiler)
1 0.000 0.000 3.078 3.078 profile:0(slow.main(1000))
999 0.009 0.000 0.012 0.000 random.py:173(randrange)
999 0.005 0.000 0.008 0.000 slow.py:10(g)
1 0.000 0.000 3.076 3.076 slow.py:14(main)
1 1.410 1.410 3.053 3.053 slow.py:21(sort)
1 0.008 0.008 0.023 0.023 slow.py:31(get_str)
498501 1.456 0.000 1.456 0.000 slow.py:37(swap)
999 0.090 0.000 0.171 0.000 slow.py:4(f)
cProfile slow code
- cProfile
import slow
import cProfile
cProfile.run('slow.main(1000)')
537470 function calls in 0.325 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.325 0.325 <string>:1(<module>)
999 0.002 0.000 0.002 0.000 random.py:173(randrange)
999 0.000 0.000 0.000 0.000 slow.py:10(g)
1 0.000 0.000 0.325 0.325 slow.py:14(main)
1 0.119 0.119 0.322 0.322 slow.py:21(sort)
1 0.001 0.001 0.003 0.003 slow.py:31(get_str)
498501 0.189 0.000 0.189 0.000 slow.py:37(swap)
999 0.008 0.000 0.010 0.000 slow.py:4(f)
999 0.000 0.000 0.000 0.000 {chr}
1000 0.000 0.000 0.000 0.000 {len}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects}
31968 0.003 0.000 0.003 0.000 {method 'random' of '_random.Random' objects}
1999 0.003 0.000 0.003 0.000 {range}
Benchmarking
import timeit
from functools import reduce
import random
chars = []
for i in range(200):
chars.append(chr(65 + random.randrange(0, 26)))
print(timeit.timeit('string = "".join(chars)',
setup="from __main__ import chars", number=10000))
print(timeit.timeit('reduce(lambda x, y: x+y, chars)',
setup="from __main__ import chars, reduce", number=10000))
0.01576369699614588
0.15464225399773568
Benchmarking subs
import timeit
def one_by_one():
import random
text = ""
for i in range(200):
text += chr(65 + random.randrange(0, 26))
return text
def at_once():
import random
chars = []
for i in range(200):
chars.append(chr(65 + random.randrange(0, 26)))
text = ''.join(chars)
return text
print(timeit.timeit('one_by_one()',
setup="from __main__ import one_by_one", number=10000))
print(timeit.timeit('at_once()',
setup="from __main__ import at_once", number=10000))
1.5248507579963189
1.5566942970035598
Counting words - which implementation is faster?
-
collections
-
defaultdict
-
Counter
-
timeit
-
try
-
except
-
In this example we have 4 functions counting the number of appearances of words that are already in memmory in a list.
-
We use
timeitto benchmark them. -
repeatis the number of repetition of each string. -
differentis the number of different string.
from collections import defaultdict
from collections import Counter
import timeit
def generate_list_of_words(number, repeat):
#words = ['Wombat', 'Rhino', 'Sloth', 'Tarantula', 'Sloth', 'Rhino', 'Sloth']
words = []
for ix in range(number):
for _ in range(repeat):
words.append(str(ix))
return words
def plain_counter(words):
counter = {}
for word in words:
if word not in counter:
counter[word] = 0
counter[word] += 1
return counter
def counter_with_exceptions(words):
counter = {}
for word in words:
try:
counter[word] += 1
except KeyError:
counter[word] = 1
return counter
def counter_with_counter(words):
counter = Counter()
for word in words:
counter[word] += 1
return counter
def counter_with_default_dict(words):
counter = defaultdict(int)
for word in words:
counter[word] += 1
return counter
def main():
#words = generate_list_of_words(1000, 1)
#counter = plain_counter(words)
#counter = counter_with_counter(words)
#counter = counter_with_default_dict(words)
#counter = counter_with_exceptions(words)
#for word in sorted(counter.keys()):
# print("{}:{}".format(word, counter[word]))
for repeat in [1, 10, 20, 50]:
different = int(1000 / repeat)
print(f'repeat {repeat} different {different}')
for name in ['plain_counter', 'counter_with_counter', 'counter_with_default_dict', 'counter_with_exceptions']:
print("{:26} {}".format(name, timeit.timeit(f'{name}(words)',
number=10000,
setup=f'from __main__ import {name}, generate_list_of_words; words = generate_list_of_words({different}, {repeat})')))
print()
if __name__ == "__main__":
main()
repeat 1 different 1000
plain_counter 0.6091844770126045
counter_with_counter 1.232734862016514
counter_with_default_dict 0.7378899219911546
counter_with_exceptions 1.4480015779845417
repeat 10 different 100
plain_counter 0.4949962190585211
counter_with_counter 0.7886336819501594
counter_with_default_dict 0.4284116430208087
counter_with_exceptions 0.4748374510090798
repeat 20 different 50
plain_counter 0.4847069630632177
counter_with_counter 0.7627606929745525
counter_with_default_dict 0.4116779019823298
counter_with_exceptions 0.407719356007874
repeat 50 different 20
plain_counter 0.4709314970532432
counter_with_counter 0.7357207209570333
counter_with_default_dict 0.3903243549866602
counter_with_exceptions 0.36094399297144264
for loop or reduce to add numbers?
- for
- functools
- reduce
import timeit
from functools import reduce
def add_in_loop(num):
total = 0
for ix in range(num+1):
total += ix
return total
def add_with_reduce(num):
total = reduce(lambda x, y: x + y, range(num+1))
return total
def main():
#num = 4
#print(add_in_loop(num))
#print(add_with_reduce(num))
for num in [10, 1000]:
print(f'num {num}')
for name in ['add_in_loop', 'add_with_reduce']:
print("{:16} {}".format(name, timeit.timeit(f'{name}({num})',
number=100000,
setup=f'from __main__ import {name}')))
print()
if __name__ == "__main__":
main()
num 10
add_in_loop 0.023712733993306756
add_with_reduce 0.05284293496515602
num 1000
add_in_loop 1.57034146389924
add_with_reduce 3.0021417930256575
Levenshtein distance
- editdistance Levenshtein distance implemented in C
- python-Levenshtein implemented in C
- pylev
- pyxdameraulevenshtein
- weighted-levenshtein
Generate words
import sys
import random
import string
# TODO: set min, max word length
# TODO: set filename
# TODO: set character types
# TODO: allow spaces?
def main():
filename = "words.txt"
min_len = 6
max_len = 6
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} WORD_COUNT")
count = int(sys.argv[1])
with open(filename, 'w') as fh:
for _ in range(count):
word = ''
length = random.randrange(min_len, max_len+1)
for _ in range(length):
word += random.choice(string.ascii_lowercase)
fh.write(word + "\n")
main()
Levenshtein - pylev
import sys
import pylev
def main():
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} filename")
filename = sys.argv[1]
outfile = 'out.txt'
rows = []
with open(filename) as fh:
for row in fh:
rows.append(row.rstrip("\n"))
with open(outfile, 'w') as fh:
for a in rows:
for b in rows:
dist = pylev.levenshtein(a, b)
fh.write(f"{a},{b},{dist}\n")
main()
Levenshtein - editdistance
import sys
import editdistance
def main():
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} filename")
filename = sys.argv[1]
outfile = 'out.txt'
rows = []
with open(filename) as fh:
for row in fh:
rows.append(row.rstrip("\n"))
with open(outfile, 'w') as fh:
for a in rows:
for b in rows:
dist = editdistance.eval(a, b)
fh.write(f"{a},{b},{dist}\n")
main()
Editdistance benchmark
A Tool to Generate text files
import sys
import string
import random
import argparse
import os
# Generate n file of size S with random letters
def get_args():
parser = argparse.ArgumentParser()
parser.add_argument('--dir', help="Directory where to create the files", default=".")
parser.add_argument('--files', type=int, help="Number of files to create", default=1)
parser.add_argument('--size', type=int, help="Size of files", default=10)
args = parser.parse_args()
return args
def main():
args = get_args()
chars = list(string.ascii_lowercase) + [' '] * 5 + ['\n']
for ix in range(args.files):
all_chars = []
for _ in range(args.size):
all_chars.extend(random.sample(chars, 1))
#print(len(all_chars))
#print(all_chars)
filename = os.path.join(args.dir, str(ix) + '.txt')
with open(filename, 'w') as fh:
fh.write(''.join(all_chars))
def old_main():
if len(sys.argv) < 2:
exit(f"Usage: {sys.argv[0]} NUMBER_OF_ROWS")
row_count = int(sys.argv[1])
min_width = 30
max_width = 50
filename = 'data.log'
chars = list(string.ascii_lowercase) + [' '] * 5
all_chars = chars * max_width
with open(filename, 'w') as fh:
for i in range(row_count):
width = random.randrange(min_width, max_width+1)
row = ''.join(random.sample(all_chars, width))
fh.write(row + "\n")
main()
Count characters
# changes chars and counter
def add_char(chars, counter, ch, cnt=1):
for ix in range(len(chars)):
if chars[ix] == ch:
counter[ix] += cnt
break
else:
chars.append(ch)
counter.append(cnt)
def count_in_file(filename):
#print(filename)
chars = []
counter = []
with open(filename) as fh:
for row in fh:
for ch in row:
#print(ch)
if ch == ' ':
continue
if ch == '\n':
continue
add_char(chars, counter, ch)
#print(chars)
#print(counter)
return chars, counter
def merge(chars1, counter1, chars2, counter2):
chars = []
counter = []
for ix in range(len(chars1)):
add_char(chars, counter, chars1[ix], cnt=counter1[ix])
for ix in range(len(chars2)):
add_char(chars, counter, chars2[ix], cnt=counter2[ix])
return chars, counter
def print_results(chars, counter):
print("Results")
for ix in range(len(chars)):
print("{} {}".format(chars[ix], counter[ix]))
def count_in(filenames):
total_chars = []
total_counter = []
for filename in filenames:
chars, counter = count_in_file(filename)
total_chars, total_counter = merge(total_chars, total_counter, chars, counter)
return total_chars, total_counter
if __name__ == '__main__':
import sys
chars, counter = count_in(sys.argv[1:])
print_results(chars, counter)
import count_characters as count
import cProfile
import sys
cProfile.run('chars, counter = count.count_in(sys.argv[1:])')
Memory leak
import random
def alloc():
a = {
'data': str(random.random()) + "a" * 10000000,
}
b = {
'data': str(random.random()) + "b" * 10000000,
}
a['other'] = b
b['other'] = a
import sys
from mymem import alloc
if len(sys.argv) < 2:
exit(f"Usage: {sys.argv[0]} N")
count = int(sys.argv[1])
for _ in range(count):
alloc()
input("End the script")
Garbage collection
import sys
from mymem import alloc
import gc
if len(sys.argv) < 2:
exit(f"Usage: {sys.argv[0]} N")
count = int(sys.argv[1])
for _ in range(count):
alloc()
input("Run gc")
gc.collect()
input("End the script")
Weak reference
import random
import weakref
def alloc():
a = {
'data': str(random.random()) + "a" * 10000000,
}
b = {
'data': str(random.random()) + "b" * 10000000,
}
#a['other'] = weakref.WeakKeyDictionary(b)
z = weakref.ref(b)
#a['other'] =
#weakref.ref(a['other'])
#b['other'] = a
#weakref.ref(b['other'])
import sys
from weakmymem import alloc
if len(sys.argv) < 2:
exit(f"Usage: {sys.argv[0]} N")
count = int(sys.argv[1])
for _ in range(count):
alloc()
input("End the script")
Exercise: benchmark list-comprehension, map, for
-
Create several functions that accept a list of numbers from 1 to 1000 and calculate their square:
-
A function with a
for-loop. -
A function that uses
map. -
A function that uses list-comprehension.
-
Feel free to have any other calucaltion and measure that.
-
Send me the code and the results!
Exercise: Benchmark Levenshtein
- Take the implementation of the Levenshtein distance calculations and check which one is faster.
Exercise: sort files
Write a script that given a path to a directory will print the files sorted by date.
If you don't have one large folder, then use os.walk to get the path to the files of a whole directory tree.
- Write a simple solution.
- Profile.
- Use DSU.
Exercise: compare split words:
We have three ways of splitting a string into words. Using split, using re.split and by going over it character-by-charcter.
Which one is the fastest?
import sys
import re
def split_to_words_by_regex(text):
return re.split(' ', text)
def split_to_words_by_split(text):
return text.split()
def split_to_words_by_chars(text):
words = []
word = ''
for ch in text:
if ch == ' ':
words.append(word)
word = ''
else:
word += ch
if word:
words.append(word)
return words
if __name__ == '__main__':
if len(sys.argv) < 2:
exit(f"Usage: {sys.argv[0]} FILENAME")
filename = sys.argv[1]
with open(filename) as fh:
text = fh.read()
res1 = split_to_words_by_split(text)
res2 = split_to_words_by_chars(text)
res3 = split_to_words_by_regex(text)
#print(res1)
#print(res2)
assert res1 == res2
assert res1 == res3
Exercise: count words
Given a file count how many times each word appears. Have two implementations. One using two list and one using a dictionary. Profile the code and benchmark the two solutions.
See examples/perf/count_words_two_lists.py and examples/dictionary/count_words.py