A Generator Expression is doing basically the same thing as a List Comprehension does, but the GE does it lazily. The difference is quite similar to the difference between range and xrange.
A List Comprehension, just like the plain range
function, executes
immediately and returns a list.
A Generator Expression, just like xrange
returns an object that
can be iterated over.
The comparision is not perfect though, because in an object returned by the generator expression, we cannot access an element by index.
The difference between the two kinds of expressions is that the
List comprehension is enclosed in square brackets []
while the Generator expression is enclosed in plain parentheses ()
.
l = [n*2 for n in range(1000)] # List comprehension
g = (n*2 for n in range(1000)) # Generator expression
Type
The types of the resulting values are list
and generator
respectively:
print(type(l)) # 'list'
print(type(g)) # 'generator'
Size in memory
The size of the objects is 9032 bytes (the list), and 80 bytes (the generator):
print(sys.getsizeof(l)) # 9032
print(sys.getsizeof(g)) # 80
Access by Index
We can access the elements of the list
, but if we try to access the elements
of the generator
we get a TypeError
:
print(l[4]) # 8
print(g[4]) # TypeError: 'generator' object has no attribute '__getitem__'
Loop over
Finally, but most importantly, we can iterate over either of them:
for v in l:
pass
for v in g:
pass
The full example
examples/python/generator_expression.py
#!/usr/bin/env python
from __future__ import print_function
import sys
l = [n*2 for n in range(1000)] # List comprehension
g = (n*2 for n in range(1000)) # Generator expression
print(type(l)) # <type 'list'>
print(type(g)) # <type 'generator'>
print(sys.getsizeof(l)) # 9032
print(sys.getsizeof(g)) # 80
print(l[4]) # 8
#print(g[4]) # TypeError: 'generator' object has no attribute '__getitem__'
for v in l:
pass
for v in g:
pass
Comments
Scenerio.From a large data collection 'else where 'you have to collect certain items and store at your end subject to complex conditions you impose. On-the-go nature of expression will help you save 'temporary storage' at your end compared to list comprehension which will make a list. Of course list comprehension can use if clause to 'filter' but your requirement may be too complex to code in relational and logical operations.
The idea is expression helps on-the-go decision making. For example getting an item from one source and deciding to keep it or not for you then and there will be huge memory saving,
Thanks. So it takes more memory to iterate over a list comprehension. Which one is faster to iterate over, all things being equal?
You never got a reply to your question but it's a good one.
My guess is that their speed for small data sets is identical or close to it. Start forcing OS swapping with huge lists and that goes out the window (obviously).
Python generators remind me of Unix pipes.