
Published 2021-10-14 21:52:15
Python - Learn Generators Now!
Python - Learn Generators
Maximize performance and conserve memory when working with large datasets by utilizing generators in Python.
In this lesson, you'll dive into the concepts of generators and discover how to create generator functions and expressions.
Explore the difference between the yield statement and the return statement and learn about advanced generator methods such as send()
, throw(),
and stop()
.
Additionally, discover how to measure memory consumption and execution time using the sys
module and timeit
."
Generators in Python
In Python, you can create your own iterator classes or functions to iterate over items in collections such as lists, dictionaries, tuples, or sets.
Additionally, Python provides a built-in generator function for iterating over iterator objects.
It's important to keep in mind that memory is a limited resource, especially when working with large datasets.
A list, for instance, requires memory to store its items. An empty list uses 56 bytes for the list structure, and an additional 8 bytes per item.
To check the size of a list, you can use the built-in sys module, as shown in an example:
import sys
sys.getsizeof([])
56
sys.getsizeof([1])
64
sys.getsizeof([1, 2])
72
Lists consume memory and working with large datasets can put a strain on your computer's memory.
To effectively handle large data sets that may contain endless lists, generators can be a more resource-efficient solution.
In this section, you'll explore how to use generators, understand the syntax and rules for using them.
It's important to note that there are performance trade-offs to consider when using generators, which will be demonstrated through benchmarking examples.
Create a Generator
We will create an infinite loop function that will generate integers. When it comes to Generators, there are some key concepts to keep in mind.
One of these is the use of the yield
statement, which is similar to a return
statement but returns a value one at a time. In order to retrieve the next value, we must call the generator using the next()
function.
The generator will preserve its state and keep track of where it was last stopped.
Example:
# The infitivie loop function
def infinitive():
num = 0 # We assign a number to our num
while True: #Never ending function
yield num # Instead of using return, we're going to use yield statement
num +=1 # each loop, we add one number
yield 'I\'m the generator' # Let's have another yield statement that gives us a message
# Here we will assign the function to a variable
inf = infinitive()
# Here we use next function that will return new yielded number every time we call it. Our yield statement only returns one item at a time.
print((next(inf)))
print((next(inf)))
print((next(inf)))
Output:
0
I'm the generator
1
I'm the generator
2
I'm the generator
3
I'm the generator
The infinitive
function is a generator that uses an infinite loop to generate a sequence of numbers. It starts by initializing the num
variable to 0
, and then repeatedly yields the value of num
followed by the string "I'm the generator"
. The while True
loop ensures that this process continues indefinitely.
The infinitive
function is assigned to the inf
variable and is then used to generate the sequence of values by calling next(inf)
multiple times. Each call to next(inf)
retrieves the next value in the sequence, until there are no more values left to be generated.
The output of the code shows that the generator yields the values of num
in sequence, followed by the string "I'm the generator"
❕ The return statement stops the execution of the function. Whereas, the yield statement pauses the execution of the function.
Exhausting a Generator
Generators can become exhausted when running out of items to yield. The error message we get is StopIteration error.
In the following example, we will create another function that is named finite. Instead of having a While True loop, we will have a list that we will yield through.
def finitive():
nums = [1, 2, 3, 4, 5]
for num in nums:
yield num
# if num == len(nums):
# yield 'I\'m exhausted'
finite = finitive()
print((next(finite)))
print((next(finite)))
print((next(finite)))
print((next(finite)))
print((next(finite)))
print((next(finite)))
As we can see in the output, we will get the StopIteration error since there are no more items to return.
1
2
3
4
5
Traceback (most recent call last):
File "/Users/almat18/Desktop/python_projects/generators/genex_1.py", line 24, in <module>
print((next(finite)))
StopIteration
Let's now add aif
statement where we check if we have reached the end of the list by using len
the function.
def finite_sequence():
nums = [1, 2, 3, 4, 5]
for num in nums:
yield num
if num == len(nums):
yield 'I\'m exhausted'
finite = finite_sequence()
print((next(finite)))
print((next(finite)))
print((next(finite)))
print((next(finite)))
print((next(finite)))
print((next(finite)))
Output:
1
2
3
4
5
I'm exhausted
So far, we have learned how to create a simple Generator using the yield
keyword. In the next section, we will have a look at how to create Generator functions by writing something called Generator expressions.
Generator Expressions
Generator Expressions are very similar to List Comprehensions (there is an article teaching you the List function in Python).
List Comprehension :huge_list = [num for num in range(1000000)]
Generator Comprehension (uses brackets):huge_gen = (num for num in range(1000000))
Let's now benchmark those two. For this exercise, we are going to use a module called timeit
We're going to sum numbers in each iteration, so if we would have something like the following sum(num for num in range(3))
the result would be, guess what?......3 since it would be doing 0+1+2=3
To make it more consuming, we will put the range of 1000.
So let's race List vs Generator:
import timeit
result_list = (timeit.timeit('sum([num for num in range(100)])'))
result_gen = (timeit.timeit('sum((num for num in range(100)))'))
if result_list < result_gen:
print('🥇 The winner is the List who made it in {} 🥇 '.format(result_list))
print('The generator made it in {}'.format(result_gen))
else:
print('This one will never happen 😂')
And the winner is:
🥇 The winner is the List who made it in 2.7239761609962443 🥇
The generator made it in 3.715018342998519
The conclusion is, the list always beats the generator in the performance benchmark.
But let's have a look at the memory consumption with our old friend sys.getsizeof()
Let's enrich our code a little bit:
We will simply add the following line of code:
print('Mr.List the speed Winner takes {} bytes, but the slower 🥇 Mr.Generator🥇 takes {} bytes'.format(sys.getsizeof([num for num in range(1000)]), sys.getsizeof((num for num in range(1000)))))
import timeit
import sys
result_list = (timeit.timeit('sum([num for num in range(100)])'))
result_gen = (timeit.timeit('sum((num for num in range(100)))'))
if result_list < result_gen:
print('🥇 The winner is the List who made it in {} 🥇 '.format(result_list))
print('The generator made it in {}'.format(result_gen))
else:
print('This one will never happen 😂')
print('Mr.List the speed Winner takes {} bytes, but the slower 🥇 Mr.Generator🥇 takes {} bytes'.format(sys.getsizeof([num for num in range(1000)]), sys.getsizeof((num for num in range(1000)))))
The output is:
🥇 The winner is the List who made it in 2.775996265998401 🥇
The generator made it in 3.4153991410057642
Mr.List the speed Winner takes 8856 bytes, but the slower 🥇 Mr.Generator🥇 takes 104 bytes
So as we can see, the generator comprehension just takes 104 bytes compared to the List that takes 8856 bytes. So when it comes to memory usage, the Generator is a clear winner. So basically using Generators over List means you will get much more efficient memory consumption, but the trade-off is performance.
Advanced Generator Methods
- .send()
- .throw()
- .close()
.Send() Method And How To Use It
In this section, we will look at the .send() method.
Use case: We have a function that checks if the number we provide is odd. If yes, we will get the True value back, but we will also print out what the function is finding, odd or even, perhaps even that the non-numeric value has been provided.
def is_odd_number(num):
if type(num) != int:
return ('{} not an integer'.format(num))
if (num % 2) != 0:
return True
if (num % 2) == 0:
return None
We will also generate a list with some integer items, so we can add the following code:
my_list = [x for x in range(10)] # a list having integer items
def is_odd_number(num):
if type(num) != int:
return ('{} not an integer'.format(num))
if (num % 2) != 0:
return True
if (num % 2) == 0:
return None
my_list = [x for x in range(10)] # a list having integer items
The reason is, at this point, obviously if we deal with large data sets. So let's add a simple Generator function that we will call
check_numbers_gen()
def is_odd_number(num):
if type(num) != int:
return ('{} not an integer'.format(num))
if (num % 2) != 0:
return True
if (num % 2) == 0:
return None
my_list = [x for x in range(10)] # a list having integer items
def check_numbers_gen():
num = 0
print('The generator is initially wakening up')
while True:
num = (yield is_odd_number(num))
print('GENERATOR: Received from the MAIN Loop:', num)
There are some key concepts here:
num
variable; First, we are assigning an integer value of zero to thenum
variable.- While True; as long as the Generator has something to process it will continue, otherwise it goes down to sleep
num = yield is_odd_number(num)
is the key expression here. What yield on the right side does is, it accepts the value it gets from the send() method, sends it to theodd_number()
function, and then the returned value is assigned tonum
variable. This means we can manipulate the yielded value with our value, in this case from our list.- print out num; With the
print('GENERATOR:.....)
function, we will check what item we've received from the MAIN Loop.
gen = check_numbers_gen() # you create the generator object
next(gen) # wakening up the generator, our program needs to advance to the first yield.
Based on our list, we need to send in the list items to the generator with the send() method.
for item in my_list:
next_value = gen.send(item) # let's assign the next item to next_value variable
print('MAIN: next_value BOOL: {} and the Item is: {} '.format(next_value, item)) # we're sending the item to the Generator function by calling next_value
- We are looping through our list
- gen.send() method is used to send a value to our generator, but we'll assign this to the next_value variable
- next_value is called to send our list item to our generator. We're also wrapping it in a print statement to get the visual result.
Complete program
def is_odd_number(num):
if type(num) != int:
return ('{} not an integer'.format(num))
if (num % 2) != 0:
# print('{} is an odd number'.format(num))
return True
if (num % 2) == 0:
return None
def check_numbers_gen():
num = 0
print('The generator is initially wakening up')
while True:
num = (yield is_odd_number(num))
print('GENERATOR: Received from the MAIN Loop:', num)
my_list = [30, 25, 53, 66, 79, 45]
# institiating the generator function to gen
gen = check_numbers_gen()
next(gen)
# MAIN Loop
for item in my_list:
next_value = gen.send(item) # let's assign the next item to next_value item
print('MAIN: next_value BOOL: {} and the Item is: {} '.format(next_value, item))
Output:
The generator is initially wakening up
GENERATOR: Received from the MAIN Loop: 30
MAIN: next_value BOOL: None and the Item is: 30
GENERATOR: Received from the MAIN Loop: 25
MAIN: next_value BOOL: True and the Item is: 25
GENERATOR: Received from the MAIN Loop: 53
MAIN: next_value BOOL: True and the Item is: 53
GENERATOR: Received from the MAIN Loop: 66
MAIN: next_value BOOL: None and the Item is: 66
GENERATOR: Received from the MAIN Loop: 79
MAIN: next_value BOOL: True and the Item is: 79
GENERATOR: Received from the MAIN Loop: 45
MAIN: next_value BOOL: True and the Item is: 45
We let the MAIN loop show all the yields including the None booleans, but this can be re-written if you don't want to have that in your output. In the MAIN loop, we can add a condition that filters out the None values. In addition, we will also add one more non-integer value to check what message we will get from the is_odd_number()
function:
my_list = [30, 25, 53, 66, 79, 45, 'automobile']
# MAIN Loop
for item in my_list:
next_value = gen.send(item) # let's assign the next item to next_value item
if next_value is not None:
print('MAIN: next_value BOOL: {} and the Item is: {} '.format(next_value, item))
Output:
MAIN: next_value BOOL: True and the Item is: 25
MAIN: next_value BOOL: True and the Item is: 53
MAIN: next_value BOOL: True and the Item is: 79
MAIN: next_value BOOL: True and the Item is: 45
MAIN: next_value BOOL: automobile not an integer and the Item is: automobile
.throw()
Method And How To Use It
The throw()
method can be used to raise exceptions if certain conditions are not met. In the following example, we will modify our main loop to include an additional check for non-integer values. We already have a similar conditional check in the is_odd_number(num)
function. Example:
# MAIN Loop
for item in my_list:
if type(item) != int:
gen.throw(ValueError('Not an integer value'))
next_value = gen.send(item) # let's assign the next item to next_value item
if next_value is not None:
print('MAIN: next_value BOOL: {} and the Item is: {} '.format(next_value, item))
close()
Method And How To Use It
The close()
method is used to terminate a generator. For example, if we're searching for a specific value in a list of iterables, let's say 53, we may want to stop the main application from requesting the next value from the generator once we find it. The close()
method raises a StopIteration
exception when a specified condition is met, allowing us to stop the generator gracefully. Example:
# MAIN Loop
for item in my_list:
if type(item) != int:
gen.throw(ValueError('Not an integer value'))
if item == 53:
gen.close()
next_value = gen.send(item) # let's assign the next item to next_value item
if next_value is not None:
print('MAIN: next_value BOOL: {} and the Item is: {} '.format(next_value, item))
Output:
StopIteration
What you have learned
In this lesson, you have learned the following:
- What generators are and how to use them
- Write your generator functions and generator expressions
- Difference between yield and return
- Advanced Generator methods, send(), throw() and stop()
- Benchmark Generator Comprehensions vs List Comprehensions with
sys
andtimeit