For this example, I'm going to use a program from the previous chapter, which showed us how to append things to a double-ended queue. It's a rather simplistic script which doesn't do a hell of a lot, so, it's perfect for showing you the results without too much noise.
import collections
doubleEndedQueue = collections.deque('123456')
print("Deque: {}".format(doubleEndedQueue))
doubleEndedQueue.append('1')
print("Deque: {}".format(doubleEndedQueue))
doubleEndedQueue.appendleft('6')
print("Deque: {}".format(doubleEndedQueue))
When we call cProfile on the preceding program, it should first run through the entire program for us and display the output on the console before displaying the tabled stats that it's recorded for this particular bit of code.
$ python3.6 -m cProfile 04_appendDeque.py
Deque: deque(['1', '2', '3', '4', '5', '6'])
Deque: deque(['1', '2', '3', '4', '5', '6', '1'])
Deque: deque(['6', '1', '2', '3', '4', '5', '6', '1'])
11 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 04_appendDeque.py:1(<module>)
1 0.000 0.000 0.000 0.000 {built-in method builtins.exec}
3 0.000 0.000 0.000 0.000 {built-in method builtins.print}
1 0.000 0.000 0.000 0.000 {method 'append' of 'collections.deque' objects}
1 0.000 0.000 0.000 0.000 {method 'appendleft' of 'collections.deque' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
3 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}
As you can see, the preceding code takes next to no time to execute. We can see from this that the method append and appendleft were called a grand total of once each, and that they took a miniscule amount of time to execute. You can also see on the last line of our table that we called the format function thrice, and that it again took a minimal amount of time.
Let's see what happens if we try using the cProfile on a slightly more advanced program like this:
import threading
import random
import time
def myWorker():
for i in range(5):
print("Starting wait time")
time.sleep(random.randint(1,5))
print("Completed Wait")
thread1 = threading.Thread(target=myWorker)
thread2 = threading.Thread(target=myWorker)
thread3 = threading.Thread(target=myWorker)
thread1.start()
thread2.start()
thread3.start()
thread1.join()
thread2.join()
thread3.join()
If we were to execute cProfile on the last program, which doesn't look too mind bogglingly complex compared to our first program, then you should see a far bigger table rendered out to your console.
For this particular example, the output is far more insightful, and gives us an indication of where some of the slowdowns occur within our codebase so that we can further optimize and improve these speeds.
5157 function calls (5059 primitive calls) in 17.354 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 17.354 17.354 01_timeit.py:1(<module>)
…
You should notice that the number of function calls in the preceding output is exponentially higher than that of our first program--a staggering 5157 function calls as opposed to 10.
This should give you some indication as to how drastically the performance of two programs can differ, and the sheer magnitude of the work that goes on in the background without our knowledge. cProfile, and other tools like this, give us an incredibly important insight into the inner workings of our systems, and arming ourselves with this knowledge allows us to actively start improving performances across all of our systems.