Summary

Summary#

In this workshop, we covered:

1. Understand how to profile Python code and identify bottlenecks

  • Measure the time of cells, functions, and programs to find bottlenecks e.g., using timeit and line_profiler.

  • Visualise the profiled code e.g., using SnakeViz and pyinstrument.

  • Log profiling information e.g., using Eliot.

  • Consider how fast the code could go e.g., Big O notation.

2. Understand how to choose the most appropriate data structure, algorithm, and libraries for a problem

  • Make use of the built-in functions e.g, use len rather than counting the items in an object in a loop.

  • Use appropriate data structures e.g., append to lists rather than concatenating, use dictionaries as fast to search look-ups, cache results in dictionaries or tuples to reduce repeated calculations.

  • Make use of the standard library (optimised in C) e.g., the math module.

  • See whether there is an algorithm or library that already optimally solves your problem e.g., faster sorting algorithms.

3. Improve the execution time of Python code using:

  • Vectorisation

    • Take advantage of broadcasting for different shaped arrays.

    • Use vectorised functions where you can e.g., NumPy ufuncs.

  • Compilers

    • Speed up numerical functions with the Numba @njit (nopython) compiler.