Testing Python Data Science Code
The larger and more complex the world of data science becomes, the more data there is to collect, sort, clean, model on, and much more. An emerging pain point in this brave new world is that a lot can go wrong if your data engineering and development practices are shoddy. This advanced-level course shows data scientists, Python developers, and data analysts how to test scientific (data science) code written in Python. Veteran data science trainer and consultant Miki Tebeka covers testing techniques, with a focus on issues specific to data science code, such as floating point errors, statistical testing, working with large datasets, choosing a baseline, and more. After presenting a testing overview, Miki dives into testing with pytest and hypothesis. He explains how to use schemas, truth values, approximate testing, and more in data validation. Miki goes over regression testing, then demonstrates how to test Jupyter Notebooks.