Data science happens in code. The ability to write reproducible, robust, scaleable code is key to a data science project's success—and is absolutely essential for those working with production code. This practical book bridges the gap between data science and software engineering,and clearly explains how to apply the best practices from software engineering to data science.Examples are provided in Python, drawn from popular packages such as NumPy and pandas. If you want to write better data science code, this guide covers the essential topics that are often missing from introductory data science or coding classes, including how to:Understand data structures and object-oriented programmingClearly and skillfully document your codePackage and share your codeIntegrate data science code with a larger code baseLearn how to write APIsCreate secure codeApply best practices to common tasks such as testing, error handling, and loggingWork more effectively with software engineersWrite more efficient, maintainable, and robust code in PythonPut your data science projects into productionAnd moreAbout the AuthorCatherine Nelson is a freelance data scientist and writer. She is currently working on the forthcoming O'Reilly book "Software Engineering for Data Scientists". Previously, she was a Principal Data Scientist at SAP Concur, where she delivered production machine learning applications and developed innovative new features using NLP. She is also co-author of the O'Reilly publication "Building Machine Learning Pipelines", and she is an organizer for Seattle PyLadies, supporting women who code in Python. In her previous career as a geophysicist she studied ancient volcanoes and explored for oil in Greenland. Catherine has a PhD in geophysics from Durham University and a Masters of Earth Sciences from Oxford University.