Jupyter notebooks (and similar products such as Google Colab) have become the army-knife of data scientists. They allow engineers to write…
Jupyter notebooks (and similar products such as Google Colab) have become the army-knife of data scientists. They allow engineers to write and execute code in their browsers. However, these environments do not analyze the code being written and lack support for any help to find potential programming issues, which makes it harder to detect bugs. In this post, we explain why using tools to detect bugs is important and we present a new tool that integrates modern code analysis techniques with Jupyter notebooks.
Why verify Python code?
Despite its popularity (Python is ranked 1st in the TIOBE index), Python is a very error-prone language. For example, an indentation error can cause an instruction to execute when it’s not supposed to. The lack of strong typing makes it hard to detect potential bugs (and programmers make bad assumptions about the type of a variable).
Developers spend a lot of time developing and debugging code. Beyond the initial implementation, what costs more when operating a system is the maintenance costs. It is estimated that 80% of the cost for operating a system is related to its maintenance (developing the system takes 20%). Maintenance costs are also inversely proportional to the code quality: a software poorly written or tested will require fixes and updates soon after its release. For these reasons, this is very important to ensure code is clean, maintainable, and tested.
Many tools help developers to detect potential problems, even before they execute their code. In the following paragraph, we present the existing ecosystem.
Existing ecosystem for Python code analysis
Most IDE comes with some code analysis capabilities that validate the syntax of your code. For example, PyCharm comes with analysis capabilities that check for syntax and semantic issues. There are also multiple open-source tools that check Python code either from a semantic (e.g. pylint), security (e.g. bandit), or style (e.g. black) perspective. These tools are often integrated into IDE using custom plugins and extensions.
However, static analysis tools are not integrated with Jupyter notebooks and developers cannot benefit from the analysis of such tools.
Checking code in Jupyter notebooks
As Python developers that regularly use Jupyter for data analysis, we wanted to bring these tools into the Jupyter ecosystem and help developers catch bugs quickly. We implemented a Chrome plugin that analyzes Python code in Jupyter notebooks and reports all issues while developers are writing code.
The tool executes static analyzers such as Pylint and Bandit to detect syntax, semantic and security errors in Python code and reports the errors directly into the Jupyter notebook. The plugin is currently compatible with Jupyter notebooks and will soon support other platforms such as Google Colab or AWS Sagemaker.
Python is the most popular programming language today but writing flawless Python code is hard. Thankfully, multiple tools exist to help developers detect sub-optimal code. By interfacing these tools with Jupyter notebooks, our Chrome extension helps developers detect issues as they write code and fix them quickly, before shipping code into production.
About the author
Julien Delange is the CEO of Codiga , a company that helps developers write better code faster. Julien is an experienced software developer and has worked at Twitter, Amazon Web Services, and is the author of the book Technical Debt in Practice published by MIT Press.