Tools of the trade: Using Python Interactive in VS Code

IMG_0293

This is the ninth post in a series of blog posts using a theme of “Tools of the trade”. The series targets software tools, statistical concepts, data science techniques, or related items. In all cases, the topic will contribute to accomplishing data science tasks. The target audience for the posts is engineers, analysts, and managers who want to build their knowledge and skills in data science, particularly those in the Microsoft Dynamics ecosystem.

This post describes my preferred way of interactive Python development in VS Code.

What is it?

The Python Interactive window in VS Code is a way to develop using ‘Jupyter’ like cells. A ‘magic’ comment of “#%%” in a *.py file triggers a new cell along with Run cell | Run below | Debug cell menu above the cell.

PreProcess.py > . 
Run Cell Run Below Debug Cell Go to (Il 
print( 'hello world' )

Running the cell causes the interactive window to open with the results of the execution along with a REPL (read-eval-print loop) command prompt for direct entry / execution of Python statements.

How do I use it?

This is my preferred Python development approach for exploring datasets, creating plots, and initial data wrangling and ML work. Productizing or automating the code involves a more structured coding approach.

Discussion

My first several years working on data science projects for school and work were R based. I came to like the RStudio development environment for interactive development. As shown in the screenshot below for some NCAA basketball modeling work, there’s a section of the app for script based coding, a REPL prompt, variable information, and plots. Each section has multiple tabs to get to more information as needed.

As I’ve started working on Azure ML based projects at work in the last couple of years, I’ve spent more time using Python. Although Jupyter notebooks can now be executed in VS Code and most samples are notebook based, I’ve never been a fan of them. One issue is the single thread of code and result of the notebook. I find that I’m scrolling a lot and it’s easy to lose context.

I write a fair amount of straight Python code. This is needed when setting up automated execution or productizing some code, but I find it limiting when doing initial interactive work with a dataset.

The interactive mode is a good compromise that is closest to the RStudio experience. The example layout shown below from my electrical utility time series posts shows similar content as the screenshot above.

I only recently discovered the ‘#%%’ magic sequence and there’s significant functionality in the linked page below that I haven’t explored. Check it out if you need to do interactive Python work!

References

Working with Jupyter code cells in the Python Interactive window (visualstudio.com)

Picture details:  Minnesota autumn, 10/6/20, Canon PowerShot G3 X, f/4.5, 1/250, ISO-160