Supplementary Tools for Data Scientists Seth Weidman

The "Why"

Most data scientists are familiar with the basic tools for data cleaning, data manipulation, and machine learning - especially the excellent Pandas, Numpy and Sci-Kit Learn libraries. However, there are so many supplementary libraries that complement these core libraries so well that it would behove most data scientists who are serious about their careers to learn them.

AJAX and jQuery

Many Data Scientists know Flask. Flask, by itself, lets you build very simple applications. AJAX and jQuery let you build significantly more involved applications that are more interactive and more complex visually.

I had wanted to build a program that would solve Sudoku puzzles for some time, mostly for philosophical reasons. This turned out to be a perfect project to learn AJAX and jQuery. Here's are two Medium posts explaining in detail how to use AJAX and jQuery to build this Sudoku app:

  • Part 1 covering HTML and CSS.
  • Part 2 covering jQuery, AJAX, D3.js, and a bit of Flask.
Also, check out the GitHub repo here, as well as the final web app here.

A screenshot of the Sudoku app

Selenium

When I was teaching the Summer 2017 data science immersive program at Metis, I decided to give the lecture on the Selenium library since I hadn’t used it much before. I prepared a short demo of how one could use Selenium - which lets you interact with websites in all the ways you normally would, such as by clicking on things and entering text in forms - to make a reservation on OpenTable.

The students - and Metis students so often do - took this and ran with it, doing amazing projects using Selenium in the week-and-a-half following the lecture.

Here is the GitHub repo containing the Selenium OpenTable example and the two projects that the students did. I was very proud of them!
Me with my students just after we spoke at the ChiPy Meetup.