Pageturner: Insight Data Science Project


I completed the Silicon Valley Insight Data Science Fellowship program in 2015. Part of this involved quickly developing an app and then demonstrating it at various companies in the area. For this, I built a first-pass version of a Pageturner app which listens to the user play music and scrolls or turns the pages of sheet music appropriately, and can also find and display sheet music from a library based on what the user is playing.

Background and Overview

There are three major algorithmic components to this app: (1) auditory note recognition, (2) optical musical notation recognition, and (3) sequence alignment (with errors). The auditory note recognition component is based on a previous exploration described in an earlier blog post, but with various changes to the underlying model used, and technical changes to allow it to use the HTML5 microphone api and also to perform the recognition task entirely in the browser (to avoid latency problems with slow internet connections). The optical musical recognition component relies heavily on the existing tool Audiveris, with just some minor additional processing to locate the measures in the original image file. As such, the main challenges in building the demonstration Pageturner app were in developing the sequence alignment component and in putting all of the pieces together into a functioning web-based app.

Recorded Demo

In the live demos, I described the sequence alignment algorithm in some detail and pulled out a melodica for some real-time playing and following. The recording below gives a bit of a sense of the playing and following part of the demo.

recorded Pageturner demo with Melodica

General Overview Schematics

The figure below outlines the various processes performed on the frontend, backend and offline, for the case of song recognition.

And the next outlines the same for the case of note following. Note that the logistic regression model is coded out on the frontend in both cases, but the Needleman-Wunch algorithm is on the backend in the former case and on the frontend in the latter.