My research is in applied mathematics. Specifically, I work at the intersection of network theory and machine learning. Working with high-dimensional and often noisy datasets, I seek to extract salient information from each data point to inform meaningful comparisons between the data points. Viewing my data as networks is essential to the techniques that I create. This network point of view introduces natural questions such as what data points cluster together? and what relationship does network structure have to the high-dimensional data domain at hand? These kinds of questions are naturally approached with theory from the fields of numerical linear algebra, statistical learning, complex networks, and machine learning.
To be more precise, I work with data based on cultural artifacts, specifically sets of musical songs. To perform the analysis of the dataset, I begin by individually analyzing each datapoint and creating Aligned Hierarchies for each one. These aligned hierarchies are created by identifying repeated structure at a variety of sizes, and are smaller, coarser representations of the original data. I use the aligned hierarchies to create a network representation for the whole dataset which I then use to perform the analysis of the dataset.
Currently, I am working with data based on Mazurkas. Thus my work is relevant and interesting to the field of Music Information Retrieval (MIR). My approach of treating the data, in this case the songs, as a complex network on which mathematical theory and machine learning principles can naturally be applied. Thereby my work is interesting to both the mathematical and machine learning communities. Additionally, this network approach frees me to find structure without limiting myself to well known and studied musical objects such as chords or codas.
Simplistically, my present goal is to compare songs without listening to them. In this work, my song comparisons are task-dependent; I may be looking for exact matches of a recording, for cover songs of a specific song, or for remixes of all or part of a particular song. In MIR, these different comparisons are called tasks. My approach, regardless of task, is to build a representation space for the dataset created from the individual multiscale signatures that represent each song. To create the aligned hierachies for a song, my algorithm finds repeats in the song matrices at a variety of sizes.
|We note that time-sequenced data, like a recording of a performance of a song, has a beginning and an end, represented by the knob and arrow, respectively. Additionally, in this image, detected relevant structure and repetitions are denoted with matching colored boxes.
1 2 3
|In this representation, we think of each repetition as a location that we are visiting. In this example, the song begins, then we visit location 1, leave and return to location 1, leave location 1, visit location 2, and so on. More compactly, we can write [1,1,2,1,2,3,2].|
To learn more about my reseach, please download my research statement.