Skip to content

5. Music Structure Analysis

5.1 Introduction

General goal: Divide an audio recording into temporal segments corresponding to musical parts, and group these segments into musically meaningful categories.

Examples:

  • Stanzas of a folk song
  • Intro, verse, chorus, bridge, outro sections of a pop song
  • Exposition, development, recapitulation, coda of a sonata
  • Musical form ABACADA ... of a rondo

Challenge: There are many different principles for creating relationships that form the basis for the musical structure.

  • Homogeneity: Consistency in tempo, instrumentation, key, ...
  • Novelty: Sudden changes, surprising elements ...
  • Repetition: Repeating themes, motives, rhythmic patterns ...

Feature Representations: Convert an audio recording into a mid-level representation that captures certain musical properties while supressing other properties

Extract the features of the whole audio and find its structure

5.2 Self-Similarity Matrix (SSM)

General idea: Compare each element of the feature sequence with each other element of the feature sequence based on a suitable similarity measure.

e.g. Brahms Hungarian Dance No. 5 (Ormandy)

Visualize how SSM is Calculated

The corresponding idealized SSM (which can be approximated using SSM enhancement) exaggerating the important features looks like:

Idealized SSM Exaggerating the Features

  • Blocks: Homogeneity
  • Paths: Repetition
  • Corners of Blocks: Novelty

According to these features in SSMs, we can divide music into finer structures: A1, A2, B, ...

Tool: MATLAB Similarity Matrix Toolbox

5.3 Audio Thumbnailing

General goal: Determine the most representative section ("Thumbnail", often assumed to be the most repetitive segment) of a given music recording.

TODO: path family