, by Frederic Font

Audio Commons Datasets

During the development of the Audio Commons project we have created a number of data collections of interest for research purposes. The complete list of these data collections with information about their availability and preservation policies can be found in our Data Management Plan. This includes 33 entreis covering a wide range of topics. What follows is a selection of those entries that can be of more general interest and are openly available. Please check the Data Management Plan for full details.

  • Requirements survey: Responses to a survey we launched to collect user requirements for the design of AudioCommons services. This includes the responses of 600+ participants. Available in Zenodo.
  • Tempo, key and pitch ground truth data for music samples: Ground truth data used for the development and evaluation of the tools for the analysis of music samples (Audio Commons Audio Extractor). Available in Zenodo.
  • MediaEval AcousticBrainz Genre: Genre and subgenre annotations of music recordings extracted from four different online metadata sources, including editorial metadata databases maintained by music experts and enthusiasts (AllMusic and Discogs) as well as collaborative music tagging platforms (Lastfm and Tagtraum).. Full datasets description and links available here.
  • Freesound and Jamendo content analyzed with Audio Commons analysis tools: This includes the results of extracting musical properties from the Freesound and Jamendo catalogs using the Audio Commons audio analysis tools. Freesound analysis avialable here, Jamendo analysis available here.
  • Timbral Characterisation Tool Development Dataset: This dataset contains data generated for the devlopment and evaluation of timbral models. Data comprise Max-based listening test interfaces, audio files, test results in the form of csv files, and documentation. Available in Zenodo.
  • FSDKaggle2018: Audio dataset containing 11,073 audio files annotated with labels from 41 general audio categories from Google’s AudioSet Ontology. Available in Zenodo.



Cover image credits: IMG_7039 by Marcin Ignac, posted on Flickr under CC-BY-NC-ND license.