This page contains a list of data collections relevant to computational ethnomusicology. If you are interested in including your own collection, please fill out the request form.

The corpus COFLA is a collection of audio-descriptors and meta-information of more than 1800 flamenco recordings, created for the development and evaluation of computational methods and studies targeting flamenco music. There are three annotated subsets: Cante2midi (singing melody), Cante100 (style) and CanteFAN (repeated patterns). Audio files are shared on request for research purposes.

CompMusic project Datasets
Several datasets of various music cultures have been created and made publicly available within the CompMusic project: Indian Music Tonic Dataset, Carnatic Varnam Dataset, Carnatic Music Rhythm Dataset, Hindustani Music Rhythm Dataset(DS), Mridangam Stroke DS, Mridangam Tani-avarthanam DS, Tabla Solo DS, Turkish Makam Symbolic Phrase DS, Turkish şarki vocal DS, Turkish makam acapella sections DS, Turkish Makam Audio-Score Alignment DS, Turkish Makam Section DS, Turkish Makam Tonic DS, Turkish Makam Melodic Phrase DS, Beijing Opera Percussion Instrument DS, Beijing Opera Percussion Pattern DS

Meertens Tune Collections
The Meertens Tune Collections contain 7000+ Dutch folk song recordings, 4000+ digitized notated Dutch folk songs (based on transcriptions and song books), and 2000+ digitized notations of instrumental music from the Netherlands (18th century).

Turkish Makam Music Symbolic Data Collection: SymbTr is a collection machine readable symbolic scores aimed at performing computational studies of Turkish Makam music. SymbTr is currently the biggest machine readable collection of Turkish makam music. The latest version of the SymbTr collection consists of 2200 pieces from 155 makams, 88 usuls, 56 forms, about 865.000 musical notes and 80 hours nominal playback time.  SymbTr-scores are provided in text, MusicXML, PDF, MIDI and mu2 formats. ABC version of this dataset is available here.

Tbilisi State Conservatory Recordings of Artem Erkomaishvili (1966)
This dataset consists of a collection of annotated music recordings performed by Artem Erkomaishvili (1887–1967), who is one of the last representative of the master chanters of Georgian music. Recorded at the Tbilisi State Conservatory in 1966, the original audio material is hosted at the Folklore Department, Tbilisi State Conservatory. The website provides segment annotations as well as fundamental frequency (F0) annotations for each of the roughly 100 recordings in a simple CSV format. Furthermore, visualizations and sonifications of the F0 trajectories are provided.

Digital resources for musicology: This website provides links to substantial open-access projects. Digital Resources in Musicology (DRM) is organised topically and provides a rapid search tool for specialties within heterogeneous collections.



Create a free website or blog at

Up ↑

%d bloggers like this: