Skip to content

A collection of high-quality public recordings of Bach's sonatas and partitas for solo violin (BWV 1001–1006)

Notifications You must be signed in to change notification settings

salu133445/bach-violin-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bach Violin Dataset

The Bach Violin Dataset is a collection of high-quality public recordings of Bach's sonatas and partitas for solo violin (BWV 1001–1006). The dataset consists of 6.5 hours of professional recordings from 17 violinists recorded in various recording setups. It also provides the reference scores and estimated alignments between the recordings and scores. The dataset can be downloaded from Zenodo or GitHub.

Contents

  • All recordings are collected from the web. Their source URLs can be found in audio.csv. They come in either MP3 or OPUS format, depending on their sources.
  • Each folder in the audio directory corresponds to a collection that contains recordings with similar recording setups except the misc folder. (Due to copyright concern, audio files in the collections shunske-sato and young-talents need to be downloaded from YouTube.)
  • The corresponding reference score for each recording can be found in info.csv. They come in MusicXML format, which can be opened with MuseScore, music21 and MusPy.

Below is the file organization of the dataset.

├─ README                                  README file
├─ audio.csv                               Metadata of the audio files
├─ info.csv                                Metadata of the processed files
├─ audio
│  ├─ emil-telmanyi
│  │  ├─ emil-telmanyi_bwv1001.mp3         Recording
│  │  └─ ...
│  └─ ...
├─ scores
│  ├─ bwv1001
│  │  ├─ bwv1001.mxl                       Reference score (whole piece)
│  │  ├─ bwv1001_mov1.mxl                  Reference score (single movement)
│  │  └─ ...
│  └─ ...
├─ notes
│  ├─ emil-telmanyi
│  │  ├─ emil-telmanyi_bwv1001_mov1.csv    Score as a note sequence
│  │  └─ ...
│  └─ ...
├─ alignments
│  ├─ emil-telmanyi
│  │  ├─ emil-telmanyi_bwv1001_mov1.csv    Estimated alignment
│  │  └─ ...
│  └─ ...
└─ tempos
   ├─ emil-telmanyi
   │  ├─ emil-telmanyi_bwv1001_mov1.txt    Average tempo
   │  └─ ...
   └─ ...

Source code

The source code for creating this dataset can be found in the src directory. More details can be found in its README. An introduction of the alignment process is available here.

License

All audio files in this dataset are public recordings collected from various sources. The license for each audio file can be found in its parent directory. All derived alignments retain the same licenses as their corresponding audio files. All reference scores are in public domain. All the code is licensed under MIT.

Citation

Please cite the following paper if you use the dataset or code provided.

Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, and Julian McAuley, "Deep Performer: Score-to-Audio Music Performance Synthesis," Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022

Paper

Deep Performer: Score-to-Audio Music Performance Synthesis
Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, and Julian McAuley
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
[homepage] [paper] [reviews]

About

A collection of high-quality public recordings of Bach's sonatas and partitas for solo violin (BWV 1001–1006)

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

 

Languages