Download
You can download the segmented and aligned data for each language by clicking on the corresponding link from the following table.
If you use any part of the corpus in your work, please cite the following paper:
A. Stan, O. Watts, Y. Mamiya, M. Giurgiu, R. A. J. Clark, J. Yamagishi, S. King, TUNDRA: A Multilingual Corpus of Found Data for TTS Research Created with Light Supervision, In Proc. Interspeech, Lyon, France, August 2013
Please refer to the README file prior to downloading the corpus. You will find a detailed description of the archives, as well as the licence info.
You can also download the tools used to create this corpus from HERE
| Language | Code | Title | Author | Segmented audio and text | Chapter-level annotation |
|---|---|---|---|---|---|
| Bulgarian | BG | Zhetvariat | Yordan Yovkov | DOWNLOAD [969MB] | DOWNLOAD |
| Danish | DA | Grimms eventyr I udvalg | Grimm Brothers | DOWNLOAD [194MB] | DOWNLOAD |
| Dutch | NL | Anna Karenina | Leo Tolstoy | DOWNLOAD [1.1GB] | DOWNLOAD |
| English | EN | Living Alone | Stella Benson | DOWNLOAD [562MB] | DOWNLOAD |
| Finnish | FI | Rautatie | Juhani Aho | DOWNLOAD [606MB] | DOWNLOAD |
| French | FR | Candide | Voltaire | DOWNLOAD [520MB] | DOWNLOAD |
| German | DE | Das Bildnis des Dorian Gray | Oscar Wilde | DOWNLOAD [953MB] | DOWNLOAD |
| Hungarian | HU | Egri csillagok | Geza Gardonyi | DOWNLOAD [1.2GB] | DOWNLOAD |
| Italian | IT | Galatea | Anton Giulio Barrili | DOWNLOAD [668MB] | DOWNLOAD |
| Polish | PL | Siedem wybranyc opowiadan | Wladyslaw Orkan | DOWNLOAD [629MB] | DOWNLOAD |
| Portuguese | PT | Senhora | Jose de Alencar | DOWNLOAD [1.2GB] | DOWNLOAD |
| Romanian* | RM | Mara | Ioan Slavici | DOWNLOAD [1.5GB] | DOWNLOAD |
| Russian | RU | Ucheniye Khrista | Leo Tolstoy | DOWNLOAD [358MB] | DOWNLOAD |
| Spanish** | ES | Don Quijote de la Mancha | Miguel de Cervantes | - | DOWNLOAD |
**Only the first 35 chapters from the first part were used for alignment. The data can not be redistributed, so please download the files from the original source found on the About page.

This work is licensed under a Creative Commons Attribution 3.0 Unported License. The underlying audio and text are subject to their source licenses, so please check the links before using the data.