ÌròyìnSpeech – ELRA Catalogue

Last view: 2025-07-06

267 Last view: 2025-07-06

ÌròyìnSpeech

ISLRN: 012-405-700-001-6

ID:

ELRA-S0492

A modern, high-fidelity, multi-speaker, Yorùbá read speech corpus suitable for Speech Synthesis, Automatic Speech Recognition and Computational Linguistics research. The subject matter is drawn from the Broadcast News domain as well as fictional texts, delivering a multi-purpose, contemporary speech dataset.
This corpus consists in 34000 read sentences, 42 hours of audio recorded under 48kHz, 16bit Linear PCM WAV format, for ca. 12.5 Gigabytes.

View resource description in French

Corpus de parole lu en Yorùbá moderne, de très bonne qualité, multi-locuteurs, adapté à la recherche en synthèse et reconnaissance automatique de la parole et à la linguistique computationnelle. Le sujet traité est issu du domaine des actualités de télé-radio-diffusion ainsi que des textes de fiction, proposant une base de données de parole contemporaine multi-usages.
Le corpus comprend 34000 phrases lues, 42 heures de données audio enregistrées au format 48kHz, 16bit Linear PCM WAV, pour env. 12.5 gigaoctets.

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	11200.00 €
Licence: Commercial Use - ELRA VAR	11200.00 €	11200.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	12000.00 €
Licence: Commercial Use - ELRA VAR	12000.00 €	12000.00 €

DistributionAvailability start date 17/05/2024 Contact Person

Valérie Mapelli

audio

Monolingual audio corpusLanguages

Yoruba

Linguality

Linguality type: Monolingual

Size

34,000 Sentences

Effective speech duration

42 Hours

Metadata

Created: 05/17/2024

Last Updated: 05/17/2024

Metadata Language: French, English (fr, en)

Version

Version: 1.0

People who looked at this resource also viewed the following: