Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
42 Language Resources (Page 1 of 3)
« Previous | Next »Order by:
- Arabic
ID: ELRA-W0030
ISLRN: 365-777-769-398-7The corpus was developed in the course of a research project at the University of Essex, in collaboration with the Open University. The corpus contains Al-Hayat newspaper articles with value added for Language Engineering and Information Retrieval applications development purposes. The data have ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
480.00 €
|
960.00 €
|
Licence: Commercial Use - ELRA VAR |
960.00 €
|
960.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
720.00 €
|
1440.00 €
|
Licence: Commercial Use - ELRA VAR |
1440.00 €
|
1440.00 €
|
- Arabic
- English
- French
ID: ELRA-W0323
ISLRN: 482-848-308-105-6The annotated tweet corpus in Arabizi, French and English was built by ELDA on behalf of INSA Rouen Normandie (Normandie Université, LITIS team), in the framework of the SAPhIRS project (System for the Analysis of Information Propagation in Social Networks), funded by the DGE (Direction Générale ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
7000.00 €
|
Licence: Commercial Use - ELRA VAR |
7000.00 €
|
7000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
10000.00 €
|
Licence: Commercial Use - ELRA VAR |
10000.00 €
|
10000.00 €
|
- Arabic
ID: ELRA-S0384
ISLRN: 866-568-447-697-8This speech corpus has been developed as part of a PhD work carried out by Nawar Halabi at the University of Southampton. The corpus was recorded through a Neumann TLM 103 Studio Microphone by one male speaker in South Levantine Arabic (Damascian accent) in a professional studio. The transcript w...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
9000.00 €
| |
Licence: Attribution - CC-BY |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
11200.00 €
| |
Licence: Attribution - CC-BY |
0.00 €
|
0.00 €
|
- Arabic
ID: ELRA-S0315
ISLRN: 919-064-571-056-1A-SpeechDB© is an Arabic speech database suited for training acoustic models for Arabic phoneme-based speaker-independent automatic speech recognition systems. The database contains about 20 hours of continuous speech recorded through one desktop omni microphone by 205 native speakers from Egypt ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
750.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1000.00 €
|
7500.00 €
|
Licence: Commercial Use - ELRA VAR |
7500.00 €
|
7500.00 €
|
- Arabic
ID: ELRA-S0308
ISLRN: 336-447-731-497-3The Egyptian Arabic Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 550 adult Egyptian speakers of Modern Standard Arabic as spoken in Egypt (273 males, 277 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
50000.00 €
|
67000.00 €
|
Licence: Commercial Use - ELRA VAR |
67000.00 €
|
67000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
60000.00 €
|
75000.00 €
|
Licence: Commercial Use - ELRA VAR |
75000.00 €
|
75000.00 €
|
- Arabic
- Bulgarian
- Chinese
- Croatian
- Czech
- French
- German
- Hausa
- Japanese
- Korean
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swahili (macrolanguage)
- Swedish
- Tamil
- Thai
- Turkish
- Ukrainian
- Vietnamese
ID: ELRA-S0400
ISLRN: 331-592-378-424-7The GlobalPhone 2000 Speaker Package contains transcribed read speech spoken by 2000 native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Mandarin (ELRA-S0193), C...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1200.00 €
|
6000.00 €
|
Licence: Commercial Use - ELRA VAR |
6000.00 €
|
6000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1400.00 €
|
7200.00 €
|
Licence: Commercial Use - ELRA VAR |
7200.00 €
|
7200.00 €
|
- Arabic
ID: ELRA-S0192
ISLRN: 720-473-726-952-8The GlobalPhone corpus developed in collaboration with the Karlsruhe Institute of Technology (KIT) was designed to provide read speech data for the development and evaluation of large continuous speech recognition systems in the most widespread languages of the world, and to provide a uniform, mu...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
600.00 €
|
3000.00 €
|
Licence: Commercial Use - ELRA VAR |
3000.00 €
|
3000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
700.00 €
|
3600.00 €
|
Licence: Commercial Use - ELRA VAR |
3600.00 €
|
3600.00 €
|
Special offers are also available. Check here for details.
- Arabic
- Bulgarian
- Chinese
- Croatian
- Czech
- French
- German
- Hausa
- Japanese
- Korean
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swahili (macrolanguage)
- Swedish
- Tamil
- Thai
- Turkish
- Ukrainian
- Vietnamese
ID: ELRA-S0399
ISLRN: 204-945-263-927-6The GlobalPhone Multilingual Model Package contains about 22 hours of transcribed read speech spoken by native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Manda...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1200.00 €
|
6000.00 €
|
Licence: Commercial Use - ELRA VAR |
6000.00 €
|
6000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1400.00 €
|
7200.00 €
|
Licence: Commercial Use - ELRA VAR |
7200.00 €
|
7200.00 €
|
- Arabic
ID: ELRA-W0049
ISLRN: 124-139-628-259-2This corpus contains 102,960 vowelised, lemmatised and tagged words (58 texts from Le Monde Diplomatique Arabic, see also ELRA-W0036-04). To each text are associated 3 files : - raw text in Arabic, - vowelized text in Arabic, - one XML file containing the morphological annotation of the text. ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
185.00 €
|
975.00 €
|
Licence: Commercial Use - ELRA VAR |
975.00 €
|
975.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
400.00 €
|
2000.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
- Arabic
- English
- French
ID: ELRA-E0045
ISLRN: 364-018-517-901-2The MAURDOR project consists in evaluating systems for automatic processing of written documents. Collected written documents are scanned documents (printed, typewritten or manuscripts). In order to get images for the evaluation of automatic analysis systems, 10,000 original documents were c...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
|
10000.00 €
|
Licence: Evaluation Use - ELRA EVALUATION |
5000.00 €
| |
Licence: Commercial Use - ELRA VAR |
10000.00 €
|
10000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
750.00 €
|
15000.00 €
|
Licence: Evaluation Use - ELRA EVALUATION |
7500.00 €
| |
Licence: Commercial Use - ELRA VAR |
15000.00 €
|
15000.00 €
|
- Arabic
ID: ELRA-S0404
ISLRN: 938-639-614-524-5The MGB-5 Moroccan Dialect comprises 14 hours of Moroccan Arabic speech extracted from 93 YouTube videos distributed across seven genres: comedy, cooking, family/children, fashion, drama, sports, and science clips. Given that dialectal Arabic does not have a clearly defined orthography, differ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
1500.00 €
|
Licence: Commercial Use - ELRA VAR |
1500.00 €
|
1500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
2000.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
- Arabic
ID: ELRA-W0078
ISLRN: 398-979-151-557-0The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
- Arabic
ID: ELRA-S0219
ISLRN: 479-507-036-103-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The Nemlar Broadcast News Speech Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
|
500.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
|
1000.00 €
|
Licence: Commercial Use - ELRA VAR |
4000.00 €
|
4000.00 €
|
Special offers are also available. Check here for details.
- Arabic
ID: ELRA-S0220
ISLRN: 361-216-121-305-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Broadcast News Speech Corpus (ELRA-S0219). The NEMLAR Speech Synthesis Corpus contains the reco...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
|
1250.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1000.00 €
|
2500.00 €
|
Licence: Commercial Use - ELRA VAR |
10000.00 €
|
10000.00 €
|
Special offers are also available. Check here for details.
- Arabic
ID: ELRA-W0042
ISLRN: 050-693-158-326-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Broadcast News Speech Corpus (ELRA-S0219) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The NEMLAR Written Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
|
250.00 €
|
Licence: Commercial Use - ELRA VAR |
1000.00 €
|
1000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
|
500.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
Special offers are also available. Check here for details.
- Arabic
ID: ELRA-S0157
ISLRN: 663-177-513-755-1The NetDC Arabic BNSC (Broadcast News Speech Corpus) is a corpus developed by ELDA in the framework of the European-funded project Network of Data Centres (NetDC). The project was done in collaboration with the LDC (Linguistic Data Consortium), which has produced a similar corpus from the news br...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
100.00 €
|
1350.00 €
|
Licence: Commercial Use - ELRA VAR |
1350.00 €
|
1350.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
200.00 €
|
2700.00 €
|
Licence: Commercial Use - ELRA VAR |
2700.00 €
|
2700.00 €
|
- Arabic
ID: ELRA-W0127
ISLRN: 305-450-745-774-1Normalized Arabic Fragments for Inestimable Stemming (NAFIS) is an Arabic stemming gold standard corpus composed by a collection of sentences, selected to be representative of Arabic stemming tasks and manually annotated. Indeed, NAFIS is: Comprehensive: The content of NAFIS can be generalized...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
0.00 €
|
0.00 €
|
- Arabic
ID: ELRA-S0190
ISLRN: 627-343-367-534-7The OrienTel Arabic as spoken in Israel database comprises 750 Arabic speakers (375 males, 375 females) recorded over the Israeli fixed and mobile telephone network. This database is partitioned into 2 DVDs. The speech databases made within the OrienTel project were validated by SPEX, the Netherl...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
37875.00 €
|
40000.00 €
|
Licence: Commercial Use - ELRA VAR |
40000.00 €
|
40000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
39843.00 €
|
43125.00 €
|
Licence: Commercial Use - ELRA VAR |
43125.00 €
|
43125.00 €
|
- Arabic
ID: ELRA-S0221
ISLRN: 036-535-444-454-5The OrienTel Egypt MCA (Modern Colloquial Arabic) database comprises 750 Egyptian speakers (398 males, 352 females) recorded over the Egyptian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
18000.00 €
|
24000.00 €
|
Licence: Commercial Use - ELRA VAR |
24000.00 €
|
24000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
22500.00 €
|
30000.00 €
|
Licence: Commercial Use - ELRA VAR |
30000.00 €
|
30000.00 €
|
This resource is also available in a bundle. Check here for bundled pricing.
- Arabic
ID: ELRA-S0222
ISLRN: 830-378-677-910-7The OrienTel Egypt MSA (Modern Standard Arabic) database comprises 500 Egyptian speakers (254 males, 246 females) recorded over the Egyptian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated b...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
12000.00 €
|
16000.00 €
|
Licence: Commercial Use - ELRA VAR |
16000.00 €
|
16000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
15000.00 €
|
20000.00 €
|
Licence: Commercial Use - ELRA VAR |
20000.00 €
|
20000.00 €
|
This resource is also available in a bundle. Check here for bundled pricing.
« Previous | Next »