Resource Type:
Corpus: | ![]() |
Lexical/Conceptual: | ![]() |
Tool/Service: | ![]() |
Language Description: | ![]() |
Media Type:
Text: | ![]() |
Audio: | ![]() |
Image: | ![]() |
Video: | ![]() |
Text Numerical: | ![]() |
Text N-Gram: | ![]() |
1685 Language Resources (Page 46 of 85)
« Previous | Next »Order by:


- English
- Polish
ID: ELRA-W0176
ISLRN: 238-889-529-582-8This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The Polish-English parallel corpus is composed of three ...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- German
ID: ELRA-S0395
ISLRN: 157-037-166-491-1The Nautilus Speaker Characterization (NSC) Corpus comprises clean microphone recordings of conversational speech from 300 German speakers (126 males and 174 females) aged 18 to 35 years, with no marked dialect/accent. The recordings were performed in the acoustically-isolated room "Nautilus" (wh...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |


- Arabic
ID: ELRA-W0078
ISLRN: 398-979-151-557-0The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- Chinese
ID: ELRA-W0079
ISLRN: 187-154-782-686-9The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- Russian
ID: ELRA-W0080
ISLRN: 024-620-556-146-2The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- Arabic
ID: ELRA-S0219
ISLRN: 479-507-036-103-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The Nemlar Broadcast News Speech Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4000.00 €
![]() |
4000.00 €
![]() |
Special offers are also available. Check here for details.


- Arabic
ID: ELRA-S0220
ISLRN: 361-216-121-305-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Broadcast News Speech Corpus (ELRA-S0219). The NEMLAR Speech Synthesis Corpus contains the reco...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
1250.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1000.00 €
![]() |
2500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
10000.00 €
![]() |
10000.00 €
![]() |
Special offers are also available. Check here for details.


- Arabic
ID: ELRA-W0042
ISLRN: 050-693-158-326-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Broadcast News Speech Corpus (ELRA-S0219) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The NEMLAR Written Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
![]() |
250.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |
Special offers are also available. Check here for details.


- Nepali (macrolanguage)
ID: ELRA-W0076
ISLRN: 325-796-965-405-9The Nepali Monolingual written corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (“language communication”), also known as Nelralec, for Nepali Language Resources and Localiza...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |


- Nepali (macrolanguage)
ID: ELRA-S0368
ISLRN: 688-800-566-571-0The Nepali Spoken Corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (“language communication”), also known as Nelralec, for Nepali Language Resources and Localization for Educ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |


- Arabic
ID: ELRA-S0157
ISLRN: 663-177-513-755-1The NetDC Arabic BNSC (Broadcast News Speech Corpus) is a corpus developed by ELDA in the framework of the European-funded project Network of Data Centres (NetDC). The project was done in collaboration with the LDC (Linguistic Data Consortium), which has produced a similar corpus from the news br...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
100.00 €
![]() |
1350.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1350.00 €
![]() |
1350.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
200.00 €
![]() |
2700.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2700.00 €
![]() |
2700.00 €
![]() |


- English
- French
ID: ELRA-T0362
ISLRN: 761-442-215-246-0Extended version of ELRA-T0090 GEOBASE. The terms were selected and collated by Dr M.S.N. CARPENTER during the course of his translation activities over the past ten years. The terms have been validated by publication in the scientific literature. Conceived as a bilingual terminological resource,...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3420.00 €
![]() |
4788.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4788.00 €
![]() |
4788.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4788.00 €
![]() |
6840.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
6840.00 €
![]() |
6840.00 €
![]() |


- English
ID: ELRA-L0045
ISLRN: 044-694-748-731-5This is Oxford University Press's most comprehensive single-volume dictionary, with 170,000 entries covering all varieties of English worldwide. The NODE data set constitutes a fully integrated range of formal data types suitable for language engineering and NLP applications: It is available in X...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
6125.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
8750.00 €
![]() |


- English
ID: ELRA-L0047
ISLRN: 869-866-137-463-6The New Oxford Thesaurus of English is a completely new top-of-the-range thesaurus offering more alternative and opposite words than any of its competitors. The synonyms are arranged in order of ?relevance? to the look-up word, starting with an individually tagged core synonym, and followed by la...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4900.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
7000.00 €
![]() |


- English
ID: ELRA-L0046
ISLRN: 003-258-865-840-0The DIMAP version of NODE (first edition) is a machine-tractable version of the machine-readable dictionary files in the DIMAP dictionary maintenance programs, adding syntactic and semantic information in the conversion. In addition, DIMAP provides several mechanisms that will allow research into...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
7000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
10000.00 €
![]() |


- Spanish; Castilian
ID: ELRA-S0444
ISLRN: 469-588-696-069-61,630 non-Spanish nationality native Spanish speakers such as Mexicans and Colombians participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, in-vehicle and home. The text is manually proofr...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
180975.00 €
![]() |
180975.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
180975.00 €
![]() |
180975.00 €
![]() |
Special offers are also available. Check here for details.


- Arabic
ID: ELRA-W0127
ISLRN: 305-450-745-774-1Normalized Arabic Fragments for Inestimable Stemming (NAFIS) is an Arabic stemming gold standard corpus composed by a collection of sentences, selected to be representative of Arabic stemming tasks and manually annotated. Indeed, NAFIS is: Comprehensive: The content of NAFIS can be generalized...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |


- Norwegian
ID: ELRA-S0301
ISLRN: 184-180-634-505-7EUROM1 is the first really multilingual speech database produced in Europe. Equivalent corpora for each of the European languages were collected with the same number of speakers selected in the same way, and recorded in the same conditions with common file formats. Initially eight European countr...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
800.00 €
![]() |
800.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
800.00 €
![]() |
800.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1600.00 €
![]() |
1600.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1600.00 €
![]() |
1600.00 €
![]() |


- Norwegian
ID: ELRA-S0081
ISLRN: 231-756-812-990-0The Norwegian SpeechDat(II) FDB-1000 comprises 1016 Norwegian speakers (517 males, 499 females) recorded over the Norwegian fixed telephone network. The FDB-1000 database is partitioned into 4 CDs. The speech databases made within the SpeechDat(II) project were validated by SPEX, the Netherlands,...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
15000.00 €
![]() |
18000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
18000.00 €
![]() |
18000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
25000.00 €
![]() |
25000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
25000.00 €
![]() |
25000.00 €
![]() |


- Portuguese
ID: ELRA-W0089
ISLRN: 412-883-442-173-8NPChunks is a training corpus containing approximately 1,000 sentences, with a total of 24,243 tokens, selected randomly from the written part of the CINTIL corpus. For more information on the CINTIL corpus, see ELRA-W0050, ISLRN: 176-775-844-396-0. The corpus is PoS-annotated at token level, ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |