20 Language Resources

Order by:

 Amharic-English bilingual corpus    
  • Amharic
  • English

ID: ELRA-W0074

ISLRN: 590-255-335-719-0

The Amharic-English bilingual corpus contains parallel text from legal and news domains in Amharic script, in transliterated form and in English. The size of the corpus is of 232,653 words in Amharic and 291,701 in English. This parallel corpus contains documents from two domains, namely legal...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 Basque WordNet    
  • Basque
  • English

ID: ELRA-M0049

ISLRN: 699-845-639-511-8

The Basque WordNet is a lexical database including information about Basque words. It is an extension of WordNet 1.6, a lexical database for English developed at the Princeton University. The Basque WordNet is tightly aligned to the English WordNet. The Basque WordNet models nouns, verbs and ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
4500.00 € submit
4500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
9000.00 € submit
9000.00 € submit
 Bulgarian WordNet    
  • Bulgarian
  • English

ID: ELRA-M0041

ISLRN: 941-120-951-927-7

The Bulgarian WordNet is a network of lexical-semantic relations, an electronic thesaurus with a structure modelled on that of the Princeton WordNet and those constructed in the EuroWordNet and BalkaNet project. Bulgarian WordNet describes meaning of a lexical unit by placing it within a network ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
4500.00 € submit
4500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
9000.00 € submit
9000.00 € submit
 Czech WordNet    
  • Czech
  • English

ID: ELRA-M0047

ISLRN: 009-714-127-860-1

The Czech WordNet was developed by the Centre of Natural Language Processing at the Faculty of Informatics, Masaryk University, Czech Republic. The Czech WordNet captures nouns, verbs, adjectives, and partly adverbs, and contains 28,201 word senses (synsets). Every synset encodes the equivale...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
475.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 English-Persian parallel corpus    
  • English
  • Persian

ID: ELRA-W0118

ISLRN: 074-825-114-781-7

The English-Persian parallel corpus contains more than 200,000 aligned sentences across a variety of text types from the domains of art, law, culture, science, religion, literature, medicine, idioms, politics and others. It is an extension of the English-Persian parallel corpus already distribute...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 English-Persian parallel Corpus    
  • English
  • Persian

ID: ELRA-W0051

ISLRN: 671-618-321-687-7

Please refer to ELRA-W0118 for the latest version of this corpus. This version consists of about 3,500,000 English and Persian (Farsi) words aligned at sentence level (about 100,000 sentences, distributed over 50,021 entries). The format of the files is Unicode. It has been originally created wi...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 English-Punjabi Code-Mixed Social Media Content    
  • English
  • Panjabi; Punjabi

ID: ELRA-W0319

ISLRN: 695-759-706-170-8

The English-Punjabi Code-Mixed Social Media Content corpus is composed is composed of 893,615 parallel sentences of English-Punjabi distributed over the following domains: - 82,341 parallel sentences of English-Punjabi code-mixed Agriculture Domain Data, - 59,158 parallel sentences of English-P...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 EnToFrNE - a Parallel English-French Lexicon of Named Entities    
  • English
  • French

ID: ELRA-M0052

ISLRN: 233-270-965-120-8

In any text document, there are particular terms that represent specific entities that are more informative and have a unique context. These entities are known as named entities, which more specifically refer to terms that represent real-world objects like people, places, organizations, and so on...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 EUROPARL Corpus Parallel Corpora: Portuguese-English    
  • English
  • Portuguese

ID: ELRA-W0090

ISLRN: 435-502-922-727-2

The EUROPARL Corpus (Portuguese-English subpart of the parallel corpora), was extracted from the proceedings of the European Parliament. It contains transcriptions of sessions dating back from 1996 to 2011, with a total of approximately 58,324,562 tokens of European Portuguese (L1) and 49,216,896...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 EuroWordNet Czech    
  • Czech
  • English

ID: ELRA-M0021

ISLRN: 724-939-553-229-9

A. Available Wordnets Following the announcement of the EuroWordNet databases in the last issue of the ELRA Newsletter (Vol.4 N.2), we are happy to announce that the list of EuroWordNet languages has grown. The following wordnets are now available via ELRA: ELRA ref. Language Synsets Word...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
128.24 € submit
1923.60 € submit
Licence: Evaluation Use - ELRA EVALUATION
256.48 € submit
Licence: Commercial Use - ELRA VAR
3206.00 € submit
3206.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
256.48 € submit
3847.20 € submit
Licence: Evaluation Use - ELRA EVALUATION
512.96 € submit
Licence: Commercial Use - ELRA VAR
6412.00 € submit
6412.00 € submit

Special offers are also available. Check here for details.

 EuroWordNet Dutch    
  • Dutch; Flemish
  • English

ID: ELRA-M0016

ISLRN: 463-146-267-453-0

A. Available Wordnets Following the announcement of the EuroWordNet databases in the last issue of the ELRA Newsletter (Vol.4 N.2), we are happy to announce that the list of EuroWordNet languages has grown. The following wordnets are now available via ELRA: ELRA ref. Language Synsets Word...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
440.15 € submit
6602.25 € submit
Licence: Evaluation Use - ELRA EVALUATION
880.30 € submit
Licence: Commercial Use - ELRA VAR
11003.75 € submit
11003.75 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
880.30 € submit
13204.50 € submit
Licence: Evaluation Use - ELRA EVALUATION
1760.60 € submit
Licence: Commercial Use - ELRA VAR
22007.50 € submit
22007.50 € submit

Special offers are also available. Check here for details.

 EuroWordNet Estonian    
  • English
  • Estonian

ID: ELRA-M0022

ISLRN: 953-046-779-755-6

A. Available Wordnets Following the announcement of the EuroWordNet databases in the last issue of the ELRA Newsletter (Vol.4 N.2), we are happy to announce that the list of EuroWordNet languages has grown. The following wordnets are now available via ELRA: ELRA ref. Language Synsets Word...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
93.17 € submit
1397.55 € submit
Licence: Evaluation Use - ELRA EVALUATION
186.34 € submit
Licence: Commercial Use - ELRA VAR
2329.25 € submit
2329.25 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
186.34 € submit
2795.10 € submit
Licence: Evaluation Use - ELRA EVALUATION
372.68 € submit
Licence: Commercial Use - ELRA VAR
4658.50 € submit
4658.50 € submit

Special offers are also available. Check here for details.

 EuroWordNet French    
  • English
  • French

ID: ELRA-M0020

ISLRN: 473-160-472-670-7

A. Available Wordnets Following the announcement of the EuroWordNet databases in the last issue of the ELRA Newsletter (Vol.4 N.2), we are happy to announce that the list of EuroWordNet languages has grown. The following wordnets are now available via ELRA: ELRA ref. Language Synsets Word...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
227.45 € submit
3411.75 € submit
Licence: Evaluation Use - ELRA EVALUATION
454.90 € submit
Licence: Commercial Use - ELRA VAR
5686.25 € submit
5686.25 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
454.90 € submit
6823.50 € submit
Licence: Evaluation Use - ELRA EVALUATION
909.80 € submit
Licence: Commercial Use - ELRA VAR
11372.50 € submit
11372.50 € submit

Special offers are also available. Check here for details.

 EuroWordNet German    
  • English
  • German

ID: ELRA-M0019

ISLRN: 874-838-254-228-5

A. Available Wordnets Following the announcement of the EuroWordNet databases in the last issue of the ELRA Newsletter (Vol.4 N.2), we are happy to announce that the list of EuroWordNet languages has grown. The following wordnets are now available via ELRA: ELRA ref. Language Synsets Word...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
151.32 € submit
2269.80 € submit
Licence: Evaluation Use - ELRA EVALUATION
302.64 € submit
Licence: Commercial Use - ELRA VAR
3783.00 € submit
3783.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
302.64 € submit
4539.60 € submit
Licence: Evaluation Use - ELRA EVALUATION
605.28 € submit
Licence: Commercial Use - ELRA VAR
7566.00 € submit
7566.00 € submit

Special offers are also available. Check here for details.

 EuroWordNet Spanish    
  • English
  • Spanish; Castilian

ID: ELRA-M0017

ISLRN: 938-684-369-235-6

A. Available Wordnets Following the announcement of the EuroWordNet databases in the last issue of the ELRA Newsletter (Vol.4 N.2), we are happy to announce that the list of EuroWordNet languages has grown. The following wordnets are now available via ELRA: ELRA ref. Language Synsets Word...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
233.70 € submit
3505.50 € submit
Licence: Evaluation Use - ELRA EVALUATION
467.40 € submit
Licence: Commercial Use - ELRA VAR
5842.50 € submit
5842.50 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
467.40 € submit
7011.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
934.80 € submit
Licence: Commercial Use - ELRA VAR
11685.00 € submit
11685.00 € submit

Special offers are also available. Check here for details.

 ItalWordNet (Italian WordNet)    
  • English
  • Italian

ID: ELRA-M0042

ISLRN: 532-206-426-067-2

ItalWordNet (Italian WordNet) is an updated version of the EuroWordNet Italian database. The ItalWordNet database was produced within a national Italian programme called SI-TAL. It contains a total of 49,360 synsets. Unlike the EuroWordNet database, the ItalWordNet is provided in XML format. Howe...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
8000.00 € submit
Licence: Commercial Use - ELRA VAR
12000.00 € submit
12000.00 € submit
 Multilingual Corpus    
  • Chinese
  • English
  • Korean

ID: ELRA-W0035

ISLRN: 731-151-596-869-3

Multilingual parallel corpus produced by Kaist Korterm containing 60 000 expressions in Korean, Chinese and English.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 Parallel Corpora & Domains (bilingual and multilingual)    
  • Arabic
  • Chinese
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hebrew
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Northern Sami
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Turkish

ID: ELRA-W0336

ISLRN: 471-919-856-164-1

Parallel corpora for nearly 400 language pairs and numerous multilingual combinations, including 10 million bilingual segments and 90 million tokens in 20 languages: Arabic, Chinese (Simplified), Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Italian, Japanese, Korean, North Sami...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
0.10 € submit
0.10 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
0.11 € submit
0.11 € submit

Special offers are also available. Check here for details.

 Persian 1984 corpus (Multext-East framework)    
  • Persian

ID: ELRA-W0054

ISLRN: 851-240-629-673-1

This corpus contains the Persian (Farsi) translation of a part of the novel “1984” (G. Orwell) annotated in the Multext-East framework (Multilingual Text Tools and Corpora for Eastern and Central European Languages). The aim of the Multext-East project was to develop standardized language resourc...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit