Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

19 Language Resources

Order by:

 CLEF Question Answering Test Suites (2003-2008) – Evaluation Package    
  • Bulgarian
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Italian
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Spanish; Castilian

ID: ELRA-E0038

ISLRN: 394-993-527-034-7

The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access (MLIA) by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) c...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit

Special offers are also available. Check here for details.

 Collins Multilingual database (MLD) - WordBank    
  • Arabic
  • Bengali
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Malayalam
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Spanish; Castilian
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

ID: ELRA-T0376

ISLRN: 990-814-402-335-7

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank) and a multilingual set of sentences in 28 languages (the PhraseBank, distributed separately under reference ELRA-T0377). The WordBank contains 10,000 words...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3600.00 € submit
 General Romanian-English bilingual corpus (Processed)    
  • English
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0193

ISLRN: 206-680-247-212-6

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian – English corpus built from a Wikipedia dump.

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 Letter of rights for persons arrested and or detained (Processed)    
  • Bulgarian
  • English
  • French
  • Latvian
  • Modern Greek (1453-)
  • Polish
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0308

ISLRN: 604-574-272-897-7

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Collection of transaltion units (1906 in total) in 21 la...

MEMBERacademiccommercial
Licence: Other - Open Under-PSI
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Open Under-PSI
0.00 € submit
0.00 € submit
 Letter of rights for persons arrested on the basis of a European Arrest Warrant (Processed)    
  • Bulgarian
  • Dutch; Flemish
  • English
  • French
  • German
  • Italian
  • Latvian
  • Modern Greek (1453-)
  • Polish
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0301

ISLRN: 175-028-844-014-3

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Letter of rights for persons arrested on the basis of a ...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 MULTIGLOSS Multilingual Glossaries - L1-English pair    
  • Afrikaans
  • Arabic
  • Azerbaijani
  • Bulgarian
  • Catalan; Valencian
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hebrew
  • Hindi
  • Hungarian
  • Icelandic
  • Indonesian
  • Italian
  • Japanese
  • Korean
  • Latin
  • Latvian
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Vietnamese
  • Western Frisian

ID: ELRA-M0112-01

ISLRN: 098-079-939-987-5

A series of innovative multilingual word-to-sense glossaries, based on a human-edited word-to-sense bilingual index of each language to English, which is linked automatically to the translation equivalents in 45 target languages. Each word and expression in every language is translated via its...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
2625.00 € submit
2625.00 € submit

Special offers are also available. Check here for details.

 MULTIGLOSS Multilingual Glossaries - L1-English pair + 1 language    
  • Afrikaans
  • Arabic
  • Azerbaijani
  • Bulgarian
  • Catalan; Valencian
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hebrew
  • Hindi
  • Hungarian
  • Icelandic
  • Indonesian
  • Italian
  • Japanese
  • Korean
  • Latin
  • Latvian
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Vietnamese
  • Western Frisian

ID: ELRA-M0112-02

ISLRN: 610-290-284-705-6

A series of innovative multilingual word-to-sense glossaries, based on a human-edited word-to-sense bilingual index of each language to English, which is linked automatically to the translation equivalents in 45 target languages. Each word and expression in every language is translated via its...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
3750.00 € submit
3750.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
3937.50 € submit
3937.50 € submit

Special offers are also available. Check here for details.

 Parallel texts from Swedish Labour market agency. Part 2 (Processed)    
  • English
  • Finnish
  • French
  • German
  • Polish
  • Romanian; Moldavian; Moldovan
  • Spanish; Castilian
  • Swedish

ID: ELRA-W0300

ISLRN: 949-454-272-492-9

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Same as part 1, but with the Readme-file. (Processed)

MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
 Parallel texts from Swedish Labour market agency (Processed)    
  • English
  • Finnish
  • French
  • German
  • Romanian; Moldavian; Moldovan
  • Spanish; Castilian
  • Swedish

ID: ELRA-W0302

ISLRN: 496-669-153-886-4

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel texts, all in pdf files, have been gathered fro...

MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
 Parallel texts from Swedish Social Security Authority (Processed)    
  • Croatian
  • English
  • Finnish
  • French
  • German
  • Italian
  • Polish
  • Romanian; Moldavian; Moldovan
  • Spanish; Castilian
  • Swedish

ID: ELRA-W0303

ISLRN: 002-471-002-734-6

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel texts, email templates and forms in pdf file fo...

MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
 Parallel texts from Swedish Work environment Authority (Processed)    
  • Bulgarian
  • Czech
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hungarian
  • Italian
  • Latvian
  • Lithuanian
  • Modern Greek (1453-)
  • Polish
  • Romanian; Moldavian; Moldovan
  • Spanish; Castilian
  • Swedish

ID: ELRA-W0304

ISLRN: 448-438-055-941-1

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel texts from the Swedish Work Environment authori...

MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
 ROCO Romanian journalistic corpus    
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0085

ISLRN: 312-617-089-348-7

ROCO is a Romanian journalistic corpus containing approximately 7.1 million tokens, the number of types being 231,626. It is rich in proper names, numerals and named entities. The corpus contains morphosyntactic information (MSD annotations) which has been assigned automatically with the high...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Romanian-English corpus with studies, reports and statistical data in the field of culture from the National Institute for Cultural Research and Training website (Processed)    
  • English
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0270

ISLRN: 131-157-185-289-5

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian-English corpus with studies, reports and statis...

MEMBERacademiccommercial
Licence: Other - Open Under-PSI
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Open Under-PSI
0.00 € submit
0.00 € submit
 Romanian - English literature corpus (Processed)    
  • English
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0192

ISLRN: 050-476-818-226-7

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Romanian – English literature corpus built fro...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 Romanian – English New Criminal Procedure Code (Processed)    
  • English
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0170

ISLRN: 085-350-774-090-4

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The New Civil Procedure Code in Romanian and English (bi...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 Romanian - English news corpus (Processed)    
  • English
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0194

ISLRN: 100-905-126-706-7

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Romanian – English news corpus built from Sout...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 Romanian Ombudsman archive (Processed)    
  • English
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0206

ISLRN: 422-693-047-625-3

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel aligned corpus in tmx format built from the Rom...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 ROMBAC - Romanian balanced corpus    
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0088

ISLRN: 162-192-982-061-0

ROMBAC is a Romanian corpus containing equal shares of texts from 5 different genres: journalism, legalese, fiction, medicine and biographical data for Romanian literary personalities. For each genre, texts have been selected containing around 7,000,000 words, so that the entire corpus counts aro...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
8000.00 € submit
Licence: Commercial Use - ELRA VAR
8000.00 € submit
8000.00 € submit
 The MWN.PT - MultiWordnet of Portuguese    
  • English
  • Hebrew
  • Italian
  • Latin
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Spanish; Castilian

ID: ELRA-M0050

ISLRN: 431-556-012-743-8

MWN.PT - MultiWordnet of Portuguese (version 1) spans over 17,200 manually validated concepts/synsets, linked under the semantic relations of hyponymy and hypernymy. These concepts are made of over 21,000 word senses/word forms and 16,000 lemmas from both European and American variants of Portugu...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit