17 Language Resources

Order by:

 Bilingual collection of reports of the Greek Public Power Corporation (Processed)    
  • English
  • Modern Greek (1453-)

ID: ELRA-W0244

ISLRN: 456-799-985-207-6

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A bilingual collection of translation units extracted fr...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
 Bulgarian Event Corpus    
  • Bulgarian

ID: ELRA-W0329

ISLRN: 832-960-876-604-2

The Bulgarian Event Corpus is composed 324,905 tokens appropriate for training Named Entity Recognition (NER), Named Entity Linking (NEL) and Event Recognition models for Bulgarian in a multidomain context within Humanities. The texts are domain related. They include documents from the area of So...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: ? - CC-BY-SA-3.0
0.00 € submit
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
 Bulgarian Treebank Corpus    
  • Bulgarian

ID: ELRA-W0328

ISLRN: 761-430-854-533-2

The Bulgarian Treebank Corpus is composed of 156,149 tokens (11,138 sentences) coming from three main sources in the domain of Grammar Notebooks (1,391 sentences), News (6,698 sentences), Other (3,049 sentences). It is available with syntactical and morphological annotation on a sentence basis in...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 Bulgarian Valency Frame Lexicon    
  • Bulgarian

ID: ELRA-L0132

ISLRN: 188-702-981-369-5

The Bulgarian Valency Frame Lexicon is composed of 9547 lexical entries organized by frames with 960 mappings to Princeton WordNet available in XML format. It is a treebank-driven resource of extracted valency frames from BulTreeBank. The frames were manually curated. The frames followed the surf...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 Corpus of State-related content from the Latvian Web (Processed)    
  • English
  • Latvian

ID: ELRA-W0169

ISLRN: 636-211-843-827-4

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Latvian Web, home pages of ministries and state public s...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
 Corpus on Finance and Economics from Bank of Latvia (Processed)    
  • English
  • Latvian

ID: ELRA-W0216

ISLRN: 389-271-130-137-6

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Contents of web site https://makroekonomika.lv/ -- Latv...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
 ECPC Corpus (European Comparable and Parallel Corpora of Parliamentary Speeches Archive) – set 1    
  • English
  • Spanish; Castilian

ID: ELRA-W0128

ISLRN: 036-939-425-010-1

The European Comparable and Parallel Corpora of Parliamentary Speeches Archive (ECPC), compiled at the Universitat Jaume I (Spain), is a collection of XML metatextually tagged corpora containing speeches from three European chambers (the European Parliament, the British House of Commons, and the ...

MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA
0.00 € submit
0.00 € submit
 English-Danish EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)    
  • Danish
  • English

ID: ELRA-M0075

ISLRN: 034-297-263-067-2

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 English-Estonian EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)    
  • English
  • Estonian

ID: ELRA-M0073

ISLRN: 367-945-013-309-2

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 English-Latvian EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)    
  • English
  • Latvian

ID: ELRA-M0076

ISLRN: 704-517-283-753-9

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 English-Lithuanian EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)    
  • English
  • Lithuanian

ID: ELRA-M0074

ISLRN: 133-724-111-130-7

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 General Romanian-English bilingual corpus (Processed)    
  • English
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0193

ISLRN: 206-680-247-212-6

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian – English corpus built from a Wikipedia dump.

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 German Political Speeches Corpus    
  • German

ID: ELRA-W0330

ISLRN: 381-445-879-769-5

This corpus consists of a collection of political speeches in German crawled from the online archive of the German presidency (Bundespraësident) and the Chancellery (Bundesregierung). For the German Presidency the speeches are available from July 1, 1984 to February 17, 2012 and the corpus con...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA
0.00 € submit
0.00 € submit
 International Agreements (Processed)    
  • English
  • Latvian

ID: ELRA-W0158

ISLRN: 810-722-062-476-6

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. International Agreements have been translated into natio...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
 Parallel Corpora for 6 Indian Languages    
  • Bengali
  • English
  • Hindi
  • Malayalam
  • Tamil
  • Telugu
  • Urdu

ID: ELRA-W0320

ISLRN: 657-350-757-058-6

The Parallel Corpora for 6 Indian Languages contains data sets for Bengali (540,000 words – 20,000 parallel sentences), Hindi (1,200,000 words – 37 000 parallel sentences), Malayalam (660,000 words – 29,000 parallel sentences), Tamil (747,000 words – 35,000 parallel sentences), Telugu (951,000 wo...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0
0.00 € submit
0.00 € submit
 Parallel Corpus from the Web Site of the the MFA of Latvia (Processed)    
  • English
  • Latvian

ID: ELRA-W0159

ISLRN: 486-155-178-937-9

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The Corpus has been built from the News and Press Releas...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
 Website of the President of the Republic of Lithuania (Processed)    
  • English
  • Lithuanian

ID: ELRA-W0160

ISLRN: 967-335-099-703-2

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. http://prezidentas.lt/ website in English-Lithuanian lan...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA-4.0
0.00 € submit
0.00 € submit