Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
12 Language Resources
Order by:
- English
- Modern Greek (1453-)
ID: ELRA-W0244
ISLRN: 456-799-985-207-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A bilingual collection of translation units extracted fr...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- Bulgarian
ID: ELRA-W0329
ISLRN: 832-960-876-604-2The Bulgarian Event Corpus is composed 324,905 tokens appropriate for training Named Entity Recognition (NER), Named Entity Linking (NEL) and Event Recognition models for Bulgarian in a multidomain context within Humanities. The texts are domain related. They include documents from the area of So...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: ? - CC-BY-SA-3.0 |
0.00 €
| |
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
- Bulgarian
ID: ELRA-W0328
ISLRN: 761-430-854-533-2The Bulgarian Treebank Corpus is composed of 156,149 tokens (11,138 sentences) coming from three main sources in the domain of Grammar Notebooks (1,391 sentences), News (6,698 sentences), Other (3,049 sentences). It is available with syntactical and morphological annotation on a sentence basis in...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0169
ISLRN: 636-211-843-827-4This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Latvian Web, home pages of ministries and state public s...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0216
ISLRN: 389-271-130-137-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Contents of web site https://makroekonomika.lv/ -- Latv...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- English
- Spanish; Castilian
ID: ELRA-W0128
ISLRN: 036-939-425-010-1The European Comparable and Parallel Corpora of Parliamentary Speeches Archive (ECPC), compiled at the Universitat Jaume I (Spain), is a collection of XML metatextually tagged corpora containing speeches from three European chambers (the European Parliament, the British House of Commons, and the ...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
- English
- Romanian; Moldavian; Moldovan
ID: ELRA-W0193
ISLRN: 206-680-247-212-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian – English corpus built from a Wikipedia dump.
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- German
ID: ELRA-W0330
ISLRN: 381-445-879-769-5This corpus consists of a collection of political speeches in German crawled from the online archive of the German presidency (Bundespraësident) and the Chancellery (Bundesregierung). For the German Presidency the speeches are available from July 1, 1984 to February 17, 2012 and the corpus con...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0158
ISLRN: 810-722-062-476-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. International Agreements have been translated into natio...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- Bengali
- English
- Hindi
- Malayalam
- Tamil
- Telugu
- Urdu
ID: ELRA-W0320
ISLRN: 657-350-757-058-6The Parallel Corpora for 6 Indian Languages contains data sets for Bengali (540,000 words – 20,000 parallel sentences), Hindi (1,200,000 words – 37 000 parallel sentences), Malayalam (660,000 words – 29,000 parallel sentences), Tamil (747,000 words – 35,000 parallel sentences), Telugu (951,000 wo...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0159
ISLRN: 486-155-178-937-9This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The Corpus has been built from the News and Press Releas...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- English
- Lithuanian
ID: ELRA-W0160
ISLRN: 967-335-099-703-2This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. http://prezidentas.lt/ website in English-Lithuanian lan...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|