Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
22 Language Resources (Page 1 of 2)
« Previous | Next »Order by:
- English
- Modern Greek (1453-)
ID: ELRA-W0244
ISLRN: 456-799-985-207-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A bilingual collection of translation units extracted fr...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- Bulgarian
ID: ELRA-W0329
ISLRN: 832-960-876-604-2The Bulgarian Event Corpus is composed 324,905 tokens appropriate for training Named Entity Recognition (NER), Named Entity Linking (NEL) and Event Recognition models for Bulgarian in a multidomain context within Humanities. The texts are domain related. They include documents from the area of So...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: ? - CC-BY-SA-3.0 |
0.00 €
| |
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
- Bulgarian
ID: ELRA-W0328
ISLRN: 761-430-854-533-2The Bulgarian Treebank Corpus is composed of 156,149 tokens (11,138 sentences) coming from three main sources in the domain of Grammar Notebooks (1,391 sentences), News (6,698 sentences), Other (3,049 sentences). It is available with syntactical and morphological annotation on a sentence basis in...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- Bulgarian
ID: ELRA-L0132
ISLRN: 188-702-981-369-5The Bulgarian Valency Frame Lexicon is composed of 9547 lexical entries organized by frames with 960 mappings to Princeton WordNet available in XML format. It is a treebank-driven resource of extracted valency frames from BulTreeBank. The frames were manually curated. The frames followed the surf...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0169
ISLRN: 636-211-843-827-4This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Latvian Web, home pages of ministries and state public s...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0216
ISLRN: 389-271-130-137-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Contents of web site https://makroekonomika.lv/ -- Latv...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- English
- Spanish; Castilian
ID: ELRA-W0128
ISLRN: 036-939-425-010-1The European Comparable and Parallel Corpora of Parliamentary Speeches Archive (ECPC), compiled at the Universitat Jaume I (Spain), is a collection of XML metatextually tagged corpora containing speeches from three European chambers (the European Parliament, the British House of Commons, and the ...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
- Danish
- English
ID: ELRA-M0075
ISLRN: 034-297-263-067-2This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Estonian
ID: ELRA-M0073
ISLRN: 367-945-013-309-2This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-M0076
ISLRN: 704-517-283-753-9This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Lithuanian
ID: ELRA-M0074
ISLRN: 133-724-111-130-7This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Romanian; Moldavian; Moldovan
ID: ELRA-W0193
ISLRN: 206-680-247-212-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian – English corpus built from a Wikipedia dump.
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- German
ID: ELRA-W0330
ISLRN: 381-445-879-769-5This corpus consists of a collection of political speeches in German crawled from the online archive of the German presidency (Bundespraësident) and the Chancellery (Bundesregierung). For the German Presidency the speeches are available from July 1, 1984 to February 17, 2012 and the corpus con...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA |
0.00 €
|
0.00 €
|
- Catalan; Valencian
ID: ELRA-S0407
ISLRN: 780-617-066-913-1Glissando-ca includes more than 12 hours of speech in Catalan, recorded under optimal acoustic conditions, orthographically transcribed, phonetically aligned and annotated with prosodic information (location of the stressed syllables and prosodic phrasing). The corpus was recorded by 8 profession...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
- Spanish; Castilian
ID: ELRA-S0406
ISLRN: 024-286-962-247-6Glissando-sp includes more than 12 hours of speech in Spanish, recorded under optimal acoustic conditions, orthographically transcribed, phonetically aligned and annotated with prosodic information (location of the stressed syllables and prosodic phrasing). The corpus was recorded by 8 profession...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0158
ISLRN: 810-722-062-476-6This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. International Agreements have been translated into natio...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- French
ID: ELRA-S0379
ISLRN: 371-240-320-910-4The JV_TDM corpus provides a phonetic annotation of 37 chapters of the original French version of “Around the World in 80 Days” by Jules Verne read by a single speaker. Each chapter has been annotated in a separate .TextGrid file. The audio files are not included in this release. They are availab...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
- Bengali
- English
- Hindi
- Malayalam
- Tamil
- Telugu
- Urdu
ID: ELRA-W0320
ISLRN: 657-350-757-058-6The Parallel Corpora for 6 Indian Languages contains data sets for Bengali (540,000 words – 20,000 parallel sentences), Hindi (1,200,000 words – 37 000 parallel sentences), Malayalam (660,000 words – 29,000 parallel sentences), Tamil (747,000 words – 35,000 parallel sentences), Telugu (951,000 wo...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
- English
- Latvian
ID: ELRA-W0159
ISLRN: 486-155-178-937-9This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The Corpus has been built from the News and Press Releas...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-4.0 |
0.00 €
|
0.00 €
|
- Persian
ID: ELRA-S0393
ISLRN: 068-845-898-304-0This about 2.5-hour Single-Speaker Speech corpus has been developed using the same methodologies used in the PhD work carried out by Nawar Halabi at the University of Southampton. The corpus was recorded in Persian (Tehrani accent) by one male speaker using a professional studio, through a "Blubb...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
4000.00 €
|
4000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
« Previous | Next »