Resource Type:
Corpus: | ![]() |
Lexical/Conceptual: | ![]() |
Tool/Service: | ![]() |
Language Description: | ![]() |
Media Type:
Text: | ![]() |
Audio: | ![]() |
Image: | ![]() |
Video: | ![]() |
Text Numerical: | ![]() |
Text N-Gram: | ![]() |
2 Language Resources
Order by:
![Corpus](/static/from_media/css/sexybuttons/images/icons/silk/database_yellow.png)
![text](/static/from_media/css/sexybuttons/images/icons/silk/page_white_text_media_type.png)
- Bengali
- English
- Hindi
- Malayalam
- Tamil
- Telugu
- Urdu
ID: ELRA-W0320
ISLRN: 657-350-757-058-6The Parallel Corpora for 6 Indian Languages contains data sets for Bengali (540,000 words – 20,000 parallel sentences), Hindi (1,200,000 words – 37 000 parallel sentences), Malayalam (660,000 words – 29,000 parallel sentences), Tamil (747,000 words – 35,000 parallel sentences), Telugu (951,000 wo...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |
0.00 €
![]() |
![Corpus](/static/from_media/css/sexybuttons/images/icons/silk/database_yellow.png)
![text](/static/from_media/css/sexybuttons/images/icons/silk/page_white_text_media_type.png)
- Assamese
- Bengali
- English
- Gujarati
- Hindi
- Kannada
- Kashmiri
- Malayalam
- Marathi
- Oriya (macrolanguage)
- Panjabi; Punjabi
- Sinhala; Sinhalese
- Tamil
- Telugu
- Urdu
ID: ELRA-W0037
ISLRN: 039-846-040-604-0The EMILLE/CIIL Corpus consists of three components: monolingual, parallel and annotated corpora. There are fourteen monolingual corpora, including both written and (for some languages) spoken data for fourteen South Asian languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayala...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |