Search and Browse – ELRA Catalogue

Urdu

ID: ELRA-S0403

The CLE Pakistan Urdu Speech Corpus consists of phonetically rich Urdu sentences extracted from CLE Urdu Digest Corpus and additional sentences covering telephone numbers, addresses and personal names. This speech corpus is recorded with a variety of microphone types (built in laptop, hands free,...

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	12000.00 €	18000.00 €
Licence: Commercial Use - ELRA VAR	18000.00 €	18000.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	15600.00 €	23400.00 €
Licence: Commercial Use - ELRA VAR	23400.00 €	23400.00 €

MULTIGLOSS Multilingual Glossaries - L1-English pair text

Afrikaans
Arabic
Azerbaijani
Bulgarian
Catalan; Valencian
Chinese
Croatian
Czech
Danish
Dutch; Flemish
English
Estonian
Finnish
French
German
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Korean
Latin
Latvian
Lithuanian
Malay (macrolanguage)
Modern Greek (1453-)
Norwegian
Persian
Polish
Portuguese
Romanian; Moldavian; Moldovan
Russian
Serbian
Slovak
Slovenian
Spanish; Castilian
Swedish
Thai
Turkish
Ukrainian
Urdu
Vietnamese
Western Frisian

ID: ELRA-M0112-01

ISLRN: 098-079-939-987-5

A series of innovative multilingual word-to-sense glossaries, based on a human-edited word-to-sense bilingual index of each language to English, which is linked automatically to the translation equivalents in 45 target languages. Each word and expression in every language is translated via its...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	2500.00 €	2500.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	2625.00 €	2625.00 €

Special offers are also available. Check here for details.

MULTIGLOSS Multilingual Glossaries - L1-English pair + 1 language text

Afrikaans
Arabic
Azerbaijani
Bulgarian
Catalan; Valencian
Chinese
Croatian
Czech
Danish
Dutch; Flemish
English
Estonian
Finnish
French
German
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Korean
Latin
Latvian
Lithuanian
Malay (macrolanguage)
Modern Greek (1453-)
Norwegian
Persian
Polish
Portuguese
Romanian; Moldavian; Moldovan
Russian
Serbian
Slovak
Slovenian
Spanish; Castilian
Swedish
Thai
Turkish
Ukrainian
Urdu
Vietnamese
Western Frisian

ID: ELRA-M0112-02

ISLRN: 610-290-284-705-6

A series of innovative multilingual word-to-sense glossaries, based on a human-edited word-to-sense bilingual index of each language to English, which is linked automatically to the translation equivalents in 45 target languages. Each word and expression in every language is translated via its...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	3750.00 €	3750.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR	3937.50 €	3937.50 €

Special offers are also available. Check here for details.

Parallel Corpora for 6 Indian Languages text

Bengali
English
Hindi
Malayalam
Tamil
Telugu
Urdu

ID: ELRA-W0320

ISLRN: 657-350-757-058-6

The Parallel Corpora for 6 Indian Languages contains data sets for Bengali (540,000 words – 20,000 parallel sentences), Hindi (1,200,000 words – 37 000 parallel sentences), Malayalam (660,000 words – 29,000 parallel sentences), Tamil (747,000 words – 35,000 parallel sentences), Telugu (951,000 wo...

MEMBER	academic	commercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0	0.00 €	0.00 €

NON MEMBER	academic	commercial
Licence: Attribution, Share Alike - CC-BY-SA-3.0	0.00 €	0.00 €

The EMILLE/CIIL Corpus text

Assamese
Bengali
English
Gujarati
Hindi
Kannada
Kashmiri
Malayalam
Marathi
Oriya (macrolanguage)
Panjabi; Punjabi
Sinhala; Sinhalese
Tamil
Telugu
Urdu

ID: ELRA-W0037

ISLRN: 039-846-040-604-0

The EMILLE/CIIL Corpus consists of three components: monolingual, parallel and annotated corpora. There are fourteen monolingual corpora, including both written and (for some languages) spoken data for fourteen South Asian languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayala...

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €

The EMILLE Lancaster Corpus text

Bengali
English
Gujarati
Hindi
Panjabi; Punjabi
Sinhala; Sinhalese
Tamil
Urdu

ID: ELRA-W0038

ISLRN: 438-045-014-925-0

The EMILLE Lancaster Corpus consists of three components: monolingual, parallel and annotated corpora. There are monolingual corpora for seven South Asian languages: Bengali, Gujarati, Hindi, Punjabi, Sinhala, Tamil, Urdu. The EMILLE monolingual corpora contain approximately 58,880,000 words (i...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR		7500.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR		12000.00 €

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

Resource Type:

Media Type:

6 Language Resources