Text (1054)
Audio (681)
Video (23)
True (226)
TEI (10)
TMX (6)

Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

1685 Language Resources (Page 46 of 85)

« Previous | Next »Order by:

 Natolin European Centre Dataset (Processed)    
  • English
  • Polish

ID: ELRA-W0176

ISLRN: 238-889-529-582-8

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The Polish-English parallel corpus is composed of three ...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 Nautilus Speaker Characterization (NSC) Corpus    
  • German

ID: ELRA-S0395

ISLRN: 157-037-166-491-1

The Nautilus Speaker Characterization (NSC) Corpus comprises clean microphone recordings of conversational speech from 300 German speakers (126 males and 174 females) aged 18 to 35 years, with no marked dialect/accent. The recordings were performed in the acoustically-isolated room "Nautilus" (wh...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
 NE3L named entities Arabic corpus    
  • Arabic

ID: ELRA-W0078

ISLRN: 398-979-151-557-0

The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 NE3L named entities Chinese corpus    
  • Chinese

ID: ELRA-W0079

ISLRN: 187-154-782-686-9

The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 NE3L named entities Russian corpus    
  • Russian

ID: ELRA-W0080

ISLRN: 024-620-556-146-2

The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 NEMLAR Broadcast News Speech Corpus    
  • Arabic

ID: ELRA-S0219

ISLRN: 479-507-036-103-9

This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The Nemlar Broadcast News Speech Corpus consists of about...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit

Special offers are also available. Check here for details.

 NEMLAR Speech Synthesis Corpus    
  • Arabic

ID: ELRA-S0220

ISLRN: 361-216-121-305-9

This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Broadcast News Speech Corpus (ELRA-S0219). The NEMLAR Speech Synthesis Corpus contains the reco...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
1250.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit

Special offers are also available. Check here for details.

 NEMLAR Written Corpus    
  • Arabic

ID: ELRA-W0042

ISLRN: 050-693-158-326-9

This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Broadcast News Speech Corpus (ELRA-S0219) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The NEMLAR Written Corpus consists of about...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
250.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit

Special offers are also available. Check here for details.

 Nepali Monolingual written corpus    
  • Nepali (macrolanguage)

ID: ELRA-W0076

ISLRN: 325-796-965-405-9

The Nepali Monolingual written corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (“language communication”), also known as Nelralec, for Nepali Language Resources and Localiza...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
 Nepali Spoken Corpus    
  • Nepali (macrolanguage)

ID: ELRA-S0368

ISLRN: 688-800-566-571-0

The Nepali Spoken Corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (“language communication”), also known as Nelralec, for Nepali Language Resources and Localization for Educ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
 NetDC Arabic BNSC (Broadcast News Speech Corpus)    
  • Arabic

ID: ELRA-S0157

ISLRN: 663-177-513-755-1

The NetDC Arabic BNSC (Broadcast News Speech Corpus) is a corpus developed by ELDA in the framework of the European-funded project Network of Data Centres (NetDC). The project was done in collaboration with the LDC (Linguistic Data Consortium), which has produced a similar corpus from the news br...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
1350.00 € submit
Licence: Commercial Use - ELRA VAR
1350.00 € submit
1350.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
2700.00 € submit
Licence: Commercial Use - ELRA VAR
2700.00 € submit
2700.00 € submit
 NEWBASE - Extended version of ELRA-T0090 GEOBASE    
  • English
  • French

ID: ELRA-T0362

ISLRN: 761-442-215-246-0

Extended version of ELRA-T0090 GEOBASE. The terms were selected and collated by Dr M.S.N. CARPENTER during the course of his translation activities over the past ten years. The terms have been validated by publication in the scientific literature. Conceived as a bilingual terminological resource,...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3420.00 € submit
4788.00 € submit
Licence: Commercial Use - ELRA VAR
4788.00 € submit
4788.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4788.00 € submit
6840.00 € submit
Licence: Commercial Use - ELRA VAR
6840.00 € submit
6840.00 € submit
 New Oxford Dictionary of English, 2nd Edition    
  • English

ID: ELRA-L0045

ISLRN: 044-694-748-731-5

This is Oxford University Press's most comprehensive single-volume dictionary, with 170,000 entries covering all varieties of English worldwide. The NODE data set constitutes a fully integrated range of formal data types suitable for language engineering and NLP applications: It is available in X...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6125.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
8750.00 € submit
 New Oxford Thesaurus of English    
  • English

ID: ELRA-L0047

ISLRN: 869-866-137-463-6

The New Oxford Thesaurus of English is a completely new top-of-the-range thesaurus offering more alternative and opposite words than any of its competitors. The synonyms are arranged in order of ?relevance? to the look-up word, starting with an individually tagged core synonym, and followed by la...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4900.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
7000.00 € submit
 NODE+DIMAP    
  • English

ID: ELRA-L0046

ISLRN: 003-258-865-840-0

The DIMAP version of NODE (first edition) is a machine-tractable version of the machine-readable dictionary files in the DIMAP dictionary maintenance programs, adding syntactic and semantic information in the conversion. In addition, DIMAP provides several mechanisms that will allow research into...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
7000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
10000.00 € submit
 Non-Hispanic Spanish Speech Data by Mobile Phone - 762 Hours    
  • Spanish; Castilian

ID: ELRA-S0444

ISLRN: 469-588-696-069-6

1,630 non-Spanish nationality native Spanish speakers such as Mexicans and Colombians participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, in-vehicle and home. The text is manually proofr...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
180975.00 € submit
180975.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
180975.00 € submit
180975.00 € submit

Special offers are also available. Check here for details.

 Normalized Arabic Fragments for Inestimable Stemming (NAFIS)    
  • Arabic

ID: ELRA-W0127

ISLRN: 305-450-745-774-1

Normalized Arabic Fragments for Inestimable Stemming (NAFIS) is an Arabic stemming gold standard corpus composed by a collection of sentences, selected to be representative of Arabic stemming tasks and manually annotated. Indeed, NAFIS is: Comprehensive: The content of NAFIS can be generalized...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 Norwegian EUROM1    
  • Norwegian

ID: ELRA-S0301

ISLRN: 184-180-634-505-7

EUROM1 is the first really multilingual speech database produced in Europe. Equivalent corpora for each of the European languages were collected with the same number of speakers selected in the same way, and recorded in the same conditions with common file formats. Initially eight European countr...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
800.00 € submit
Licence: Commercial Use - ELRA VAR
800.00 € submit
800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1600.00 € submit
1600.00 € submit
Licence: Commercial Use - ELRA VAR
1600.00 € submit
1600.00 € submit
 Norwegian SpeechDat(II) FDB-1000    
  • Norwegian

ID: ELRA-S0081

ISLRN: 231-756-812-990-0

The Norwegian SpeechDat(II) FDB-1000 comprises 1016 Norwegian speakers (517 males, 499 females) recorded over the Norwegian fixed telephone network. The FDB-1000 database is partitioned into 4 CDs. The speech databases made within the SpeechDat(II) project were validated by SPEX, the Netherlands,...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
15000.00 € submit
18000.00 € submit
Licence: Commercial Use - ELRA VAR
18000.00 € submit
18000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
25000.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
 NPChunks    
  • Portuguese

ID: ELRA-W0089

ISLRN: 412-883-442-173-8

NPChunks is a training corpus containing approximately 1,000 sentences, with a total of 24,243 tokens, selected randomly from the written part of the CINTIL corpus. For more information on the CINTIL corpus, see ELRA-W0050, ISLRN: 176-775-844-396-0. The corpus is PoS-annotated at token level, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit

« Previous | Next »