corpus: meaning, definition, pronunciation and examples

C1
UK/ˈkɔː.pəs/US/ˈkɔːr.pəs/

Academic, technical, formal

My Flashcards

Quick answer

What does “corpus” mean?

A collection of written or spoken texts, especially one used for linguistic research or analysis.

Audio

Pronunciation

Definition

Meaning and Definition

A collection of written or spoken texts, especially one used for linguistic research or analysis.

In linguistics: a systematic collection of authentic language data stored electronically for analysis. In law: the principal or capital of an estate. In anatomy/biology: the main part of an organ or structure.

Dialectal Variation

British vs American Usage

Differences

No significant difference in meaning or usage. Both varieties use the term identically in academic contexts.

Connotations

Neutral and technical in both varieties. Carries strong academic/professional associations.

Frequency

Equally low-frequency in general use, but standard in linguistics, computational linguistics, and legal/financial writing in both regions.

Grammar

How to Use “corpus” in a Sentence

[verb] + corpus: build/compile/create/analyse/use/access/search a corpuscorpus + [verb]: The corpus contains/shows/illustrates/provides...adjective + corpus: a large/small/electronic/annotated/historical corpuscorpus + of + [noun]: a corpus of texts/speech/letters/novels

Vocabulary

Collocations

strong
linguistic corpustext corpuscorpus linguisticslarge corpuselectronic corpuscorpus databuild a corpusanalyse a corpusreference corpusparallel corpus
medium
corpus analysiscorpus evidencecorpus sizecorpus-basedsearch the corpuscompile a corpusspoken corpuswritten corpushistorical corpusspecialised corpus
weak
corpus of textscorpus of documentsentire corpusmain corpussmall corpusmanage a corpusaccess the corpusannotated corpusraw corpusbalanced corpus

Examples

Examples of “corpus” in a Sentence

noun

British English

  • The researcher compiled a corpus of 19th-century newspaper editorials.
  • The court ruled that the trust's corpus could not be diminished.
  • The linguistic corpus is freely available online for academic use.

American English

  • Her analysis was based on a large corpus of American television dialogue.
  • The endowment's corpus generates the income for our scholarships.
  • Corpus linguistics relies on evidence from real language use.

Usage

Meaning in Context

Business

Rare. Might appear in legal/financial contexts referring to the capital or principal of a fund or trust (e.g., 'the corpus of the endowment').

Academic

Very common in linguistics, computational linguistics, language studies. Refers to a collection of texts used for research (e.g., 'The British National Corpus is a key resource.').

Everyday

Extremely rare. An everyday speaker would say 'a collection of texts' or 'a database of writing'.

Technical

Standard in linguistics, NLP (Natural Language Processing), lexicography, and legal/financial documentation with distinct meanings for each field.

Vocabulary

Synonyms of “corpus”

Strong

text collectionlanguage databasetext archive

Vocabulary

Antonyms of “corpus”

Watch out

Common Mistakes When Using “corpus”

  • Using 'corpus' in everyday speech where 'collection' or 'set' is appropriate.
  • Mispronouncing it as /kərˈpʊs/ (like 'purpose') instead of /ˈkɔːr.pəs/.
  • Confusing plural: 'corpuses' is acceptable, but 'corpora' /ˈkɔːr.pər.ə/ is the traditional Latin plural, more common in academia.
  • Using it as a synonym for 'corpse'.

FAQ

Frequently Asked Questions

Both are correct. 'Corpora' is the original Latin plural and is more common in formal academic writing, especially in linguistics. 'Corpuses' is an accepted English plural and is also widely used.

All corpora are databases, but not all databases are corpora. A 'corpus' specifically refers to a structured, representative, and often annotated collection of authentic language texts compiled for linguistic analysis. A general text 'database' might not have the design principles (balance, representativeness, annotation) of a linguistic corpus.

It is highly unusual and would sound overly technical. In everyday situations, you would use words like 'collection', 'set', 'bunch', or 'archive' (e.g., 'a collection of poems', 'an archive of letters').

It refers to the principal or capital sum of an estate, trust, or endowment, as distinct from the interest or income generated from it. For example, 'The will stipulated that the corpus of the inheritance should be preserved for the grandchildren.'

A collection of written or spoken texts, especially one used for linguistic research or analysis.

Corpus is usually academic, technical, formal in register.

Corpus: in British English it is pronounced /ˈkɔː.pəs/, and in American English it is pronounced /ˈkɔːr.pəs/. Tap the audio buttons above to hear it.

Phrases

Idioms & Phrases

  • [None directly. The word itself is technical.]

Learning

Memory Aids

Mnemonic

Think of 'corpus' like the 'corpse' of language – it's the collected 'body' of texts used for study. Both 'corpus' and 'corpse' come from Latin for 'body'.

Conceptual Metaphor

A CORPUS IS A BODY (of evidence/data). A CORPUS IS A REPOSITORY/VAULT (of linguistic material).

Practice

Quiz

Fill in the gap
To study how grammar is used in informal writing, the PhD student decided to of blog posts and social media comments.
Multiple Choice

In which context is the word 'corpus' LEAST likely to be used correctly?