corpus: meaning, definition, pronunciation and examples
C1Academic, technical, formal
Quick answer
What does “corpus” mean?
A collection of written or spoken texts, especially one used for linguistic research or analysis.
Audio
Pronunciation
Definition
Meaning and Definition
A collection of written or spoken texts, especially one used for linguistic research or analysis.
In linguistics: a systematic collection of authentic language data stored electronically for analysis. In law: the principal or capital of an estate. In anatomy/biology: the main part of an organ or structure.
Dialectal Variation
British vs American Usage
Differences
No significant difference in meaning or usage. Both varieties use the term identically in academic contexts.
Connotations
Neutral and technical in both varieties. Carries strong academic/professional associations.
Frequency
Equally low-frequency in general use, but standard in linguistics, computational linguistics, and legal/financial writing in both regions.
Grammar
How to Use “corpus” in a Sentence
[verb] + corpus: build/compile/create/analyse/use/access/search a corpuscorpus + [verb]: The corpus contains/shows/illustrates/provides...adjective + corpus: a large/small/electronic/annotated/historical corpuscorpus + of + [noun]: a corpus of texts/speech/letters/novelsVocabulary
Collocations
Examples
Examples of “corpus” in a Sentence
noun
British English
- The researcher compiled a corpus of 19th-century newspaper editorials.
- The court ruled that the trust's corpus could not be diminished.
- The linguistic corpus is freely available online for academic use.
American English
- Her analysis was based on a large corpus of American television dialogue.
- The endowment's corpus generates the income for our scholarships.
- Corpus linguistics relies on evidence from real language use.
Usage
Meaning in Context
Business
Rare. Might appear in legal/financial contexts referring to the capital or principal of a fund or trust (e.g., 'the corpus of the endowment').
Academic
Very common in linguistics, computational linguistics, language studies. Refers to a collection of texts used for research (e.g., 'The British National Corpus is a key resource.').
Everyday
Extremely rare. An everyday speaker would say 'a collection of texts' or 'a database of writing'.
Technical
Standard in linguistics, NLP (Natural Language Processing), lexicography, and legal/financial documentation with distinct meanings for each field.
Vocabulary
Synonyms of “corpus”
Strong
Watch out
Common Mistakes When Using “corpus”
- Using 'corpus' in everyday speech where 'collection' or 'set' is appropriate.
- Mispronouncing it as /kərˈpʊs/ (like 'purpose') instead of /ˈkɔːr.pəs/.
- Confusing plural: 'corpuses' is acceptable, but 'corpora' /ˈkɔːr.pər.ə/ is the traditional Latin plural, more common in academia.
- Using it as a synonym for 'corpse'.
FAQ
Frequently Asked Questions
Both are correct. 'Corpora' is the original Latin plural and is more common in formal academic writing, especially in linguistics. 'Corpuses' is an accepted English plural and is also widely used.
All corpora are databases, but not all databases are corpora. A 'corpus' specifically refers to a structured, representative, and often annotated collection of authentic language texts compiled for linguistic analysis. A general text 'database' might not have the design principles (balance, representativeness, annotation) of a linguistic corpus.
It is highly unusual and would sound overly technical. In everyday situations, you would use words like 'collection', 'set', 'bunch', or 'archive' (e.g., 'a collection of poems', 'an archive of letters').
It refers to the principal or capital sum of an estate, trust, or endowment, as distinct from the interest or income generated from it. For example, 'The will stipulated that the corpus of the inheritance should be preserved for the grandchildren.'
A collection of written or spoken texts, especially one used for linguistic research or analysis.
Corpus is usually academic, technical, formal in register.
Corpus: in British English it is pronounced /ˈkɔː.pəs/, and in American English it is pronounced /ˈkɔːr.pəs/. Tap the audio buttons above to hear it.
Phrases
Idioms & Phrases
- “[None directly. The word itself is technical.]”
Learning
Memory Aids
Mnemonic
Think of 'corpus' like the 'corpse' of language – it's the collected 'body' of texts used for study. Both 'corpus' and 'corpse' come from Latin for 'body'.
Conceptual Metaphor
A CORPUS IS A BODY (of evidence/data). A CORPUS IS A REPOSITORY/VAULT (of linguistic material).
Practice
Quiz
In which context is the word 'corpus' LEAST likely to be used correctly?