A quick intro to language analysis

This page gives you an idea of what it takes to solve an olympiad problem (thanks to Joseph Jojoe, in Year 11, for suggesting it). The main skill you’ll need is the ability to analyse patterns in language data – in short, the skill of language analysis. This is the basis for linguistics, so once you’ve developed your language analysis, you’re a linguist (= one who does linguistics) – ready to study language and even to do good research.


It’s all about looking behind the raw data to find how the language system works. Imagine a strange language in which adding an -s makes some words singular and others plural; how can that be? What’s going on? In case you haven’t noticed, that strange language is English, where an odd old sock smell+s [singular], but many old sock+s [plural] smell even worse. Why? You could start with the difference between nouns and verbs, and write some rules. Then you might notice some exceptions (e.g. smelled or will among the verbs and mice among the nouns), and wonder whether your units should be endings like -s or abstract features like ‘plural’. But the basic work is done: you’ve found the rules, so, as in any good science, you can now make testable predictions: given the noun gug and the verb chortle, you can predict A gug chortle+s but Gug+s chortle.

In short, you have to ‘crack the code’, using the data as a source of clues to what lies behind – just like a detective investigating a crime, a doctor diagnosing an illness, a scientist studying an unfamiliar substance or species, or an explorer finding a way through unknown territory. Welcome to the science of linguistics!

The olympiad

An olympiad problem, unlike the English example above, presents unfamiliar data, and the challenge for you is to discover the system that lies behind the data; so it’s not a test of your MFL skills.

Here’s an example, taken from a language spoken in the north-east of the Sudan, between the Nile and the Red Sea, by about a million people who are mostly nomads. They call the language ti bedawie, but it’s usually called Beja, based on its Arabic name. The language isn’t used in writing, so the first step in analysing it is to develop a way of writing it down; here we’ll skip that step and go straight into the grammar. Here are some Beja words:



kaam a camel
uukaam the camel
ragad a leg
iragad the leg
ikaami my camel
meek a donkey


the donkey
laga a calf


the calf

Your task is to crack the code to the extent of being able to fill the two empty cells. If you look at the data (i.e. the words that are given to you), the answer may leap out at you; if so, lucky you! If not, here’s how to work it out.

  1. What’s the missing bit of information? Each empty box needs the Beja for ‘the X’, where X is actually given in the cell above – meek for ‘donkey’ and laga for ‘calf’; so what’s missing is the way to express the meaning of English the.
  2. If you look through the earlier forms that translate the X, it’s easy to work out that the meaning ‘the’ is expressed by a prefix attached to the noun; but in some cases, this prefix is uu- (uukaam) and in others it is i- (iragad, ikaami).
  3. Why does the form of this prefix vary?
    1. Is it because of the choice of the noun (like gender in languages like French)? No, this can’t be so because uukaam and ikaami both contain the same noun (meaning ‘camel’) but have different prefixes.
    2. What else might be relevant? If you compare the alternatives, they vary in length: uu, i. So maybe it has something to do with the length of the word to which the prefix is added? This is more promising because uu-kaam has just one syllable after the prefix whereas iragad and ikaami have two. At least all the data are consistent with the following rule:
      • Add uu- before one syllable and i- before more than one syllable.
  4. You can now fill the empty cells: uumeek and ilaga. Job done!

Hopefully that wasn’t too difficult; but of course the actual problems in the competition can be very much harder – as hard as any problem in (say) the Mathematics Olympiad.

If you think language analysis sounds cool, here are some links for you to follow: