Saturday, January 07, 2017

Of words and pens

In Algerian Arabic, this is a stilu ستيلو - a word instantly recognizable as a borrowing from French stylo:

In Standard Arabic, on the other hand, as any Algerian learns in primary school, it's a qalam قَلَمٌ. This, as it happens, may also be a borrowing, though a much older one; compare ancient Greek kálamos κάλαμος "reed, reed-pen", which apparently has an Indo-European etymology. Clearly, either pre-modern Algerians were so sunk in illiteracy as to have forgotten the word for a pen altogether, or they replaced a pre-existing word for pen with a French borrowing - right?

Well, no. In the Middle Ages, there weren't too many fountain pens or biros around. Classical Arabic qalam referred to something more like these:

Any Algerian who went to Qur'anic school up to the 1960s or so will remember this - a simple reed pen anyone can make using nothing more complicated than a sharp knife. (The Algerian version was a bit different than those in the picture, as it happens - usually people would use a quarter-circumference of a large reed, not the whole circumference of a small one.) More than that, they will remember what it's called: qləm قلم. There are probably people in Algeria who still use these, and very likely they still call them that.

But no one calls a modern industrial pen qləm. When industrial pens were introduced, sometime in the 19th century, ordinary Algerians ended up classing them as a new object, quite distinct from the reed pen despite its similar function, and deserving of an unrelated name. The guardians of Standard Arabic, on the other hand, decided to extend the reference of qalam to cover both. It may be no coincidence that French distinguishes calame from stylo, like Algerian Arabic, whereas English, like Standard Arabic, treats both as diferent types of pen.

Historical linguists regularly use lexical reconstruction to shed light on technological history, an approach called "Wörter und Sachen". This approach has been very fruitful in many cases. But, as this case illustrates, there are some pitfalls to watch out for: whether something counts as the same object or as a new one is a rather culture-bound question, and if investigators impose their own ideas about this on the situation they are investigating, they will get the wrong answer.

Tuesday, December 27, 2016

Too strong to get out

At four, my nephew speaks English (his dominant language) very well. He still shows some interesting divergences from the standard of those around him, though. Some are influenced by German (a close second): he uses "mine" as a determiner in English (like German "mein") rather than "my", saying things like "mine house". Others seem to result from language-internal overgeneralization, as when he said:
  • If I push the Lego box then the carpet will destroy. [intended meaning: be destroyed]
Presumably, he's interpreted "destroy" as a labile verb, like "open" or "burn".

At first blush, I thought the following sentence was another example of overgeneralization:

  • I'm too strong to get out, so you can't. [intended meaning: I'm too strong for anyone to get me out]
However, reflection suggests that this ought to be perfectly grammatical in English, since "get out" is already labile. "This stump is too heavy to pull out" works fine, so why not "I'm too strong to get out"? Yet, for me at least, the clause immediately receives a pragmatically absurd interpretation with "I" as the subject of "get out", and the obviously intended interpretation is barely accessible even when I've consciously concluded that it should be grammatically acceptable.

In terms of the classic Chomskyan analysis of control, the two interpretations correspond to different unpronounced pronouns PRO:

  1. Ii'm too strong [PROi to get out]
  2. Ii'm too strong [PROarb to get PROi out]
A lot of linguists really dislike the idea of an unpronounced pronoun. Whatever its psychological merits, though, this analysis has the advantage of suggesting why the first interpretation comes more easily than the second here: it only involves one empty pronoun, whereas the desired interpretation needs two. So if anything is going wrong in this sentence, it's not so much the syntax as the pragmatics: an adult speaker might be more aware that listeners could have trouble processing a clause of this form, and avoid it in favour of something less ambiguous. That would need empirical checking though.

Thursday, December 08, 2016

How Tunisia ruined its PISA performance

PISA 2015 is an OECD-run survey intended to evaluate education systems worldwide by giving the same test to (almost) all students of the same grade across a large number of countries and comparing the results. This years' results have gotten a lot of coverage, notably for the dismal perfomance of all the Arabic-speaking countries participating. The UAE did least badly in terms of combined scores, managing 48th place out of 70; it was trailed by Qatar (59th), Jordan (61st), Lebanon (65th), Tunisia (66th), and, most ignominiously, Algeria at 69th place, barely beating the Dominican Republic.

Laudably, PISA have made their science tests publicly available online in many languages, including four Arabic versions labelled Israel, Qatar, Tunisia, and the UAE - don't ask me what happened to Algeria, Jordan, and Lebanon. Browsing through these, one immediately notices that the Tunisian translation (unlike the Gulf ones) has a remarkable number of grammatical errors, typos, and phrasings so awkward as to be barely comprehensible. For instance:

  • Bird Migration 1: "يستعملون العدّ الذي يقوم به المتطوّعين" - wrong case: should be المتطوّعون
  • Bird Migration 1: extremely awkward phrasing: "هجرة الطيور هي حركة موسمية كبيرة، يتنقل أثناءها الطيور نحو أماكن تكاثرها أو هي تعود منها." ("Bird migration is a great seasonal movement, during which birds move to the places of their reproduction and they come back from them.") Contrast the clearer phrasing in the Qatar version: "هجرة الطيور الموسمية هي انتقال واسع النطاق للطيور من وإلى مناطق تكاثرها. وفي كل عام يتولى متطوعون إحصاء عدد الطيور المهاجرة في مواقع محددة."
  • Bird Migration 3: the bird's name is "الزقزوق الذهبي" in the text, but in the question it turns into "الزقزاق الذهبي".
  • Running in Hot Weather 1: Garden path title: anyone looking at "العدو في الطقس الحار" is going to read it as "the enemy in hot weather", at least until the context is established. Contrast the Qatari translation "الجري في الجو الحار", using a better known, graphically unambiguous term for "running".
  • Running in Hot Weather 1: Grammatical error in "يدل على ذلك {كمية العرق | ضياع الماء | درجة حرارة الجسم} العداء بعد ساعة من السباق": for the sentence to make sense (even in dialectal Arabic!), none of the alternatives should contain the definite article, since they form part of an idafa genitive. Contrast the Qatari version, which avoids the problem by putting "للعداء".
  • Running in Hot Weather 2: Garden path sentence: "شرب الماء خلال السباق يمكن أن يكون له تأثير على حصول تجفّف وضربة حرارة بالنسبة إلى العداء. أيّهما؟ " Anyone reading this will start by reading the first word as šariba "he drank", giving "he drank water during the race, it can have an effect..." and only after the fifth word will they be in a position to read it, as intended, as "Drinking water during the race can have an effect on the occurrence of dehydration and heatstroke for the runner. Which of the two?" Having gotten that far, they'll still be given pause by the need to decide the intended referents of "Which of the two?" Contrast, yet again, the much easier to read Qatari version: " ماهو تأثير شرب المياه خلال الجري على تعرض العداء للجفاف وضربة الشمس ؟ " (What is the effect of drinking water during the race on the runner's exposure to dehydration and heatstroke?")

I could keep going, and no doubt more fluent Arabic speakers can find problems I haven't even noticed, but the pattern is clear: Compared to Qatari students, to say nothing of Western ones, Tunisian students were systematically disadvantaged in the PISA 2015 science tests by bad translation.

Whose fault is this? Clearly there was a failure at the level of PISA's international verification, which should have eliminated such problems. But the translations themselves are carried out at the national level (PISA2012 Technical Report Ch. 5). In other words, this mess was produced by Tunisian translators under the direction of the Tunisian government.

How is that possible? Simple: in Tunisia, appallingly enough, science is taught in French from the start of secondary school onwards. Science teachers have little need to keep up their Standard Arabic proficiency. Which raises the question of why this test, targeted at 15-year-olds, was administered in Arabic there to begin with.

Wednesday, November 30, 2016

Siwi vocabulary for addressing animals

Probably every language has a certain number of forms used especially for addressing animals, especially domestic animals. In response to a recent query by Mark Dingemanse, I gathered together all the ones I happened to have recorded for Siwi - the list below is definitely not exhaustive, but should at least be suggestive. Note the sounds used - clicks do not usually form part of Siwi phonology!

To chicks:
didididididi: eat!

To cats:
ərrrr: come!
ǀǀǀǀǀ: come!
pss: move!

To dogs:
ʘʘʘʘʘʘʘ: follow me!

To goats:
əšš: go!
ħəww: go!
xətt: go!
kškškškškš: eat!

To donkeys:
ǁǁǁǁ: giddy-ap! (?)

The interesting question here is: to what extent are these arbitrary, reflecting an emergent cross-species convention just as most human lexemes do, versus to what extent do they reflect innate properties of animal perception and communication? How do they compare to those you've encountered, if any?

Tuesday, November 08, 2016

Some Dellys etymologies via Andalus

Looking through Corriente's etymological dictionary of Andalusi Arabic, I keep coming across explanations for obscure Dellys words whose origins had been a mystery to me. Corriente's etymologies are not always to be trusted - I've found several errors, most egregiously the attribution of kurānah كُرانة "frog" to Romance rather than to Berber - but the work remains very valuable. Here are a few etymologies that struck me.

  • l-ənjbaṛ لنجبار "maize" was originally anjibār أنجبار "snake-weed" (Persicaria bistorta), whose flowers looks vaguely similar. This in turn comes from Persian angbār انگبار, which Corriente seems to derive from rang-bār رنگبار "many-coloured".
  • skənjbir سكنجبير "ginger" derives from some sort of popular confusion between two Arabic words: zanjabīl زنجبيل "ginger" and sakanjabīn سكنجبين "oxymel" (a mixture of honey and vinegar used medicinally). I assume the connection is that both are good for colds, but a quick search didn't turn up any actual evidence that oxymel was used for that purpose. Sakanjabīn is apparently from Persian سرکه انگبین serke angabin (Corriente gives the form sik angubēn) "vinegar honey", while zanjabīl is apparently, again via Persian, from Sanskrit शृङ्गवेर ‎śṛṅgavera.
  • fərnəħ فرنح "smile, laugh (of a baby)": cp. Andalusi farnas فرنس, Moroccan fərnəs فرنس; possibly, Corriente suggests, from Greek euphrosynē εὐφροσύνη "joy".
  • bu-mnir بومنير "seal" was very hard to elicit, since they've been locally extinct for decades (they've nearly disappeared from the entire Mediterranean, in fact). However, it turns out to be correct after all: cf. Andalusi bul marīn بل مرين "sea lion", Maltese bumerin "seal". Corriente seems to take this as Romance *pollo marino "sea-chicken", but the first part of that at least is clearly implausible in light of the comparative evidence as well as of common sense; the second might be tenable, but I'm not sure.

On a not entirely unrelated note: for anyone who wants to explore the maritime terminology of Dellys in greater depth than I've ever been able to elicit, El-Bahri.net is a wonderful and unexpected resource.

Friday, November 04, 2016

Lingua Franca and Sabir in "Four Months in Algeria" (1859)

I recently finished reading Four Months in Algeria, a travel diary by the English Rev. J. W. Blakesley published in 1859. It's mostly rather superficial - he couldn't speak Arabic, and spent most of his time with French soldiers and German settlers - but enlivened by occasional insights. It contains little content of linguistic interest, but it does contain two brief passages in the pidgin still used for communication between North Africans and Europeans when neither spoke the other's language - call it Lingua Franca, or Sabir. Since it would take a brave creolist to plough through the whole thing just in the slender hopes of finding such material, I reproduce them here.

The first passage (p. 340) comes from the author's description of his journey from El Aria to a place called Embadis, both in the east of Algeria, during the month of Ramadan; it shows a curious combination of French, Arabic, and "classic" Lingua Franca:

The poor muleteers had not tasted food during the whole day ; and as soon as ever the sun dipped, they produced one or two flat cakes, and ate them with avidity, not however without first offering me a sahre. I of course declined to diminish their scanty store, and reminded them that I had breakfasted at El Aria. "Toi makasch tiene carême ; toujours mangiaria," said one of the poor fellows, in the polyglot dialect which is growing up out of the intercourse between the natives and the illiterate European settlers of the interior.*
* There are a few Arabic words which the European children habitually make use of at Guelma, even when playing with each other. Makasch, no, shuiya, gently, I found invariably took the place of the corresponding French terms. On the other hand the Arabs constantly use the words ora, hour, and buono or bueno, good, to one another. Iauh, yes, a Kabyle word, pronounced exactly like the German affirmation, is also very common among the lower orders of Europeans.

In this passage, "toi" (you), "carême" (fast), and "toujours" (still) are French, while "tiene" (have) is Spanish, and "mangiaria" (eat, or perhaps food?) is Lingua Franca (from Italian), and "makasch", being used as a simple negator, is Algerian Arabic makaš ماكاش "there is no" (I discuss the latter's history here). Despite the diversity of the lexical sources drawn on, however, the grammar - simple SVO with no subject-verb agreement - matches better with Lingua Franca than with any of the lexifiers.

The second (p. 419), from a country as yet unconquered by the French, shows no such admixture, corresponding perfectly to earlier descriptions of Lingua Franca in which it often appears as little more than Italian minus the morphology:

More than once have I found in Algeria the conventional civility of the Arab to an European change into an unmistakeable expression of goodwill, when it appeared that I was an Englishman ; and in Tunis a notification of the fact at once drew forth a "Buono Inglese ; non buono Francese," from the mouth of a native.

Tuesday, September 27, 2016

Two funny adjectives (?) in Algerian Arabic

In Algerian Arabic, as in any other Arabic variety, adjectives follow the noun. However, there is one exception to this rule: invariant quja قوجا or qŭjna قُجنا, "a huge". Thus we say ṛajəl kbir راجل كبير "a big man", but quja ṛajəl قوجا راجل "a great big man". Not only does this "adjective" precede the noun it modifies, it requires it to be made indefinite: you can say šrit quja ktab شريت قوجا كتاب "I bought a huge book", but if you want to say "I bought the huge book", there's nothing you can do but use a different adjective. *šrit quja l-ktab or *šrit əl-quja ktab or *šrit əl-quja l-ktab are all impossible. You can make quja قوجا follow the noun, but you have to use a different construction, equally unique to this "adjective": ṛajəl quja mən huwwa راجل قوجا من هو "a great big man", daṛ quja mən hiyya دار قوجا من هي "a huge house". The origin of quja قوجا is clear: it comes from Turkish koca "large; husband", which in turn is apparently an early adaptation of Persian xɑje خواجه "master, gentleman". In Turkish, all adjectives are prenominal, so one could take that to explain its position in Algerian Arabic; but a quick search suggests that Turkish koca has no problem combining with the indefinite (one finds phrases like bu koca dünya "this huge world"). However, it looks like Algerian quja has followed a trajectory very similar to Iraqi and Khaliji xôš خوش. It is not obvious to me why obligatorily indefinite prenominal adjectives should even be possible in a language that otherwise strictly requires adjectives to be postposed, much less why they should have to be indefinite in order to stay prenominal - but that's what it looks like....

The word məskin مسكين "poor (pitiable)" is not so unusual, lexically speaking; it's just about pan-Arabic. It combines just fine with definite nouns, and takes normal agreement (f. məskina مسكينة, pl. msakən مساكن.) However, it has almost the opposite idiosyncrasy: it doesn't take the definite article, which would be obligatory with any normal adjective whose head is definite (and, if it comes to that, with a noun in apposition to a definite phrase as well). Thus we say bwəʕlam məskin maqdərš yji بوعلام مسكين ماقدرش يجي "poor Boualem couldn't come", even though we would say bwəʕlam əṭ-ṭwil بوعلام الطويل for "tall Boualem" (Boualem the-tall). Why? No idea. Suggestions are welcome!