Showing posts with label lexicography. Show all posts

Wednesday, November 30, 2016

Siwi vocabulary for addressing animals

Probably every language has a certain number of forms used especially for addressing animals, especially domestic animals. In response to a recent query by Mark Dingemanse, I gathered together all the ones I happened to have recorded for Siwi - the list below is definitely not exhaustive, but should at least be suggestive. Note the sounds used - clicks do not usually form part of Siwi phonology!

To chicks:
didididididi: eat!

To cats:
ərrrr: come!
ǀǀǀǀǀ: come!
pss: move!

To dogs:
ʘʘʘʘʘʘʘ: follow me!

To goats:
əšš: go!
ħəww: go!
xətt: go!
kškškškškš: eat!

To donkeys:
ǁǁǁǁ: giddy-ap! (?)

The interesting question here is: to what extent are these arbitrary, reflecting an emergent cross-species convention just as most human lexemes do, versus to what extent do they reflect innate properties of animal perception and communication? How do they compare to those you've encountered, if any?

Tuesday, June 24, 2014

From Figuig to Igli: Berber in the Algerian-Morocco borderland

The number of good Berber descriptive dictionaries has been slowly but steadily increasing in recent years, but Hassane Benamara's new Dictionnaire amazigh-français : Parler de Figuig et ses régions (Rabat: IRCAM, 2013), which I was lucky enough to be lent a copy of lately, is surely one of the best. Apart from being quite unusually large (800 pages), it incorporates examples, multiple senses, pictures of items difficult to describe, an appendix with encyclopedic information on culturally specific words such as festivals and childrens' games. It incorporates a few neologisms useful for schooling, but takes a fairly inclusive attitude towards Arabic loanwords. There are barely 15,000 people in Figuig, but, astonishingly enough, this is actually the second dictionary of Figuig Berber published by a native speaker; the first, Ali Sahli's معجم أمازيغي-عربي (خاص بلهجة أهالي فجيج) (Oujda: Al Anwar Al Maghribia, 2008), was a good effort, but is substantially shorter and used a less accurate transcription. (There's even another linguist from Figuig, Mohamed Yeou, threatening to make a third dictionary – if he goes ahead with the project, he'll have a high hurdle to clear.)

Across the border in Algeria, the situation is rather different. A number of towns across a wide area around Bechar and Ain Sefra speak Berber varieties closely related to that of Figuig, collectively imprecisely termed "Shelha". Some of them seem to be shifting to Arabic (on my latest trip, I was told that in Lahmar they had stopped speaking Berber with their children, and for Igli I had heard the same much earlier.) But little effort – and no official effort, as far as I know – is being made to document them. The only (very) partial exceptions of which I am aware are Igli and Boussemghoun.

For Igli (population 7000), I have already described the local Scouts' efforts to put together an online dictionary. More recently, however, I came across a laudable local attempt at approaching the problem academically: Fatima Mouili's The Berber Speech of Igli, Language towards Extinction. After a very brief summary of Igli grammar and phonology, unfortunately made frequently illegible by font problems, the author discusses the reasons for language shift. Corresponding to my impressions for the region, including Tabelbala, she cites emigration and the desire to ensure educational success as important drivers; others are more surprising, including the immigration of refugees expelled by the French from a nearby village during the Algerian War of Independence. Apparently, her thesis discusses similar issues, for those with 59€ to spare...

For Boussemghoun (population 4000), a few articles and a book by Mohamed Benali may be cited, all focusing – as far as I can see – exclusively on the sociolinguistic situation of Berber in the town. A local Berber-language poet billed as "the Ait Menguellet of Boussemghoun", Bashir Oulhaj, has a considerable presence on YouTube, eg here; he's even been interviewed, by Figuig News. It seems to be treated as the centre for Amazigh identity in the region; the HCA has even organised a symposium there. Nevertheless, little if any descriptive work has been published on its variety of Berber.

Taken together, there are probably more speakers of Berber in southwestern Algeria than in and around Figuig. Why the difference, then? Is it because linguistics is better represented in Moroccan universities than in Algerian ones? (Notwithstanding some interesting work coming out of Algeria, I think that is fair – it would be hard to think of any linguist working in Algeria with a profile comparable to Abdelkader Fassi Fehri, for example.) Or is it because the Amazigh movement in Morocco is less closely associated with one side in the "culture war"? (Benali observes that, while most Semghounis wanted Berber to be taught in schools, they rejected the installation of an HCA office due to distrusting their politics.) Or are there more specific, purely local factors explaining the difference? That would be worth a study in itself – though perhaps not as much so as the Berber varieties in question!

Thursday, December 26, 2013

Does Arabic have the most words? Don't believe the hype.

For some time, I've been hearing rumours (from Arabs, of course) that Arabic has the largest number of words of any language. Recently I found one vector for this rumour: Comparison of the Number of Words in Languages of the World, a poster put together by Azzam Aldakhil which has the merit of at least giving the sources for its figures, namely Muʕjam ʕAjā'ib al-Lughah by Shawqī Ḥamādah, 2000. (In a follow-up comment he gives the page numbers, 83-84.) This poster claims that "Arabic has 25 times as many words as English".

Unfortunately for this claim, if you go to the book cited, what you actually find is a calculation of the number of possible roots in Arabic, without regard to whether or not the root actually has a meaning. Such a count includes huge numbers of unused roots such as بزح bzḥ or قذب qḏb, while at the same time lumping together all words derived from the same root; كتاب book, كاتب writer, and مكتب office are three words, but only one root. The result of such a calculation might tell us something about the potential for expanding Arabic, but absolutely nothing about the state of the Arabic language. And since in practice both Arabic and the languages it is being compared to on that poster allow arbitrary long words without real roots, if only in loanwords, it doesn't even tell us much about its potential.

Both the number of Classical Arabic roots with actual meanings and the number of words can be estimated from the classic dictionaries: according to Sakhr's statistics, there seem to be around 10,000 roots, and up to 200,000 distinct words. Roots don't play such a major role in the lexicography of most non-Semitic languages, so it's difficult to compare the number of roots cross-linguistically. But in terms of words, that would be slightly fewer than English (250,000 in the OED, although the poster cites 600,000) and slightly higher than French (over 100,000 excluding proper nouns, according to the Académie Française).

However, such comparisons can hardly fail to be misleading. For one thing, English is much more hospitable towards dialectal and colloquial usages than Arabic is – the OED is full of words marked as Scottish or Northern or slang or whatnot, the equivalents of which would never be accepted by an Arabic dictionary. For another thing, the whole enterprise of counting words across languages runs into apparently insuperable problems, especially when it comes to compounds, which Arabic dictionaries do not normally treat as words. If you include compounds, then compound-friendly languages like German or Turkish or Inuktitut are automatically going to beat all the rest – and all the available statistics that I've seen for, say, English happen to include compounds.

So the best answer is that we don't really know, and that word count, even if we could measure it better, is not a very good measure of a language's expressive power anyway. Some missing words make a genuine difference, as I've discussed here before. But is English really missing out by not having distinct words for male camels (جمل) vs. female camels (ناقة)? Is Arabic really missing out by not having a special word for cornpone, or for scones?

Sunday, February 10, 2013

Kabyle vocab 1: Verbs of motion

I've been taking advantage of being in Paris to attend some Kabyle classes. However, the classes are in French - as are all the textbooks - and I find that I memorise vocabulary more easily when English equivalents are presented. So I'm going to experiment with writing up vocabulary lists and posting them online periodically, on the theory that these might be useful to Anglophone learners other than myself, and that putting them together will be good for my memory. For today, the theme will be verbs of motion. I find that knowing facts about a word's wider connections makes me more likely to remember it, but that may just be me, so if you don't, feel free to ignore them...

Go: ṛuḥ "go!", yeţṛuḥ(u) "he goes", iṛuḥ "he went". This verb, obviously, is borrowed from colloquial Arabic ṛuḥ (like its Siwi counterpart ṛuḥ, iteṛṛaḥ, iṛaḥ); it is quite commonly used, but there is a more purist alternative:

Go: ddu "go!", iṯeddu "he goes", yedda "he went". This verb is also used with the same meaning in Tashelhiyt; it's probably related to Tamasheq idaw, itidaw, ǎddew "accompany, go with". Example: Tom yebɣa ad yeddu ɣer Japun.

Come: as "come!", yeţţas "he comes", yusa "he came". This nearly pan-Berber verb is usually combined with the particle -d "hither (towards here)"; in Siwi, that particle has fused with the stem, yielding héd, itased, yused. Example: Yusa-d ɣer Japun asmi ay yella d agrud.

Pass: ɛeddi "pass!", yeţɛeddi / yeţɛedday "he passes", iɛedda "he passed". This verb, widespread in both Berber and dialectal Arabic, is from Arabic عدا "he passed", as the generally un-Berber ɛ betrays. Siwi retains fel, iteffal, yefla "pass / depart"; the rarer cognate verb (fel, yeffal, ifel) in Kabyle means "go over". Example: ɛeddaɣ fell-as deg wezniq.

Arrive: aweḍ "arrive!", yeţţaweḍ "he arrives", yebbʷeḍ (yuweḍ) "he arrived". Siwi instead uses an Arabic loan mraq, imerraq, yemraq; but it retains a causative of the original root, siweṭ. Example: aql-ik tuwḍeḍ-d zik.

Go up: ali "go up!", yeţţali "he goes up", yuli "he went up". The similarity to Arabic على is probably just a coincidence, since the Tashelhiyt equivalent is eɣli. Siwi uses an equally Berber but unrelated form wen, itewwan, yuna, also found in Tashelhiyt (awen); Kabyle retains a causative of this root, ssiwen "go up (eg road)", and a commoner noun, asawen "(up) a rising slope". Example: La ttalyeɣ isunan.

Go down: aḏer "go down!", yeţţaḏer "he goes down", yuḏer "he went down". Siwi again uses an equally Berber but unrelated form ggez, iteggez, yeggez, also found in Tashelhiyt (ggʷez). Example: La ttadreɣ isunan.

Go in: ḵcem "go in!", iḵeččem "he goes in", yeḵcem "he went in". The same verb is used in Tashelhiyt; Siwi uses a cognate form kim, itekkam, ikim. Example: Ttxil-k, kcem-d.

Go out: ffeɣ "go out!", iṯeffeɣ "he goes out", yeffeɣ "he went out". The same verb is used in Tashelhiyt. and (with a trivial regular vowel change) in Siwi f̣f̣eɣ, itef̣f̣aɣ, yef̣f̣aɣ. Example: Zemreɣ ad ffɣeɣ ad urareɣ?

Or, in a form more suitable for quick self-testing:

go upali
go downaḏer
go inḵcem
go outffeɣ

Comments and suggestions welcome, especially if you speak Kabyle!

Tuesday, October 25, 2011

Berber dictionary online

A link I've been meaning to post for a while: Amawal n Tiddukla Tadelsant Imedyazen. The guy behind it, Omar Mouffok, deserves credit for his efforts to document Kabyle dialects outside of the mainstream, like the one spoken near Blida; many entries indicate which regions the word is used in, though unfortunately a fairly impenetrable system of abbreviations is used. Translations into French, Spanish, and Arabic are given for some words, but many are only given definitions in Kabyle.

Monday, October 03, 2011

Songhay online

The Northern Songhay family is of some general interest, both for the study of language contact - all its members are astonishingly strongly influenced by Berber and/or Arabic, to the point that only a few hundred Songhay words survive and much of the grammar has been replaced - and for understanding the history of the Sahara (they suggest both that the spread of Songhay predates the Songhay Empire and that a Berber language different from Tuareg used to be spoken in much of Mali and Niger.) I've recently put together a sort of homepage for Northern Songhay linguistics: Northern Songhay. It includes a more or less complete bibliography.

Anyone interested in that will also be interested in a site I recently came across:, offering lexicographical data, lessons, software, and some references focused mainly on the Songhay of Gao (Koyraboro Senni.) I particularly appreciated the pictorial dictionaries under "Encyclopédie".

Wednesday, September 29, 2010

Small vocabularies, or lazy linguists?

In Guy Deutscher's new book The Language Glass (which I'll be reviewing on this blog sometime soon) he claims (p. 110) that "Linguists who have described languages of small illiterate societies estimate that the average size of their lexicons is between three thousand and five thousand words." This would be rather interesting, if verified - but this statement is not sourced at the back, and is in any case too vague (what counts as "small"?) to be relied on as it stands. Does anyone have any idea where he might have got this figure?

I haven't found his source, but Bonny Sands et al's paper "The Lexicon in Language Attrition: The Case of N|uu" gives a nice table of Khoisan dictionaries' sizes, ranging from 1,400 for N|uu to < 6,000 for Khwe and 24,500 for Khoekhoegowab. She prudently concludes "The correlation between linguist-hours in the field and lexicon size is so close that no conclusions about lexical attrition can be drawn" - the outlier, Khoekhoegowab, is not only the biggest of the lot (with over 250,000 speakers), but had its dictionary written by a team including a native speaker over the course of twenty years. Given that "2,000 - 5,000 word forms (in English) may cover 90-97% of the vocabulary used in spoken discourse (Adolphs & Schmitt 2004)", it is not surprising that it should take disproportionately long to move beyond the 5,000 word range. However, she also points out that "Gravelle (2001) reports finding only 2,300 dictionary entries in Meyah (Papuan) after 16 years of study", suggesting that some languages may simply have unusually small vocabularies. Along similar lines, Gertrud Schneider-Blum's talk Don’t waste words – some aspects of the Tima lexicon suggested that the Tima language of Kordofan had an unusually small number of nouns due to extensive polysemy and use of idioms (I can't remember any figures, nor indeed whether she gave any.)

I'd be interested to see other discussions of the issue of differences in lexicon size and explanations for them. My Kwarandzyey dictionary (in progress) so far stands at about 2000 words - it would be encouraging to think that I might already have done more than half the vocabulary, but I very much doubt it!