Any concrete answer to this is actually false! Arabic is much more flexible to have its vocabulary counted by the number of words. There are “accurate” answers though about how many are there in different Ma`ajim (plural of Mu`jam, a collection of words organized and categorized by the alphabetic order) An “oh-so-specific-and-logical” answer that is more of an assumption to me is that Arabic has: between 100,000 (if you are so stingy), just a few millions (if you disregard all words not currently in usage), 12 million (one restrict rule) 90 million words and 500 million words! All of these numbers are the results of linguistic queries conducted in the past. And the number varies depending on what do you count as a word.
So, why did I say we can never know? It’s because Arabic has a three-/four-/five-lettered root in the simplest past form, it has a meaning in itself like for example: كَـتَـبَ KaTaBa (a=shwa, short vowel) This verb means “wrote”, no indication of female, duality, plurality, direction of action or emphasize yet… This word can now generate at least 30 other meaningful words! Just put that small root into DOZENS OF word scales/patterns to attach each root to a certain scale/pattern which in itself implies a specific kind of word).
What on Earth does that even mean? Let’s take KaTaBa and implement it into a dozen of different scales:
×You can create several verbs simply meaning to write, but in different tenses and doers: كَتبتَ، كَتَبَتْ، كتبتي، كتبتم، كتبتما، كتبتن، يكتبون تكتبن But literally, that’s not even half of the possible words (they’re all distinct from one another, and accurate to the letter! So if use one of them, it won’t be confused with another. There’s no redundancy of different versions of one verb, each variation serves a certain purpose, it can reflect a different gender, being singular or plural, being in the past, present, future or more, etc. )
× A dozen of nouns and adjectives: 1- doer = [KaTeB] = كاتب = writer (Scale: فاعِل - Fa3el) 2- object = [maKToB] = مكتوب = written/message (Scale: مَفعول - maF3ool) 3- different object = [KeTaab] = كِتاب = book (Scale: فِعال - Fe3aal) 4- another object = [maKTaBah] = مكتبة = library (Scale: maF3aLah)
Additionally: كُتاب، مَكتَب، إكّتِتاب، مُكاتبة، كُتَيّب ….. It DOESN’T stop any time soon, all kinds of words related to writing you can imagine can somehow all be expressed using a three-sound/three-letter -root, put in so many shapes and forms!
So, there is no correct answer, as a lot of words can be put in any scale and give a lot of grammatically correct words, while each word can be never or rarely used locally… It would still be a valid and comprehensible word. Example: كَتُوب [KaTooB] - this word would mean someone who writes a lot, I don’t remember reading it before, but I doubt anyone who knows proper Arabic would find it difficult to understand what it means.
The other problem is when people count from معاجم [Ma`ajim], it’s not like that collection of all words in a language… Arabic got super changed, yet remained untouched, meaning the the Standard Arabic is still almost the same, yet the different local Arabic dialects Also were born, having simpler vocabulary yet more modern and has more borrowed and Arabized words!
The correct answer: You can barely count how many scales are there, then you’ll have to count how many roots are there, the multiplication of the two will give you only the maximum possible words from known roots, it’s 100% not accurate, because not all of them are used (although they all have meanings and can be used!) And also a lot of words don’t count (for being too similar to another word) or get counted twice in two close case (because
the word for the two boys lived is different from the word for a group of women or a gendered human name/adj. or an animal, etc.!
This is more reasonable, but usable English today has more words that Arabic dialects, but almost 100% surely Arabic has a little more than English as a classic type: Fus7a
Thanks for asking such a hard question, I seriously had fun going through this issue; follow my blog about the Arabic language :D No self-promo intended!
First I want to tackle why the answer to this question depends on methodology.
Short answer: It depends who’s asking. For ideological reasons, Muslims want to slant this number upwards—but I’ll get to that later.
This is a very difficult question to answer, for a few reasons:
You have not defined your temporal parameters. Arabic has been written for over a thousand years.
You have not defined dialect parameters. Ethnologue classifies Arabic as a macro-language which includes about 30 languages. It is very unscientific to pretend you can arrange all of these into one dictionary and say that they are one language—or to pretend that they are “not a language,” which is a deprecation sadly heaped on spoken Arabic by its own speakers.
You have not defined lexicographic parameters. Just because one root, k - t - b, can be stretched into 1000 forms, does not mean that this root counts as 1000 words. That’s ridiculous. So you would have to create some constraints and support them with science.
Given the most permissiveness in your lexicography, you could use the number millions, but this is surely based on a methodological bias and is not backed by real science.
Why, then, do people ask this question?
The only reason Muslims ask this question is because they have been taught the idea that Arabic is the divine language, and God only accepts prayer in Arabic. Here’s some of the related ideas behind that language ideology:
Richness of vocabulary. They believe that having the most words means it is somehow the best language. This in itself is a linguistic misconception. Guy Deutscher, for instance, in The Unfolding of Language, shows that the average Swedish peasant had a larger vocabulary than Shakespeare. Vocabulary varies from language to language and it is no objective expression on the eloquence (or divinity) of the language. In fact, richness of vocabulary can serve to obscure meaning.
Anatomical determinism. Muslim Arabs where I live think that Westerners are anatomically unable to produce the sounds of Arabic, which is patently false. With training, anyone can produce the sounds of Arabic—which is not true of Czech, an Indo-European language.
Uniqueness of sounds. Arabs teach that Arabic is superior in the uniqueness of its sounds, but that’s not true. You can read about the Northwest Caucasian language, Ubykh and it’s pharyngealized consonants, or the fascinating Khoisan languages of southern Africa with their click sounds.
Syntactical complexity. Classical Arabic is extremely complicated in its verb morphology, but there are many, many other languages with more complicated systems. (Again, look at the polysynthetic languages of Central Asia and the Caucasus region.)
Regularity. Part of the assumed elegance of Classical Arabic is its regularity, but this is very misleading. First of all, the Qur’an has unexplained forms that, were they not in the Qur’an, would be plainly called “irregular.” (It breaks its own rules, in a sense.) Second, Modern Standard Arabic (often conflated by native speakers with Classical Arabic) is a reconstructed language, and therefore each lexeme can regain its regularity of forms by analogy with other lexems.
Old Arabic is globally unique in the richness of its vocabulary and expression; yet so is Piraha of Brazil, or New Zealand English, or Tunisian Arabic for that matter.
Actually this is an interesting question for somebody who doesn't know Arabic, in short: You can say that Arabic has (according to different Arabic sources listed at the end) between 90 million words and 500 million words.
Now let's delve into the detailed answer:
Someone who knows Arabic very well knows for sure that it's really hard to estimate the number of Arabic words because this language depends on Juthoor system ("ʒʊðu:r" phoenitic transliteration/ I can translate it to Radices/Origin Verbs System ) so What is it about this system?
The answer is: Most authentic Arabic words have a raw form which is a raw verb of three letters only, from which nomen agentis (active participle), nomen patientis(passive participle), nomen verbi(infinitive/gerund), hyperbole(exaggeration) forms, similes, comparative and superlative adjectives, feminine and masculine forms(which differ in Arabic) and in brief all linguistic morphological forms and conjugations are derived from a shared unit i.e. Jather The Radix("ʒʌðr" singular of Juthoor) , obviously this makes the compendium of words derived from the tri-literal Juthoor, so vast not only from a mathematical point of view but also when looking at the linguistic reality of Arabic.
Arabic does not only have tri-literal radices but also has tetra-literal, less penta-literal and rarely hexa-literal radical verbs, as I know there are around 8600 tri-literal Juthoor in Arabic, so you can imagine now that there are practically endless possibilities to create/have different words, but the story doesn't end here, Arabic also has diacritics/accent marks that a lot of the times produces a new word although the letters of the word are not changed, I can also add that there is a considerable amount of words that do not depend on any radix rather they are kind of stand-alone words like the word bird in English and the word "Kursi" meaning a Chair in Arabic.
After this clarification the questions to be asked are: How many Juthoor does the Arabic language have? How many "Stand-alone" words does the Arabic language have?
In retrospect, I have not had the opportunity to have a look at languages of the world but I believe deeply Arabic is one of the most dynamic and plastic languages of humankind if not the most ever and when I reflect upon the greatness of Arabic language I cannot imagine how did my ancestors manage to improvise such a great intricate language system(sorry for sounding subjective in this part of the answer).
You can think of Arabic as a programming language with a large API which has functions/procedures called Radices(one: radix) where you pass what you intend to talk about(statement/expression) with parameters (such as specific diacritics) to have at the end one result or maybe multiple valid results!
I would say that it has at most 100.000 actual usable words.
Some people have given some accurate descriptions of the morphology of Arabic in here. However, if you look at all the judhuur (جذور) in the Arabic system that are found in lisan al-arab (i.e. the biggest Arabic dictionary, for whoever doesn’t know it), you will very rapidly realize that there are probably just at most 1000 consonant roots that are viable and proliferating.
12 million words? 90? 500???? Come on…
Tthe patterns of word construction are not all so common. I can think of at most 100 patterns (فاعل، تفعيل، مفاعلة، استفعل, etc.) that produce commonly used words in Arabic.
When referring to words, you must think that words are carriers of meaning. And out of those many millions of possible words created through the root system and the Arabic patterns, a mere 5% (if lucky) are usable even in the most complex contexts.
Remember that words in Arabic have even tens of meanings at times, rendering the need of building/creating new words obsolete.
I am a professor and interpreter of Arabic and I’m using it in all possible domains.
I can tell you that I use almost the same words for mathematics, judicial or talk-shows.
I will give you just ONE example, so you can understand how little is the number of words needed in Arabic. Look at the meanings attributed to ONE verb from:
P.S. Being Semitic languages, Aramaic/Syriac and Hebrew have almost the same number of words also, if you count all their roots and morphological patterns: they have words by the tens of millions also :)) lol… what a joke
As for how many words, it seems rather impossible for us to know.
I just went over a 500 page long book, that only makes account of 3 & 4 lettered verb roots, a total of around 15000. Along with their standard conjugations which ranged in hundreds for each verb.
Conjugations included the following sets:
Past, present, future, and command tenses.
Tables associated with known subject, and separate tables for the unknown subject.
Singular, dual, and plural.
Masculine (M) and Feminine (F) genders, for each and every word.
And for all 14 pronouns: I, you, he, she, we dual F & M, we plural F & M, they etc...
The book, published in 1991, didn’t include word counts, but also It doesn't include nouns, and adjectives (for instance there are over 5000 names and adjectives for Camels and over 1000 for lions!)
Further the kanguage is generic, new words are continuously derived from original roots, being perfectly accepted linguistically, as long as it obeys rules of derivations and it's from original Arabic root (not Arabised from another language as is the trend now adays.)
Historically when linguistics started deriving and recording grammatical rules from Qur’an and ancient poetry, they devised a tool, a gauge, a single word to measure conjugations of derived words, and believe it or not, it works perfectly well for all kind of words in Arabic, except for original nouns/adjetives/pronouns.
This word is فعل from the three letters ف ع ل it means “verb or to do.” Besides it's multiple possible pronounciation (depending on short vowel diacritics), these three letters can even be shuffled to give new set of verbs, nouns, and adjectives (علف فلع عفل لفع) again with their own sets of diacritics and conjugations.
Conjugation, for instance, in English language can go up to may be ten words. In Arabic, it goes up to hundreds, and over a thousand for most roots from one single 3 lettered root.
Here are just a handful of hundreds of the above “measure” word:
فعل يفعل يفعلا يفعلون تفعلون فعلت فعلن فعلتا يفعلن تفعلن
Oh my god, well I think that the number is undefined, lets start, the Arabic language is build on 3 kinds of words: verbs, nouns, lettres.
Each of them has a large spectrum of variation especially verbs and nouns that have roots, each root can be inflected in different pattern and each pattern can be conjugated in 154 forms and we can add to each form the convenient pronoun (18 pronouns) and …
the most complicated issue is the short vowels and it is very important because it can change the meaning of the word (two roots having the same letters but different short vowels are different) , words can be fully short vowelled or partly.
exemple: a verb - darasa دَرَسَ- root -drs دَرْس- has at least several patterns -dArasa, tadArasa, darrasa, … each one has different form of conjugation and can be an independent lexical entry and has different meaning or function, we have 22000 verbs and each has 2 participial nouns that have 18 variations … and can be collated to pronouns
bref, billions of words and hundred of billions if we consider short vowels as variations of words
Arabic is a non-concatinative language, it can be described as derivational language meaning that the morphotactics depend rather on affixation i.e. adding morphemes onto the word without changing the root, that is, preserving the core order of the verb binyanim, this results in the highly regular inflectional pattern distinguishing the language.
Deriving a full-form lexicon on a root-based algorithm e.g from a root like (كسر) [ksr] "to break" will yield roughly 30,000 conjugations/inflections which is the thorough listing of the verb/noun paradigm, this is theoretically true for any other triconsonantal sound root.
Now, not all roots are triliteral nor sound, this will reduce the number above significantly, also be informed that the Modern Standard Arabic uses roughly between 6,000 - 7,000 roots to generate words through the technique mentioned before, you do the math.
You also should add a considerable amount of loanwords on top of that orthographic pile.
Here is a visualization of how the Arabic verb conjugation systems works (click on item to expand)
I haven’t found a reliable answer online or in any book I have. Any number that is above million is unlikely to be correct because that would suggest that English has far fewer words than Arabic and that is impossible to be the case because English has been the lingua Franca of the world in the past century and such a position is unlikely to happen if the language is far fewer in vocabulary to a language that is far behind English in that measure.
Therefore, I can say that it would be possible for Arabic to have far more words than English only if we count all the translated technical terms that aren’t in actual use in the Arabic academia. Therefore, all the words that have been added to the Arabic dictionaries as a mere attempt to find an Arabic direvitive equivalent to a word in the dictionary of another language but has never been used in any book or in culture should not be counted as an actual word in the vocabulary. Likewise, we shouldn’t count any derivative of a word that is only possible due to the grammatical rules if the Arabic language and still have no use or hasn’t been used.
Due due to the previous concerns, I now say that we have no legitimate answer to this question.
Having said that, the grammatical rules of Arabic are so flexible and clear that when a term is created or derived accordingly, it’s almost always the case that you can easily understand the meaning of the term even if it is created on demand.
English and other European languages have a different type of process of creating new words that is other than deriving words from the grammatical rules of the language or from the common use of a new term. The other type is the process of deriving new words from combining Latin or Greek morphemes. In that case, Arabic lacks such an equivalent, therefore, it resorts to a literal translation of the morphemes of the English or any other European language if a direct translation doesn’t exist. Therefore, it is possible to find many technical translations in a technical or medical book that have never been used in academia or in practice.
For that reason, and until I can find a reliable and citeable study that has been done on this subject, I can only say that Arabic is likely to have a number of words within the vicinity of the English language size of vocabulary.
If anyone can answer that last stated concern (in the previous paragraph), please include the answer in any answer that claims Arabic to have any number of words that is more than 50 percent above the number of the words in the English vocabulary.
I don't know Arabic, so I'll have to give a general answer. Most languages have as many words as needed by the persons using them. If somebody needs more words, they invent them. They may borrow a word from an other language, or they may construct one, or they may do both. When my Norwegian language needed a word for computer - because computers had been invented - we borrowed the English word that the inventors used, while Icelandic constructed a totally new word, tölva. So we both got one new word.
Then there is the question of how to count the words. Arabic, as an other answer tells us, can construct new words from its "roots". Norwegian constructs new words by putting old words together, writing them as one. English put words together, too, but writes them separately. Norwegian "bilvask" is constructed exactly like English "car wash", but written without a separation. Should we count "bil", "vask" and "bilvask" as three words, and "car wash" as just two?
And there is the question of how many words people actually use and how many are just in dictionaries. In what sense are the funny words in Shakespeare 's plays actually part of English today?
Arabic is a language that has been in use for more than fourteen hundred years, on many fields, not least religion and poetry, and it is able to construct new words from existing ones. In view of the remarks I have made I would suppose there are lots of words that have been used or could be used in Arabic but not all of them are widely used or commonly understood. How do we get that into our counting?