Adjective comparison in African varieties of English

2. Literature review

The formation of comparative adjectives has been a topic of interest in English linguistics in the last two centuries, from both synchronic and diachronic perspectives. Most research into the diachronic evolution of comparative formation focuses on the history of synthetic comparison, the native form, and the progressive implementation of the analytic comparison. Given that English was historically a highly inflected language, the comparative system for adjectives was mostly inflectional in both Old English (Hogg 1992: 141) and Middle English (Lass 1992: 116). Although the analytic formation has existed since Old English, its use was very scarce until Late Middle English and the first attestations go back to the thirteenth century (Kytö and Romaine 1997: 330). From late Middle English until the seventeenth century, when the traditional rule that establishes a relatively stable complementary distribution between number of syllables and type of comparative arose, the two forms remained as alternatives (Lass 1999: 157). Priestley (1761 [1969]), one of the first authors to tentatively account for the choice of the analytic over the synthetic comparative form, referred to length as a determining factor for the distribution here: polysyllabic adjectives tend to add adverb more more frequently to avoid difficulties in pronunciation (Wick 2005: 2), whereas monosyllabic adjectives tend to select the inflectional form, and this is the distribution acknowledged in most grammars of Present-day English (cf. Quirk et al. 1985: 461–463). For disyllabic adjectives, there is variation depending on their endings, as demonstrated by Kytö and Romaine (1997), among many others (see also Wick 2005). The criterion of length is already found in Sweet (1891). For inflectional gradation, in addition to monosyllabic adjectives, Sweet (1891: 326–327) includes disyllabic ones that bear the stress on the second syllable (other than those ending in consonant clusters), as well as many with stress on the first syllable (other than those ending in -ish, -s, and -st, e.g., foolish, nervous, and honest) which are frequently found in the analytic form to avoid the repetition of sibilant sounds, whereas those ending in -ful, -ing, or -ed (e.g., careful, boring, and tired) favor the periphrastic form.

Current views on comparative alternation in English also include length as a variable in the choice of comparative form, and indeed this remains the most frequently-cited criterion for learning gradation in texts of English as a Foreign Language (EFL). Typically, a distinction is made between monosyllabic, disyllabic and trisyllabic or longer adjectives. While most monosyllabic adjectives take the inflectional form, and trisyllabic or longer adjectives take the periphrastic variant, disyllabic adjectives are more frequently subject to variation (Quirk et al. 1985: 461–463). Such variation is often determined by the final segment of the adjective; for instance, the suffixes -y, -ow, -le, -er, and -re (e.g., angry, shallow, noble, clever, and mature) act as triggers for the choice of the inflectional form (Quirk et al. 1985: 462). Similarly, Huddleston and Pullum (2002: 1583) state that the main determinant in either allowing inflection or making it impossible in disyllabic adjectives is the ending of the lexical base. Hence, along with the number of syllables, the final segment of the adjective becomes a significant factor in comparative alternation. A number of more fine-grained studies claim that most adjectives ending in -ly favor the analytical form (Lindquist 1998). In line with such a claim are Bauer’s (1994: 57–78) findings on the adjectives costly, deadly, friendly, and kindly, which all favor the periphrastic form. Other studies which take the final segment of the adjective as a variable have proposed that adjectives ending in -y, other than those ending in -ly, take the inflectional form (Kytö and Romaine 2000: 307); also, adjectives ending in -le, excluding able, inflect for comparative formation (Kytö and Romaine 2000: 181).

The literature on comparative formation shows that variation between the inflectional and periphrastic comparatives is not determined solely by length but also by phonological factors, especially in disyllabic adjectives, and here a degree of disagreement arises. Mondorf (2003), in a very comprehensive study, shows that to account for the distribution of the comparative, it is necessary to not only consider phonological factors, but also morphological and syntactic ones. Hilpert (2008) goes on to confirm that both phonological predictors and structural factors are relevant, together with frequency of use (see Section 4).

Despite the non-clear consensus in previous research, when adopting a diachronic perspective, it is generally agreed that there is a progressive increase in the use of the periphrastic form in English, as part of a broader trend towards analyticization, observed from Old English to Modern English and continuing today (Leech et al. 2009: 264).

The expression of comparison in varieties of English around the world has received far less attention. Kortmann et al. (2020) document this in the Electronic World Atlas of the English Language (eWAVE), a comprehensive digital resource designed for the study and analysis of morphosyntactic variation of the English language worldwide. eWAVE is an interactive platform that integrates geographic, linguistic, and demographic data, allowing researchers to explore and visualize morphosyntactic features of English usage across different regions and communities around the world. Among the features included are the spread of the analytic form to theoretically synthetic domains (feature: 80 regularized comparison strategies: extension of analytic marking), especially in monosyllabic adjectives, and the expression of the synthetic form to an a priori analytic domain (feature 79: regularized comparison strategies: extension of synthetic marking). A preliminary study on Asian Englishes (Bangladeshi English, Indian English, Pakistani English and Sri Lankan English) with data from GloWbE (Seoane and Suárez-Gómez 2023) shows that the analytic form of the comparative has extended to monosyllabic adjectives in all varieties, but also reports that this is more marked in Bangladeshi English, where six out of nine of the adjectives in the study (high, great, low, old, large and big) show values higher than in the other varieties analyzed. This has been interpreted as a result of transparency, in the sense that periphrastic forms are easier to learn and use than synthetic ones, and in Bangladeshi English input has been relatively scarcer than in the other Asian varieties. This tendency towards analyticization, observed in the historical development of English, aligns with broader theories of language contact and adult second-language learning. As Haspelmath and Michaelis (2017) acknowledge, analyticization is commonly observed in language contact scenarios influenced by European languages (e.g., European-based creoles), driven by the pursuit of increased transparency. Similarly, in adult second-language acquisition, learners often prioritize transparency to facilitate mutual intelligibility. This is often achieved through analytic structures.

In sum, there exists a rich literature on variation in the comparative formation in English, with length being the common factor for variant selection in all studies. Beyond this, Mondorf’s work (especially 2003) is perhaps the most comprehensive, in that it takes morphological, phonological, pragmatic, and lexico-semantic criteria to draw the widest and most complete picture of what determines inflectional and analytical comparison in English. That said, most research has concentrated on L1 varieties, particularly standard British and American English. Our aim here is to analyze variation in comparative formation in L2 varieties, more specifically in five African Englishes.

3. Methodology

The present section describes the methodology of the study, including data collection and analysis. The primary source is GloWbE, which was released in 2013 and is unique in that it allows for comparisons between different varieties of English, containing as it does around 1.9 billion words of web language from 20 countries (Davies 2013). Recognized as one of the largest and most diverse corpus of English, it contains texts from websites around the world, enabling researchers to study various English varieties. The corpus includes texts from different countries where English serves as a first or a second language. This global coverage provides insights into the linguistic features and usage patterns of English across different cultural and geographical contexts. Consequently, GloWbE is the most adequate corpus to analyze varieties of English worldwide. While other sources of data such as the International Corpus of English (ICE)² are also of utility for comparison of varieties of English around the world, ICE is much smaller than GloWbE and it lacks data for all the varieties studied in this paper. Therefore, GloWbE is currently the only source incorporating data from African English varieties, rendering it indispensable for the objectives of this study.

In addition, we used eWAVE, an interactive database on morphosyntactic variation in spontaneous spoken English that maps 235 features from a dozen domains of grammar in 51 varieties of English and 26 English-based pidgins and creoles in eight Anglophone regions around the world (Kortmann et al. 2020). eWAVE was essential, both in terms of the choice of the varieties under analysis here (Englishes from South Africa, Nigeria, Ghana, Kenya, and Tanzania), and as a means of being able to directly ascertain how frequent specific features such as the synthetic and analytic marking in comparison are in different varieties of English. For example, feature 79 (regularized comparison strategies: extension of synthetic marking) illustrates the degree to which synthetic marking is found in adjectives which would typically take the analytic formation, as in He is the regularest kind of guy I know (Kortmann et al. 2020: feature 79). It is neither pervasive nor extremely rare in Tanzanian English. In other words, it exists but is extremely rare in Black and Indian South African English, the two indigenized L2 varieties from South Africa included in eWAVE, and in White South African English, a high contact L1 variety also included in eWAVE. The feature is absent in Nigerian and Ghanaian English. Finally, no information is available on this feature in Kenyan English. Regarding feature 80 (regularized comparison strategies: extension of analytic marking), which deals with the degree to which analytic marking extends to contexts of synthetic marking, as in One of the most pretty sunsets (Kortmann et al. 2020: feature 80), it is neither pervasive nor extremely rare in Black South African, Kenyan, and Tanzanian Englishes. It exists but is extremely rare in Indian South African and White South African English, and in Ghanaian English. Finally, it is absent in Nigerian English.

The African varieties selected for this study are all postcolonial, that is, they are varieties in countries where English is an official language which has coexisted with other local languages since it was introduced in the country. What these varieties have in common is that they have all achieved the phase of ‘nativization’ in Schneider’s ‘Dynamic Model’ (Schneider 2007: 113–238; Brato 2020: 378–380), a theoretical framework that describes the development of postcolonial Englishes from the foundation of a colony ––when English was introduced in the territory–– to the emergence of the new variety that eventually becomes the new norm. The phase of nativization is recognized as the one where English becomes entrenched in a local community as a native language. During this phase, the new variety of English is considered to undergo a significant adaptation and integration with the local linguistic and cultural norms, and this is manifested by showing heavy lexical borrowing and phonological, lexical, and grammatical innovations derived through contact with other indigenous languages. South Africa has even gone beyond this phase to move into the phase of ‘endonormative stabilization’ in Schneider’s model, which often occurs after independence and is characterized by the stabilization of the variety through codification brought about by dictionaries, writing, and grammatical descriptions. In all these varieties ––and taking into account that the language we are analyzing is language taken from the Internet–– external factors of current language change, such as Americanization and globalization, which are significant factors in the ‘Extra- and Intra-Territorial Forces model’ (Buschfeld and Kautzsch 2016), may also be in operation.

In the current analysis, seven adjectives were selected, with a focus on disyllabic ones, the group which shows most variation between analytic and synthetic comparison. In the selection, we took Mondorf (2003) as the point of departure and for comparative purposes among varieties. Firstly, the seven adjectives were chosen and classified according to their final segment: 1) disyllabic adjectives ending in <-ly> and <-y> (costly, deadly, and risky), 2) disyllabic adjectives ending in <-l>, <-le> (noble and real), and 3) disyllabic adjectives ending in <-er> (bitter and clever). Both synthetic and comparative forms of these adjectives were then searched in GloWbE.

The automatic search of the 14 strings (e.g., costlier, more costly, and the equivalent synthetic and analytic forms of the other six adjectives) in GloWbE yielded a total of 1,040 examples, which were individually revised to exclude false positives, such as those illustrated in (1)-(5). In (1), Bitterer is part of a proper name; in (2), more functions as a determiner, as in the noun phrase more real life elements, rather than as a comparative adverb; in (3), there is a double comparative, such as more riskier, which combines both the synthetic and the analytic forms, whose analysis is beyond the scope of the present study; in (4), an <r> has been added to noble in the proper name Barnes & Nobler; and (5) illustrates quotations from sources which do not represent any of the geographic varieties under analysis. In the case of repeated examples, only one instance was included in the database.

Andreas Bitterer, research vice president at Gartner, was quoted stating that [GloWbE ZA]

Gamer’s demand of developers to include more real-life elements into games. [GloWbE ZA]

The reality is, however, that the more debt that you take on, the more riskier you become for both prospective shareholders and bankers. [GloWbE ZA]

E-bookstores such as Apple iBooks, Barnes &; Nobler NOOKr, and AmazonrKindler [GloWbE ZA]

Acts 17:11 English: World English Bible - WEB 11 Now these were more noble than those in Thessalonica. [GloWbE NG]

Table 1 below provides the raw numbers and percentages of tokens showing variation in the distribution of comparative forms, either synthetic or analytic:

Comparative form	Tokens and frequency
Analytic	563 (63.7%)
Synthetic	320 (36.3%)
Total	883

Table 1: Overall distribution of synthetic and analytic comparative forms in African varieties

After carrying out the manual analysis, the total number of cases was 883. Of these, 320 (36.3%) were cases of the inflectional comparative and 563 (63.7%) of the periphrastic comparative. Table 1 confirms that the comparative form in adjectives represents a clear case of morphosyntactic variation in African varieties of English. Although the analytic form is selected more frequently in the adjectives under analysis, a rate of synthetic forms of more than 36 percent in the examples clearly shows that it can be regarded as a case of language variation. If we cross-tabulate the results per adjective, we obtain the analysis set out in Table 2:

Adjective	Analytic	Synthetic	Total
Costly	185 (78.4%)	51 (21.6%)	236
Deadly	70 (66%)	36 (34%)	106
Risky	72 (37.9%)	118 (62.1%)	190
Real	143 (96%)	6 (4%)	149
Noble	28 (41.8%)	39 (58.2%)	67
Bitter	35 (97.2%)	1 (2.8%)	36
Clever	30 (30.3%)	69 (69.7%)	99
Total	563 (63.7%)	320 (36.3%)	883
χ²= 201,57, df = 6, p-value < 2.2e-16

Table 2: Overall distribution of synthetic and analytic comparatives per adjective

The overall distribution shows that periphrastic comparatives are almost twice as frequent as inflectional comparatives, in contrast to the findings reported in Hilpert (2008: 404), who in a very comprehensive examination of 247 alternating adjectives in the British National Corpus (BNC)³ reports a considerably higher number of inflectional comparatives (89.7% vs. 10.3%). If we select the adjectives analyzed by Hilpert which also figure in our list,⁴ a more balanced distribution between inflectional and analytic comparatives is observed (48.2% vs. 51.8% in Hilpert’s and 36.3% vs. 63.7% in our findings), although still very different from the distribution in our analysis, where a higher frequency of analytic structures is found, in line with what has been observed elsewhere for American English (Mondorf 2009). Table 2 shows that, when dealing with specific disyllabic adjectives, there is less of a clear trend in terms of the choice of comparative formation. Thus, whereas users clearly favor the analytic comparative with adjectives such as real, costly, deadly, and bitter, they opt more frequently for the inflectional comparative with risky, clever, and noble.

4. Description of the variables

This section provides a description of the independent variables which have been reported to yield variation in the choice of comparative forms in the English adjectives selected for our analysis. These operate at the levels of morphology (4.1), phonology (4.2), meaning (4.3), syntax (4.4), and region (4.5).

4.1. Morphological variables

The area of morphology is often predominant in the literature on the comparative alternation of adjectives. In fact, Mondorf (2003: 283) notes that morphological complexity may indeed be a contributing factor in the choice of the comparative form. She shows that morphologically complex adjectives, namely, those formed by more than one morpheme (e.g., careful), opt for the analytic comparative. Following Mondorf, we measure morphological complexity by means of the number of morphemes that form the adjective. This factor predicts that morphologically complex adjectives favor periphrastic comparative forms as opposed to morphologically simple adjectives (represented by monomorphemic adjectives), which favor the synthetic form.

In the present study we have analyzed both simple or monomorphemic adjectives, bitter, clever, noble, and real, and morphologically complex ones, costly and deadly (formed by a base and the suffix –ly) and risky (formed by a base and the suffix –y). The distribution of these is set out in Table 3.

	Analytic	Synthetic	Total
Simple	236 (67.2%)	115 (32.8%)	351
Complex	327 (61.5%)	205 (38.5%)	532
Total	563 (63.7%)	320 (36.3%)	883

Table 3: Distribution of synthetic and analytic comparative forms according to morphological complexity of the adjective

The data in Table 3 reflect the distribution of comparison alternation in the morphologically relevant contexts. As can be seen, although in both contexts there is alternation between the analytic and the synthetic forms, with different frequencies, both morphologically simple and complex adjectives favor the analytic comparison.

4.2. Phonological variables

Phonology is another factor that influences comparative alternation. The present section includes two main phonological factors in terms of the choice here: length and final segment.⁵

Length of words, measured in number of syllables, has traditionally been one of the most significant determinants in distinguishing between the analytic and the synthetic comparative forms (Sweet 1891: 326–327; Quirk et al. 1985: 461–463; Huddleston and Pullum 2002: 1580–1584). Generally, monosyllabic adjectives take the synthetic form and trisyllabic adjectives take the analytic one (see Hilpert 2008: 399), leaving disyllabic adjectives subject to variability (Mondorf 2003: 257). Given that the focus of our study is on disyllabic adjectives, and hence the length of the adjective will not be a determinant factor in such cases, it is important to consider the prospective length of the resulting adjectives after comparison. Those adjectives for which the addition of the suffix -er does not involve the addition of a new syllable (e.g., noble in our database) are expected to take the morphological option, whereas those for which the addition of the comparative suffix entails an extra syllable (e.g., real) are more likely to take the periphrastic form. Table 4 presents the results from this analysis and shows that most of the analytic comparison is clearly favored (65.6%) in those cases in which the suffix -er entails the addition of a new syllable as opposed to the synthetic comparison, which is preferred when it does not change the length of the adjective (58.2% vs. 41.8%).

	Analytic	Synthetic	Total
No extra syllable	28 (41.8%)	39 (58.2%)	67
Extra-syllable	535 (65.6%)	281 (34.4%)	816
Total	563 (63.7%)	320 (36.3%)	883

Table 4: Distribution of synthetic and analytic comparative forms in terms of prospective length of the adjective

Within phonological variation, the final segment of the adjective has also been found to be a relevant factor in the choice of the comparative form. It is generally agreed that the presence of certain suffixes can (dis)favor the synthetic form. Thus, Mondorf observes that adjectives ending in /r/ <r> in our database ––bitter and clever–– and /l/ <l, le> ––as in real and noble–– tend towards the analytic comparative (Mondorf 2003: 281; contra Kytö and Romaine 1997, who observed that adjectives ending in -le, excluding able, inflect for comparative formation, see Section 2). This tendency is justified by the so-called ‘horror aequi effect’ (Rohdenburg 2003: 236), according to which “(near-)identical and (near-)adjacent (non-coordinate) grammatical elements or structures” are universally avoided. In this context, the adjectives bitter and clever avoid the synthetic comparative not to repeat identical segments (e.g., clever-er). If this is the case, we would expect adjectives such as bitter, clever, real, and noble to favor the analytic comparison in our database. For adjectives ending in <ly>, these also show a tendency towards the analytic comparative (Lindquist 1998), as opposed to those ending in <y>, which favor the inflectional form, as already noted in Section 2.

Table 5 sets out the results of the selection of comparative form according to the final segment of the adjective. As has also been shown by Lindquist (1998) and Mondorf (2003), the final segments <r, l, le, ly> favor the analytic form, especially the <l, le>, but this is not the case for the final segment <y>, which clearly favors the synthetic form.

	Analytic	Synthetic	Total
<r>	70 (51.9%)	65 (48.1%)	135
<l, le>	171 (79.2%)	45 (20.8%)	216
<ly>	255 (74.6%)	87 (25.4%)	342
<y>	67 (35.3%)	123 (64.7%)	190
Total	563 (63.7%)	320 (36.3%)	883
χ²= 50.471, df = 3, p-value = 6.341e-11⁶

Table 5: Distribution of synthetic and analytic comparative forms according to the final segment of the adjective

4.3. Variation in meaning

The influence of the meaning of the adjective on comparative alternation has received little attention in the literature, among other reasons because “these factors do not easily lend themselves to objective annotation” (Hilpert 2008: 412). Nevertheless, the issue of meaning has been addressed by Mondorf (2003: 289) on the grounds that it can also “exert a potent role in comparative alternation.” Hence, we also include it in the present study, looking particularly at the degree of semantic complexity of an adjective, as well as the concrete vs. abstract nature inherent in its meaning.

Turning first to semantic complexity, Mondorf (2003: 289), referring to Braun (1982), confirms the relevance of the degree of semantic complexity of an adjective in the selection of the comparative form. She shows that semantically complex adjectives prefer the analytic comparative, as opposed to semantically simple adjectives, which steer towards the synthetic option. In order to measure the degree of semantic complexity of an adjective, both the length of the glosses provided in dictionaries and the availability of antonyms can be taken into account (Braun 1982: 112). To this end, we began by establishing both the number and length of glosses in the Oxford English Dictionary (OED) and then noted the number of antonyms⁷ for an adjective, using the Merrian-Webster Thesaurus. The results are set out in Table 6.

Adjective	Number of glosses	Length (number of words)	Number of antonyms
Costly	3	49	3
Deadly	16	144	9
Risky	3	23	9
Real	24	383	18
Noble	20	337	9
Bitter	15	198	6
Clever	12	137	7

Table 6: Number of glosses, length of entries and number of antonyms per adjective

Table 6 illustrates a correlation between the number and length of glosses, but this correlation is not necessarily supported by the number of antonyms of each adjective, as shown in the rank orders provided below. The first of these, illustrated in (6), arranges the adjectives from more to less semantically complex according to the number and length of glosses. In (7) the same adjectives are arranged according to the number of antonyms.

real > noble > bitter > deadly > clever > costly > risky

real > noble/deadly/risky > clever > bitter > costly

While the two adjectives with the highest degree of semantic complexity coincide in (6) and (7) (real and noble in both cases), the right-hand end of the hierarchy differs, with only costly found towards that end in both rank orders. If we compare (6) and (7) against the hierarchy which arranges the adjectives from highest to lowest frequency of the analytic comparative (based on data from Table 2 above), the sequence in (8) is obtained:

bitter > real > costly > deadly > noble > risky > clever

There seems to be no clear relationship between degree of semantic complexity and favoring the analytic form. Whereas the most semantically complex adjective is real, and it is indeed among the most frequent ones selecting the analytic variant, the second most semantically complex adjective, noble, is among those with the lowest frequency in the selection of analytic forms. Therefore, semantic complexity cannot be considered to be a particularly influential factor of comparative alternation in the present data.

Turning now to the inherent meaning of the adjectives (whether concrete or abstract), Mondorf (2003: 289) observes that adjectives referring to abstract concepts have a notable affinity with the analytic variant. For our classification, we analyzed each example individually, identifying them as concrete when they referred to physical things or people, as with tented chalets in (9), or as abstract when they referred to ideas, qualities, or states, as with disease in (10). From Table 7 we can confirm that abstract meanings favor the analytic form more clearly than concrete ones.

The standard rooms which are relatively cheap, and the tented chalets, whilst more costly, are lovely and spacious. [GloWbE ZA]

The disease sprouts and goes on full offensive, becoming even deadlier. [GloWbE NG]

	Analytic	Synthetic	Total
Concrete meaning	221 (60.9%)	142 (39.1%)	363
Abstract meaning	342 (65.8%)	178 (34.2%)	520
Total	563 (63.7%)	320 (36.3%)	883

Table 7: Distribution of synthetic and analytic comparative forms in terms of meaning of the adjective

4.4. Syntactic variables

It has long been known that position in a sentence can influence the use of comparative alternation (Jespersen 1956: 348). Leech and Culpeper (1997: 366), for example, observe that the predicative and postnominal positions of adjectives favor analytic comparison and that an attributive position favors the synthetic one. This factor has been analyzed in this study, and all adjectives were marked as attributive, as with nobler (11) ––which premodifies the noun descent––, predicative ––typically found in copulative constructions–– as with cleverer (12), postnominal, as with more deadly (13), or ‘not applicable’ for the correlative comparative structures, as in (14), where priming may be playing a role: that is, the synthetic form of deadlier may have been primed by the previous use of longer.

Zaynab could not overcome the fact she was of nobler descent than her husband. [GloWbE NG]

The Jews are not cleverer than the Gentiles, if by clever you mean good at their jobs. [GloWbE KE]

His 2015 ambition will do us no good bt something more deadly than boko haram. [GloWbE NG]

The longer your computer is infected the deadlier it is. # Another great way to find the information you are desperately seeking. [GloWbE GH]

The data in Table 8 confirm the relevance of including position of the adjective in the global count, since they show variation and confirm Leech and Culpeper’s (1997) findings: both the predicative and postnominal positions of adjectives favor analytic comparison and the attributive position favors the synthetic one.

	Analytic	Synthetic	Total
Attributive	130 (48.5%)	138 (51.5%)	268
Predicative	409 (79.9%)	168 (20.1%)	577
Postnominal	19 (70.4%)	8 (29.6%)	27
Not applicable	5 (45.5%)	6 (54.5%)	11
Total	563 (63.7%)	320 (36.3%)	883

Table 8: Distribution of synthetic and analytic comparative forms according to the position of the adjective

Regarding syntax, the presence of infinitival complements and the presence of than-constituents following the adjective have both been shown to exert an effect on the selection of the comparative form. Mondorf (2003: 262) argues that the presence of to-infinitives depending on adjectives favors the analytic comparison. In all cases, the presence of a to-infinitive combines with adjectives in the predicative position, as illustrated in (15), where the adjective costly is used twice and complemented by the infinitives to extract and to refine. Finally, we also included the presence of a following than-constituent (16), in light of earlier studies (Leech and Culpeper 1997: 367; Hilpert 2008: 402). Considering these studies, the hypothesis is that the presence of a than-element favors the use of the analytic comparative, as in (16).

Every barrel we consume will be more costly to extract, more costly to refine. [GloWbE ZA]

Two decades later, there was a Second World War, far costlier than the first. [GloWbE NG]

The results in Table 9 and Table 10 below show different distributions according to the type of clause following the adjective. The presence of to-infinitives depending on adjectives is stronger in the preference of the analytic comparison, accounting for 80 percent of the occurrences, as opposed to the presence of than-clauses following the adjectives, which also favor the analytic form for the comparative, but to a lesser extent (61.5%).

	Analytic	Synthetic	Total
No to-infinitive	539 (63.2%)	314 (39.8%)	853
To-infinitive	24 (80%)	6 (20%)	30
Total	563 (63.7%)	320 (36.3%)	883

Table 9: Distribution of synthetic and analytic comparative forms in terms of presence/absence of a to-infinitive clause complementing the adjective

	Analytic	Synthetic	Total
No than-clause	424 (64.5%)	233 (35.5%)	657
Than-clause	139 (61.5%)	87 (38.5%)	226
Total	563 (63.7%)	320 (36.3%)	883

Table 10: Distribution of synthetic and analytic comparative forms in terms of presence/absence of a than-clause following the adjective

4.5. Region

Table 11 provides information about the distribution of forms in the five African varieties individually. As can be noticed, the higher frequency of analytic forms in the overall distribution reported in Section 3 is found in all five of the varieties at very similar frequencies.

	Analytic	Synthetic	Total
South Africa [ZA]	159 (66%)	72 (34%)	231
Nigeria [NG]	115 (61%)	74 (39%)	189
Ghana [GH]	78 (61.9%)	48 (38.1%)	126
Kenia [KE]	117 (61.8%)	71 (38.2%)	189
Tanzania [TZ]	94 (63%)	55 (37%)	149
Total	563 (63.7%)	320 (36.3%)	883

Table 11: Distribution of the synthetic and analytic comparatives per variety

5. Data analysis

A multivariate approach via a logistic regression analysis using the ‘glm’ function in R (Gelman and Hill 2007) was used to predict the use of synthetic/analytic comparison in adjectives adjusting for potential covariables. The logistic regression model (AIC = 1109.6) was fitted introducing all categorical factors with treatment coding contrasts. The regression model was used considering a binomial distribution for the response (‘Form’), which was recoded (analytical = 0; synthetic = 1) and seven categorical covariates: variety, morphology, meaning, position, to-infinitive, than-clause, and prospective length of the adjective). Therefore, the distribution of comparative forms found in this study cannot be attributed to lexical preferences. The results obtained in relation to the effect of the relevant covariates are summarized in Table 12 below. Positive numbers in the ‘estimate’ column represent an increase in the probability of producing the analytic form of the comparative, while negative numbers represent a decrease in the probability of this form. ‘Standard error’ refers to the accuracy of the estimate ––the level of uncertainty about the coefficient–– and the ‘Z-value’ represents how much a given value differs from the standard variation. The last column provides the p-value of each predictor, which indicates the statistical significance: significance levels were established at 0.05.

Predictor	Estimate	Standard error	Z-value	P-value
Intercept	-1.26486	0.19958	-6.38	2.33e-10***
Variety (Reference level: South-Africa)
Nigeria	0.33198	0.21545	1.541	0.1233
Kenya	0.28941	0.21741	1.331	0.1831
Tanzania	0.23882	0.23128	1.033	0.3018
Ghana	0.34357	0.24314	1.413	0.1576
Morphology (Reference level: Complex)
Simple	-0.47469	0.16903	-2.807	0.005**
Meaning (Reference level: Abstract)
Concrete	0.31873	0.15103	2.110	0.034*
Position (Reference level: Predicative)
Attributive	0.94756	0.17010	5.571	2.54e-08***
Postnominal	-0.06794	0.44189	-0.154	0.8778
Correlative forms	1.30376	0.62345	2.091	0.036*
**To-infinitive** (Reference level: No)
Presence of to-infinitive	-0.43640	0.47590	-0.917	0.35
**Than-clause** (Reference level: No)
Presence of than-clause	0.46153	0.17700	2.608	0.009**
Prospective length of the adjective (Reference level: New Syllable)
No New Syllable	1.09606	0.29838	3.673	0.0002***

Table 12: Summary of the estimated effect for the binominal regression model (p-values < 0.05 in bold type)

Of the variables under analysis, morphology, meaning, position, than-clause, and prospective length of the adjective have a significant effect on the choice between analytic and comparative forms of the adjective. Starting with morphology, African varieties seem to show a significantly higher probability of using the synthetic form when the adjective is monomorphemic (e.g., clever, noble, bitter, and real) in comparison with the reference variant which is morphologically complex, that is, with non-monomorphemic adjectives or those formed by a base and an affix (e.g., costly, deadly, and, risky in this study).

The covariate meaning is also statistically significant. More specifically, the use of synthetic forms shows a lower probability if the adjective refers to concrete entities, in comparison with the reference variant ‘abstract’.

As to position, African varieties show a significantly higher probability of using the analytic form if the adjective is in attributive position or if it appears in a correlative structure in comparison with the reference variant ‘predicative’. No preference of form was detected in those cases in which the adjective is placed postnominally. The covariate than-clause is statistically significant too, since the analytic form is more likely to occur if the adjective is followed by a than-clause, as opposed to the covariate to-infinite, which does not have a significant effect on the selection of synthetic or analytic comparison.

Regarding prospective length of the adjective, this covariate also yields significant results. The synthetic form of the comparison shows a lower probability of occurrence if the addition of the suffix -er does not alter the number of syllables of the adjective, in comparison with those cases in which the addition of the suffix -er adds an extra syllable to the adjective.

In the regression model, the variable variety does not have a significant effect on the selection of synthetic or analytic comparison, and this is clearly because all five African varieties of English show similar frequencies of analytic and synthetic comparison, as shown in Table 11. Therefore, the specific African varieties (South-African, Nigerian, Kenyan, Tanzanian, or Ghanaian) do not seem to be responsible for any particular selection of the comparative form.

The results of comparative alternation of disyllabic adjectives in African varieties confirm the relevance of intra-linguistic variables in the selection of the analytic or synthetic form for the comparative. The results for morphological complexity are in line with Mondorf (2003: 284), but contrary to Hilpert (2008: 408), who reports a very weak effect of this factor in the choice of the comparison form. In agreement with Mondorf’s predictions, morphologically simple adjectives such as clever, bitter, noble, and real are more likely to occur with the synthetic form, in comparison with morphologically complex adjectives. This goes against the ‘horror aequi principle’ (see Section 4.1), since those contexts which show the repetition of (near-)identical segments favor the synthetic comparative, and the adjective clever, if the ‘horror aequi effect’ applies, would favor the analytic comparison. Within morphological predictors, the prospective length of the adjective reinforces this result, as the synthetic form is more likely to be used with adjectives which after the addition of the suffix -er become morphologically more complex with the addition of a new syllable.

In terms of phonology, we also considered the final segment of the adjective. Initially, we distinguished four variants within this variable, namely <r>, <l>, <ly> and <y> adjectives (see Section 4.2), but after testing for multicollinearity, the ‘V Cramer correlation matrix’ showed a perfect correlation between final segment and morphological complexity. For this reason, the final segment was finally excluded from the regression analysis. The chi-square reported in Section 4.3 for the correlation between the final segment and the comparative form shown in Table 5 yielded significant results, something which Hilpert (2008: 409) also found for British English. As in previous findings, adjectives ended in <r>, <l>, or <ly> favor the analytic comparison.

Moving on to the predictors related to meaning, Mondorf (2003: 290) found a correlation between abstract concepts and analytic comparative, which she interprets as evidence of the greater cognitive effort involved in expressing abstract meanings being balanced by the use of the analytic variant. Nevertheless, our results do not confirm this. The data in Table 12 make it clear that it is the expression of concrete meanings that shows a lower probability of synthetic forms. In addition, we did not find a correlation between the degree of semantic complexity (taking into account number of entries and number of antonyms, see Section 4.3 above) and choice of comparative form. This was most notably the case with the adjective noble, which, in terms of number and length of entries in the dictionary and number of antonyms, was classified as a semantically complex adjective, and was therefore expected to favor the periphrastic comparative. In this study, however, noble is among the adjectives which select a lower use of analytic comparative (see Table 2 above in Section 3 and example (8) in 4.3).

Finally, the syntactic variables in the analysis, which included the position of the adjective (whether attributive, predicative, postnominal, or in a correlative structure), the presence/absence of a than-constituent and the presence/absence of to-infinitive, also yielded significant results. Regarding the position of the adjective, the attributive option and correlative structures replicating the pattern the more…the merrier favor the analytic form in comparison with the predicative. This is in line with Leech and Culpeper (1997) and Mondorf (2003). No preference was shown for adjectives in postnominal position. As to the presence of than-clauses following the adjectives, these prefer the analytic comparative, unlike Hipert’s analysis (2008: 408). Finally, the presence of a to-infinitive shows no significant results, and thus the tendency for the synthetic comparative observed by Hilpert (2008: 408), and timidly pointed out in the correlation included in Table 8 (Section 4.4) cannot be confirmed. We are aware that the low number of examples in the database with to-infinitives (30 examples) may have conditioned these results.

6. Conclusion

The present study has analyzed adjective comparative alternation in African Englishes. Seven disyllabic adjectives were analyzed in five African varieties, taking into account predictors of variation of an extra-linguistic (e.g., region) and an intra-linguistic nature, affecting meaning, morphology, and syntax, which have been shown to yield significant results in previous studies.

The choice of synthetic or analytic comparison has traditionally been associated with the number of syllables of the adjective. This remains a relevant factor, especially in very short (monosyllabic) or very long (three syllables or more) adjectives, but more variation is found in disyllabic adjectives: whereas in some cases individual preferences may arise (e.g., bitter, see Table 2), when dealing with several adjectives, the distribution is more complex and seems to be conditioned by factors of a different nature.

Mondorf’s pioneering study (2003) served to determine the interplay of various factors in the English comparative. All these factors render cognitively complex environments which in turn favor more explicit options; in the expression of comparison this is achieved by the analytic form (more + adjective). The reduced number of adjectives included in the present study may somewhat affect the results due to the distribution of comparative forms of individual adjectives (e.g., bitter) and lexical effects cannot be discarded. However, the results from the statistical analysis still reflect some tendencies which confirm Mondorf’s findings, in particular with adjectives in which the addition of the -er suffix would result into a morphologically complex adjective. Within such adjectives, those ending in <r, ly> are especially notable, in that they clearly favor the use of the periphrastic comparative. Other complex environments, such as the use of a than-clause following the adjective, are also seen in our study to favor the analytic option, unlike Hilpert’s study (2008).

The correlation between cognitively complex environments and more explicit options pointed out for British English in Mondorf (2003) cannot be fully confirmed with the present results, which can perhaps be explained in terms of the reduced dataset used. This reflects previous research on English comparison in which, as pointed out in Section 2 and Section 4, different tendencies were found in different samples, different sources, and different varieties. Despite of this, what the current study shares with similar work is that morphological, phonological, and syntactic factors are all seen to be involved in the selection of the synthetic or analytic comparative.

Regarding potential regional differences between the five African Englishes, no intra-linguistic differences were found. An important finding here is that in African varieties the comparative is closer to American English than to British English, since the periphrastic comparative is favored more frequently than the morphological one, as also shown in Mondorf (2009). Considering that the five African varieties are the result of British colonization, we might have expected a stronger exonormative influence of British English as a consequence of colonial lag, which refers to the tendency in former British Colonies to retain older forms of English, and thus a higher presence of the synthetic comparative. Such an expectation cannot be fully discarded until a more comprehensive study is conducted. However, the fact that the five African varieties have reached the navitization phase of Schneider’s (2007) Dynamic Model (see Section 3) and that the language analyzed here is exclusively from the internet may have had a bearing on the higher frequency of analytic forms attested in the data. It is not uncommon to find that language from web-derived corpora tends to imitate the hub or hyper-central variety of Mair’s (2013) ‘World System of Englishes’, represented by standard American English and reflecting the current trend in language change commonly known as Americanization (Leech et al. 2009: chapter 11). This in turn is directly related to the external force of globalization and its effects on language, as noted by Buschfeld and Kautzsch (2016) in their model of ‘Extra- and Intra-Territorial Forces’ to account for the evolution of varieties of English around the world.

Language contact cannot be discarded as a potential influence for this marked tendency towards analytic comparative structures, as shown by Haspelmath and Michaelis (2017) in language contact scenarios with European languages involved. Regarding potential influences of the L1s, The World Atlas of Language Structures Online (WALS Online; Stassen 2013) shows a tendency for sub-Saharan African varieties languages to mark comparison through the so-called ‘the Exceed Comparative’ (Stassen 2013), which entails the addition of a lexical morpheme (a verb with the meaning to exceed or to surpass), that is, an analytic construction. A more detailed revision of how comparison is formed in the most widely spoken languages in the countries under analysis would also support the tendency towards analytic comparison.

Finally, the preference for the analytic comparative may also be motivated by the fact that these African Englishes, as L2 varieties of English, would favor analytic constructions in general, since these are considered more transparent and therefore easier to learn and use than synthetic ones, as acknowledged by Haspelmath and Michaelis (2017) and shown by Seoane and Suárez-Gómez (2023) for Bangladeshi English.

More comprehensive analyses, including a wider sample of adjectives in these varieties of English and other Englishes around the globe are necessary to confirm the tendencies attested in this preliminary study and discard potential lexical effects.

References

Bauer, Laurie. 1994. Watching English Change: An Introduction to the Study of Linguistic Change in Standard Englishes in the Twentieth Century. London: Routledge.

Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan. 1999. Longman Grammar of Spoken and Written English. Harlow: Pearson Education Limited.

Brato, Thorsten. 2020. Noun phrase complexity in Ghanaian English. World Englishes 39/3: 377–393.

Braun, Albert. 1982. Studien zur Syntax und Morphologie der Seigerungsformen im Englischen. Bern: Francke.

Buschfeld, Sarah and Alexander Kautzsch. 2016. Towards an integrated approach to postcolonial and non-postcolonial Englishes. World Englishes 36/1: 1–23.

Davies, Mark. 2013. The Corpus of Global Web-based English (GloWbE). https://www.english-corpora.org/glowbe/

Fuchs, Robert. 2016. The frequency of the present perfect in varieties of English around the World. In Valentin Werner, Elena Seoane and Cristina Suárez-Gómez eds. Re-assessing the Present Perfect. Berlin: De Gruyter, 223–258.

Gelman, Andrey and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press.

González-Díaz, Victorina. 2008. English Adjective Comparison: A Historical Perspective. Amsterdam: John Benjamins.

Haspelmath, Martin and Susanne M. Michaelis. 2017. Analytic and synthetic: typological change in varieties of European languages. In Isabelle Buchstaller and Beat Siebenhaar eds. Language Variation – European Perspectives VI: Selected Papers from the Eighth International Conference on Language Variation in Europe. Amsterdam: Benjamins, 3–22.

Hilpert, Martin. 2008. The English comparative - language structure and language use. English Language and Linguistics 12/3: 395–417.

Hogg, Richard M. 1992. Phonology and morphology. In Richard M. Hogg ed. The Cambridge History of the English Language. Vol I: The Beginnings to 1066. Cambridge: Cambridge University Press, 67–167.

Huddleston, Rodney and Geoffrey K. Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press.

Jespersen, Otto. 1956. A Modern English Grammar on Historical Principles. Copenhagen: Ejnar Munksgaard.

Kortmann, Bernd, Kerstin Lunkenheimer and Katharina Ehret. 2020. The Electronic World Atlas of Varieties of English. https://ewave-atlas.org

Kytö, Merja and Suzanne Romaine. 1997. Competing forms of adjective comparison in Modern English: What could be more quicker and easier and more effective? In Terttu Nevalainen and Leena Kahlas-Tarkka eds., 329–352.

Kytö, Merja and Suzanne Romaine. 2000. Adjective comparison in American and British English. In Laura Wright ed. The Development of Standard English 1300–1800: Theories, Descriptions, Conflicts. Cambridge: Cambridge University Press, 171–194.

Lass, Roger. 1992. Phonology and morphology. In Norman Blake ed. The Cambridge History of the English Language. Vol II: 1066–1476. Cambridge: Cambridge University Press, 23–155.

Lass, Roger. 1999. Phonology and morphology. In Roger Lass ed. The Cambridge History of the English Language. Vol III: 1476–1776. Cambridge: Cambridge University Press, 56–186.

Leech, Geoffrey and Jonathan Culpeper. 1997. The comparison of adjectives in recent British English. In Terttu Nevalainen and Leena Tarkka Kahlas eds., 353–373.

Leech, Geoffrey, Marianne Hundt, Christian Mair and Nicholas Smith. 2009. Change in Contemporary English. Cambridge: Cambridge University Press.

Lindquist, Håkan. 1998. Livelier or more lively? Syntactic and contextual factors influencing the comparison of disyllabic adjectives. In John M. Kirk ed. Corpora Galore: Analyses and Techniques in Describing English. Amsterdam: Rodopi, 125–132.

Mair, Christian. 2013. The World System of Englishes: Accounting for the transnational importance of mobile and mediated vernaculars. English World-Wide 34/3: 253–278.

Mondorf, Britta. 2003. Support for more-support. In Günter Rohdenburg and Britta Mondorf eds. Determinants of Grammatical Variation in English. Berlin: De Gruyter, 251–304.

Mondorf, Britta. 2007. Recalcitrant problems of comparative alternation and new insights emerging from internet data. In Marianne Hundt, Nadja Nesselhauf and Carolin Biewer eds. Corpus Linguistics and the Web. Amsterdam: Rodopi, 211–232.

Mondorf, Britta. 2009. Synthetic and analytic comparatives. In Günter Rohdenburg and Julia Schlüter eds. One Language, Two Grammars? Differences between British and American English. Cambridge: Cambridge University Press, 86–107.

Nevalainen, Terttu and Leena Kahlas-Tarkka eds. 1997. To Explain the Present: Studies in the Changing English Language in Honour of Matti Rissanen. Helsinki: Société Néophilologique.

Priestley, Joseph. 1969. The Rudiments of English grammar. Mensto: Scholar Press.

OED = Oxford English Dictionary. 1989. Oxford: Oxford University Press.

Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik. 1985. A Comprehensive Grammar of the English Language. London: Longman.

Rohdenburg, Günter. 2003. Cognitive complexity and horror aequi as factors determining the use of interrogative clause linkers in English. In Günter Rohdenburg and Britta Mondorf eds. Determinants of Grammatical Variation in English. Berlin: De Gruyter, 205–250.

Schneider, Edgar W. 2007. Postcolonial English: Varieties around the World. Cambridge: Cambridge University Press.

Seoane, Elena and Cristina Suárez-Gómez. 2023. A look at the nativization of Bangladeshi English through corpus data. Miscelánea 68: 15–37.

Stassen, Leon. 2013. Comparative Constructions. In Matthew S. Dryer and Martin Haspelmath eds. The World Atlas of Language Structures Online. Zenodo. http://wals.info/chapter/121 (25 March 2024.)

Sweet, Henry. 1955[1891]. A New English Grammar: Logical and Historical. Oxford: Clarendon Press.

Wick, Neil. 2005. Complexity in the formation of English comparatives and superlatives. https://ir.lib.uwo.ca/cgi/viewcontent.cgi?article=1009&context=bwtl (25 March 2024.)