Multilingual parallel corpus: An institutional resource for terminology development at the University of South Africa (Unisa)

Keywords: corpus, corpus compilation, terminology development, multilingual parallel corpus, indigenous language


The indigenous African languages of South Africa are not fully developed to provide for specialised terminology and were considered unsuitable for use as languages of tuition and research. This was used as a scapegoat for not utilising these languages in the South African education system. Since 1994, however, terminology development has been one of the key priorities of democratic South Africa. The institutions of Higher Learning have been mandated to develop and intellectualise the indigenous languages for teaching, learning and research. In line with this, this article aims to address the problem of unavailability of scientific or technical terms by illustrating how a multilingual corpus ––from which multilingual glossaries as resources for tuition and research–– can be compiled. Adopting a qualitative descriptive approach, suitable source texts in English and their translations in various African indigenous languages, namely, IsiZulu, IsiXhosa, IsiNdebele, SiSwati, Tshivenda, and Xitsonga were selected from the University study material for inclusion in the multilingual parallel corpus. ParaConc, a software that is suitable to query parallel texts, was used to align and extract terms from the corpus. The study demonstrates how parallel texts can be useful in developing scientific and technical terms. The University of South Africa can become the centre of corpus compilation for the intellectualisation of the official indigenous South African languages, since it is the only university in the country that caters for all these languages.


Download data is not yet available.


Metrics Loading ...


Alexander, Neville. 2003. African Renaissance and the Use of African Languages in Tertiary Education. Cape Town: The Estate of Neville Edward Alexander.

Baker, Mona Baker. 1995. Corpora in translation studies: An overview and some suggestions for future research. Target 7/2: 223–243.

Barlow, Michael. 2008. ParaConc and Parallel Corpora in Contrastive and Translation Studies. Houston: Athelstan.

Bowker, Lynne. 2000. Towards a methodology for exploiting specialized target language corpora as translation resources. International Journal of Corpus Linguistics 5/1: 17–52.

Department of Education. 2002. Language Policy for Higher Education. Pretoria: Government Printers.

Department of Education. 2020. Language Policy for Higher Education. Pretoria: Government Printers.

De Schryver, Gilles-Maurice and Jacobus Daniëlle. 2005. Managing eleven parallel corpora and the extraction of data in all official South African languages. In Walter Daelemans ed. Multilingualism and Electronic Language Management. Pretoria: Van Schaik, 100–122.

Gauton, Rachélle and Gilles-Maurice De Schryver. 2004. Translating technical texts into Zulu with the aid of multilingual and/or parallel corpora. Studies in the Languages of Africa 35/1: 148–161.

Granger, Sylviane. 1998. Learner English on Computer. London: Longman.

Khumalo, Langa, Valentine Azom and Peter Olukanmi. 2019. The design and implementation of a corpus management system for IsiZulu National Corpus. In Martin Doerr, Oyvind Eide, Oddrun Gronvik and Bjorghild Kjelsvik eds. Humanists and the Digital Toolbox. Oslo: Novus Forlag, 179–196.

Kilgarriff, Adam, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý and Vít Suchomel. 2014. The Sketch Engine: Ten years on. Lexicography 1/1: 7–36.

Leech, Geoffrey. 1998. Learner corpora: What they are and what can be done with them. In Sylviane Granger ed. Learner English on Computer. London: Longman: xiv–xx.

Madiba, Mbulungeni. 2001. Towards a model for terminology modernisation in the African languages of South Africa. Language Matters 32/1: 53–77.

Madiba, Mbulungeni. 2004. Parallel corpora as tools for developing indigenous languages of South Africa. Language Matters 35/1: 133–147.

Mlambo, Respect, Nomsa Skosana and Muzi Matfunjwa. 2021. The extraction of terminology list using ParaConc for creating a quadrilingual dictionary. Southern African Linguistics and Applied Language Studies 39/1: 82–91.

Moreira, Adonay. 2014. A methodology for building a translator and translation oriented terminological resource. InTralinea Online Translation Journal.

Moropa, Koliswa. 2004. A parallel corpus as a terminology resource for Xhosa: A study of strategies used to translate financial statements. Language Matters 35/1: 162–178.

Moropa, Koliswa. 2005. An Investigation of Translation Universals in a Parallel Corpus of English-Xhosa Texts. Pretoria: University of South Africa dissertation.

Moropa, Koliswa. 2007. Analysing the English-Xhosa parallel corpus of technical texts with ParaConc: A case study of term formation processes. South African Linguistics and Applied Language Studies 25/2:183–205.

Moropa, Koliswa and Feziwe Martha Shoba. 2017. Language and terminology development in IsiXhosa: A history. In Russell H. Kaschula, Pamela Maseko and H. Ekkehard Wolff eds. Multilingualism and Intercultural Communication: A South African Perspective. Johannesburg: Wits University Press, 76–91.

Ndhlovu, Ketiwe. 2012. An Investigation of Strategies Used by Ndebele Translators in Zimbabwe in Translating HIV/AIDS Texts: A Corpus-based Approach. Alice: University of Fort Hare dissertation.

Ndhlovu, Ketiwe. 2016. Using ParaConc to extract bilingual terminology from parallel corpora: A case of English and Ndebele. Journal of Literary Criticism, Comparative Linguistics and Literary Studies 37/2: 1–12.

Prinsloo, Daniël Jacobus. 1991. Towards computer assisted word frequency studies in Northern Sotho. South African Journal of African Languages 11/2: 54–60.

Republic of South Africa. 1996. The Constitution of the Republic of South Africa (Act 108 of 1996). Pretoria: The Government Printer.

Sinclair, John. 1995. Corpus typology: A framework for classification. In Gunnel Mechers and Beatrice Warren eds. Studies in Anglistics. Acta Universitatistockholmienses. Stockholm: Almqvist & Wicksell, 17–33.

Shoba, Feziwe Martha. 2018. Exploring the Use of Parallel Corpora in the Compilation of Specialized Bilingual Dictionaries of Technical Terms: A Case Study of English and IsiXhosa. Pretoria: University of South Africa dissertation.

Teubert, Wolfgang. 2005. Language as an economic factor: The importance of terminology. In Geoffrey Barnbrook, Pernilla Danielsson and Michaela Mahlberg eds. Meaningful Texts: The Extraction of Semantic Information from Monolingual and Multilingual Corpora. London: Continuum, 96–106.

University of South Africa. 2016. Unisa Language Policy.

How to Cite
Moropa, K., & Nokele, B. (2023). Multilingual parallel corpus: An institutional resource for terminology development at the University of South Africa (Unisa) . Research in Corpus Linguistics, 11(2), 141-165.