The Varieties of English for Specific Purposes dAtabase (VESPA): Towards a multi-L1 and multi-register learner corpus of disciplinary writing
DOI:
https://doi.org/10.32714/ricl.10.02.02Keywords:
learner corpus, learner corpus research, English as a Foreign Language, academic writing, register variation, student writingAbstract
The Varieties of English for Specific Purposes dAtabase (VESPA first release) is the result of an international corpus compilation project that aims to address the lack of large-scale, open access, multi-L1, multi-discipline and multi-register learner corpora. This corpus report provides a detailed description of VESPA and illustrates possible uses of the corpus for register exploration of learner data. Specifically, it first offers an overview of the makeup of the corpus and the online interface that can be used to search and download the corpus. It then gives an illustrative example of a study where multi-dimensional analysis was used to investigate the relative importance of register vis-à-vis other factors in learner academic writing. In the concluding remarks, we identify priorities for future developments in the VESPA project, including the addition of more L1 components, more disciplines and more registers, as well as the compilation of a comparable corpus of native student writing.
Downloads
References
Alsop, Sian and Hilary Nesi. 2009. Issues in the development of the British Academic Written English (BAWE) corpus. Corpora 4 /1: 71–83
Biber, Douglas. 1988. Variation across Speech and Writing. Cambridge: Cambridge University Press.
Biber, Douglas. 1992. The multi-dimensional approach to linguistic analyses of genre variation: An overview of methodology and findings. Computers and the Humanities 26: 331–345.
Biber, Douglas, Randi Reppen, Shelley Staples and Jesse Egbert. 2020. Exploring the longitudinal development of grammatical complexity in the disciplinary writing of L2-English university students. International Journal of Learner Corpus Research 6/1: 38–71.
Blanchard, Daniel, Joel Tetreault, Derrick Higgins, Aoife Cahill and Martin Chodorow. 2013. TOEFL11: A corpus of non-native English. ETS Research Report Series, 2013/2: i–15. https://doi.org/10.1002/j.2333-8504.2013.tb02331.x (29 September, 2021.)
Callies, Marcus and Ekaterina Zaytseva. 2013. The Corpus of Academic Learner English (CALE) – A new resource for the assessment of writing proficiency in the academic register. Dutch Journal of Applied Linguistics 2/1: 126–132.
Ebeling, Signe O. and Hilde Hasselgård. 2015. Learners’ and native speakers’ use of recurrent word-combinations across disciplines. In Ann-Kristin H. Gujord, Susan Nacey, Silje Ragnhildstveit eds. Learner Corpus Research: LCR2013 Conference Proceedings (Bergen Language and Linguistics Studies 6), 87–106.
Ebeling, Signe O. and Alois Heuboeck. 2007. Encoding document information in a corpus of student writing: The British Academic Written English Corpus. Corpora 2/2: 241–256.
Gilquin, Gaëtanelle, Sylviane Granger and Magali Paquot. 2007. Learner corpora: The missing link in EAP pedagogy. In Paul Thompson ed. Corpus-based EAP Pedagogy. Special issue of the Journal of English for Academic Purposes 6/4: 319–335.
Granger, Sylviane, Maïté Dupont, Fanny Meunier, Hubert Naets and Magali Paquot. 2020. The International Corpus of Learner English (version 3). Louvain-la-Neuve: Presses universitaires de Louvain.
Granger, Sylviane and Magali Paquot. 2013. Language for specific purposes learner corpora. In Carol A. Chapelle ed. The Encyclopedia of Applied Linguistics. Oxford: Blackwell-Wiley.
Hasselgård, Hilde. 2014. It-clefts in English L1 and L2 academic writing. In Kristin Davidse, Caroline Gentens, Lobke Ghesquière and Lieven Vandelanotte eds. Corpus Interrogation and Grammatical Patterns. Amsterdam: John Benjamins, 295–320.
Heuboeck, Alois, Jasper Holmes and Hilary Nesi. 2008. The BAWE Corpus Manual. http://www.reading.ac.uk/AcaDepts/ll/app_ling/internal/bawe/BAWE.documentation.pdf (29 September, 2021.)
Larsson, Tove. 2019. Grammatical stance marking in student and expert production: Revisiting the informal-formal dichotomy. Register Studies 1/2: 243–268.
Larsson, Tove, Marcus Callies, Hilde Hasselgård, Natalia J. Laso, Magali Paquot, Sanne van Vuuren and Isabel Verdaguer. 2020. Adverb placement in EFL academic writing: Going beyond syntactic transfer. International Journal of Corpus Linguistics 25/2: 155–184.
Larsson, Tove and Henrik Kaatari. 2019. Extraposition in learner and expert writing: Exploring (in)formality and the impact of register. International Journal of Learner Corpus Research 5/1: 33–62.
Larsson, Tove, Magali Paquot and Douglas Biber. 2021. On the importance of register in learner writing: A multi-dimensional approach. In Elena Seoane and Douglas Biber eds. Corpus-based Approaches to Register Variation. Amsterdam: John Benjamins, 235–258.
Lee, David Y. W. and Sylvia Xiao Chen. 2009. Making a bigger deal of the smaller words: Function words and other key items in research writing by Chinese learners. Journal of Second Language Writing 18/3: 149–165.
Nesi, Hilary, Sheena Gardner, Paul Thompson and Paul Wickens. 2008. British Academic Written English Corpus. Oxford Text Archive. http://hdl.handle.net/20.500.12024/2539.
Open Cambridge Learner Corpus (v1). 2017. Distributed by Lexical Computing Limited on behalf of Cambridge University Press and Cambridge English Language Assessment.
Paquot, Magali. 2010. Academic Vocabulary in Learner Writing: From Extraction to Analysis. London: Continuum.
Paquot, Magali. 2019. The phraseological dimension in interlanguage complexity research. Second Language Research 35/1: 121–145.
Paquot, Magali, Hilde Hasselgård and Signe O. Ebeling. 2013. Writer/reader visibility in learner writing across genres: A comparison of the French and Norwegian components of the ICLE and VESPA learner corpora. In Sylviane Granger, Gaëtanelle Gilquin and Fanny Meunier eds. Twenty Years of Learner Corpus Research: Looking back, Moving ahead. Louvain-la-Neuve: Presses universitaires de Louvain, 377–387.
Paquot, Magali, Signe O. Ebeling, Alois Heuboeck and Larry Valentin. 2015. The VESPA Tagging Manual (version 2.3). Louvain-la-Neuve: Centre for English Corpus Linguistics.
Polio, Charlene. 2017. Second language writing development: A research agenda. Language Teaching 50/2: 261–275.
Römer, Ute. 2009. English in academia: Does nativeness matter? Anglistik: International Journal of English Studies 20/2: 89–100.
Römer, Ute and Matthew Brook O’Donnell. 2011. From student hard drive to web corpus (part 1): The design, compilation and genre classification of the Michigan Corpus of Upper-level Student Papers (MICUSP). Corpora 6/2: 159–177.
Staples, Shelley, Douglas Biber and Randi Reppen. 2018. Using corpus-based register analysis to explore authenticity of high-stakes language exams: A register comparison of TOEFL iBT and disciplinary writing tasks. The Modern Language Journal 102/2: 310–332.
Ströbel, Marcus, Elma Kerz and Daniel Wiechmann. 2020. The relationship between first and second language writing: Investigating the effects of first language complexity on second language complexity in advanced stages of learning. Language Learning 70/3: 732–767.
Downloads
Published
How to Cite
Issue
Section
License
Submission of your paper to this journal implies that the paper is not under submission for publication elsewhere. Material which has been previously copyrighted, published, or accepted for publication will not be considered for publication in this journal. Submission of a manuscript is interpreted as a statement of certification that no part of the manuscript is copyrighted by any other publisher nor is under review by any other formal publication. By submitting your manuscript to us, you agree on these copyright guidelines. It is your responsibility to ensure that your manuscript does not cause any copyright infringements, defamation, and other problems.
Submitted papers are assumed to contain no proprietary material unprotected by patent or patent application; responsibility for technical content and for protection of proprietary material rests solely with the author(s) and their organizations and is not the responsibility of the journal or its editorial staff. The main author is responsible for ensuring that the article has been seen and approved by all the other authors. It is the responsibility of the author to obtain all necessary copyright release permissions for the use of any copyrighted materials in the manuscript prior to the submission.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under the BY Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal
Article submission implies author agreement with this policy.