Looking into international research groups’ digital discursive practices: Criteria and methodological steps taken towards the compilation of the EUROPRO digital corpus

Keywords: corpus design, research project websites, Twitter, e-visibility, digital discourse, computer-mediated communication


The EUROPRO digital corpus was designed by the InterGedi research group, based at the University of Zaragoza (Spain). The main focus of InterGedi is the analysis of the textual resources used by international research groups as part of their dissemination and visibility strategies. The corpus comprises a collection of 30 international research project websites funded by the European Horizon2020 Programme (EUROPROwebs corpus). By looking into their websites, 20 projects were observed to maintain a Twitter account and the tweets from these accounts were the basis for the compilation of the EUROPROtweets corpus. This paper delves into the criteria used for the selection of the research project websites and the methodological steps taken to classify, label and tag the verbal component in these websites and tweets. The paper discusses the challenges in the compilation of the corpus because of the dynamic, hypermodal, and hypermedial nature of the digital texts it contains. The paper closes by underlining the potential uses and applications of EUROPRO in order to gain insights into the digital discursive and professional practices used by international research groups to foster their visibility online.


Download data is not yet available.


Austin, John L. 1965. How to Do Things with Words. Oxford: Oxford University Press.

Beißwenger, Michael and Angelika Storrer. 2008. Corpora of computer-mediated communication. In Anke Lüdeling and Merja Kytö eds. Corpus Linguistics: An International Handbook. Berlin: Mouton de Gruyter, 292–308.

Bhatia, Vijay K. 2004. Worlds of Written Discourse: A Genre-based View. London: Continuum.

Biber, Douglas, Susan Conrad and Randi Reppen. 1998. Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press.

Brown, Penelope and Stephen Levinson. 1987. Politeness: Some Universals in Language Usage. Cambridge: Cambridge University Press.

Collins, Luke Curtis. 2019. Corpus Linguistics for Online Communication: A Guide for Research. London: Routledge.

Djonov, Emilia. 2007. Website hierarchy and the interaction between content organization, webpage and navigation design. Information Design Journal 15/2: 144–162.

Fletcher, William H. 2013. Corpus analysis of the World Wide Web. In Carol Chapelle ed. Encyclopedia of Applied Linguistics: Volume 3. New Jersey: Wiley-Blackwell, 1339–1347.

Gries, Stefan T. and John Newman. 2013. Creating and using corpora. In Robert J. Podesva and Devyani Sharma eds. Research Methods in Linguistics. Cambridge: Cambridge University Press, 257–287.

Jewitt, Carey. 2016. Multimodal analysis. In Alexandra Georgakopoulou and Tereza Spilioti eds. Routledge Handbook of Language and Digital Communication. London: Routledge, 69–84.

Koester, Almut. 2010. Building small specialised corpora. In Anee O’Keeffe and Michael McCarthy eds. The Routledge Handbook of Corpus Linguistics. London: Routledge, 66–79.

Kress, Gunther and Theo van Leeuwen. 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Arnold.

Kuteeva, Maria. 2016. Research blogs, wikis and tweets. In Ken Hyland and Philip Shaw eds. The Routledge Handbook of English for Academic Purposes. London: Routledge, 431–444.

Lorés, Rosa. 2020. Science on the web: The exploration of research websites of energy-related projects as digital genres for the promotion of values. Discourse, Context and Media 35: 1–10.

McEnery, Tony and Andrew Wilson. 2001. Corpus Linguistics: An Introduction. Edinburgh: Edinburgh University Press.

Page, Ruth. 2012. The linguistics of self-branding and micro-celebrity in Twitter: The role of hashtags. Discourse & Communication 6/2: 181–201.

Petroni, Sandra. 2014. Collaborative writing and linking: When technology interacts with genres in meaning construction. In Paola E. Allori, John Bateman and Vijay K. Bhatia eds. Evolution in Genre: Emergence, Variation, Multimodality. Bern: Peter Lang, 289–306.

Puschmann, Cornelius. 2015. A digital mob in the ivory tower? Context collapse in scholarly communication online. In Marina Bondi, Silvia Cacchiani and Davide Mazzi eds. Discourse in and through the Media: Recontextualizing and Reconceptualizing Expert Discourse. Newcastle upon Tyne: Cambridge Scholars Publishing, 22–45.

Searle, John Rogers. 1969. Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press.

Sinclair, John. 2005. Corpus and text – Basic principles. In Martin Wynne ed. Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 1–16.

Sperber, Dan and Deirdre Wilson. 1995. Relevance: Communication and Cognition. Oxford: Blackwell.

Stein, Dieter. 2006. The web as a domain-specific genre. Language@Internet 3: https://www.languageatinternet.org/articles/2006/374 (10 May, 2020.)

Swales, John. 1990. Genre Analysis: English in Academic and Research Settings. Cambridge: Cambridge University Press.

Swales, John. 2004. Research Genres: Explorations and Applications. Cambridge: Cambridge University Press.

Tognini-Bonelli, Elena. 2001. Corpus Linguistics at Work. Amsterdam: John Benjamins.

Zappavigna, Michele. 2014. Enacting identity in microblogging through ambient affiliation. Discourse & Communication 8/2: 209–228

How to Cite
Daniel Pascual, Mur-Dueñas, P., & Lorés, R. (2020). Looking into international research groups’ digital discursive practices: Criteria and methodological steps taken towards the compilation of the EUROPRO digital corpus. Research in Corpus Linguistics, 8(2), 87-102. https://doi.org/10.32714/ricl.08.02.05