The Process Corpus of English in Education: Going beyond the written text

Keywords: learner corpus research, process learner corpus, writing process, keylogging, screencasting, metadata


The Process Corpus of English in Education (PROCEED) is a learner corpus of English which, in addition to written texts, consists of data that make the writing process visible in the form of keystroke log files and screencast videos. It comes with rich metadata about each learner, among which indices of exposure to the target language and cognitive measures such as working memory or fluid intelligence. It also includes an L1 component which is made up of similar data produced by the learners in their mother tongue. PROCEED opens new perspectives in the study of learner writing, by going beyond the written product. It makes it possible to investigate aspects such as writing fluency, use of online resources, cognitive phenomena like automaticity and avoidance, or theoretical modelling of the writing process. It also has applications for teaching, e.g. by showing students screencast video clips from the corpus illustrating effective writing strategies, as well as for testing, e.g. by establishing a corpus-derived standard of writing fluency for learners at a certain proficiency level.


Download data is not yet available.


Metrics Loading ...


Breuer, Esther Odilia. 2019. Fluency in L1 and FL writing: An analysis of planning, essay writing and final revision. In Eva Lindgren and Kirk P. H. Sullivan eds. Observing Writing: Insights from Keystroke Logging and Handwriting. Leiden: Brill, 190–211.

Centre for English Corpus Linguistics. 2020. Learner corpora around the world. Louvain-la-Neuve: Université catholique de Louvain. (23 December, 2020.)

Chenoweth, N. Ann and John R. Hayes. 2001. Fluency in writing: Generating text in L1 and L2. Written Communication 18/1: 80–98.

Cislaru, Georgeta ed. 2015. Writing(s) at the Crossroads: The Process–Product Interface. Amsterdam: John Benjamins.

Cislaru, Georgeta and Thierry Olive. 2018. Le processus de textualisation. Analyse des unités linguistiques de performance écrite. Louvain-la-Neuve: De Boeck Supérieur.

Elola, Idoia and Ariana M. Mikulski. 2016. Similar and/or different writing processes? A study of Spanish foreign language and heritage language learners. Hispania 99/1: 87–102.

Gilquin, Gaëtanelle. 2015. From design to collection of learner corpora. In Sylviane Granger, Gaëtanelle Gilquin and Fanny Meunier eds. The Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 9–34.

Gilquin, Gaëtanelle. 2019. Screencasting and keylogging as pedagogical tools to enhance writing skill development. Paper presented at EUROCALL 2019, Louvain-la-Neuve, Belgium, 28–31 August 2019.

Gilquin, Gaëtanelle. 2020. In search of constructions in writing process data. Belgian Journal of Linguistics 34: 99–109.

Gilquin, Gaëtanelle. 2021. Hic sunt dracones: Exploring some terra incognita in learner corpus research. In Anna Čermáková and Marketa Malá eds. Variation in Time and Space: Observing the World through Corpora. Berlin: De Gruyter, 65–86.

Gilquin, Gaëtanelle and Samantha Laporte. Forthcoming. The use of online writing tools by learners of English: Evidence from a process corpus. International Journal of Lexicography.

Hairston, Maxine. 1982. The winds of change: Thomas Kuhn and the revolution in the teaching of writing. College Composition and Communication 33/1: 76–88.

Hamel, Marie-Josée and Jérémie Séror. 2016. Video screen capture to document and scaffold the L2 writing process. In Catherine Caws and Marie-Josée Hamel eds. Language-Learner Computer Interactions: Theory, Methodology and CALL Applications. Amsterdam: John Benjamins, 137–162.

Hayes, John R. 2012. Modeling and remodeling writing. Written Communication 29/3: 369–388.

Hegarty, David L. and Dufflecoat Enterprises. 2014. The PEBL Operation Span Task. (12 March, 2021.)

Kellogg, Ronald T. 1996. A model of working memory in writing. In C. Michael Levy and Sarah Ransdell eds. The Science of Writing: Theories, Methods, Individual Differences, and Applications. Mahwah, NJ: Erlbaum, 57–71.

Laporte, Samantha and Gaëtanelle Gilquin. 2018. Annotating the use of online writing resources in a video corpus of written process data in ELAN. Annotation manual version 1.1. CECL Papers 2. Louvain-la-Neuve: Université catholique de Louvain. (8 March, 2021.)

Leijten, Mariëlle and Luuk Van Waes. 2013. Keystroke logging in writing research: Using Inputlog to analyze and visualize writing processes. Written Communication 30/3: 358–392.

Leijten, Mariëlle, Luuk Van Waes, Iris Schrijver, Sarah Bernolet and Lieve Vangehuchten. 2019. Mapping master’s students’ use of external sources in source-based writing in L1 and L2. Studies in Second Language Acquisition 41/3: 555–582.

Leijten, Mariëlle, Luuk Van Waes, Karen Schriver and John R. Hayes. 2014. Writing in the workplace: Constructing documents using multiple digital sources. Journal of Writing Research 5/3: 285–337.

Lemhöfer, Kristin and Mirjam Broersma. 2012. Introducing LexTALE: A quick and valid Lexical Test for Advanced Learners of English. Behavior Research Methods 44/2: 325–343.

Lindgren, Eva and Kirk P. H. Sullivan eds. 2019. Observing Writing: Insights from Keystroke Logging and Handwriting. Leiden: Brill.

Meara, Paul M. and Vivienne E. Rogers. 2019. The LLAMA Tests v3. Cardiff: Lognostics.

Mueller, Shane T. 2011a. The PEBL Flanker Task. (12 March, 2021.)

Mueller, Shane T. 2011b. The PEBL Simon Interference Task. (12 March, 2021.)

Mueller, Shane T. 2012. The Psychology Experiment Building Language, Version 0.13. (23 December, 2020.)

Murray, Donald M. 1980. Writing as process: How writing finds its own meaning. In Timothy R. Donovan and Ben W. McClelland eds. Eight Approaches to Teaching Composition. Urbana, IL: National Council of Teachers of English, 3–20.

Ranalli, Jim, Hui-Hsien Feng and Evgeny Chukharev-Hudilainen. 2018. Exploring the potential of process-tracing technologies to support assessment for learning of L2 writing. Assessing Writing 36: 77–89.

Raven, John and Jean Raven. 2003. Raven Progressive Matrices. In R. Steve McCallum ed. Handbook of Nonverbal Assessment. Boston, MA: Springer, 223–237.

Révész, Andrea and Marije Michel. 2019. State of the scholarship: Introduction. Studies in Second Language Acquisition 41/3: 491–501.

Roca de Larios, Julio, Liz Murphy and Javier Marín. 2002. A critical examination of L2 writing process research. In Sarah Ransdell and Marie-Laure Barbier eds. New Directions for Research in L2 Writing. Dordrecht: Kluwer Academic Publishers, 11–47.

Sasaki, Miyuki. 2000. Toward an empirical model of EFL writing processes: An exploratory study. Journal of Second Language Writing 9/3: 259–291.

Sasaki, Miyuki. 2004. A multiple-data analysis of the 3.5-year development of EFL student writers. Language Learning 54/3: 525–582.

Seidlhofer, Barbara. 2002. Pedagogy and local learner corpora: Working with learning-driven data. In Sylviane Granger, Joseph Hung and Stephanie Petch-Tyson eds. Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. Amsterdam: John Benjamins, 213–234.

Spelman Miller, Kristyan, Eva Lindgren and Kirk P. H. Sullivan. 2008. The psycholinguistic dimension in second language writing: Opportunities for research and pedagogy using computer keystroke logging. TESOL Quarterly 42/3: 433–454.

Stevenson, Marie, Rob Schoonen and Kees de Glopper. 2006. Revising in two languages: A multi-dimensional comparison of online writing revisions in L1 and FL. Journal of Second Language Writing 15/3: 201–233.

Sullivan, Kirk and Eva Lindgren. 2002. Self-assessment in autonomous computer-aided second language writing. ELT Journal 56/3: 258–266.

Thorson, Helga. 2000. Using the computer to compare foreign and native language writing processes: A statistical and case study approach. Modern Language Journal 84/2: 155–170.

Vandermeulen, Nina, Mariëlle Leijten and Luuk Van Waes. 2020. Reporting writing process feedback in the classroom: Using keystroke logging data to reflect on writing processes. Journal of Writing Research 12/1: 109–139.

Van Waes, Luuk and Mariëlle Leijten. 2015. Fluency in writing: A multidimensional perspective on writing fluency applied to L1 and L2. Computers and Composition 38/A: 79–95.

Wengelin, Åsa. 2006. Examining pauses in writing: Theory, methods and empirical data. In Kirk P. H. Sullivan and Eva Lindgren eds. Computer Keystroke Logging and Writing: Methods and Applications. Amsterdam: Elsevier, 107–130.

Wittenburg, Peter, Hennie Brugman, Albert Russel, Alex Klassmann and Han Sloetjes. 2006. ELAN: A professional framework for multimodality research. Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation, 1556–1559.

How to Cite
Gilquin, G. (2021). The Process Corpus of English in Education: Going beyond the written text. Research in Corpus Linguistics, 10(1), 31-44.