Vocabulary learning through data-driven learning in the context of Spanish as a foreign language

Keywords: Data-driven learning; Spanish as a foreign language; vocabulary learning; empirical study


An increasing number of studies have shown the potential associations between corpus work and second language acquisition and teaching. Some research, for example, has tried to explore and demonstrate the effect of Data-driven Learning (DDL, Johns 1991) in the context of foreign language learning. Up till now, however, empirical studies in this respect, especially quantitative studies have been limited, especially with respect to foreign languages other than English. Therefore, the objective of the present study is twofold: first, to argue whether there is a statistically significant difference between the DDL approach to vocabulary learning and more traditional methods (e.g., dictionary approach) in the context of Spanish as a foreign language (SFL); second, to gauge students’ attitude towards DDL activities. With this end in view, a quasi-experimental longitudinal design was used to compare two groups of Chinese students of Spanish (experimental group, N = 16; control group, N = 16). The result of the immediate post-test indicated that DDL was more effective than the traditional method in learning Spanish vocabulary (t(30) = 6.191, p < .001, d = 2.19). In the delayed post-test, the result also revealed that the DDL group outperformed the other group (t(30) = 2.600, p = .014, d = 0.92). Furthermore, a questionnaire assessment collected from the experimental group corroborated the positive results of the said tests, showing that the respondents generally favored DDL and adopted a positive attitude towards its future application to Spanish learning. This study provides a sound base for DDL in the area of second language acquisition and teaching, notably in the area of SFL; at the same time, it raises some caveats and suggests future work in this research line.


Allan, Rachel. 2006. Data-driven Learning and Vocabulary: Investigating the Use of Concordances with Advanced Learners of English. Dublin: Trinity College Dublin.

Anani-Sarab, Mohammad R. and Amir Kardoust. 2014. Concordance-based data-driven learning activities and learning English phrasal verbs in EFL classrooms. Issues in Language Teaching 3/1: 89–112.

Asención-Delaney, Yuly, Joseph G. Collentine, Karina Collentine, Jersus Colmenares and Luke Plonsky. El potencial de la enseñanza del vocabulario basada en corpus: Optimismo con precaución. Journal of Spanish Language Teaching 2/2: 140–151.

Aşık, Asuman, Arzu Sarlanoglu Vural and Kadriye Dilek Akpınar. 2016. Lexical awareness and development through data driven learning: Attitudes and beliefs of EFL learners. Journal of Education and Training Studies 4/3: 87–96.

Barcroft, Joe. 2005. La enseñanza del vocabulario en español como segunda lengua. Hispania 88/3: 568–582.

Bardovi-Harlig, Kathleen, Sabrina Mossman and Yunwen Su. 2017. The effect of corpus-based instruction on pragmatic routines. Language Learning and Technology 21/3: 76–103.

Bernardini, Silvia. 2000. Systematising serendipity: Proposals for concordancing large corpora with language learners. In Lou Burnard and Tony McEnery eds. Rethinking Language Pedagogy from a Corpus Perspective. Hamburg: Peter Lang, 225–234.

Bernardini, Silvia. 2004. Corpora in the classroom: An overview and some reflections on future developments. In John Sinclair ed. How to Use Corpora in Language Teaching. Amsterdam: John Benjamins, 5–36.

Boulton, Alex. 2007. But where’s the proof? The need for empirical evidence for data- driven learning. In Michael Edwardes ed. Proceedings of the BAAL Annual Conference 2007. London: Scitsiugnil Press, 13–16.

Boulton, Alex. 2008. Esprit de corpus: Promouvoir l’exploitation de corpus en apprentissage des langues. Texte et Corpus 3: 37–46.

Boulton, Alex. 2009. Testing the limits of data-driven learning: Language proficiency and training. ReCALL 21/1: 37–54.

Boulton, Alex. 2010a. Data-driven learning: Taking the computer out of the equation. Language Learning 60/3: 534–572.

Boulton, Alex. 2010b. Learning outcomes from corpus consultation. In María Moreno-Jaén, Fernando Serrano Valverde and María Calzada Pérez eds. Exploring New Paths in Language Pedagogy: Lexis and Corpus-Based Language Teaching. London: Equinox, 129–144.

Boulton, Alex. 2010c. Data-driven learning: On paper, in practice. In Tony Harris and María Moreno-Jaén eds. Corpus Linguistics in Language Teaching. Bern: Peter Lang, 17–52.

Boulton, Alex. 2017a. Research timeline: Corpora in language teaching and learning. Language Teaching 50/4: 483–506.

Boulton, Alex. 2017b. Data-driven Learning and Language Pedagogy. In Steven L. Thorne and Stephen May eds. Language, Education and Technology. New York: Springer, 181–192.

Boulton, Alex and Tom Cobb. 2017. Corpus use in language learning: A meta-analysis. Language Learning 67/2: 1–46.

Brezina, Vaclav. 2018. Statistics in Corpus Linguistics: A Practical Guide. Cambridge: Cambridge University Press.

Calderón-Campos, Miguel. 1994. Sobre la elaboración de diccionarios monolingües de producción: Las definiciones, los ejemplos y las colocaciones léxicas. In Peter Jan Slagter ed. Aproximaciones a Cuestiones de Adquisición y Aprendizaje del Español como Lengua Extranjera o Lengua Segunda. Amsterdam: Rodopi, 105–119.

Chambers, Angela and Íde O’Sullivan. 2004. Corpus consultation and advanced learners’ writing skills in French. ReCALL 16/1: 158–172.

Chambers, Angela. 2005. Integrating corpus consultation in language studies. Language Learning and Technology 9/2: 111–125.

Chambers, Angela. 2007. Popularising corpus consultation by language learners and teachers. In Encarnación Hidalgo, Luis Quereda and Juan Santana eds. Corpora in the Foreign Language Classroom. Amsterdam: Rodopi, 3–16.

Chambers, Angela. 2010. What is data-driven learning? In Anne O’Keeffe and Michael McCarthy eds. The Routledge Handbook of Corpus Linguistics. London: Routledge, 345–358.

Chan, Tun-pei and Hsien-Chin Liou. 2005. Effects of web-based concordancing instruction on EFL students’ learning of verb-noun collocations. Computer Assisted Language Learning 18/3: 231–251.

Chang, Ji-Yeon. 2014. The use of general and specialized corpora as reference sources for academic English writing: A case study. ReCALL 26/2: 243–259.

Chang, Jung. 2001. Chinese speakers. In Michael Swan and Bernard Smith eds. Learner English: A Teacher’s Guide to Interference and other Problems. Cambridge: Cambridge University Press, 310–324.

Charles, Maggie. 2012. Proper vocabulary and juicy collocations: EAP students evaluate do-it-yourself corpus-building. English for Specific Purposes 31/2: 93–102.

Cheng, Winnie, Martin Warren and Xun-feng, Xu. 2003. The language learner as language researcher: Putting corpus linguistics on the timetable. System 31/2: 173–186.

Cobb, Tom. 1997. Is there any measurable learning from hands-on concordancing? System 25/3: 301–315.

Cobb, Tom. 1999. Breadth and depth of lexical acquisition with hands-on concordancing. Computer Assisted Language Learning 12/4: 345–360.

Cotos, Elena, Stephanie Link and Sarah Huffman. 2017. Effects of DDL technology on genre learning. Language Learning and Technology 21/3: 104–130.

Crosthwaite, Peter. 2017. Retesting the limits of data-driven learning: Feedback and error correction. Computer Assisted Language Learning 30/6: 447–473.

Daskalovska, Nina. 2015. Corpus-based versus traditional learning of collocations. Computer Assisted Language Learning 28/2: 130–144.

Faul, Franz, Edgar Erdfelder, Albert-Georg Lang and Axel Buchner. 2007. G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods 39/2: 175–191.

Flowerdew, John. 1996. Concordancing in language learning. In Martha C. Pennington ed. The Power of CALL. Houston: Athelstan, 97–113.

Frankenberg-Garcia, Ana. 2014. The use of corpus examples for language comprehension and production. ReCALL 26/2: 128–146.

Gabrielatos, Costas. 2005. Corpora and language teaching: Just a fling or wedding bells? The Electronic Journal for Teaching English 8/4: 1–32.

Gaskell, Delian and Thomas Cobb. 2004. Can learners use concordance feedback for writing errors? System 32/3: 301–319.

Gavioli, Laura. 2001. The learner as researcher: Introducing corpus concordancing in the classroom. In Guy Aston ed. Learning with Corpora. Bologna: Athelstan, 108–137.

Geluso, Joe and Atsumi Yamaguchi. 2014. Discovering formulaic language through data-driven learning: Student attitudes and efficacy. ReCALL 26/2: 225–242. Gilmore, Alex. 2009. Using online corpora to develop students’ writing skills. ELT Journal 63/4: 363–372.

Gilquin, Gaëtanelle and Sylviane Granger. 2010. How can data-driven learning be used in language teaching? In Anne O’Keeffe and Michael McCarthy eds. The Routledge Handbook of Corpus Linguistics. London: Routledge, 359–370.

Guan, Xiaowei. 2013. A study on the application of data-driven learning in vocabulary teaching and learning in China’s EFL class. Journal of Language Teaching and Research 4/1: 105–112.

Hadley, Gregory and Maggie Charles. 2017. Enhancing extensive reading with data-driven learning. Language Learning and Technology 21/3: 131–152.

Huang, Zeping. 2014. The effects of paper-based DDL on the acquisition of lexico-grammatical patterns in L2 writing. ReCALL 26/2: 163–183.

Jiao, Binkai. 2012. An empirical study on corpus-driven English vocabulary learning in China. English Language Teaching 5/4: 131–137.

Jiménez-Calderón, Francisco and Ana Sánchez-Rufat. 2017. Posibilidades de aplicación de un enfoque léxico a la enseñanza comunicativa del español. In Guadalupe Nieto Caballero ed. Nuevas Aportaciones al Estudio de la Enseñanza y Aprendizaje de Lenguas. Cáceres: Universidad de Extremadura, 11–23.

Johns, Tim. 1991. Should you be persuaded: Two examples of data-driven learning materials. ELR Journal 4: 1–16.

Johns, Tim, Lee Hsingchin and Wang Lixun. 2008. Integrating corpus-based CALL programs in teaching English through children’s literature. Computer Assisted Language Learning 21/5: 483–506.

Karras, Jacob N. 2016. The effects of data-driven learning upon vocabulary acquisition for secondary international school students in Vietnam. ReCALL 28/2: 166–186.

Kennedy, Claire and Tiziana Miceli. 2010. Corpus-assisted creative writing: Introducing intermediate Italian learners to a corpus as a reference resource.

Language Learning and Technology 14/1: 28–44. Koosha, Mansour and Ali A. Jafarpour. 2006. Data-driven learning and teaching collocation of prepositions: The case of Iranian EFL adult learners. Asian EFL Journal 8/4: 192–209.

Lavrakas, Paul J. 2008. Encyclopedia of Survey Research Methods. California: SAGE Publications.

Lee, Hansol, Mark Warschauer and Jang H. Lee. 2019. The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis. Applied Linguistics 40/5 721–753.

Lewis, Michael. 1993. The Lexical Approach: The State of ELT and a Way Forward. Boston: Heinle.

Li, Shuangling. 2017. Using corpora to develop learners’ collocational competence. Language Learning and Technology 21/3: 153–171.

McEnery, Tony and Andrew Wilson. 1997. Teaching and language corpora. ReCALL 9/1: 5–14.

Mizumoto, Atsushi and Kiyomi Chujo. 2015. A meta-analysis of data-driven learning approach in the Japanese EFL classroom. English Corpus Studies 22: 1–18.

Moliner, María. 2007. Diccionario del Uso del Español. Gredos: Madrid.

Moon, Soyeon and Sun-Young Oh. 2017. Unlearning overgenerated be through data-driven learning in the secondary EFL classroom. ReCALL 30/1: 48–67.

Mukherjee, Joybrato. 2006. Corpus linguistics and language pedagogy: The state of the art – and beyond. In Sabine Braun, Kurt Kohn and Joybrato Mukherjee eds. Corpus Technology and Language Pedagogy. Bern: Peter Lang, 5–24.

Nation, Paul and Paul Meara. 2010. Vocabulary. In Norbert Schmitt ed. An Introduction to Applied Linguistics. London: Hodder Education, 34–51.

Nation, Paul. 2001. Learning Vocabulary in Another Language. Cambridge: Cambridge University Press.

O’Keeffe, Anne, Michael McCarthy and Ronald Carter. 2007. From Corpus to Classroom: Language Use and Language Teaching. Cambridge: Cambridge University Press.

O’Sullivan, Íde and Angela Chambers. 2006. Learners’ writing skills in French: Corpus consultation and learner evaluation. Journal of Second Language Writing 15/1: 49–68.

Pérez-Paredes, Pascual. 2005. Data-driven learning y el aprendizaje de idiomas. Greta: Revista para Profesores de Inglés 13/1–2: 5–10.

Pérez-Paredes, Pascual. 2010. Appropriation and integration issues in corpus methods and mainstream language education. In Tony Harris and María Moreno-Jaén eds. Corpus Linguistics in Language Teaching. Bern: Peter Lang, 53–73.

Plonsky, Luke, and Frederick L. Oswald. 2014. How big is ‘big’? Interpreting effect sizes in L2 research. Language Learning 64/4: 878–912.

Römer, Ute. 2011. Corpus research applications in second language teaching. Annual Review of Applied Linguistics 31: 205–225.

Royal Spanish Academy. Online. Diccionario de la Lengua Española (22. ed.). At https://dle.rae.es/?w=diccionario. Accessed on 25/09/2019

Royal Spanish Academy. Online. Banco de Datos CREA: Corpus de Referencia del Español Actual. At http://www.rae.es. Accessed on 25/09/2019

Schmitt, Norbert. 2000. Vocabulary in Language Teaching. Cambridge: Cambridge University Press.

Smart, Jonathan. 2014. The role of guided induction in paper-based data-driven learning. ReCALL 26/2: 184–201.

Soruç, Adem and Bilal Tekin. 2017. Vocabulary learning through data-driven learning in an English as a second language setting. Educational Sciences: Theory and Practice 17/6: 1811–1832.

Sripicharn, Passapong. 2003. Evaluating classroom concordancing: The use of concordance-based materials by a group of Thai students. Thammasat Review 1: 203–236.

Sripicharn, Passapong. 2010. How can we prepare learners for using language corpora? In Anne O’Keeffe and Michael McCarthy eds. The Routledge Handbook of Corpus Linguistics. London: Routledge, 371–384.

Stevens, Vance. 1991. Concordance-based vocabulary exercises: A viable alternative to gap-filling. English Language Research Journal 4: 47–61.

Szudarski, Paweł. 2018. Corpus Linguistics for Vocabulary: A Guide for Research. London: Routledge.

Tekin, Bilal and Adem Soruç. 2016. Using corpus-assisted learning activities to assist vocabulary development in English. The Turkish Online Journal of Educational Technology 1270–1283.

Thompson, Paul. 2006. Assessing the contribution of corpora to EAP practice. In Zoe Kantaridou, Iris Papadopoulou and Ifigenia Mahili eds. Motivation in Learning Language for Specific and Academic Purposes. Macedonia: University of Macedonia.

Thurstun, Jennifer and Christopher N. Candlin. 1998. Concordancing and the teaching of the vocabulary of academic English. English for Specific Purposes 17/3: 267–280.

Vyatkina, Nina. 2016. Data-driven learning for beginners: The case of German verb- preposition collocations. ReCALL 28/2: 207–226.

Wesche, Marjorie and T. Sima Paribakht. 1996. Assessing Second language vocabulary knowledge: Depth versus breadth. The Canadian Modern Language Review 53/1: 13–40.

Yılmaz Enes and Adem Soruç. 2015. The use of concordance for teaching vocabulary: A data-driven learning approach. Procedia-Social and Behavioral Sciences 191: 2626–2630.

Yoon Hyunsook and Alan Hirvela. 2004. ESL student attitudes toward corpus use in L2 writing. Journal of Second Language Writing 13/4: 257–283.

How to Cite
Yao, G. (2019). Vocabulary learning through data-driven learning in the context of Spanish as a foreign language. Research in Corpus Linguistics, 7, 18-46. https://doi.org/10.32714/ricl.07.02