COWS-L2H: A corpus of Spanish learner writing

Keywords: L2 Spanish, Spanish as a Heritage Language, learner corpus research


This paper presents the Corpus of Written Spanish of L2 and Heritage Speakers (COWS-L2H), a large corpus of compositions written by North American university students learning Spanish. The goals of this work are to (1) build a large corpus of Spanish learner writing that provides samples of written data from Spanish learners in the context of a North American university, (2) to contribute corpus data collected not only from second language (L2) learners of Spanish but also from learners of Spanish as a heritage language (SHL), and (3) to develop one of the few Spanish learner corpora to provide longitudinal data.


Alonso-Ramos, Margarita ed. 2016. Spanish Learner Corpus Research: Current Trends and Future Perspectives. Amsterdam: John Benjamins.

American Academy of Arts and Sciences. 2016. The State of Languages in the U.S.: A Statistical Portrait. Cambridge, Massachusetts: American Academy of Arts and Sciences.

Beaudrie, Sara M. 2012. Introduction: Development in Spanish heritage language placement. Heritage Language Journal. Special Issue on Spanish Assessment 9/1: i–xi.

Blanco Canales, Ana. 2011. Fono.ele, una herramienta Web para la investigación de la competencia fónica y la formación de profesores. In Carmen Hernández González, Antonio Carrasco Santana and Eva Álvarez Ramos eds. La Red y sus Aplicaciones en la Enseñanza-Aprendizaje del Español como Lengua Extranjera. Servicio de Publicaciones Universidad de Valladolid, 129–140.

Brown, Earl K. 2017. Corpus of Mexican Spanish in Salinas, California. (24 November, 2019.)

Buyse, Kris, Lydia Fernández Pereda and Katrien Verveckken. 2016. The Aprescrilov corpus, or broadening the horizon of Spanish language learning in Flanders. In Margarita Alonso-Ramos ed., 143–168.

Campillos Llanos, Leonardo. 2014. A Spanish learner oral corpus for computer aided error analysis. Corpora 9/2: 207–238.

Carvalho, Ana M. 2012–. Corpus del Español en el Sur de Arizona (CESA). University of Arizona. (18 February, 2020.)

Colombi, María Cecilia. 2015. Academic and cultural literacy for heritage speakers of Spanish. A case study of Latin@ students in California. Linguistics and Education 32/A: 5–15.

Colombi, María Cecilia and Joseph Harrington. 2012. Advanced biliteracy development in Spanish. In Sara M. Beaudrie and Marta Fairclough eds. Spanish as a Heritage Language in the United States: The State of the Field. Georgetown University Press, 241–258.

Council of Europe. 2011. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. (24 November, 2019.)

Davies, Mark. 2016–. Corpus del Español: Two billion words, 21 countries. (24 November, 2019.)

Granger, Sylviane, Gaëtanelle Gilquin and Fanny Meunier. 2015. Introduction: Learner corpus research– past, present and future. In Sylviane Granger, Gaëtanelle Gilquin and Fanny Meunier eds. The Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 1–5.

Instituto Cervantes. 2019. El Español: Una Lengua Viva. Madrid: Instituto Cervantes.

Koike, Dale and Jennifer Witte. 2016. Spanish corpus proficiency level training website and corpus: An open-source, online resource for corpus linguistics studies. In Margarita Alonso-Ramos ed., 169–196.

Lozano, Cristóbal. 2009. CEDEL2: Corpus Escrito del Español como L2. In Carmen M. Bretones José Francisco Fernández Sánchez, José Ramón Ibáñez Ibáñez, María Elena García Sánchez, María Enriqueta Cortés de los Ríos, Sagrario Salaberri Ramiro, María Soledad Cruz Martínez, Nobel Perdú Honeyman and Blasina Cantizano Márquez eds. Applied Linguistics Now: Understanding Language and Mind/La Lingüística Aplicada Actual: Comprendiendo el Lenguaje y la Mente. Almería: Universidad de Almería, 197–212.

Mitchell, Rosamond, Laura Domínguez, María J. Arche, Florence Myles and Emma Marsden. 2008. SPLLOC: A new database for Spanish second language acquisition research. EuroSLA Yearbook 8/1: 287–304.

Otheguy, Ricardo and Nancy Stern. 2011. On so-called Spanglish. International Journal of Bilingualism 15/1: 85–100.

Pascual y Cabo, Diego ed. 2016. Advances in Spanish as a Heritage Language. Amsterdam: John Benjamins.

Ployhart, Robert E. and Robert J. Vandenberg. 2010. Longitudinal research: The theory, design, and analysis of change. Journal of Management 36/1: 94–120.

Rojo, Guillermo and Ignacio M. Palacios-Martínez. 2016. Learner Spanish on computer: The CAES ‘Corpus de Aprendices de Español’ project. In Margarita Alonso-Ramos ed., 55–87.

Tracy-Ventura, Nicole, Rosamond Mitchell and Kevin McManus. 2016. The LANGSNAP longitudinal learner corpus. Design and use. In Margarita Alonso-Ramos ed., 117–142.

Valdés, Guadalupe and María Luisa Parra. 2018. Towards the development of an analytical framework for examining goals and pedagogical approaches in teaching language to heritage speakers. In Kim Potowski ed. The Routledge Handbook of Spanish as a Heritage Language. London: Routledge, 301–330.

How to Cite
Yamada, A., Davidson, S., Fernández-Mira, P., Carando, A., Sagae, K., & Sánchez-Gutiérrez, C. (2020). COWS-L2H: A corpus of Spanish learner writing. Research in Corpus Linguistics, 8(1), 17-32.