lunduniversity.lu.se

Centre for Languages and Literature

The Joint Faculties of Humanities and Theology | Lund University

Programme and speakers

Click on the name to read the speaker's bio and abstract.

09.15–09.20 Welcome

09.20–10.00 
Nele Põldvere, Victoria Johansson & Carita Paradis, Lund University
The new London–Lund Corpus (LLC–2): design, compilation, access

10.00–10.30 
Bas Aarts, University College London
Research on spoken English: the London–Lund experience

10.30–11.00 Fika

11.00–11.30 
Gunnel Tottie, University of Zurich
Corpus linguistics now and then: the case of negation of indefinites

11.30–12.00 
Karin Aijmer, University of Gothenburg
'They're like proper crazy like' – new uses of intensifiers in spoken English

12.00–13.30 Lunch break

13.30–14.00 
Robbie Love, University of Leeds
Building and analysing a national corpus of informal spoken English: the Spoken BNC2014

14.00–14.30 
Susan Reichelt, University of Greifswald
Combining apparent and real time approaches to language change: recent developments of kind of and sort of in spoken British English using BNClab

14.30–15.00 Fika

15.00–15.30 
Jonathan Culpeper, Lancaster University
On 'spokenness': from Early Modern to Present-day English

15.30–16.30 
Herbert Clark, Stanford University (Keynote speaker)
On the use and misuse of language corpora

16.30–16.40 Closing

Nele Põldvere, Lund University

Nele Põldvere is a doctoral candidate in English Linguistics at the Centre for Languages and Literature at Lund University, Sweden. Her thesis, to be defended in September 2019, is titled 'What's in a dialogue? On the dynamics of meaning-making in English conversation' and has two aims: (i) to further our understanding of the use and development of constructions in spoken dialogue and (ii) to compile a brand new corpus of spoken British English, the London–Lund Corpus 2. She is also involved in a project on advice-giving and uptake in conversation, together with Dr Rachele De Felice from UCL. For a complete list of publications and conference presentations, please see here.

The new London–Lund Corpus (LLC–2): design, compilation, access

This talk reports on the compilation of the new London–Lund Corpus (LLC–2) – a corpus of contemporary spoken British English, collected 2014–2019. The size and design of LLC–2 are the same as the world's first corpus of spoken language, namely the London–Lund Corpus (LLC–1), with spoken data mainly from the 1960s. In addition to the fact that we have a corpus of contemporary speech, the existence of LLC–2 also gives researchers the opportunity to make principles diachronic comparisons of speech over the past 50 years and detect change in communicative behaviour among speakers.

The compilation of LLC–2 has included a number of different stages such as data collection, transcription of the recordings, markup and annotation, and finally making the corpus accessible to the research community. The talk describes and critically examines the methodological decisions made in each stage. For example, it was important to strike a balance between LLC–2 as a representative collection of data of contemporary spoken English and its comparability to LLC–1. Therefore, both corpora contain the same speech situations (dialogue, mainly everyday face-to-face conversation, as well as monologue), but the specific recordings added to LLC–2 also reflect the technological advances of the last few decades, particularly with respect to speech situations such as telephone calls (e.g. Skype) and broadcast discussions and interviews (e.g. podcasts). Moreover, the transcriptions in LLC–2 are orthographic and time-aligned with the corresponding sound files, which is a feature of the corpus that is novel and makes it possible to, among other things, investigate prosody and dialogue management among speakers with great precision. The corpus, as well as metadata about the transcriptions and the speakers, will be released to the public in late 2019 from the Lund University Humanities Lab's corpus server. The release will fill an unfortunate gap in the availability of spoken corpora for linguistic analysis.

The benefits of spoken corpora in general and of LLC–2 in particular will be demonstrated in the talk through examples of case studies based on the corpus (e.g. Põldvere & Paradis, 2019a, 2019b). The case studies illustrate how LLC–2 can contribute to our understanding of meaning-making and discursive practices in real communication and provide a window into the cognitive and social processes of dialogic interaction, both from a contemporary and a back-in-time perspective.

References

Põldvere, N., & Paradis, C. (2019a). 'What and then a little robot brings it to you?' The reactive what-x construction in spoken dialogue. English Language and Linguistics. Advance online publication. doi:10.1017/S1360674319000091
Põldvere, N., & Paradis, C. (2019b). Motivations and mechanisms for the development of the reactive what-x construction in spoken dialogue. Journal of Pragmatics, 143, 65–84.

Victoria Johansson, Lund University

Victoria Johansson is associate professor in General linguistics. Her research interests include language production in a lifelong perspective, and a central aspect of her research constitutes of methodological development for investigating writing in real time (by means of keystroke logging, sometimes in combination with eye tracking to capture reading patterns concurrent with writing). Important outcomes of the research concern psycholinguistic comparisons of writing and speaking in a developmental perspective, where the use and establishment of e.g. lexical expressions and syntactic constructions are discussed in regard to the writer's/speaker's cognitive effort. More information is found here.

The new London–Lund Corpus (LLC–2): design, compilation, access

This talk reports on the compilation of the new London–Lund Corpus (LLC–2) – a corpus of contemporary spoken British English, collected 2014–2019. The size and design of LLC–2 are the same as the world's first corpus of spoken language, namely the London–Lund Corpus (LLC–1), with spoken data mainly from the 1960s. In addition to the fact that we have a corpus of contemporary speech, the existence of LLC–2 also gives researchers the opportunity to make principles diachronic comparisons of speech over the past 50 years and detect change in communicative behaviour among speakers.

The compilation of LLC–2 has included a number of different stages such as data collection, transcription of the recordings, markup and annotation, and finally making the corpus accessible to the research community. The talk describes and critically examines the methodological decisions made in each stage. For example, it was important to strike a balance between LLC–2 as a representative collection of data of contemporary spoken English and its comparability to LLC–1. Therefore, both corpora contain the same speech situations (dialogue, mainly everyday face-to-face conversation, as well as monologue), but the specific recordings added to LLC–2 also reflect the technological advances of the last few decades, particularly with respect to speech situations such as telephone calls (e.g. Skype) and broadcast discussions and interviews (e.g. podcasts). Moreover, the transcriptions in LLC–2 are orthographic and time-aligned with the corresponding sound files, which is a feature of the corpus that is novel and makes it possible to, among other things, investigate prosody and dialogue management among speakers with great precision. The corpus, as well as metadata about the transcriptions and the speakers, will be released to the public in late 2019 from the Lund University Humanities Lab's corpus server. The release will fill an unfortunate gap in the availability of spoken corpora for linguistic analysis.

The benefits of spoken corpora in general and of LLC–2 in particular will be demonstrated in the talk through examples of case studies based on the corpus (e.g. Põldvere & Paradis, 2019a, 2019b). The case studies illustrate how LLC–2 can contribute to our understanding of meaning-making and discursive practices in real communication and provide a window into the cognitive and social processes of dialogic interaction, both from a contemporary and a back-in-time perspective.

References

Põldvere, N., & Paradis, C. (2019a). 'What and then a little robot brings it to you?' The reactive what-x construction in spoken dialogue. English Language and Linguistics. Advance online publication. doi:10.1017/S1360674319000091
Põldvere, N., & Paradis, C. (2019b). Motivations and mechanisms for the development of the reactive what-x construction in spoken dialogue. Journal of Pragmatics, 143, 65–84.

Carita Paradis, Lund University

Carita Paradis is Professor of English Linguistics. She is interested in the dynamics of meaning making in human communication and couches her research within the broad framework of Cognitive Linguistics. Central to this approach is the meaningful functioning of language in all its guises and all its uses in discourse. She uses different empirical methods – corpus methods as well as experimental techniques of different kind to contribute to a better understanding of what linguistic expressions reveal about human interaction, perception and cognition and inversely how they influence and give rise to patterns and structures in natural language use. Here is some information.

The new London–Lund Corpus (LLC–2): design, compilation, access

This talk reports on the compilation of the new London–Lund Corpus (LLC–2) – a corpus of contemporary spoken British English, collected 2014–2019. The size and design of LLC–2 are the same as the world's first corpus of spoken language, namely the London–Lund Corpus (LLC–1), with spoken data mainly from the 1960s. In addition to the fact that we have a corpus of contemporary speech, the existence of LLC–2 also gives researchers the opportunity to make principles diachronic comparisons of speech over the past 50 years and detect change in communicative behaviour among speakers.

The compilation of LLC–2 has included a number of different stages such as data collection, transcription of the recordings, markup and annotation, and finally making the corpus accessible to the research community. The talk describes and critically examines the methodological decisions made in each stage. For example, it was important to strike a balance between LLC–2 as a representative collection of data of contemporary spoken English and its comparability to LLC–1. Therefore, both corpora contain the same speech situations (dialogue, mainly everyday face-to-face conversation, as well as monologue), but the specific recordings added to LLC–2 also reflect the technological advances of the last few decades, particularly with respect to speech situations such as telephone calls (e.g. Skype) and broadcast discussions and interviews (e.g. podcasts). Moreover, the transcriptions in LLC–2 are orthographic and time-aligned with the corresponding sound files, which is a feature of the corpus that is novel and makes it possible to, among other things, investigate prosody and dialogue management among speakers with great precision. The corpus, as well as metadata about the transcriptions and the speakers, will be released to the public in late 2019 from the Lund University Humanities Lab's corpus server. The release will fill an unfortunate gap in the availability of spoken corpora for linguistic analysis.

The benefits of spoken corpora in general and of LLC–2 in particular will be demonstrated in the talk through examples of case studies based on the corpus (e.g. Põldvere & Paradis, 2019a, 2019b). The case studies illustrate how LLC–2 can contribute to our understanding of meaning-making and discursive practices in real communication and provide a window into the cognitive and social processes of dialogic interaction, both from a contemporary and a back-in-time perspective.

References

Põldvere, N., & Paradis, C. (2019a). 'What and then a little robot brings it to you?' The reactive what-x construction in spoken dialogue. English Language and Linguistics. Advance online publication. doi:10.1017/S1360674319000091
Põldvere, N., & Paradis, C. (2019b). Motivations and mechanisms for the development of the reactive what-x construction in spoken dialogue. Journal of Pragmatics, 143, 65–84.

Bas Aarts, University College London

Bas Aarts is Professor of English Linguistics and Director of the Survey of English Usage at UCL. Hid research interest is English grammar. His recent publications include: Syntactic gradience (2007, OUP), Oxford modern English grammar (2011, OUP), The English verb phrase (2013, edited with J. Close, G. Leech and S. Wallis, CUP), Oxford dictionary of English grammar (2nd edition 2014; edited with S. Chalker and E. Weiner, OUP), as well as articles in books and journals. He is a founding editor of the journal English Language and Linguistics (CUP).

Research in spoken English: the London–Lund experience

In this paper I will go back to the beginning by tracing the history of the collaboration between the Survey of English Usage (which celebrates its 60th anniversary this year) and the University of Lund. I will briefly present the corpus exploration tools that we developed, and how they can be used to carry out research on both written and spoken English. I will then present the results of some recent research in the Survey of English Usage on spoken English, specifically work on the progressive construction, modal verbs and the perfect construction.

Gunnel Tottie, University of Zurich

Gunnel Tottie got her PhD at Stockholm University, Sweden in 1971 and taught at the universities of Stockholm, Lund and Uppsala before becoming a professor at the University of Zurich (1991–2002). Always a committed corpus linguist, she has published on topics in syntax, especially negation and relativization, and more recently, pragmatics.

Corpus linguistics now and then: the case of negation of indefinites

I will illustrate the progress (and the woes) of corpus linguistics with examples from my own work on two problems in the syntax of negation in English.

I first studied the variation between NO-negation and NOT-negation, as in (1) and (2), beginning with the Brown Corpus in the 1970s and adding the London–Lund Corpus in the 1980s (Tottie 1983, 1991).

(1) NO-negation: I have no dog/money/friends.
(2) NOT-negation with a or any: I don't have a dog/any problem/money/friends.

The second problem was impossible to address because of the paucity of material in the seventies and eighties: the variation between the indefinite article a/an and any as indefinite determiners of count nouns in sentences with NOT-negation, as in (3):

(3) It isn't a/any problem/There isn't a/any problem/I don't have/see a/any problem.

With the advent of mega-corpora like the Corpus of Contemporary American English (COCA), comprising 577 million words, it was tempting to try to study the variation between a/n - or any-negation and try to find the factors conditioning their use. This is what I am currently working on – not without complications, due both to the sheer size and the makeup of the corpus.

References

Tottie, Gunnel. 1983. Much about Not and Nothing. A Study of Analytic and Synthetic Negation in Contemporary American English., (Publications of the Royal Society of Letters at Lund 1983–1984:1. Lund: Kungl. Humanistiska Vetenskapssamfundet.)
Tottie, Gunnel. 1991. Negation in English Speech and Writing. San Diego, New York, London: Academic Press.

Karin Aijmer, University of Gothenburg

Karin Aijmer is professor emerita in English linguistics at the University of Gothenburg, Sweden. Her research interests focus on pragmatics, discourse analysis, modality, corpus linguistics and contrastive analysis. Her books include Conversational Routines in English: Convention and Creativity, (1996), English Discourse Particles. Evidence from a Corpus, (2002), The Semantic Field of Modal Certainty: a Study of Adverbs in English (with co-author), (2007), Understanding Pragmatic Markers. A Variational Pragmatic Analysis, (2013). She is co-editor of Pragmatics of Society (Handbook of Pragmatics, Mouton de Gruyter, 2011) and of A Handbook of Corpus Pragmatics (Cambridge University Press, 2014) and co-author of Pragmatics. An Advanced Resource Book for Students (Routledge, 2012).

'They're like proper crazy like' – new uses of intensifiers in spoken English

New intensifiers emerge, become fashionable but can then lose their popularity and be replaced by other more striking intensifiers. When Paradis (2000) revisited degree modifiers of adjectives in the 1990's, she found, for example, that the intensifiers in the London Lund Corpus occurred with different frequency in the COLT corpus (the Bergen Corpus of London Teenagers) that she used for comparison. Moreover, she found some examples of 'new' intensifiers such as well (well weird) and enough (enough bad). The aim of my presentation is to discuss changes in the intensification system which seem to have taken place after this. The focus will be on some intensifiers (eg wellallproperpretty) which seem to have become more frequent or have emerged recently in spoken British English. The material for this study is taken from the spoken British National Corpus 2014 (Love et al 2017). On-going changes can be observed by comparing the distribution and uses of the same intensifiers in the old BNC (BNC1994). The research questions are:

– What are the mechanisms responsible for the changes?
– What is the role of sociolinguistic factors such as the age and gender of the speakers to explain the changes?

References

Love, R., Dembry, C., Hardie, A., Brezina, V., and McEnery, T. 2017. The Spoken BNC2014 – designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics 22 (3): 311–318.
Paradis, C. 2000. It's well weird. Degree modifiers of adjectives revisited: The nineties. In Kirk, J. (ed.). Corpus galore. Analysis and techniques in describing English. Amsterdam: Rodopi. 147–160. 

Robbie Love, University of Leeds

Dr Robbie Love is a Research Fellow in Applied and Corpus Linguistics at the School of Education, University of Leeds, UK. His research interests span language (in) education, change in informal spoken English and corpus linguistic methods. He completed his PhD at Lancaster University, where he worked on the construction of the Spoken British National Corpus 2014 (Spoken BNC2014), which was made available publicly in 2017. His first monograph, on this topic, will be published in Routledge's Advances in Corpus Linguistics series.

Building and analysing a national corpus of informal spoken English: the Spoken BNC2014

The Spoken BNC2014 (Love et al. 2017, Love forth.) is an important component of the new British National Corpus 2014;i a large dataset representing current British English usage across different situations, which is being compiled by Lancaster University in collaboration with Cambridge University Press. It is the successor to the spoken component of the original British National Corpus (Crowdy 1995) and was released publicly via Lancaster University's CQPweb server (Hardie 2012) in September 2017.

In terms of corpus construction, I pay attention to other contemporary spoken corpus projects such as the London–Lund Corpus 2 (Paradis et al. 2015–), the spoken component of CorCenCC (Knight et al. 2016) and FOLK (Schmidt 2016) and consider the role of representativeness in corpus design. I argue that representativeness is ideal but that it is inevitable – due to practical constraints – that there will be some differences between the original design of a large 'national' corpus and the finished product, and that it is important to be honest, critical and realistic about representativeness. I then demonstrate the research potential of the Spoken BNC2014 with examples from recent research into adverbs (Goodman & Love 2019).

i http://cass.lancs.ac.uk/bnc2014/

References

Crowdy, S. (1995). The BNC spoken corpus. In G. Leech, G. Myers, & J. Thomas (Eds.), Spoken English on Computer: Transcription, Mark-Up and Annotation (pp. 224–234). Harlow: Longman.
Goodman, O., & Love, R. (2019). 1000 hours of conversations: what does it mean for ELT? 53rd Annual IATEFL Conference & Exhibition. Liverpool, UK. April 2019.
Hardie, A. (2012). CQPweb – combing power, flexibility and usability in a corpus analysis tool. International Journal of Corpus Linguistics 17(3), 380–409.
Knight, D., Neale, S., Watkins, G., Spasic, I., Morris, S., & Fitzpatrick, T. (2016, June). Crowdsourcing corpus construction: contextualizing plans for CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh). Paper presented at the IVACS 2016 conference, Bath Spa University, UK.
Love, R. (forth). Overcoming Challenges in Corpus Construction: The Spoken British National Corpus 2014. New York: Routledge.
Love, R., Dembry, C., Hardie, A., Brezina, V., & McEnery, T. (2017). The Spoken BNC2014: Designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics, 22(3), 319–344.
Paradis, C., Põldvere, N., Johansson, V., & O'Hare, P. (2015–). The London–Lund Corpus 2 of spoken British English (LLC–2). Available at: http://www.sol.lu.se/en/research/forskningsprojekt/906/ (last accessed June 2019).
Schmidt, T. (2016). Good practices in the compilation of FOLK, the Research and Teaching Corpus of Spoken German. International Journal of Corpus Linguistics, 21(3), 396–418.

Susan Reichelt, University of Greifswald

Dr Susan Reichelt completed her PhD in English Language and Communication Research at Cardiff University in 2018, focusing on socio-pragmatic variation in audio-visual media. In Lancaster University's Centre for Corpus Approaches to Social Science (CASS), she was part of the ESRC-funded project 'The British National Corpus (BNC) as a sociolinguistic dataset: Exploring individual and social variation' from 2017 to 2018. She is currently working as a lecturer in the English and American Studies department at Greifswald University in Germany.

Combining apparent and real time approaches to language change: recent developments of kind of and sort of in spoken British English using BNClab

This study reports on ongoing changes in the use of hedges sort of and kind of in spoken British English of the past twenty years. Following known sociolinguistic patterns of change in progress (c.f. Bailey, 2008; Pichler et al., 2018), special focus will be put on three categories of time: age, date of birth, and date of corpus compilation.

The data used in this study stem from two subsets of the original BNC from 1994 and the newly compiled BNC2014. Both sets were, where possible, balanced across social categories of age, gender, location, and social class. Feature tokens were extracted using the online platform BNClab, which includes a concordance viewer alongside first data evaluations, visualizations, and teaching materials. The design of the subsets, highlighted further in this presentation, allows for a combination approach to change, using apparent time and real time trend analyses.

The features under investigation, hedges sort of and kind of, are often treated as having "basically the same meaning" (Mauranen, 2004: 179; see also Aijmer, 1984: 118), yet show distributional differences across different varieties of English. In British English context, sort of is often found as more dominant (cf. Aijmer, 1984; Biber et al., 1999; Gries & David, 2007; Kay, 1984; Mauranen, 2004). Feature use within the two BNC datasets suggests that the variants are currently undergoing change. Kind of is increasingly encroaching on sort of – a change that becomes observable through the inclusion of the three categories of time, as mentioned above.

The talk thus highlights the need and usefulness of corpora that allow the researcher to combine apparent and real time approaches in order to gain a full picture of ongoing linguistic change.

References

Aijmer K. (1984). 'Sort of' and 'Kind of' in English conversation. Studica Linguistica 38: 118–128.
Bailey, G. 2008. Real and Apparent Time. In: Chambers, J.K. et al. eds. The Handbook of Language Variation and Change. Blackwell Publishing Ltd, pp. 312–332.
Biber D, Johansson S, Leech G, et al. (1999). Longman Grammar of Spoken and Written English, London: Pearson Education Limited.
Gries S and David C. (2007). This is kind of / sort of interesting: variation in hedging in English. In: Päivi Pahta IT, Terttu Nevalainen & Jukka Tyrkkö (ed) Studies in Variation, Contacts and Change in English: Towards Multimedia in Corpus Studies. Helsinki: Research Unit for Variation, Contacts and Change in English (VARIENG).
Kay P. (1984). The Kind of/Sort of Construction. Tenth Annual Meeting of the Berkeley Linguistics Society. 157–171.
Mauranen, A., (2004). They're a little bit different. Observations on hedges in academic talk. In Aijmer, K. & Stenström, A., (eds.). Discourse patterns in spoken and written corpora, pp. 173–98.
Pichler, H, Wagner SE, Hesson A. 2018. Old-age language variation and change: Confronting variationist ageism. Lang Linguist Compass. 12:e12281.

Jonathan Culpeper, Lancaster University

 

 

Jonathan Culpeper is Professor of English Language and Linguistics in the Department of Linguistics and English Language at Lancaster University, UK. His research spans pragmatics, stylistics and the history of English. His most recent major publications include Second Language Pragmatics: From Theory to Research (2018, Routledge; co-authored with Alison Mackey and Naoko Taguchi) and English Language: Description, Variation and Context (second edn., 2018, Palgrave: lead editor). For five years he was co-editor-in-chief of the Journal of Pragmatics (2009–14). He is currently leading the £1 million AHRC-funded Encyclopedia of Shakespeare's Language project.

On 'spokenness': from Early Modern to Present-day English

This paper reflects on 'spokenness' in English from the early modern period to today. I begin by (a) making some general remarks on spokenness in the history of English, and (b) introducing a descriptive approach to 'writenness' and 'spokenness', one revolving around three categories, namely, the degree to which a text is speech-like, speech-based or speech-purposed. This approach was part of the corpus-based work on spoken interaction in historical English writing that I conducted over 20 years with Merja Kytö (e.g. Culpeper and Kytö 2010). I discuss some of the problems we encountered and some of our findings, in particular those relating to what we termed 'pragmatic noise' (essentially, primary interjections, the noises – ooh's and aah's – that carry pragmatic meanings). I identify the five pragmatic noise items that occurred most frequently in all our speech-related genres but hardly occurred in our non-speech-related genres, and also briefly account for their development. In addition, I discuss at some length the case of the genre of play-texts, a complex hybrid spoken-written genre. Using corpus-based methods, I show how it has changed over the centuries, and relate some of those changes to changes in context.

References

Culpeper, Jonathan and Merja Kytö (2010) Early Modern English Dialogues: Spoken Interaction in Writing. Cambridge: Cambridge University Press.

Herbert Clark, Stanford University

Herbert H. Clark is the Albert Ray Lang Professor of Psychology Emeritus at Stanford University. He received his PhD from the Johns Hopkins University and, since 1969, has taught at Stanford. He is author of good many research articles, chapters, and books, including Psychology and Language (with Eve V. Clark), Arenas of Language Use, and Using Language. He is best known for his work on semantics and pragmatics (from word meaning, speech acts, reference, and common ground to disfluencies). His most recent work has been on depicting as a method of communication. He is a recipient of a John Simon Guggenheim Fellowship and a fellowship at the Center for Advanced Study in the Behavioral Sciences. He has been elected fellow of the American Academy of Arts and Sciences, member of the Society of Experimental Psychologists, and foreign member of the Koninklijke Nederlandse Academie van Wetenschappen (Royal Dutch Academy of Arts and Sciences). He is also recipient of the Distinguished Scientific Contribution Award from the Society for Text and Discourse and an honorary doctorate from the University of Neuchâtel.

On the use and misuse of language corpora

A corpus of English conversation, edited by Jan Svartvik and Randolph Quirk, was a pioneering work when it appeared in 1980. Although many corpora have been published since then, the London–Lund Corpus (LLC) remains unique in its quality and detail. It has proven especially valuable for studying features of the rough and tumble language of spontaneous conversation. Still, like most language corpora, the LLC had to ignore the gestural side of language use, even though gestures are an essential part of deictic expressions such as I, you, we, those guys, over there, the other side, now, and a moment ago. The problem is that many investigators have proceeded as if language use were all speech and no gesture, and that has left their models wanting.