My research is primarily focused on developing corpus-driven methodology. I am especially interested in how we can describe and explain the interaction of the different elements of linguistic structure. For example, how grammatical semantics interact with lexical semantics, metaphor, and metonymy and then in turn how these phenomena interact with pragmatics. These are old problems in linguistics, but approaching them using corpus-driven methods is a relatively new project. The advantage of corpus-driven methodology is that we may treat the data quantitatively. This allows us to test our hypotheses but also, using multivariate statistics, make generalisations about complex interactions in way that is impossible using one's intuition.

I am currently working on two major lines of research: applying multivariate usage-feature analysis (also called behavioural profile analysis) to

(i) pragmatic structures such as epistemic stance and evidentiality

(ii) conceptual structures such as the cultural model of FATE and the emotion concept of ANGER.


R and Multivariate Statistics

One of my main concerns is working out how to employ multivariate statistical analysis to better understand our data. I working primarily with R and often give workshops of statistics for corpus linguistics. Information, including upcoming workshops, can be found on the presentations page.

I focus on exploratory techniques such as:

– Binary correspondence analysis
– Multiple correspondence analysis
– Agglomerative cluster analysis
– K-Means cluster analysis

and on confirmatory modelling such as:

– Loglinear Regression
– Binary Logistic Regression
– Multinomial Logistic Regression
– Ordinal Logistic Regression

The scripts and R commands I teach with can be found in the download section for students


