Workshop convened by Cathleen Kantner and Amelie Kutter at the Free University Berlin, 27-28th May, 2010, in collaboration with the Integrated Project RECON and the Jean Monnet Centre of Excellence (JMCE) at the Free University Berlin.
Abstract: In the digital era, large bodies of electronic data hold great promise for the exploration of researchers’ questions. Social scientists have started to use computer-aided and linguistic methods for the investigation of social and political issues, drawing on large text corpora. In the process, they have invariably been confronted with a host of problems related to the compilation of an issue-specific corpus; the retrieval of content-related information and its frequency-based generalisation; the down-sizing of corpora; the detailed qualitative-linguistic analysis of smaller sub-corpora; and the triangulation of text analytical methods for the qualitative and quantitative exploration of texts. At the same time, computational and corpus linguists have developed and applied new tools for the computational retrieval, annotation, and representation of linguistic information in large text corpora. However, they are faced with the challenge of communicating these innovations to a broader scientific audience that has an increasing interest in these tools of analysis, but often operates with different (contentrelated) categories. The objective of this workshop is to explore the common methodological interests of computational and corpus linguists working on computer-aided linguistic analysis, on the one hand, and social scientists, who have used computational and corpus-linguistic tools to advance text analysis in the social sciences, on the other. We hope to foster interdisciplinary cooperation focusing on method-related issues.