evidence, part 2
Continuing yesterday's thoughts, and following Cameron's addition of the VP debate to his data set, I wanted to talk a little bit about what I see as the significance of this kind of analysis.
This is important for me because one of the basic questions that I find myself asking about network studies is the degree to which this kind of study is merely descriptive. In other words, what benefits are there to having this sort of evidence? What kinds of claims and/or strategies can build on network analysis?
We tend to think of language as something over which we have complete control. But anyone who writes over a fair period of time knows that this isn't the case. In my case, I can no longer remember the specific language of articles that I myself have written any better than I can remember others'. And yet, there are certain features--of style, semantics, vocabulary, etc.--that remain relatively constant, and which I do recognize when I go back and read my writing. "Relatively" because we absorb all those things as we come into contact with others' language, and that contact nudges us in various directions. I may use a word more often because I like it, or avoid certain sentence constructions because I find them confusing. But the deeper the patterns, the slower the change, and the less conscious control we have over them. We may have immediate control over something that we are writing at the present moment, but we don't think about every single word to an equal degree. We take any variety of shortcuts--language use is at heart a vast network of shortcuts and connotations, and we use those shortcuts and patterns as a means of conserving our communicative energies.
And so the virtue of a doing large-scale, statistical analysis of a set of textual data is that it may reveal those shortcuts, those subconscious preoccupations that emerge over the long term in the language we use. As I think I already mentioned, this kind of analysis is limited by small samples, and it's likewise limited by textual performances that are as highly scripted as the debates undoubtedly are. In other words, both things allow for more conscious, deliberate control over text.
And yet, there are things that can be said here. When I see, for example, the prominence of the phrase "hard work," my sense is that W is basically asking for the political equivalent of an "A for effort." Given how quickly they've been to accuse the Dems of "demeaning the sacrifice" of our troops, I think that they realize that, in the face of a very limited amount of success, they have to argue not that we've been successful, but rather that we've tried really, really hard. Of course, my gut response is that they've made a big deal out of the bankruptcy of such a tactic when it comes to teachers that they have no right to rely on it themselves. If teachers are to be judged purely on the basis of their students' test scores (i.e., quantitative results) regardless of how hard teachers work, then they should not shy from the same sort of accountability themselves. If it's not enough for teachers to "work hard" and "fail," then it's hypocritical of them not to abide by the same standard.
Now, a healthy dose of this is partisan interpretation on my part, I know, but the patterns unearthed by analyses like Cameron's, I would argue, give us avenues for that kind of interpretation, avenues that do carry some quantitative justification behind them. This doesn't mean that all language use can simply be reduced to statistical patterns--far from it, in fact--but rather that it is a mix of conscious art and subconscious pattern, and that to date, we've (and that's a disciplinary "we") been less inclined to pursue the latter element. I would say that the work of Don Foster is a notable exception to this, and I'm sure that there are others whose work I have yet to encounter, particularly in linguistics.