« Had they asked a week ago, | Main | (And speaking of tagclouds) »

Collin's Clever CCCC Cluster Cloud

Speaking of CCCC, or of the CCCCCCCCC referenced in my title (8 Cs!), Derek and I were yappin tonight about how we might go about indexing the CCCC Program using TagCrowd, a tool I came across via Jill and recommended to Jenny. It overlaps a fair bit with what we're doing over at CCCOA, but one difference is that TagCrowd allows you to upload a file, whereupon it generates a cloud of frequent terms.

So here's what I did:

1. I went to the searchable program for the 2007 CCCC, and searched for all panels under the 106 Area Cluster (Information Technologies).

2. I added each of the 50 or so panels to my "Convention Schedule," and then hit the button to email it to myself. The result is a window with all of the panels & descriptions in a text file. Copy and paste into TextEdit.

3. I stripped out all of the speaker information, including titles. I could have left the titles in, but it would have taken longer (and been a little more debatable in terms of focus).

4. Find/Replace on 2-word phrases (new media, social software, et al.), variants (online and on-line), making them a single word in the case of the former and standardizing in the latter. (I thought, too, about just deleting "speaker," which appears in the prose with some frequency.)

5. TagCrowd the file, and voila!

Tagcloud for Area Cluster 106 (Information Technologies)

You can look at the bigger graphic over at FlickR, but here's a cloud of the 100 most frequently used terms in CCCC proposals for the 106 cluster. "Speaker" and "presentation" are throwaways, and you could argue the same for "discuss" ("In this presentation, Speaker X will discuss...."). Looks pretty sensible to me--I'd say that blogging and Facebook are the flavors of the year. I may have caused the word "remix" to drop out of the cloud by not including titles--I'm not sure.

One caveat is that not all the panels included prose descriptions--that may just be a matter of time, though. Again, I'm not certain.

One thing I do know, though, and that's that this whole process took me less than an hour, and it would be child's play to go back in, and do it for each cluster, as well as all of the "focuses" and "emphases." Not that I have the time, energy, or schedule to allow me to do so. But it's a fun little experiment, nonetheless.

(I should mention, if anyone sees fit to do some of these, that TagCrowd allows one to create a blackredlist of terms that won't be included. In addition to speaker, presentation, and discuss, I'd probably (were I to redo this one) add become, consider, examine, important, include, and panel. They function here as mostly empty proposal jargon.)

That's all.