Collin Vs. Blog: CCC Online Archives

A CCCO status report

As promised. My clothes are still drying, and I'm at a stopping point in various other tasks, and so...

Over at CCC Online, we've been making steady progress on archiving back issues. We've reached the point--Volume 50 in 98-99--in the archives where CCC began adding abstracts to every article. From now on, we'll be generating abstracts ourselves, and that promises to slow us down a little. In fact, if there's anyone out there who's interested in writing abstracts for us, drop me a note.

But that means that we're 8 years deep into the archives now, and we've got a pretty solid work process established. Some of the work is unavoidably time-intensive, but it's gotten a lot easier than it was, say, last summer, when we were still figuring out exactly how to manage it all. One of the things that Derek and I talked about tonight was starting up a development weblog for the site, a place where we could solicit feedback, talk about some of the work that goes on behind the scenes, try out ideas for new features, experiment with some alternative visualizations, etc. It might also be a space where we could invite (and/or carry out) some collaboration. Not a week goes by where one of us doesn't say that it would be cool to do X, Y, or Z, but we don't always have a clear sense of how valuable (or not) X, Y, or Z might be, or whether or not someone might make use of it.

So look for that.

With the release of the latest issue of CCC, we feel like another piece of the CCC Online puzzle is starting to take shape. On the front page, the upper right hand section was originally billed as a "feature space" and to this point, it's been occupied by a truncated version of the "about the site" statement that I wrote. Today, it went "live" in the sense that we've got a few different things added to the site meant to enhance the reading experience of the journal:

"Performing Writing, Performing Literacy" (by Fishman et al.) argues for the connection between student perceptions of writing and performance, and makes reference to videos of two of the article's authors, which we've placed on the site.
"Who Owns Writing?" was Doug Hesse's Chair's Address at the 2005 CCCC, and included not only video elements in the form of a PP deck (whose slides appear in the print version), but also certain audio features unavailable in print form, so we've included a video of the Address with the slides spliced in at various points.
We've also made Kathi Yancey's 2004 CCCC Chair's Address available on the site, as it is referenced in the Interchange between her and David Laurence.

It's overly optimistic to imagine that there's going to be digital content with each subsequent issue, although I do hope. And maybe these kinds of hybrid publications will start to encourage all of us not to think of paper and electronic publications as an either/or situation. One of our early Spring projects will be to develop a fixed location for such content, including some essays/sites developed for the first iteration of CCCO.

That's about it for right now. One of the things that we're hoping to accomplish in the next year or so is to update the "feature space" on the front page with a little more frequency, and we've got a few ideas in that regard. Any suggestions you might have are welcome.

Posted by cgbrooke at 12:42 AM | Permalink | Comments (7)

If only they would feed me

(CCC Online fed through Bloglines)

Here's one of the top items on my wishlist for our field, and here's what we've done to get there. Although there aren't a lot of subscribers yet, one of the things that using MT allows us to do with CCC Online is to publish RSS/Atom feeds of new issues of the journal.

Imagine with me for a minute. Rather than having to subscribe to all the journals, to guess when they're coming out, to borrow them from colleagues, or to hear about a relevant article months after its release when someone else cites it in a paper, imagine being able to just have a folder in Bloglines, or a feed page in Safari, or a bookmark in Firefox, that simply allows you to browse the most recent articles from the various journals in our field. Imagine that, rather than asking our graduate students to figure it out on their own what the journals are, we could just give them an OPML file that contained the feeds of all those journals. Imagine having all of those abstracts at your fingertips, and being able to bookmark them for later, email it to a friend you know would be interested, etc.

This is already possible with CCC Online. And in fact, it's theoretically possible for those journals that are oligopublished, like Computers and Composition or Rhetoric Review--I know that the big kids are slowly moving in a 2.0 direction. To generate a feed of new articles for CCC takes us (and this is including all of the other site features that we build in) maybe an hour or so an issue. Leave off the tagging, and the internal linking/trackbacking, and we'd be talking maybe 15 minutes, 4 times a year.

That's 1 hour. 1 hour per year.

For an hour's worth of work a year, a journal could make that metadata available in a much broader fashion and much more conveniently to the entire field. It really is that simple. Really. Just copy and paste, and a little bit of elementary design on the front end.

Maybe part of it is that I've been living with this idea for the last year or so, because it seems bone-crushingly obvious to me. It requires so little effort, and what effort is required is distributed so broadly that it's negligible. And the benefit is so clear and present--to have the last year's worth of articles in the field at our fingertips? Genius. There's no reason why publishers couldn't hop on to this as well: feeds for various subject areas, including books and chapters from edited collections.

Every once in a while, there are complaints about the flood of information we're faced with, even in a field as relatively small as ours is. We need to poke ourselves in the head, though, with the sharp fact that this is true for every discipline, much less every field of endeavor, and there are solutions out there, solutions that are pretty easy to implement and that could really transform the way we handle that flood.

That is all.

Posted by cgbrooke at 11:14 PM | Permalink | TrackBacks (1)

Turning Ten

It's hard to say whether it was that I had fallen behind, or Derek had gotten ahead, but either way, I spent most of this afternoon catching up to him. The result? I tagged and linked up a year and a half's worth of CCC articles, six issues. As always, you can visit CCC Online, and see for yourself.

One of the peculiarities of working on the site is that our archive necessarily moves in both directions--as new issues are released, we add them to the site, of course, but we're also moving steadily backwards, at roughly an issue a week, or a volume per month. The process is slightly different for either direction, and there are certain parts of the process that simply get more intensive with each new issue that we add.

But the big news is that, as of a few minutes ago, I compiled the data for Volume 47, Issue 1, which was originally published in February of 1996. Technically, it's 10.75 volumes (since there was a 2-issue volume that shifted publication from the calendar year to the academic year), but it's definitely 10 years. So in a strange way, today is the archive's 10th birthday. In less than a year, we've managed to archive 10 years worth of the journal, interlinked the journals forwards and backwards, and generated a keyword index for those 10 years, using del.icio.us.

Not bad. Not bad at all. The further along we've gotten, the more conscious I am of some of the limitations of the approach we're taking, but for a cottage project, it's pretty darn good.

In honor of the archive's 10th birthday, then, here are the top ten tags from the past ten years of the journals (203 essays indexed). As I've discussed before, we generate the tags by parsing each article for nouns and noun phrases, and then take the most frequent to use as tags. We try to keep variations minimal, but there are some obvious synonyms that we haven't combined as well--there's probably an article in all of the tiny decisions I've had to make in compiling this data...

Top Ten Tags for CCC, 1996-2006

Students (159)
Writing (129)
Composition (78)
Language (37)
Literacy (36)
Discourse (33)
Rhetoric (33)
Pedagogy (32)
Community (28)
Work (28)

(I should mention that Derek and I will be talking about the site more generally next month at CCCC, as part of the Computer Connection. You'll find us there on Thursday at 1:45, appropriately enough during the C Session.)

That's all.

Posted by cgbrooke at 8:05 PM | Permalink

A talk in search of a stage

Those of you who follow my scholarly career as closely and in as much painstaking detail as I myself do will notice that one particular vector that I've followed with some consistency is the exploration, experimentation (with), and implementation (PPT) of various visual/spatial tools for writing. In other words, I find myself drawn, time and again, to different ways of writing, different means of expressing the kind of thinking that I do as a scholar.

Funny thing about this is that I didn't realize this myself until fairly recently. I've been working first with hypertexts and webtexts and later with new media more broadly conceived for over 10 years now, and I think that one of the things that drives that work is an underlying conviction on my part that electracy allows me to write in ways that feel more comfortable to me than do those supported by pencils, typewriters, and word processors. It's early in the morning, so forgive me for waxing a little philosophical here.

Anyhow, over the past year or so, I've been using Keynote as a composing tool, mostly for talks that I've given, but as a means of visually and spatially writing before I commit my ideas to sentences and paragraphs. When people started screencasting, I was excited about the possibility of being able to do even more with it. But I've had trouble finding the right combination of tools for myself. Enter ProfCast, a $35 app that allows you to simultaneously record voice on top of Keynote or PPT slides, and it preserves the timing of the slides as well. Finally, it allows you to publish the results as podcasts/screencasts (with RSS feeds to boot). The idea behind it is that it's a tool that would allow professors giving PPT-assisted lectures to record both the voice and slides, and package them together for their students.

So what we have linked below is my first crack at a screencast written in Keynote, then scripted, and recorded with ProfCast. It's about 12 minutes long, and runs a little larger than 8 MB (8.1, I think). It's an MPEG-4 file, and I was able to view it on my machine using QuickTime without any trouble. The slides are vector-based for the most part, so you can watch it full-screen without any fuzziness--in fact, it's probably better displayed large than small, so I recommend downloading it to watch it.

It's a little rough around the edges, but not bad for a first try, and the ideas in it are ones that I've been batting around in different forms (and different forums) for the past year. Enjoy.

Posted by cgbrooke at 4:17 AM | Permalink | Comments (4) | TrackBacks (1)

The CCC Top Ten

Okay, not really.

The list below is of the top ten CCC articles as measured by the number of times that they are cited by subsequent CCC articles. It's not exactly the be-all, end-all of bibliometrics, but rather a single dimension of what would have to be a much more extensive data set (if I wanted to start making substantial (and substantiable) claims).

But from my perspective, it's another of those little pieces that makes CCC Online interesting for me to work on. I've set up a page that will update as we add/index more content both forwards and backwards in time.

If I have a little time in the next day or two, I'll add a column in the table with the year that the article was published as well, although rolling over the links will flash their month/year combo. Interesting to note, perhaps, that the most recent article on this list comes from December 1997 (Ball & Lardner). It makes sense that there would be some lag between publication and subsequent citations, but the majority of articles on this list are more than a decade old. I leave it to you to hypothesize what this might mean...

At the very least, I suppose, a list like this would be a place to begin for someone new to the field--there are worse ways of figuring out where to start.

That's all...

Posted by cgbrooke at 2:33 AM | Permalink | Comments (5)

CCCO thoughts

Over at if:book, Ray Cha relays and recommends an upcoming chapter from Clifford Lynch, about moving beyond "reader-centric views of scholarly literature." It has much in common with Franco Moretti's work on literary history, and is worth reading for that reason alone.

But I'm also on the lookout for ways to articulate just what it is we're trying to do with CCC Online, and Lynch's piece fits the bill. Namely...

We would also see an explosion in services that provided access to this literature in new and creative ways. Such services would also incorporate specialized vocabulary databases, gazetteers, factual databases, ontologies, and other auxiliary tools to enhance indexing and retrieval. They would rapidly transcend access to address navigation and analysis. One path here leads towards more-customized rehosting of scholarly literatures and underlying evidence into new usage and analysis environments attuned to the specific scholarly practices of various disciplines.
We would also see a move beyond federation and indexing to actual text mining and analysis, to the extraction of hypotheses and correlations that would help to drive ongoing scholarly inquiry. Indeed, the literature would be embedded in a computational context that reorganized and re-evaluated the existing body of knowledge as new literature became available.

That excerpt separates nicely into what I think we're already doing at the site, although not perhaps to the extent that Lynch imagines it, and the second half, which in many ways is the prize that we've got our long-term eyes on. If you don't think we're watching projects like this and this, well, you don't know us very well. Heh.

I'm less worried about the potential objections that Cha raises at the end of his post--"Purists will undoubtedly frown upon the use of computation that cannot be replicated by humans in scholarly research"--than I am about getting to the point where such objections can be raised. In other words, I believe that such work, if it can generate compelling results, will override knee-jerk complaints. I think it's also going to be necessary, in our own field at least, to be very careful to qualify the value of this work appropriately. Not that that's always been enough, especially when it comes to quasi-statistical work, which tends to run afoul of the old "me humanities. me hate math." goofiness.

Two other points. First is one that I'm guessing some people will not appreciate, and that's that, to an extent this work is fairly easily decoupled from the "open access" that appears to drive Lynch's piece. That is, the value of data mining is offered as a consequence of open access, and while that is true at a very large scale, I think it possible to do quite a bit in this area without it, honestly. We're able to work around providing the metadata we wanted without having to open up the journal's content, even if we might have preferred it otherwise. And I think that some pretty entrenched attitudes will need to change for what Lynch describes to be more than a thought experiment. Not that they shouldn't change, but I'm not sure how far they actually need to, for this at least.

Second point is that we use a fairly small, fairly simple suite of tools to do what we're doing now. We had to cobble stuff together, and we've done so fairly successfully, but it shouldn't go unmentioned that a couple of good programmers would go a long way towards making this a lot more doable. Personally, I have enough ability to tweak, and I'm pretty good at making MT modules do what I want them to, but we spent a fair bit of time just cobbling. I'm conscious of how much more efficient our system could be.

And yeah, it's only one journal that we're working on, and all things considered, we really have to pace things more slowly than I'd like. But it's also our flagship journal, and if nothing else, we tackled the biggest job first, in designing and testing it on CCC. There's going to be some real value in what we're doing, even if it doesn't hit the scale that Lynch imagines. And we're a pretty solid model for how to accomplish these goals on both a small scale and approaching it from the bottom up.

That is all.

Posted by cgbrooke at 2:24 AM | Permalink

The off-4Cson

My title only works when you understand that 4Cs is pronounced "four seas." I'm just saying.

Unless you happen to be involved with the behind-the-scenes work of the conference, there are basically 2 times during the year when the dreams of rhetcompers turn to their annual conference. The first, and most elaborate, is March, when the conference itself happens. The second, though, is right now, the week when notifications are made for the following March. Word on the street is that the conference acceptance rate is now hovering somewhere around 33%--that there were maybe 600 accepted out of about 1800 submissions. Hard to know exactly how those numbers play out--some proposals are for 3-5 person panels and some individual--but still, it's a pretty big deal.

So it was a little aggravating this week, as everyone was receiving their notifications, to not receive my own. I did hear about a panel that I'm chairing, but that only served to confirm that my email was indeed working. As I noted a couple of years ago,

Notification is always something of an odd season around grad programs--on the one hand, CCCC is selective enough that you expect a little bit of congratulations; on the other, no one really asks anyone else, for fear that they didn't get accepted.

So I pretty much just kept my mouth shut, and vowed to give it a few days, figuring that if I hadn't heard by this weekend, I'd fire off a Monday email to see. Well, my patience was rewarded with the news today that Deb Holdstein, Derek, and I will be doing a Featured Session at the 2007 CCCC, one where we talk about the relationship between the journal, both print and online, and the discipline. Rock. Roll.

I think the plan to is to do some revising to the text of our abstract, and perhaps even the title, so once that's done, I'll be sure to post it here. In the meantime, I'm just going to sit back and bask in the glow of the fact that our work on CCCO is going to be featured in disciplinary primetime. CCCC was really the last piece of a speaking puzzle for the year that includes four (four!) different conferences and perhaps a job talk or two. Hence the whole lot of talking that I referred to a couple of days back. And hence the recent addition to my speakerly arsenal. I can't guarantee that I'll be good, but I'll almost certainly be better.

That's the plan, anyway.

Posted by cgbrooke at 11:54 PM | Permalink | Comments (7)

One step forward...to "sorta" and beyond!!

You may recall how, once upon a time, certain of us (blogeurs) were, shall we say, disinterpellated by particular long-time members of a disciplinary listserv? Well, you'll be pleased to know that, compared to that lovely episode, the following marks a real step forward. In the process of discussing some recent upgrades to CompPile, one loyal user remarks that it would be nice if that site included the 7 most recent years of scholarship:

can you find a way to update to more recent years? I know that the CCCC project is doing that, sorta, but I never do get around to checking it after the great convenience of comppile. Maybe some kind of link, so as not to duplicate effort?

Now, I'm not exactly sure what the "CCCC project" is, but since our site shares 3 of those C's, and we are a project, I can only surmise that "sorta" is meant as a grudging acknowledgment of our efforts over the past 2 years. We sorta belong, at long long last! Why, we might even rate a link, if we're lucky.

Yes, I'm chock full of sarcasm, because apparently the inconvenience of say, bookmarking our site, is apparently too much to ask of this user. I can only imagine that it's too much, because once you arrive at our site, there are only 10 or so different ways that you might search for scholarship:

by typing an author's name into the search bar
by typing a word or two from the title into the search bar
by typing a keyword into the search bar
by using the search bar to track down something in a bibliography
by following a link from something that has cited the thing you're looking for
by following a link from something that the thing you're looking for has cited
by using the drop down menu that links to the last 15 years of issues
by exploring the CCCC categories, each of which contains dozens of articles
by clicking on a tag, and seeing all of the other articles that are similarly tagged
by visiting delicious, where all such tags can be ordered by frequency or alphabetically

I don't talk a lot about CompPile, because I really respect the efforts of the individuals who maintain it. The model that they're working with, on the other hand, is unsustainable, except through Herculean effort, and it only scratches the surface of what databases could be allowing us to do in this field. Heck, we're only scratching the surface, but at the very least, we're getting beyond the "bob for apples" model of search that still seems to dominate a lot of the discussions I see.

Mainly, I have to remind myself that they're not responsible for what said loyal user posts to the list. And I'm content to work along, to improve our site, and to make it a tool that rewards the efforts of both new and experienced researchers. Heck, if we keep at it, by the end of the decade, he might even acknowledge us by name.

Snarkography complete.

Posted by cgbrooke at 6:02 PM | Permalink | Comments (5)

If Tuesday began with the letters CSH...

Then I could tell you that those letters stand for "crushing seasonal headache." The news round these parts is that the temperature reached into the low 60s today, which has been good for The Melt, but bad for My Head. When seasons change, the corresponding shift in pressure typically renders me unable to focus for 2-3 days at a time, bringing with it dull, throbbing headaches of the sort that quite literally make my eyeballs sore. Needless to say, sleep becomes something of a chore, rivaled only by the effort that goes into being awake. Not the happiest of times.

I've been giving some thought to the presentation I'll be giving at CCCC this year. Inspired in part by last week's snarky little entry, which itself prompted me to add "snark alert" to my categories, I've been dialing back my expectations for what I'll accomplish in this presentation. It's hard, having been working on CCCOA for two-plus years now, to imagine that there aren't folks in our field who remain unfamiliar with it, and yet, my guess is that this is actually a fair description of most folks in our field. The speed of change in the 'sphere--and on the net more generally--outpaces that of the run-of-the-mill discipline, perhaps exponentially. And so, what I think I need to do in my talk is to actually introduce the site and what it contributes.

Right now, I'm thinking of an unofficial subtitle for my talk that would be something like "13 Ways of Looking at a Journal." Mostly it would be an introduction to the site, running from the most basic and obvious features to some of the trickier stuff we've built into it, and finally to a couple of disciplinary questions that a site like this can provide us the evidence to work on.

I've been thinking about this a little harder after seeing Tim Burke's post about what he describes as "search as alchemy." To wit,

But there are other times where I want search to be alchemy, to turn the lead of an inquiry into unexpected gold. I’m hoping that the rush to simplify, speed up, demystify and digitize search doesn’t leave that alchemy behind.

It seems like such an obvious point to me, that academic search functions in much different ways than "regular" search, but what's come clear to us over the past couple of years is that we need to figure out better ways of getting the word out, to make the case that CCCOA is a site for search, yes, but also a site of invention. I think that message is both clear and obvious to many of you, my fair readers, but to the field-at-large, it still needs saying.

So I think that's part of what I'll be saying next week.

Posted by cgbrooke at 7:27 PM | Permalink | Comments (6)

Re/Visions are Live

I'm assuming that the issues themselves are going into the mail soon, but if you visit the NCTE site (which I seem to be doing a lot lately), you'll find the most recent issue of CCC available, which includes the Re/Visions piece from Anne, Jeff, and I.

The issue index is here, and the article itself is available here. You'll need to be a subscriber to download it, though. If you want a free copy of the Janangelo article, it's available on the front page of the CCC Online Archive.

I'm just heading out; otherwise, I wouldn't violate the rule against deictic linking. Sorry about that.

Posted by cgbrooke at 6:56 PM | Permalink | Comments (4)

Collin Vs. Blog

December 14, 2005

A CCCO status report

February 2, 2006

If only they would feed me

February 27, 2006

Turning Ten

March 5, 2006

A talk in search of a stage

July 20, 2006

The CCC Top Ten

August 10, 2006

CCCO thoughts

September 15, 2006

The off-4Cson

March 5, 2007

One step forward...to "sorta" and beyond!!

March 13, 2007

If Tuesday began with the letters CSH...

November 30, 2007

Re/Visions are Live