You Report. We Decide?

botstein “It’s one of the real black marks on the history of higher education, ” Leon Botstein, the long-time President of Bard College, recently told The New Yorker’s Alice Gregory, “that an entire industry that’s supposedly populated by the best minds in the country … is bamboozled by a third-rate news magazine.” He was objecting, of course, to the often criticized but widely influential rankings of colleges and universities by US News & World Reports.

Two stories, and a cautionary note.

Wired

leydonSeeing Wired magazine‘s annual “wired campus” rankings in the same way Botstein viewed those from US News, some years ago several of us college and university CIOs conspired to disrupt Wired‘s efforts. As I later wrote, the issue wasn’t that some campuses had different (and perhaps better or worse) IT than others. Rather, for the most part these differences bore little relevance to the quality of those campuses’ education or the value they provided to students.

wiredWe persuaded almost 100 key campuses to withhold IT data from Wired. After meeting with us to see whether compromise was possible (it wasn’t) and an abortive attempt to bypass campus officials and gather data directly from students, the magazine discontinued its ratings. Success.

But, as any good pessimist knows, every silver lining has a cloud. Wired had published not only summary ratings, but also, to its credit, the data (if not the calculations) upon which the ratings were based. Although the ratings were questionable, and some of the data seemed suspect, the latter nevertheless had some value. Rather than look at ratings, someone at Campus A could look and see how A’s reported specific activity compared to its peer Campus B’s.

Partly to replace the data Wired had gathered and made available, and so extend A’s ability to see what B was doing, EDUCAUSE started the Core Data Survey (now the Core Data Service, CDS). This gathered much of the same information Wired had, and more. (Disclosure: I served on the committee that helped EDUCAUSE design the initial CDS, and revised it a couple of years later, and have long been a supporter of the effort.)

Unlike Wired, EDUCAUSE does not make individual campus data publicly available. Rather, participating campuses can compare their own data to those of all or subsets of other campuses, using whatever data and comparison algorithm they think appropriate. I can report from personal experience that this is immensely useful, if only because it stimulates and focuses discussions among campuses that appear to have made different choices.

cds postitBut back to Botstein. EDUCAUSE doesn’t just make CDS data available to participating campuses. It also uses CDS data to develop and publish “Free IT Performance Metrics,” which it describes as “Staffing, financials, and services data [campuses] can use for modifications, enhancements, and strategic planning.” The heart of Botstein’s complaint about US News & World Reports  isn’t that the magazine is third rate–that’s simply Botstein being Botstein–but rather that US News believes the same rating algorithm can be validly used to compare campuses.

Which raises the obvious question: Might EDUCAUSE-developed “performance metrics” fall into that same trap? Are there valid performance metrics for IT that are uniformly applicable across higher education?

mckMany campuses have been bedeviled and burned by McKinseys, BCGs, Accentures, Bains, PWCs, and other management consultants. These firms often give CFOs, Provosts, and Presidents detailed “norms” and “standards” for things like number of users per help-desk staffer, the fraction of operating budgets devoted to IT, or laptop-computer life expectancy. These can then become targets for IT organizations, CIOs, or staff in budget negotiations or performance appraisal.

Some of those “norms” are valid. But many of them involve inappropriate extrapolation from corporate or other different environments, or implicitly equate all campus types. Language is important: “norms,” “metrics,” “benchmarks,” “averages,” “common”, “typical,” and “standards” don’t mean the same thing. So far EDUCAUSE has skirted the problem, but it needs to be careful to avoid asserting uniform validity when there’s no evidence for it.

US News

lake desertA second story illustrates a different, more serious risk. A few years ago a major research university–I’ll call it Lake Desert University or LDU–was distressed about its US News ranking. To LDU’s leaders, faculty, and students the ranking seemed much too low: Lake Desert generally ranked higher elsewhere.

patA member of the provost’s staff–Pat, let’s say–was directed to figure out what was wrong. Pat spent considerable time looking at US News data and talking to its analysts. An important component of the US News ranking algorithm, Pat learned, was class size. The key metric was the fraction of campus-based classes with enrollments smaller than 20.

tutorialPat, a graduate of LDU, knew that there were lots of small classes at Lake Desert–the university’s undergraduate experience was organized around tutorials with 4-5 students–and so it seemed puzzling that LDU wasn’t being credited for that. Delving more deeply, Pat found the problem. Whoever had completed LDU’s US News questionnaire had read the instructions very literally, decided that “tutorials” weren’t “classes”, and so excluded them from the reporting counts. Result: few small classes, and a poor US News ranking.

usnewsUS News analysts told Pat that tutorials should have been counted as classes. The following year, Lake Desert included them. Its fraction-of-small-classes metric went up substantially. Its ranking jumped way up. The Provost sent Pat a case of excellent French wine.

In LDU’s case, understanding the algorithm and looking at the survey responses unearthed a misunderstanding. Correcting this involved no dishonesty (although some of LDU’s public claims about the “improvement” in its ranking neglected to say that the improvement had resulted from data reclassification rather than substantive progress).

Caution

But not all cases are as benign as LDU’s . As I wrote above, there were questions not only about Wired‘s ranking algorithm, but about some of the data campuses provided. Lake Desert correcting its survey responses in consultation with analysts is one thing; a campus misrepresenting its IT services to get a higher ranking is another. But it can be hard to distinguish the two.

whistleAuditing is one way to address this problem, but audits are expensive and difficult. Publishing individual responses is another–both Wired and US News have done this, and EDUCAUSE shares them with survey respondents–but that only corrects the problem if respondents spend time looking at other responses, and are willing to become whistleblowers when they find misrepresentation. Most campuses don’t have the time to look at other campuses’ responses, or the willingness to call out their peers.

If survey responses are used to create ratings, and those ratings become measures of performance, then those whose performance is being measured have incentive to tailor their survey responses accordingly. If the tailoring involves just care within the rules, that’s fine. But if it involves stretching or misrepresenting the truth, it’s not.

More generally, it’s important to closely connect the collection of data to their evaluative use. Who reports, should decide.

 

 

 

Comments are closed.