Posts Tagged ‘“information technology”’

Notes From (or is it To?) the Dark Side

“Why are you at NBC?,” people ask. “What are you doing over there?,” too, and “Is it different on the dark side?” A year into the gig seems a good time to think about those. Especially that “dark side” metaphor.  For example, which side is “dark”?

This is a longer-than-usual post. I’ll take up the questions in order: first Why, then What, then Different; use the links to skip ahead if you prefer.

Why are you at NBC?

5675955This is the first time I’ve worked at a for-profit company since, let’s see, the summer of 1967: an MIT alumnus arranged an undergraduate summer job at Honeywell‘s Mexico City facility. Part of that summer I learned a great deal about the configuration and construction of custom control panels, especially for big production lines. I think of this every time I see photos of big control panels, such as those at older nuclear plants—I recognize the switch types, those square toggle buttons that light up. (Another part of the summer, after the guy who hired me left and no one could figure out what I should do, I made a 43½-foot paper-clip chain.)

One nice Honeywell perk was an employee discount on a Pentax 35mm SLR with a 40mm and 135mm lenses, which I still have in a box somewhere, and which still works when I replace the camera’s light-meter battery. (The Pentax brand belonged to Honeywell back then, not Ricoh.) Excellent camera, served me well for years, through two darkrooms and a lot of Tri-X film. I haven’t used it since I began taking digital photos, though.

5499942818_d3d9e9929b_nI digress. Except, it strikes me, not really. One interesting thing about digital photos, especially if you store them online and make most of them publicly visible (like this one, taken on the rim of spectacular Bryce Canyon, from my Backdrops collection), is that sometimes the people who find your pictures download them and use them for their own purposes. My photos carry a Creative Commons license specifying that although they are my intellectual property, they can be used for nonprofit purposes so long as they are attributed to me (an option not available, apparently, if I post them on Facebook instead).

So long as those who use my photos comply with the CC license requirement, I don’t require that they tell me, although now and then they do. But if people want to use one of my photos commercially, they’re supposed to ask my permission, and I can ask for a use fee. No one has done that for me—I’m keeping the day job—but it’s happened for our son.

dmcaI hadn’t thought much about copyright, permissions, and licensing for personal photos (as opposed to archival, commercial, or institutional ones) back when I first began dealing with “takedown notices” sent to the University of Chicago under the Digital Millennium Copyright Act (DMCA). There didn’t seem to be much of a parallel between commercialized intellectual property, like the music tracks that accounted for most early DMCA notices, and my photos, which I was putting online mostly because it was fun to share them.

Neither did I think about either photos or music while serving on a faculty committee rewriting the University’s Statute 18, the provision governing patents in the University’s founding documents.

sealThe issues for the committee were fundamentally two, both driven somewhat by the evolution of “textbooks”.

First, where is the line between faculty inventions, which belong to the University (or did at the time), and creations, which belong to creators—between patentable inventions and copyrightable creations, in other words? This was an issue because textbooks had always been treated as creations, but many textbooks had come to include software (back then, CDs tucked into the back cover), and software had always been treated as an invention.

Second, who owns intellectual property that grows out of the instructional process? Traditionally, the rights and revenues associated with textbooks, even textbooks based on University classes, belonged entirely to faculty members. But some faculty members were extrapolating this tradition to cover other class-based material, such as videos of lectures. They were personally selling those materials and the associated rights to outside entities, some of which were in effect competitors (in some cases, they were other universities!).

fathomAs you can see by reading the current Statute 18, the faculty committee really didn’t resolve any of this. Gradually, though, it came to be understood  that textbooks, even textbooks including software, were still faculty intellectual property, whereas instructional material other than that explicitly included in traditional textbooks was the University’s to exploit, sell, or license.

With the latter well established, the University joined Fathom, one of the early efforts to commercialize online instructional material, and put together some excellent online materials. Unfortunately, Fathom, like its first-generation peers, failed to generate revenues exceeding its costs. Once it blew through its venture capital, which had mostly come from Columbia University, Fathom folded. (Poetic justice: so did one of the profit-making institutions whose use of University teaching materials prompted the Statute 18 review.)

Gradually this all got me interested in the thicket of issues surrounding campus online distribution and use of copyrighted materials and other intellectual property, and especially the messy question how campuses should think about copyright infringement occurring within and distributed from their networks. The DMCA had established the dual principles that (a) network operators, including campuses, could be held liable for infringement by their network users, but (b) they could escape this liability (find “safe harbor”) by responding appropriately to complaints from copyright holders. Several of us research-university CIOs worked together to develop efficient mechanisms for handling and responding to DMCA notices, and to help the industry understand those and the limits on what they might expect campuses to do.

heoaAs one byproduct of that, I found myself testifying before a Congressional committee. As another, I found myself negotiating with the entertainment industry, under US Education Department auspices, to develop regulations implementing the so-called “peer to peer” provisions of the Higher Education Opportunity Act of 2008.

That was one of several threads that led to my joining EDUCAUSE in 2009. One of several initiatives in the Policy group was to build better, more open communications between higher education and the entertainment industry with regard to copyright infringement, DMCA, and the HEOA requirements.

hero-logo-edxI didn’t think at the time about how this might interact with EDUCAUSE’s then-parallel efforts to illuminate policy issues around online and nontraditional education, but there are important relevancies. Through massively open online courses (MOOCs) and other mechanisms, colleges and universities are using the Internet to reach distant students, first to build awareness (in which case it’s okay for what they provide to be freely available) but eventually to find new revenues, that is, to monetize their intellectual property (in which case it isn’t).

music-industryIf online campus content is to be sold rather than given away, then campuses face the same issues as the entertainment industry: They must protect their content from those who would use it without permission, and take appropriate action to deter or address infringement.

Campuses are generally happy to make their research freely available (except perhaps for inventions), as UChicago’s Statute 18 makes clear, provided that researchers are properly credited. (I also served on UChicago’s faculty Intellectual Property Committee, which among other things adjudicated who-gets-credit conflicts among faculty and other researchers.) But instruction is another matter altogether. If campuses don’t take this seriously, I’m afraid, then as goes music, so goes online higher education.

Much as campus tumult and changes in the late Sixties led me to abandon engineering for policy analysis, and quantitative policy analysis led me into large-scale data analysis, and large-scale data analysis led me into IT, and IT led me back into policy analysis, intellectual-property issues led me to NBCUniversal.

Peacock_CleanupI’d liked the people I met during the HEOA negotiations, and the company seemed seriously committed to rethinking its relationships with higher education. I thought it would be interesting, at this stage in my career, to do something very different in a different kind of place. Plus, less travel (see screwup #3 in my 2007 EDUCAUSE award address).

So here I am, with an office amidst lobbyists and others who focus on legislation and regulation, with a Peacock ID card that gets me into the Universal lot, WRC-TV, and 30 Rock (but not SNL), and with a 401k instead of a 403b.

What are you doing over there?

NBCUniversal’s goals for higher education are relatively simple. First, it would like students to use legitimate sources to get online content more, and illegitimate “pirate” sources less. Second, it would like campuses to reduce the volume of infringing material made available from their networks to illegal downloaders worldwide.

477px-CopyrightpiratesMy roles are also two. First, there’s eagerness among my colleagues (and their counterparts in other studios) to better understand higher education, and how campuses might think about issues and initiatives. Second, the company clearly wants to change its approach to higher education, but doesn’t know what approaches might make sense. Apparently I can help with both.

To lay foundation for specific projects—five so far, which I’ll describe briefly below—I looked at data from DMCA takedown notices.

Curiously, it turned out, no one had done much to analyze detected infringement from campus networks (as measured by DMCA notices sent to them), or to delve into the ethical puzzle: Why do students behave one way with regard to misappropriating music, movies, and TV shows, and very different ways with regard to arguably similar options such as shoplifting or plagiarism? I’ve written about some of the underlying policy issues in Story of S, but here I decided to focus first on detected infringement.

riaa-logoIt turns out that virtually all takedown notices for music are sent by the Recording Industry Association of America, RIAA (the Zappa Trust and various other entities send some, but they’re a drop in the bucket).

MPAAMost takedown notices for movies and some for TV are sent by the Motion Picture Association of America, MPAA, on behalf of major studios (again, with some smaller entities such as Lucasfilm wading in separately). NBCUniversal and Fox send out notices involving their movies and TV shows.

sources chartI’ve now analyzed data from the major senders for both a twelve-month period (Nov 2011-Oct 2012) and a more recent two-month period (Feb-Mar 2013). For the more recent period, I obtained very detailed data on each of 40,000 or so notices sent to campuses. Here are some observations from the data:

  • Almost all the notices went to 4-year campuses that have at least 100 dormitory beds (according to IPEDS). To a modest extent, the bigger the campus the more notices, but the correlation isn’t especially large.
  • Over half of all campuses—even of campuses with dorms—didn’t get any notices. To some extent this is because there are lots and lots of very small campuses, and they fly under the infringement-detection radar. But I’ve learned from talking to a fair number of campuses that, much to my surprise, many heavily filter or even block peer-to-peer traffic at their commodity Internet border firewall—usually because the commodity bandwidth p2p uses is expensive, especially for movies, rather than to deal with infringement per se. Outsourced dorm networks also have an effect, but I don’t think they’re sufficiently widespread yet to explain the data.
  • Several campuses have out-of-date or incorrect “DMCA agent” addresses registered at the Library of Congress. Compounding that, it turns out some notice senders use “abuse” or other standard DNS addresses rather than the registered agent addresses.
  • Among campuses that received notices, a few campuses stand out for receiving the lion’s share, even adjusting for their enrollment. For example, the top 100 or so recipient campuses got about three quarters of the total, and a handful of campuses stand out sharply even within that group: the top three campuses (the leftmost blue bars in the graph below) accounted for well over 10% of the notices. (I found the same skewness in the 2012 study.) With a few interesting exceptions (interesting because I know or suspect what changed), the high-notice groups have been the same for the two periods.

utorrent-facebook-mark-850-transparentThe detection process, in general, is that copyright holders choose a list of music, movie, or TV titles they believe likely to be infringed. Their contractors then use BitTorrent tracker sites and other user tools to find illicit sources for those titles.

For the most part the studios and associations simply look for titles that are currently popular in theaters or from legitimate sources. It’s hard to see that process introducing a bias that would affect some campuses so much differently than others. I’ve also spent considerable time looking at how a couple of contractors verify that titles being offered illicitly (that is, listed for download on a BitTorrent tracker site such as The Pirate Bay) are actually the titles being supplied (rather than, say, malware, advertising, or porn), and at how they figure out where to send the resulting takedown notices. That process too seems pretty straightforward and unbiased.

argo-15355-1920x1200Sender choices clearly can influence how notice counts vary from time to time: for example, adding a newly popular title to the search list can lead to a jump in detections and hence notices. But it’s hard to see how the choice of titles would influence how notice counts vary from institution to institution.

This all leads me to believe that takedown notices tell us something incomplete but useful about campus policies and practices, especially at the extremes. The analysis led directly to two projects focused on specific groups of campuses, and indirectly to three others.

Role Model Campuses

Based on the results of the data analysis, I communicated individually with CIOs at 22 campuses that received some but relatively few notices: specifically, campuses that (a) received at least one notice (and so are on the radar) but (b) fewer than 300 and fewer than 20 per thousand student headcount, (c) have at least 7,500 headcount students, and (d) have at least 10,000 dorm beds (per IPEDS) or sufficient dorm beds to house half your headcount. (These are Group 4, the purple bars in the graph below. The solid bars represent total notices sent, and the hollow bars represent incidence, or notices per thousand headcount students. Click on the graph to see it larger.)

I’ve asked each of those campuses whether they’d be willing to document their practices in an open “role models” database developed jointly by the campuses and hosted by a third party such as groups charta higher-education association (as EDUCAUSE did after the HEOA regulations took effect). The idea is to make a collection of diverse effective practices available to other campuses that might want to enhance their practices.

High Volume Campuses

Separately, I communicated privately with CIOs at 13 campuses that received exceptionally many notices, even adjusting for their enrollment (Group 1, the blue bars in the graph). I’ve looked in some detail at the data for those campuses, some large and some small, and in some cases that’s led to suggestions.

For example, in a few cases I discovered that virtually all of a high-volume campus’s notices were split evenly among a small number of consecutive IP addresses. In those cases, I’ve suggested that those IP addresses might be the front-end to something like a campus wireless network. Filtering or blocking p2p (or just BitTorrent) traffic on those few IP addresses (or the associated network devices) might well shrink the campus’s role as a distributor without affecting legitimate p2p or BitTorrent users (who tend to be managing servers with static addresses).

Symposia

Back when I was at EDUCAUSE, we worked with NBCUniversal to host a DC meeting between senior campus staff from a score of campuses nationwide and some industry staff closely involved with the detection and notification for online infringement. The meeting was energetic and frank, and participants from both sides went away with a better sense of the other’s bona fides and seriousness. This was the first time campus staff had gotten a close look at the takedown-notice process since a Common Solutions Group meeting in Ann Arbor some years earlier; back then the industry’s practices were much less refined.

university-st-thomas-logo-white croppedBased on the NBCUniversal/EDUCAUSE experience, we’re organizing a series of regional “Symposia” along these lines on campuses in various cities across the US. The objectives are to open new lines of communication and to build trust. The invitees are IT and student-affairs staff from local campuses, plus several representatives from industry, especially the groups that actually search for infringement on the Internet. The first was in New York, the second in Minneapolis, the third will be in Philadelphia, and others will follow in the West, the South, and elsewhere in the Midwest.

Research

We’re funding a study within a major state university system to gather two kinds of data. Initially the researchers are asking each campus to describe the measures it takes to “effectively combat” copyright infringement: its communications with students, its policies for dealing with violations, and the technologies it uses. The data from the first phase will help enhance a matrix we’ve drafted outlining the different approaches taken by different campuses, complementing what will emerge from the “role models” project.

Based on the initial data, the researchers and NBCUniversal will choose two campuses to participate in the pilot phase of the Campus Online Education Initiative (which I’ll describe next). In advance of that pilot, the researchers will gather data from a sample of students on each campus, asking about their attitudes toward and use of illicit and legitimate online sources for music, movies, and video. They’ll then repeat that data collection after the pilot term.

Campus Online Entertainment Initiative

Last but least in neither ambition nor complexity, we’re crafting a program that will attempt to address both goals I listed earlier: encouraging campuses to take effective steps to reduce distribution of infringing material from their networks, and helping students to appreciate (and eventually prefer) legitimate sources for online entertainment.

maxresdefaultWorking with Universal Studios and some of its peers, we’ll encourage students on participating campuses to use legitimate sources by making a wealth of material available coherently and attractively—through a single source that works across diverse devices, and at a substantial discount or with similar incentives.

Participating campuses, in turn, will maintain or implement policies and practices likely to shrink the volume of infringing material available from their networks. In some cases the participating campuses will already be like those in the “role models” group; in others they’ll be “high volume” or other campuses willing to  adopt more effective practices.

I’m managing these projects from NBCUniversal’s Washington offices, but with substantial collaboration from company colleagues here, in Los Angeles, and in New York; from Comcast colleagues in Philadelphia; and from people in other companies. Interestingly, and to my surprise, pulling this all together has been much like managing projects at a research university. That’s a good segue to the next question.

Is it different on the dark side?

IMG_1224Newly hired, I go out to WRC, the local NBC affiliate in Washington, to get my NBCUniversal ID and to go through HR orientation. Initially it’s all familiar: the same ID photo technology, the same RFID keycard, the same ugly tile and paint on the hallways, the same tax forms to be completed by hand.

But wait: Employee Relations is next door to the (now defunct) Chris Matthews Show. And the benefits part of orientation is a video hosted by Jimmy Fallon and Brian Williams. And there’s the possibility of something called a “bonus”, whatever that is.

Around my new office, in a spiffy modern building at 300 New Jersey Avenue, everyone seems to have two screens. That’s just as it was in higher-education IT. But wait: here one of them is a TV. People watch TV all day as they work.

Toto, we’re not in higher education any more.

IMG_1274It’s different over here, and not just because there’s a beautiful view of the Capitol from our conference rooms. Certain organizational functions seem to work better, perhaps because they should and in the corporate environment can be implemented by decree: HR processes, a good unified travel arrangement and expense system, catering, office management. Others don’t: there’s something slightly out of date about the office IT, especially the central/individual balance and security, and there’s an awful lot of paper.

Some things are just different, rather than better or not: the culture is heavily oriented to face-to-face and telephone interaction, even though it’s a widely distributed organization where most people are at their desks most of the time. There’s remarkably little email, and surprisingly little use of workstation-based videoconferencing. People dress a bit differently (a maitre d’ told me, “that’s not a Washington tie”).

But differences notwithstanding, mostly things feel much the same as they did at EDUCAUSE, UChicago, and MIT.

tiny NBCUniversal_violet_1030Where I work is generally happy, people talk to one another, gossip a bit, have pizza on Thursdays, complain about the quality of coffee, and are in and out a lot. It’s not an operational group, and so there’s not the bustle that comes with that, but it’s definitely busy (especially with everyone around me working on the Comcast/Time Warner merger). The place is teamly, in that people work with one another based on what’s right substantively, and rarely appeal to authority to reach decisions. Who trusts whom seems at least as important as who outranks whom, or whose boss is more powerful. Conversely, it’s often hard to figure out exactly how to get something done, and lots of effort goes into following interpersonal networks. That’s all very familiar.

MIT_Building_10_and_the_Great_Dome,_Cambridge_MAI’d never realized how much like a research university a modern corporation can be. Where I work is NBCUniversal, which is the overarching corporate umbrella (“Old Main”, “Mass Hall”, “Building 10”, “California Hall”, “Boulder”) for 18 other companies including news, entertainment, Universal Studios, theme parks, the Golf Channel, Telemundo (which are remarkably like schools and departments in their varied autonomy).

Meanwhile NBCUniversal is owned by Comcast—think “System Central Office”. Sure, these are all corporate entities, and they have concrete metrics by which to measure success: revenue, profit, subscribers, viewership, market share. But the relationships among organizations, activities, and outcomes aren’t as coherent and unitary as I’d expected.

Dark or Green?

So, am I on the dark side, or have I left it behind for greener pastures? Curiously, I hear both from my friends and colleagues in higher education: Some of them think my move is interesting and logical, some think it odd and disappointing. Curioser still, I hear both from my new colleagues in the industry: Some think I was lucky to have worked all those decades in higher education, while others think I’m lucky to have escaped. None of those views seems quite right, and none seems quite wrong.

The point, I suppose, is that simple judgments like “dark” and “greener” underrepresent the complexity of organizational and individual value, effectiveness, and life. Broad-brush characterizations, especially characterizations embodying the ecological fallacy, ”…the impulse to apply group or societal level characteristics onto individuals within that group,” do none of us any good.

It’s so easy to fall into the ecological-fallacy trap; so important, if we’re to make collective progress, not to.

Comments or questions? Write me: greg@gjackson.us

(The quote is from Charles Ess & Fay Sudweeks, Culture, technology, communication: towards an intercultural global village, SUNY Press 2001, p 90. Everything in this post, and for that matter all my posts, represents my own views, not those of my current or past employers, or of anyone else.)

3|5|2014 11:44a est

Perceived Truths as Policy Paradoxes

imagesThe quote I was going to use to introduce this topic — “You’re entitled to your own opinion, but not to your own facts” — itself illustrates my theme for today: that truths are often less than well founded, and so can turn policy discussions weird.

I’d always heard the quote attributed to Pat Moynihan, an influential sociologist who co-wrote Beyond the Melting Pot with Nathan Glazer, directed the MIT-Harvard Joint Center for Urban Studies shortly before I worked there (and left behind a closet full of Scotch, which stemmed from his perhaps apocryphal rule that no meeting extend beyond 4pm without a bottle on the table), and later served as a widely respected Senator from New York. The collective viziers of Wikipedia have found other attributions for the quote, however. (This has me once again looking for the source of “There go my people, I must go join them, for I am their leader,” supposedly Mahatma Gandhi but apparently some French general — but I digress.). The quote will need to stand on its own.

a0157b7d-9976-410d-bba8-6ccf1dbf4c48-The-ACT-Here’s the Scott Jaschik item from Inside Higher Education that triggered today’s Rumination:

A new survey from ACT shows the continued gap between those who teach in high school and those who teach in college when it comes to their perceptions of the college preparation of today’s students. Nearly 90 percent of high school teachers told ACT that their students are either “well” or “very well” prepared for college-level work in their subject area after leaving their courses. But only 26 percent of college instructors reported that their incoming students are either “well” or “very well” prepared for first-year credit-bearing courses in their subject area. The percentages are virtually unchanged from a similar survey in 2009.

This is precisely what Moynihan (or whoever) had in mind: two parties to an important discussion each bearing their own data, and therefore unable to agree on the problem or how to address it. The teachers presumably think the professors have unreasonable expectations, or don’t work very hard to bring their students along; the professors presumably think the teachers aren’t doing their job. Each side therefore believes the problem lies on the other, and has data to prove that. Collaboration is unlikely, progress ditto. This is what Moynihan had observed about the federal social policy process.

5-financial-aid-tips-1The ACT survey reminded me of a similar finding that emerged back when I was doing college-choice research. I can’t locate a citation, but I recall hearing about a study that surveyed students who had been admitted to several different colleges.

The clever wrinkle in the study was that the students received several different survey queries, each purporting to be from one of the colleges to which he or she had been admitted, and each asking the student about the reasons for accepting or declining the admission offer. Here’s what they found: students told the institution they’d accepted that the reason was excellent academic quality, but they told the institutions they’d declined that the reason was better financial aid from the one they’d accepted.

131More recently, I was talking to a colleague in a another media company who was concerned about the volume of copyright infringement on a local campus. According to the company, the campus was hosting a great deal of copyright infringementl, as measured by the volume of requests for infringing material being sent out by BitTorrent. But according to the campus, a scan of the campus network identified very few hosts running the peer-to-peer applications. The colleague thought the campus was blowing smoke, the campus thought the company’s statistics were wrong.

Although these three examples seem similar — parties disagreeing about facts — in fact they’re a bit different.

  • In the teacher/professor example, the different conclusions presumably stem from different (and unshared) definitions of “”prepared for college-level work”.
  • In the accepted/decline example, the different explanations possibly stem from students’ not wanting to offend the declined institution by questioning its quality, or wanting think of their actual choice as good rather than cheap.
  • In the infringement/application case, the different explanations stem from divergent metrics.

compass-badgeWe’ve seen similar issues arise around institutional attributes in higher education. Do ratings like those from US News & World Report gather their own data, for example, or rely on presumably neutral sources such as the National Center for Educational Statistics? This is critical where results have major reputational effects — consider George Washington University’s inflation of class-rank admissions data, and similar earlier issues with Claremont McKenna, Emory, Villanova, and others.

I’d been thinking about this because in my current job it’s quite important to understand patterns of copyright infringement on campuses. It would be good to figure out which campuses seem to have relatively low infringement rates, and to explore and document their policies and practices lest other campuses might benefit. For somewhat different reasons, it would be good to figure out which campuses seem to have relatively high infringement rates, so that they could be encouraged adopt different policies and practices.

But here we run into the accept/decline problem. If the point to data collection is to identify and celebrate effective practice, there are lots of incentives for campuses to participate. But if the point is to identify and pressure less effective campuses, the incentives are otherwise.

Compounding the problem, there are different ways to measure the problem:

  • One can rely on externally generated complaints, whose volume can vary for reasons having nothing to do with the volume of infringement,
  • one can rely on internal assessments of network traffic, which can be inadvertently selective, and/or
  • one can rely on external measures such as the volume of queries to known sources of infringement;

I’m sure there are others — and that’s without getting into the religious wars about copyright, middlemen, and so forth I addressed in an earlier post).

There’s no full solution to this problem. But there are two things that help: collaboration and openness.

  • By “collaboration,” I mean that parties to questions of policy or practice should work together to define and ideally collect data; that way, arguments can focus on substance.
  • By “openness,” I mean that wherever possible raw data, perhaps anonymized, should accompany analysis and advocacy based on those data.

As an example what this means, here are some thoughts for one of my upcoming challenges — figuring out how to identify campuses that might be models for others to follow, and also campuses that should probably follow them. Achieving this is important, but improperly done it can easily come to resemble the “top 25″ lists from RIAA and MPAA that became so controversial and counterproductive a few years ago. The “top 25″ lists became controversial partly because their methodology was suspect, partly because the underlying data were never available, and partly because they ignored the other end of the continuum, that is, institutions that had somehow managed to elicit very few Digital Millennium Copyright Act (DMCA) notices.

PirateBay_1_NETT_26916dIt’s clear there are various sources of data, even without internal access to campus network data:

  • counts of DMCA notices sent by various copyright holders (some of which send notices methodically, following reasonably robust and consistent procedures, and some of which don’t),
  • counts of queries involving major infringing sites, and/or
  • network volume measures for major infringing protocols.

Those last two yield voluminous data, and so usually require sampling or data reduction of some kind. And not all queries or protocols they follow involve infringement. It’s also clear, from earlier studies, that there’s substantial variation in these counts over time and even across similar campuses.

This means it will be important for my database, if I can create one, to include several different measures, especially counts from different sources for different materials, and to do that over a reasonable period of time. Integrating all this into a single dataset will require lots of collaboration among the providers. Moreover, the raw data necessarily will identify individual institutions, and releasing them that way would probably cause more opposition than support. Clumping them all together would bypass that problem, but also cover up important variation. So it makes much more sense to disguise rather than clump — that is, to identify institutions by a code name and enough attributes to describe them but not to identify them.

It’ll then be important to be transparent: to lay out the detailed methodology used to “rank” campuses (as, for example, US News now does), and to share the disguised data so others can try different methodologies.

big_dataAt a more general level, what I draw from the various examples is this: If organizations are to set policy and frame practice based on data — to become “data-driven organizations,” in the current parlance — then they must put serious effort into the source, quality, and accessibility of data. That’s especially true for “big data,” even though many current “big data” advocates wrongly believe that volume somehow compensates for quality.

If we’re going to have productive debates about policy and practice in connection with copyright infringment or anything else, we need to listen to Moynihan: To have our own opinions, but to share our data.

Story of S, and the Mythology of the Lost Generation

argo_ver7_xlgDinner talk turned from Argo and Zero Dark Thirty to movies more generally. A 21-year-old college senior—I’ll call her “S”—recognized most of the films we were discussing. She had seen several, but others she hadn’t, which was a bit surprising, since S was an arts major, wanted to be a screenwriter, and was enthusiastic about her first choice for graduate school: the screenwriting program at a major California institution focused on the movie industry.

S had older brothers in the movie business, and she already had begun writing. What she needed, S said, was broader and deeper exposure to what made good screenplays. Graduate school would provide “deeper.” Her plan for “broader” was to watch as many well-regarded classics as possible, and apparently we were helping her map out that strategy.

But many of the films she wanted to see weren’t available on cable in her dormitory, even as pay-per-view. “Buying” or “renting” them online she found too expensive and awkward, especially given the number of films she wanted to see. So S was doing what unfortunately many students (and others) do: looking for movies on the Internet, and then streaming or downloading the least expensive version she could find. Since S’s college dormitory provided good Internet connectivity, S used that to download or stream her movies. Bluebeard_PirateUsually, she said, the least expensive version was an unauthorized copy, a so-called “pirate” version.

Some of us challenged her: Didn’t S realize that downloading or streaming “pirated” copies was against the law? Was she not concerned about the possible consequences? As a budding screenwriter, would she want others to do as she was doing, and deprive her of royalties? Didn’t it just seem wrong to take something without the owner’s permission?

S listened carefully—she was pretty sharp—but she didn’t seem convinced. Indeed, she seemed to feel that her choice to use unauthorized copies was reasonable, given the limited and unsatisfactory alternatives provided by the movie industry.

cary-shermanIn so believing, S was echoing the persistent mythology of the lost generation. I first heard Cary Sherman, the President of the Recording Industry Association of America (RIAA), use “the lost generation” to describe the approximately 25 million students who became digital consumers between two milestones: Napster‘s debut in 1999, which made sharing of MP3s ripped from CDs easy, and Apple’s discontinuing digital rights management (DRM) for most iTunes music in 2009, which made buying tracks legally almost as easy and convenient.

Even without the illusion that infringing materials were “free,” there were ample incentives to infringe during that period: illegal mechanisms were comprehensive and easy to use, for the most part, whereas legal mechanisms did not exist, were inflexible and awkward, and/or did not include many widely-desired items.

Age_of_Mythology_LinerBecause of this, many members of the lost generation adopted a mythology comprising some subset of

  • digital materials are priced too high, since it costs money to manufacture CDs and DVDs but the Internet is free,
  • profits flow to middlemen rather than artists, and so artists aren’t hurt by infringement,
  • DRM is just the industry’s mechanism for controlling users and rationing information,
  • people who stream or download unauthorized copies wouldn’t have bought legal copies anyway, and so copyright holders don’t lose any revenue because of unauthorized copying,
  • there’s no way to sample material before buying it, and so unauthorized sources are the only easy way to explore new or arcane stuff,
  • the entertainment  industry has no interest in serving customers, as evidenced by its keeping so much material unavailable,
  • copyright is wrong, since information should be free and users should just pay what they think it’s worth, and
  • (the illegitimate moral leap S and others make) therefore it’s “okay” to copy and share digital materials without permission.

Unfortunately, the lost generation’s beliefs, most of which have always been exaggerated or invalid, have been passed down to successor generations, a process accelerated rather than slowed by the current industry emphasis on monitoring and penalizing network users.

cool-hand-luke-martinWhy does the mythology persist?

There are the obvious technical and financial arguments: if illegal technology is more convenient that legal, and illegal content costs less than legal, then it’s not surprising that illegal stuff remains prominent.

But in addition, as the Captain might observe, what we have here is failure to communicate:

  • There’s lots of evidence that convenient, comprehensive services like Netflix, Amazon Prime Instant Video, Hulu, Pandora, and Spotify draw users to them even when there are illegal “free” alternatives. But for this to happen, users must know about those services. S clearly didn’t—we asked her specifically—and that’s a marketing failure.
  • Shoplifting and plagiarism are relatively rare, at least among individuals like S. Yet they have the same appealing features as “pirate” music and video. Somehow S and her peers have come to understand that shoplifting, plagiarism, and various similar choices are unethical, immoral, or socially counterproductive. Yet they don’t put copyright infringement in the same category. That’s a social, educational, and parental failure.
  • LSb_120504_345.jpgFor all kinds of arguably irremediable licensing, contractual, competitive, and anti-trust reasons, it remains stubbornly difficult to “give the lady what she wants“: in S’s case, a comprehensive, reasonably priced, convenient service from which she could obtain all the movies she wanted. Whether this is customers not conveying their wants to providers (in part because they can bypass the latter), or whether this is providers stuck on obsolete delivery models, it’s a business failure.
  • Colleges and universities are supposed at least to tell their students about copyright infringement, and to implement technologies and other mechanisms to “effectively combat” it. S had no idea that the consequences of being caught downloading or streaming unauthorized copies were anything beyond being told to stop. So far as she knew, no one, at least no one at her college, had ever gotten in trouble for that. And she’d never heard anything from her college—which was also her Internet service provider—about the issue. That’s a policy failure.

To be fair, S’s dinner comments endorsed only a small subset of the lost generation’s tenets, she seemed generally interested in the streaming services we told her about, and she was now thinking about the consequences of being caught downloading or streaming unauthorized copies—and about how lots of people doing that might affect her future earnings. So there was progress.

But ganging up on 21-year-olds at dinner parties is a very inefficient way to counteract the mythology of the lost generation. We—and by this I mean everyone: users, parents, schools, artists, producers, network providers—need  to find much better ways to communicate about copyright infringement, to help potential infringers understand the choices they are making, and to provide and use better legal services.

Especially until we do that last, this will be hard, and progress will be slow. But it’s progress we need if the intellectual-property economy is to endure.

Three Fallacies: Optimal Diet, Best Practices, and Key Indicators

ntn(126788, 10)Just before writing this (and then losing most of it to a Chrome freeze, and then rewriting it), I had a sort-of-Ploughman’s lunch: a couple of Wasa Wholegrain crackers spread with about 1 ounce of nice smelly Buttermilk Blue cheese, and a Pink Lady apple, and a glass of water.

For yesterday’s lunch I mixed some canned white tuna with with nonfat Greek yogurt and mayo, and put it on Wasa crackers.

Which lunch was better? How might I measure that?

Here are some data from Peapod prices (even though I didn’t buy these ingredients there): prices per serving and nutrition info from the standard labels:

servings cost/serving fat (g) calories fiber (g) protein (g)
cheese 1 $1.16 8.0 100 1.0 6.0
crackers 2 $0.18 0.0 40 2.0 1.0
apple 1 $0.75 0.3 95 4.4 0.5
Today   $2.26 8.3 275 9.4 8.5
tuna 1 $0.95 1.0 70 0.0 14.0
crackers 2 $0.18 0.0 40 2.0 1.0
yogurt 1 $0.28 0.0 15 0.0 2.6
mayo 1 $0.18 10.0 90 0.0 0.0
Yesterday   $1.76 11.0 255 4.0 18.6

 

Today’s lunch cost $2.26, which is about 1/3 more than yesterday’s. Were I focused on cost, therefore, I couldn’t rate today’s lunch as highly as yesterday’s. I have excellent, robust cost indicators to make this judgment, and so it’s pretty clear how to assess practice if minimizing lunch cost is my goal.

ntn(191171, 10)Then again, I’m of the age where I need to be careful what I eat (note “need to be”, not “am”), and so maybe minimizing cost isn’t the right goal. Instead, perhaps I should look at nutritional indicators. Today’s lunch had 8.3 grams of fat, mostly from the cheese, and 275 calories. Yesterday’s was similar in those respects, with 11 grams of fat, mostly from the mayo, and 255 calories. So fat and calories don’t give me a clear indication which lunch is better.

I’m told that fiber is good, though. Today’s lunch is better than yesterday’s fiber-wise: 9.4 versus 4 grams. Then again, protein is also good, and here the indicator tilts the other way: yesterday’s 18.6 grams of protein trumps today’s 8.5 grams.

I’ve got measures of my two lunches’ nutritional attributes — not as robust as my cost measures (see, for example, this recent Fox News story), but still pretty good. However, unlike my single measure of cost, I have multiple measures of nutrition, and they’re divergent: even the few measures on the standard label value the lunches differently. I may choose different indicators than someone else — and I may choose differently some other day  if, say, my cholesterol levels change.

imgresTrying to incorporate both nutritional goals (as standards to be met) and cost (as an outcome to be minimized) into an optimal diet, George Stigler, in a 1945 article, tried to determine the least expensive nutritionally adequate diet for a 70-kg male economist — that is, himself. Using 1944 products, prices, and then-current Recommended Daily Allowances (which included calories, protein, calcium, iron, and five vitamins), and after laborious analysis, he proposed an optimal daily diet comprising about 23 ounces of wheat flour (!), 5 ounces of cabbage, 1 ounce of spinach, 6 ounces of pancake flour, and 1 ounce of pork liver. Stigler estimated this diet would cost 16¢/day, which would be $2.08 in 2011 dollars. (Stigler never tried this diet, and neither did his son, who was a faculty member at UChicago during my tenure there.)

Seven years later, George Dantzig developed the Simplex algorithm for solving linear-programming problems like this. Today there are simple online or sophisticated spreadsheet tools available to explore the now-famous Diet Problem — for example, Stefan Warner’s  web tool, or a more comprehensive Excel-based one developed by Samir Khan.

Playing around with the various tools, one thing becomes clear immediately: results vary dramatically depending on exactly how one bounds the problem. As we already know from my lunches, the “optimal” diet depends on what foods are considered, and on which nutritional requirements one chooses to impose on them.

imgresThus, for example, Warner’s default settings include 20 foods, use 2008 RDAs, allow no more than 2 servings of any one food, and impose requirements for 10 nutrients (minima for calories, fat, carbohydrates, protein, vitamin C, sodium, fiber, vitamin A, and calcium, and a maximum for cholesterol —  Warner  adds fat and fiber to Stigler’s 9 nutrients, and omits iron). These settings yield a daily diet comprising 1.6 servings of spaghetti with sauce plus 2 servings each of broccoli, potatoes, banana, wheat bread, lowfat milk, eggs, and white rice. According to the web tool, the default Warner diet costs $2.87/day, or $3.14 in 2011 dollars.

imgresUsing Khan’s Excel-based tool, its more limited list of foods, and its longer list of 14 nutrients (most of which have both a minimum and a maximum), a cost-optimized diet runs $10.71/day: 3.1 servings of lentils, 2.9 of bagels, 2.3 of roast chicken, 2.2 of Brussels sprouts, 2.1 of oatmeal, 1.7 of 1% milk, 1.2 of oranges, and 1 of broccoli.

No one who’s ever paid close attention to dietary recommendations is surprised. The best diet depends on what one has at hand, and what one means by “best” — what micro-economists would call one’s “utility function.” Given this, it’s hard to make sense of all-in-one, impersonal key indicators like the ANDI (Aggregate Nutrient Density Index) numbers prominently displayed at Whole Foods.

Back to lunch. Although yesterday’s lunch was good, I liked today’s much more.  It was satisfying and tasty. The crackers were firm and crisp, an excellent base for the strongly-flavored cheese. The apple tied it all together very nicely, adding sweetness and juiciness. In gastronomic terms, the lunch rated very highly.

But maybe that’s just because I was in the mood for cheese today. Not only is there no obvious robust measure of gastronomic appeal, but if there were, it might vary both from person to person and time to time.

So which lunch was better? It depends on what I choose to value.

As For Diet,
So For  IT “Best Practices” & “Key Performance Indicators”

imgresBest practices in IT depend on context and goals even more than diet does. Yet somehow we’ve come to believe there’s a one-size-fits-all optimum out there — documented by an ANDI for IT — and that if we can find it all will be well. We’ve spent large amounts of time and money on the quest for these “best practices,” often hiring consultants to “optimize” IT practice.

A recent cursory survey of research university CIOs, for example, found that at least 18 of them had undertaken optimization projects, most of them involving outside firms such as Bain, McKinsey, PriceWaterhouseCoopers, Accenture,  and their kith. (I’ve been through two of these: an early effort at MIT involving CSC Index, and a more recent one at UChicago involving McKinsey.)

Here’s what I think: too often our quest for best practices — especially when it’s a quest guided by outside entities — is based on goals that may not be what we want, or at least may not be all that we want. What Washington College wants IT to achieve may differ from what George Washington University wants IT to achieve, and probably neither has the same goals as the Chicago City Colleges or the University of Phoenix or the EdX consortium.

imgresConsider IT support, for example. Users seem to be most satisfied when they can choose their own technologies, and the institution provides a knowledgeable IT support person they trust — ideally down the hall, so that sticking one’s head out the door and saying “Pat, can you come help me please?” brings an expert Pat running.

Unfortunately, that “local” support model is expensive, both because it requires flexible IT support staff who are skilled with diverse technologies and because it inhibits economy of scale. Conversely, central administrators and their consultants focus on costs, and so tend to value support strategies that reduce cost: standardization, tiered support, and even consolidated outsourcing.

But cost-reduction strategies are almost precisely antithetical to user-satisfaction strategies. Neither is “right” in any objective sense; indeed, the point is that although it’s quite possible to develop each strategy and measure whether it’s working, whether that’s “right” depends on the original goal.

Measurement Isn’t the Same as Evaluation

The point of all this should be obvious: it’s pointless to talk about “best practices” or to use “key performance indicators” until those involved at least understand and appreciate each other’s goals — in lunch terms, they know who wants cheap, who wants nutritious, and who wants tasty, and maybe have negotiated some compromises. Don’t read that sentence as being anti-data: data are good, and the more of them and the higher their quality the better, but that one has data on some attribute doesn’t mean that attribute signifies value — or that the absence of data signifies the absence of value. Language is important: a “datum” is value-neutral, for example, whereas a “score” isn’t. “Index tilts toward “score”, “indicator” toward “datum”. “Practice” is neutral, “best practice” isn’t. And so on.

Also, “those involved,” in higher-education IT, typically entails an awkward triangle: IT organizations provide services to users, who seek maximum service levels, but IT organizations get resources from central administration, which seeks minimum expenditure. Goal divergence results, one of many reason there’s such frustration within higher-education IT these days. Without mutual understanding and agreement on goals, there’s no such thing as “best practice,” no matter how many “key performance indicators” are available.

urlThat I hate watermelon and love bacon can’t govern our family diet. The bacon is intrinsically controversial (tasty but fatty, so even without talking to anyone else I’m conflicted), whereas the watermelon admits compromise (my wife and son eat it, and I don’t).

All of this is making me hungry…

 

 

 

 

The Importance of Being Enterprise

…as Oscar Wilde well might have titled an essay about campus-wide IT, had there been such a thing back then.

Enterprise IT it accounts for the lion’s share of campus IT staffing, expenditure, and risk. Yet it receives curiously little attention in national discussion of IT’s strategic higher-education role. Perhaps that should change. Two questions arise:

  • What does “Enterprise” mean within higher-education IT?
  • Why might the importance of Enterprise IT evolve?

What does “Enterprise IT” mean?

Here are some higher-education spending data from the federal Integrated Postsecondary Education Data Service (IPEDS), omitting hospitals, auxiliaries, and the like:

Broadly speaking, colleges and universities deploy resources with goals and purposes that relate to their substantive mission or the underlying instrumental infrastructure and administration.

  • Substantive purposes and goals comprise some combination of education, research, and community service. These correspond to the bottom three categories in the IPEDS graph above. Few institutions focus predominantly on research—Rockefeller University, for example. Most research universities pursue all three missions, most community colleges emphasize the first and third, and most liberal-arts colleges focus on the first.
  • Instrumental activities are those that equip, organize, and administer colleges and universities for optimal progress toward their mission—the top two categories in the IPEDS graph. In some cases, core activities advance institutional mission by providing a common infrastructure for the latter. In other cases, they do it by providing campus-wide or departmental staffing, management, and processes to expedite mission-oriented work. In still other cases, they do it through collaboration with other institutions or by contracting for outside services.

Education, research, and community service all use IT substantively to some extent. This includes technologies that directly or indirectly serve teaching and learning, technologies that directly enable research, and technologies that provide information and services to outside communities—for examples of all three, classroom technologies, learning management systems, technologies tailored to specific research data collection or analysis, research data repositories, library systems, and so forth.

Instrumental functions rely much more heavily on IT. Administrative processes rely increasingly on IT-based automation, standardization, and outsourcing. Mission-oriented IT applications share core infrastructure, services, and support. Core IT includes infrastructure such as networks and data centers, storage and computational clouds, and desktop and mobile devices; administrative systems ranging from financial, HR, student-record, and other back office systems to learning-management and library systems; and communications, messaging, collaboration, and social-media systems.

In a sense, then, there are six technology domains within college and university IT:

  • the three substantive domains (education, research, and community service), and
  • the three instrumental domains (infrastructure, administration, and communications).

Especially in the instrumental domains, “IT” includes not only technology, but also the services, support, and staffing associated with it. Each domain therefore has technology, service, support, and strategic components.

Based on this, here is a working definition: in in higher education,

“Enterprise” IT comprises the IT-related infrastructure, applications, services, and staff
whose primary institutional role is instrumental rather than substantive.

To explore Enterprise IT, framed thus, entails focusing on technology, services, and support as they relate to campus IT infrastructure, administrative systems, and communications mechanisms, plus their strategic, management, and policy contexts.

Why Might the Importance of Enterprise IT Evolve?

Three reasons: magnitude, change, and overlap.

Magnitude

According data from EDUCAUSE’s Core Data Service (CDS) and the federal Integrated Postsecondary Data System (IPEDS), the typical college or university spends just shy of 5% of its operating budget on IT. This varies a bit across institutional types:

We lack good data breaking down IT expenditures further. However, we do have CDS data on how IT staff distribute across different IT functions. Here is a summary graph, combining education and research into “academic” (community service accounts for very little dedicated IT effort):

Thus my assertion above that Enterprise IT accounts for the lion’s share of IT staffing. Even if we omit the “Management” component, Enterprise IT comprises 60-70% of staffing including IT support, almost half without. The distribution is even more skewed for expenditure, since hardware, applications, services, and maintenance are disproportionately greater in Administration and Infrastructure.

Why, given the magnitude of Enterprise relative to other college and university IT, has it not been more prominent in strategic discussion? There are at least two explanations:

  • relatively slow change in Enterprise IT, at least compared to other IT domains (rapidly-changing domains rightly receive more attention that stable ones), and
  • overlap—if not competition—between higher-education and vendor initiatives in the Enterprise space.

Change

Enterprise IT is changing thematically, driven by mobility, cloud, and other fundamental changes in information technology. It also is changing specifically, as concrete challenges arise.

Consider, as one way to approach the former, these five thematic metamorphoses:

  • In systems and applications, maintenance is giving way to renewal. At one time colleges and universities developed their own administrative systems, equipped their own data centers, and deployed their own networks. In-house development has given way to outside products and services installed and managed on campus, and more recently to the same products and services delivered in or from the cloud.
  • In procurement and deployment, direct administration and operations are giving way to negotiation with outside providers and oversight of the resulting services. Whereas once IT staff needed to have intricate knowledge of how systems worked, today that can be less useful that effective negotiation, monitoring, and mediation.
  • In data stewardship and archiving, segregated data and systems are giving way to integrated warehouses and tools. Historical data used to remain within administrative systems. The cost of keeping them “live” became too high, and so they moved to cheaper, less flexible, and even more compartmentalized media. The plunging price of storage and the emergence of sophisticated data warehouses and business-intelligence systems reversed this. Over time, storage-based barriers to data integration have gradually fallen.
  • In management support, unidimensional reporting is giving way to multivariate analytics. Where once summary statistics emerged separately from different business domains, and drawing inferences about their interconnections required administrative experience and intuition, today connections can be made at the record level deep within integrated data warehouses. Speculating about relationships between trends is giving way to exploring the implications of documented correlations.
  • In user support, authority is giving way to persuasion. Where once users had to accept institutional choices if they wanted IT support, today they choose their own devices, expect campus IT organizations to support them, and bypass central systems if support is not forthcoming. To maintain the security and integrity of core systems, IT staff can no longer simply require that users behave appropriately; rather, they must persuade users to do so. This means that IT staff increasingly become advocates rather than controllers. The required skillsets, processes, and administrative structures have been changing accordingly.

Beyond these broad thematic changes, a fourfold confluence is about to accelerate change in Enterprise IT: major systems approaching end-of-life, the growing importance of analytics, extensive mobility supported by third parties, and the availability of affordable, capable cloud-based infrastructure, services, and applications.

Systems Approaching End-of-Life

In the mid-1990s, many colleges and universities invested heavily in administrative-systems suites, often (if inaccurately) called “Enterprise Reporting and Planning” systems or “ERP.” Here, again drawing on CDS, are implementation data on Student, Finance, and HR/Payroll systems for non-specialized colleges and universities:

The pattern of implementation varies slightly across institution types. Here, for example, are implementation dates for Finance systems across four broad college and university groups:

Although these systems have generally been updated regularly since they were implemented, they are approaching the end of their functional life. That is, although they technically can operate into the future, the functionality of turn-of-the-century administrative systems likely falls short of what institutions currently require. Such functional obsolescence typically happens after about 20 years.

The general point holds across higher education: A great many administrative systems will reach their 20-year anniversaries over the next several years.

Moreover, many commercial administrative-systems providers end support for older products, even if those products have been maintained and updated. This typically happens as new products with different functionality and/or architecture establish themselves in the market.

These two milestones—functional obsolescence and loss of vendor support—mean that many institutions will be considering restructuring or replacement of their core administrative systems over the next few years. This, in turn, means that administrative-systems stability will give way to 1990s-style uncertainty and change.

Growing Importance of Analytics

Partly as a result of mid-1990s systems replacements, institutions have accumulated extensive historical data from their operations. They have complemented and integrated these by implementing flexible data-warehousing and business-intelligence systems.

Over the past decade, the increasing availability of sophisticated data-mining tools has given new purpose to data warehouses and business-intelligence systems that have until now have largely provided simple reports. This has laid foundation for the explosive growth of analytic management approaches (if, for the present, more rhetorical than real) in colleges and universities, and in the state and federal agencies that fund and/or regulate them.

As analytics become prominent in areas ranging from administrative planning to student feedback, administrative systems need to become better integrated across organizational units and data sources. The resulting datasets need to become much more widely accessible while complying with privacy requirements. Neither of these is easy to achieve. Achieving them together is more difficult still.

Mobility Supported by Third Parties

Until about five years ago campus communications—infrastructure and services both—were largely provided and controlled by institutions. This is no longer the case.

Much networking has moved from campus-provided wired and WiFi facilities to cellular and other connectivity provided by third parties, largely because those third parties also provide the mobile end-user devices students, faculty, and staff favor.

Separately, campus-provided email and collaboration systems have given way to “free” third-party email, productivity, and social-media services funded by advertising rather than institutional revenue. That mobile devices and their networking are largely outside campus control is triggering fundamental rethinking of instruction, assessment, identity, access, and security processes. This rethinking, in turn, is triggering re-engineering of core systems.

Affordable, Capable Cloud

Colleges and universities have long owned and managed IT themselves, based on two assumptions: that campus infrastructure needs are so idiosyncratic that they can only be satisfied internally, and that campuses are more sophisticated technologically than other organizations.

Both assumptions held well into the 1990s. That has changed. “Outside” technology has caught up to and surpassed campus technology, and campuses have gradually recognized and begun to avoid the costs of idiosyncrasy.

As a result, outside services ranging from commercially hosted applications to cloud infrastructure are rapidly supplanting campus-hosted services. This has profound implications for IT staffing—both levels and skillsets.

The upshot is that Enterprise, already the largest component of higher-education IT, is entering a period of dramatic change.

Beyond change in IT, the academy itself is evolving dramatically. For example, online enrollment is becoming increasingly common. As the Sloan Foundation reports, the fraction of students taking some or all of their coursework online is increasing steadily:

This has implications not only for pedagogy and learning environments, but also for the infrastructure and applications necessary to serve remote and mobile students.

Changes in the IT and academic enterprises are one reason Enterprise IT needs more attention. A second is the panoply of entities that try to influence Enterprise IT.

Overlap

One might expect colleges and universities to have relatively consistent requirements for administrative systems, and therefore that the market for those would consist largely of a few major widely-used products. The facts are otherwise. Here are data from the recent EDUCAUSE Center for Applied Research (ECAR) research report The 2011 Enterprise Application Market in Higher Education:

The closest we come to a compact market is for learning management systems, where 94% of installed systems come from the top 5 vendors. Even in this area, however, there are 24 vendors and open-source groups. At the other extreme is web content management, where 89 active companies and groups compete and the top providers account for just over a third of the market.

One way major vendors compete under circumstances like these is by seeking entrée into the informal networks through which institutions share information and experiences. They do this, in many cases, by inviting campus CIOs or administrative-systems heads to join advisory groups or participate in vendor-sponsored conferences.

That these groups are usually more about promoting product than seeking strategic or technical advice is clear. They are typically hosted and managed by corporate marketing groups, not technical groups. In some cases the advisory groups comprise only a few members, in some cases they are quite large, and in a few cases there are various advisory tiers. CIOs from large colleges and universities are often invited to various such groups. For the most part these groups have very little effect on vendor marketing, and even less on technical architecture and direction.

So why do CIOs attend corporate advisory board meetings? The value to CIOs, aside from getting to know marketing heads, is that these groups’ meetings provide a venue for engaging enterprise issues with peers. The problem is that the number of meetings and their oddly overlapping memberships lead to scattershot conversations inevitably colored by the hosts’ marketing goals and technical choices. It is neither efficient nor effective for higher education to let vendors control discussions of Enterprise IT.

Before corporate advisory bodies became so prevalent, there were groups within higher-education IT that focused on Enterprise IT and especially on administrative systems and network infrastructure. Starting with 1950s workshops on the use of punch cards in higher education, CUMREC hosted meetings and publications focused on the business use of information technology. CAUSE emerged from CUMREC in the late 1960s, and remained focused on administrative systems. EDUCOM came into existence in the mid-1960s, and its focus evolved to complement those of CAUSE and CUMREC by addressing joint procurement, networking, academic technologies, copyright, and in general taking a broad, inclusive approach to IT. Within EDUCOM, the Net@EDU initiative focused on networking much the way CUMREC focused on business systems.

As these various groups melded into a few larger entities, especially EDUCAUSE, Enterprise IT remained a focus, but it was only one of many. Especially as the y2k challenge prompted increased attention to administrative systems and intensive communications demands prompted major investments in networking, the prominence of Enterprise IT issues in collective work diffused further. Internet2 became the focal point for networking engagements, and corporate advisory groups became the focal point for administrative-systems engagements. More recently, entities such as Gartner, the Chronicle of Higher Education, and edu1world have tried to become influential in the Enterprise IT space.

The results of the overlap among vendor groups and associations, unfortunately, are scattershot attention and dissipated energy in the higher-education Enterprise IT space. Neither serves higher education well. Overlap thus joins accelerated change as a major argument for refocusing and reenergizing Enterprise IT.

The Importance of Enterprise IT

Enterprise IT, through its emphasis on core institutional activities, is central to the success of higher education. Yet the community’s work in the domain has yet to coalesce into an effective whole. Perhaps this is because we have been extremely respectful of divergent traditions, communities, and past achievements.

We must not be disrespectful, but it is time to change this: to focus explicitly on what Enterprise IT needs in order to continue advancing higher education, to recognize its strategic importance, and to restore its prominence.

9/25/12 gj-a  

The Rock, and The Hard Place

Looking into the near-term future—say, between now and 2020—we in higher-education IT have to address two big challenges. Neither admits easy progress. But if we don’t address them, we’ll find ourselves caught between a rock and a hard place.

  • The first challenge, the rock, is to deliver high-quality, effective e-learning and curriculum at scale. We know how to do part of that, but key pieces are missing, and it’s not clear how will find them.
  • The second challenge, the hard place, is to recognize that enterprise cloud services and personal devices will make campus-based IT operations the last rather than the first resort. This means everything about our IT base, from infrastructure through support, will be changing just as we need to rely on it.

“But wait,” I can hear my generation of IT leaders (and maybe the next) say, “aren’t we already meeting those challenges?”

If we compare today’s e-learning and enterprise IT with that of the recent past, those leaders might rightly suggest, immense change is evident:

  • Learning management systems, electronic reserves, video jukeboxes, collaboration environments, streamed and recorded video lectures, online tutors—none were common even in 2000, and they’re commonplace today.
  • Commercial administrative systems, virtualized servers, corporate-style email, web front ends—ditto.

That’s progress and achievement we all recognize, applaud, and celebrate. But that progress and achievement overcame past challenges. We can’t rest on our laurels.

We’re not yet meeting the two broad future challenges, I believe, because in each case fundamental and hard-to-predict change lies ahead. The progress we’ve made so far, however progressive and effective, won’t steer us between the rock of e-learning and the hard place of enterprise IT.

The fundamental change that lies ahead for e-learning
is the the transition from campus-based to distance education

Back in the 1990s, Cliff Adelman, then at the US Department of Education, did a pioneering study of student “swirl,” that is, students moving through several institutions, perhaps with work intervals along the way,before earning degrees.

“The proportion of undergraduate students attending more than one institution,” he wrote, “swelled from 40 percent to 54 percent … during the 1970s and 1980s, with even more dramatic increases in the proportion of students attending more than two institutions.” Adelman predicted that “…we will easily surpass a 60 percent multi-institutional attendance rate by the year 2000.”

Moving from campus to campus for classes is one step; taking classes at home is the next. And so distance education, long constrained by the slow pace and awkward pedagogy of correspondence courses, has come into its own. At first it was relegated to “nontraditional” or “experimental” institutions—Empire State College, Western Governors University, UNext/Cardean (a cautionary tale for another day), Kaplan. Then it went mainstream.

At first this didn’t work: fathom.com, for example, a collaboration among several first-tier research universities led by Columbia, found no market for its high-quality online offerings. (Its Executive Director has just written a thoughtful essay on MOOCs, drawing on her fathom.com experience.)

Today, though, a great many traditional colleges and universities successfully bring instruction and degree programs to distant students. Within the recent past these traditional institutions have expanded into non-degree efforts like OpenCourseWare and to broadcast efforts like the MOOC-based Coursera and edX. In 2008, 3.7% of students took all their coursework through distance education, and 20.4% took at least one class that way.

Learning management systems, electronic reserves, video jukeboxes, collaboration environments, streamed and recorded video lectures, online tutors, the innovations that helped us overcome past challenges—little of that progress was designed for swirling students who do not set foot on campus.

We know how to deliver effective instruction to motivated students at a distance. Among policy issues we have yet to resolve, we don’t yet know how to

  • confirm their identity,
  • assess their readiness,
  • guide their progress,
  • measure their achievement,
  • standardize course content,
  • construct and validate curriculum across diverse campuses, or
  • certify degree attainment

in this imminent world. Those aren’t just IT problems, of course. But solving them will almost certainly challenge IT.

The fundamental change that lies ahead for enterprise technologies
is the transition from campus IT to cloud and personal IT

The locus of control over all three principal elements of campus IT—servers and services, networks, and end-user devices and applications—is shifting rapidly from the institution to customers and third parties.

As recently as ten years ago, most campus IT services, everything from administrative systems through messaging and telephone systems to research technologies, were provided by campus entities using campus-based facilities, sometimes centralized and sometimes not. The same was true for the wired and then wireless networks that provided access to services, and for the desktop and laptop computers faculty, students, and staff used.

Today shared services are migrating rapidly to servers and systems that reside physically and organizationally elsewhere—the “cloud”—and the same is happening for dedicated services such as research computing. It’s also happening for networks, as carrier-provided cellular technologies compete with campus-provided wired and WiFi networking, and for end-user devices, as highly mobile personal tablets and phones supplant desktop and laptop computers.

As I wrote in an earlier post about “Enterprise IT,” the scale of enterprise infrastructure and services within IT and the shift in their locus of control have major implications for and the organizations that have provided it. Campus IT organizations grew up around locally-designed services running on campus-owned equipment managed by internal staff. Organization, staffing, and even funding models ensued accordingly. Even in academic computing and user support, “heavy metal” experience was valued highly. The shifting locus of control makes other skills at least as valuable: the ability to negotiate with suppliers, to engage effectively with customers (indeed, to think of them as “customers” rather than “users”), to manage spending and investments under constraint, to explain.

To be sure, IT organizations still require highly skilled technical staff, for example to fine-tune high-performance computing and networking, to ensure that information is kept secure, to integrate systems efficiently, and to identify and authenticate individuals remotely. But these technologies differ greatly from traditional heavy metal, and so must enterprise IT.

The rock, IT, and the hard place

In the long run, it seems to me that the campus IT organization must evolve rapidly to center on seven core activities.

Two of those are substantive:

  • making sure that researchers have the technologies they need, and
  • making sure that teaching and learning benefit from the best thinking about IT applications and effectiveness.

Four others are more general:

  • negotiating and overseeing relationships with outside providers;
  • specifying or doing what is necessary for robust integration among outside and internal services;
  • striking the right personal/institutional balance between security and privacy for networks, systems, and data; and last but not least
  • providing support to customers (both individuals and partner entities).

The seventh core activity, which should diminish over time, is

  • operating and supporting legacy systems.

Creative, energetic, competent staff are sine qua non for achieving that kind of forward-looking organization. It’s very hard to do good IT without good, dedicated people, and those are increasingly difficult to find and keep. Not least, this is because colleges and universities compete poorly with the stock options, pay, glitz, and technology the private sector can offer. Therein lies another challenge: promoting loyalty and high morale among staff who know they could be making more elsewhere.

To the extent the rock of e-learning and the hard place of enterprise IT frame our future, we not only need to rethink our organizations and what they do; we also need to rethink how we prepare, promote, and choose leaders for higher-education leaders on campus and elsewhere—the topic, fortuitously, of a recent ECAR report, and of widespread rethinking within EDUCAUSE.

We’ve been through this before, and risen to the challenge.

  • Starting around 1980, minicomputers and then personal computers brought IT out of the data center and into every corner of higher education, changing data center, IT organization, and campus in ways we could not even imagine.
  • Then in the 1990s campus, regional, and national networks connected everything, with similarly widespread consequences.

We can rise to the challenges again, too, but only if we understand their timing and the transformative implications.

The Ghost is Ready, but the Meat is Raw

Old joke. Someone writes a computer program (creates an app?) that translates from English into Russian (say) and vice versa. Works fine on simple stuff, so the next test is a a bit harder: “the spirit is willing, but the flesh is weak.”  The program/app translates the phrase into Russian, then the tester takes the result, feeds it back into the program/app, and translates it back into English. Result: “The ghost is ready, but the meat is raw.”

(The starting phrase is from Matthew 26:41 – the King James version has “indeed” before “willing”, ASV doesn’t, and weirdly enough, if you try this in Google Translate, the joke falls flat, because you get an accurate translation to Russian and back, except for some reason you end up with an extra “indeed” in the final version. It’s almost as though Google Translate has figured out where the quotation came from, and then substituted the King James version for the ASV one, but not quite correctly. Spooky. But I digress.)

Old joke, yes. Tired, even. But, as usual, it’s a metaphor, in this case for a problem that will only become larger as higher education outsources or contracts for ever more of its activity: we think we’ve doing the right thing when we contract with outside providers, but the actual effect of the contract, once it takes effect, isn’t quite what we expected. If we’re lucky, we figure this out before we’re irrevocably committed. If we’re unlucky, we box ourselves in.

Two examples.

1. Microsoft Site Licensing

About a decade ago, several of us were at an Internet2 meeting. A senior Microsoft manager spoke about relations with higher education (although looking back, I can’t see why Microsoft would present at I2. Maybe it wasn’t an I2 meeting, but let’s just say it was — never let truth get in the way of a good story). At the time, instead of buying a copy of Office for each computer, as Microsoft licenses required, many students, staff, and faculty simply installed Microsoft Office on multiple machines from one purchased copy — or even copied the installation disks and passed them around. That may save money, but it’s copyright infringement, and illegal.

Microsoft’s response to this problem had been threefold:

  • it began incorporating copy protection and other digital-rights-management (DRM) mechanisms into its installation media so that they couldn’t be copied,
  • it began berating campuses for tolerating the illegal copying (and in some cases attempted to audit compliance with licenses by searching campus computers for illegally obtained software), and
  • it sought to centralize campus procurement of Microsoft software by tailoring and refining its so-called “Select” volume-discount program to encourage campuses to license software campus-wide.

Problem was, the “Select” agreement required campuses to count how many copies of software they licensed, and to maintain records that would enable Microsoft to determine whether each installed copy on campus was properly licensed. This entailed elaborate bookkeeping and tracking mechanisms, exposed campuses to audit risk, and its costs into the future were unpredictable. The volume-discount “Select” program was clearly a step forward, but it fell far short of actually appealing to campuses.

So the several of us in the Internet2 session (or wherever it was) took the Microsoft manager aside afterwards, told him Microsoft needed a more attractive licensing model for campuses, and suggested what that might be.

To our surprise, Microsoft followed up, and the rump-group discussions evolved into the initial version of the Microsoft Campus Agreement. The Campus Agreement (since replaced by Enrollment for Education Solutions, EES) was a true site license: glossing over some complexities and details, its general terms were that campuses would pay Microsoft based on their size and the number of different products they wished to license, and in return would be permitted to use as many copies of those products as they liked.

Most important from the campus perspective, the Campus Agreement included no requirement to track or count individual copies of the licensed products, thereby making all copies legal; in fact, campuses could make their own copies of installation media. Most important from the Microsoft perspective, Campus Agreement pricing was set so that the typical campus would still pay Microsoft about as much as Microsoft had been receiving from that campus’s central or departmental users for Select or individual copies; that is, Micorsoft’s revenue from campuses would not decline.

The Campus Agreement did entail a fundamental change that was less appealing. In effect, campuses were paying to rent software, with Microsoft agreeing to provide updates at no additional cost, rather than campuses buying copies and then periodically paying to update them. Although it included a few other lines, for the most part the Campus Agreement covered Microsoft’s operating-system and Office products.

Win-win, right? Lots of campuses signed up for the Campus Agreement. It largely eliminated talk about “piracy” of MS-Office products in higher education (enhanced DRM played an important role in this too), and it stabilized costs for most Microsoft client software. It was very popular with students, faculty, and staff, especially since the Campus Agreement allowed institutionally-provided software to be installed on home computers.

But at least one campus, which I’ll call Pi University, balked. The Campus Agreement, PiU’s golf-loving CIO pointed out, had a provision no one had read carefully: if PiU withdrew from the Campus Agreement, he said, it might be required to list and pay for all the software copies that PiU or its students, faculty, and staff had acquired under the Campus Agreement — that is, to buy what it had been renting. The PiU CIO said that he had no way to comply with such a provision, and that therefore PiU could not in good faith sign an agreement that included it.

Some of us thought the PiU CIO’s point was valid but inconsequential. First, some of us didn’t believe that Microsoft would ever enforce the buy-what-you’d-rented clause, so that it presented little actual risk. Second, some of us pointed out that since there was no requirement that campuses document how many copies they distributed, and in general the distribution would be independent of Microsoft, a campus leaving the Campus Agreement could simply cite any arbitrary number of copies as the basis for its exit payment. Therefore, even if Microsoft enforced the clause, estimating the associated payment was entirely under the campus’s control. Those of who believed these arguments went forward with the Campus Agreement; Pi University didn’t.

So the ghost was ready (higher education had gotten most of what it wanted), but the meat was raw (what we wanted turned out problematic in ways no one had really thought through).

Now let’s turn to a more current case.

2. Outsourcing Campus Bookstores

In February 2012 EDUCAUSE agreed to work with Internet2 on an electronic textbooks pilot. This was to be the third in a series of pilots: Indiana University had undertaken one for the fall of 2011, it and a few other campuses had worked with Internet2 on second proof-of-concept pilot for the spring of 2012, and the third pilot was to include a broader array of  institutions.

Driving these efforts were the observations that textbook prices figured prominently in spiraling out-of-pocket college-attendance costs, that electronic textbooks might help attenuate those prices, and that electronic textbooks also might enable campuses to move from individual student purchases to more efficient site licenses, perhaps bypassing unnecessary intermediaries.

A small team planned the pilot, and began soliciting participation in mid-March. By April 7, the initial deadline, 70 institutions had expressed interest. Over 100 people joined an informational webinar two days later, and it looks as though about 25 institutions will manage to participate and help higher education, publishers, and e-reader providers understand their joint future better.

The ghost/meat example here isn’t the etext pilot itself. Rather, it’s something that caused many interested institutions to withdraw from the pilot: campus bookstore outsourcing.

According to the National Association of College Stores (NACS), there are about 4500 bookstores serving US higher education (probably coincidentally, that’s about the number of degree-granting institutions in the US, of which about two thirds are nonspecialized institutions enrolling more than just a few students). Many stores counted by NACS are simply stores near campuses rather than located on or formally associated with them.

Of the campus-located, campus-associated stores, over 820 are operated under outsourcing contracts by Follett Higher Education Group and about 600 are operated by Barnes & Noble College Booksellers. Another 140 stores are members of the Independent College Bookstore Association (ICBA), and the remainder — I can’t find a good count — are either independent, campus-operated, or operated by some other entity.

The arrangements for outsourced bookstores vary from campus to campus, but they have some features in common. The most prominent of those is the overall deal, which is generally that in return for some degree of exclusivity or special access granted by the campus, the store pays the campus a fee of some kind. The exclusivity or special access may be confined to textbook adoptions, or it may extend to clothing and other items with the campus logo or to computer hardware and software. The payment to the campus may be negotiated explicitly, or it may be a percentage of sales or profit. Some outsourced stores are in campus-owned buildings and pay rent, some own a building part of which is rented to campus offices or activities, and some are freestanding; the associated space payments further complicate the relationship between outsourced stores and campuses but do not change its fundamental dependence on the exchange of exclusivity for fees.

For the most part outsourcing bookstores seems to serve campuses well. Managing orders, inventories, sales, and returns for textbooks and insignia items requires skill and experience with high-volume, low-margin retail, which campus administrators rarely have. Moreover, until recently bookstore operations generally had little impact on campus operations and vice versa.

Because bookstore operations generally stood apart from academic and programmatic activities on campus, negotiating contracts with bookstores generally emphasized “business” issues. Since these for the most part involved money and space, negotiations and contract approvals often remained on the “business” side of campus administration, along with apparently similar issues like dining halls, fleet maintenance, janitorial service, lab supply, and so forth. Again, this served campuses well: the campus administrators most attuned to operations and finance (chief finance officers, chief administrative officers, heads of auxiliary services) were the right ones to address bookstore issues.

Over the past few years this changed, first gradually and then more abruptly.

  • First, having bookstores handle hardware and software sales to students (and in some cases departments) came into conflict with campus desires to guide individual choices and maximize support efficiency through standardization and incentives, none of which aligned well with bookstores’ need to maximize profit from IT sales — an important goal, with campus bookstore sales essentially flat since 2005-2006 despite 10%+ enrollment growth.
  • Second, the high price of textbooks drew attention as a major component of growing college costs, and campuses sought to regain some control over it – NACS reports that the average student spends $483 on texts and related materials, that the average textbook price rose from $56 in 2006-2007 to $62 in 2009-2010, and that the typical margin on textbooks is about 22% for new texts and 35% for used ones.
  • Third, as textbooks have begun to migrate from static paper volumes to interactive electronic form, they have come to resemble software more than sweatshirts in that individual student purchases through bookstores may not be the optimal way to distribute or procure them.

That last point — that bookstores may not be the right medium for selling and buying textbooks — potentially threatens the traditional bookstore model, and therefore the outsourcing industry based on it. Not surprisingly, bookstores have responded aggressively to this threat, both offensively and defensively. On the offensive front (I mean this in the sense “trying to advance”, rather than “trying to offend”), the major bookstore chains have invested in e-reader technology, and have begun experimenting extensively with alternative pricing and delivery models. On the defensive front, they have tried to extend past exclusivity clauses to include electronic texts and other new materials.

Many campuses expressed interest in the EDUCAUSE/Internet2 EText Pilot, going so far as to add themselves to a list, make preliminary commitments, and attend the webinar. Filled with enthusiasm, many webinar attendees began talking up the pilot on their campuses, and many of them then ran into a wall: they learned, often only when they double-checked with their counsel in the final stages of applying, that their bookstore contracts — Barnes & Noble and Follett both — precluded their participation in even a pilot exploration of alternative etext approaches, since the right to distribute electronic textbooks was reserved exclusively for the outsourced bookstore.

The CIO from one campus — I’ll call it Omega University — discovered that a recent renewal of the bookstore contract provided that during the 15-year term of the contract, “the Bookstore shall be the University’s  …exclusive seller of all required, recommended or suggested course materials, course packs and tools, as well as materials published or distributed electronically, or sold over the Internet.”  The OmegaU CIO was outraged: “In my mind,” he wrote, “the terms exclusive and over the Internet can’t even be in the same sentence!  And to restrict faculty use of technology for next 15 years is just insane.”

If the last decade has taught us anything, it is that the evolutionary cycle for electronic products is very short, requiring near-constant reappraisal of business models, pricing, and partnerships. That someone on campus signed a contract fixing electronic distribution mechanisms for 15 years may be an extreme case, but we’ve learned even from less pernicious cases  that exclusivity arrangements bound to old business models will drastically constrain progress.

And so the ghost’s readiness again yielded raw meat: technological progress translated well-intentioned, longstanding bookstore contracts that had served campuses well into obstacles impeding even the consideration of important changes.

3. So What Do We Do?

It’s important to draw the right inference from all this.

The problem isn’t simply Microsoft trying to lock customers into the Campus Agreement or bookstore operators being avaricious; rather, they’re acting in self-interest, albeit self-interest that in each case is a bit short-sighted.

The compounding problem is that we in higher education often make decisions too narrowly. In the case of the Campus Agreement, we were so focused on the important move from per-copy to site licensing, a major win, that we didn’t pay sufficient negotiating time or effort to the so-called exit clauses — which, in retrospect, could certainly have been written in much less problematic ways still acceptable to Microsoft. In the case of bookstore contracts, we failed to recognize that what had been a distinct, narrow set of activities readily handled within business and finance was being driven by technology into new domains requiring foresight and expertise generally found elsewhere on campus.

Sadly, there’s no simple solution to this problem. It’s hard to take everything into account or involve every possible constituency in a decision and still get it done, and decisions must get done. Perhaps the best solution we can hope for is better, more transparent discussion of both past decisions and future opportunities, so that we learn collectively and openly from our mistakes, take joint responsibility for our shared technological future, and translate accurately back and forth between what we want and what we get.

Notes on Barter, Privacy, Data, & the Meaning of “Free”

It’s been an interesting few weeks:

  • Facebook’s upcoming $100-billion IPO has users wondering why owners get all the money while users provide all the assets.
  • Google’s revision of privacy policies has users thinking that something important has changed even though they don’t know what.
  • Google has used a loophole in Apple’s browser to gather data about iPhone users.
  • Apple has allowed app developers to download users’ address books.
  • And over in one of EDUCAUSE’s online discussion groups, the offer of a free book has somehow led security officers to do linguistic analysis of the word “free” as part of a privacy argument.

Lurking under all, I think, are the unheralded and misunderstood resurgence of a sometimes triangular barter economy, confusion about different revenue models, and, yes, disagreement what the word “free” means.

Let’s approach the issue obliquely, starting, in the best academic tradition, with a small-scale research problem. Here’s the hypothetical question, which I might well have asked back when I was a scholar of student choice: Is there a relationship between selectivity and degree completion at 4-year colleges and universities?

As a faculty member in the late 1970s, I’d have gone to the library and used reference tools to locate articles or reports on the subject. If I were unaffiliated and living in Chicago (which I wasn’t back then), I might have gone to the Chicago Public Library, found in its catalog a 2004 report by Laura Horn, and have had that publication pulled from closed-stack storage so I could read it.

By starting with that baseline, of course, I’m merely reminiscing. These days I can obtain the data myself, and do some quick analysis. I know the relevant data are in the Integrated Postsecondary Education Data System (IPEDS). And those IPEDS data are available online, so I can

(a) download data on 2010 selectivity, undergraduate enrollment, and bachelor’s degrees awarded for the 2,971 US institutions that grant four-year degree and import those data into Excel,

(b) eliminate the 101 system offices and such missing relevant data, the 1,194 that granted fewer than 100 degrees, the 15 institutions reporting suspiciously high degree/enrollment rates, the one that reported no degrees awarded (Miami-Dade College, in case you’re interested), and the 220 that reported no admit rate, and then

(c) for the remaining 1,440 colleges and universities, create a graph of degree completion (somewhat normalized) as a function of selectivity (ditto).

The graph doesn’t tell me much–scatter plots rarely do for large datasets–but a quick regression analysis tells me there’s a modestly positive relationship: 1% higher selectivity (according to my constructed index) translates on average into 1.4% greater completion (ditto). The download, data cleaning, graphing, and analysis take me about 45 minutes all told.

Or I might just use a search engine. When I do that, using “degree completion by selectivity” as the search term, a highly-ranked Google result takes me to an excerpt from a College Board report.

Curiously, that report tells me that “…selectivity is highly correlated with graduation rates,” which is a rather different conclusion than IPEDS gave me. The footnotes help explain this: the College Board includes two-year institutions in its analysis, considers only full-time, first-time students, excludes returning students and transfers, and otherwise chooses its data in ways I didn’t.

The difference between my graph and the College Board’s conclusion is excellent fodder for a discussion of how to evaluate what one finds online — in the quote often (but perhaps mistakenly) attributed to Daniel Patrick Moynihan, “Everyone is entitled to his own opinion, but not his own facts.” Which gets me thinking about one of the high points in my graduate studies, a Harvard methodology seminar wherein Mike Smith, who was eventually to become US Undersecretary of Education, taught Moynihan what regression analysis is, which in turn reminds me of the closet full of Scotch at the Joint Center for Urban Studies kept full because Moynihan required that no meeting at the Joint go past 4pm without a bottle of Scotch on the table. But I digress.

Since I was logged in with my Google account when I did the search, some of the results might even have been tailored to what Google had learned about me from previous searches. At the very least, the information was tailored to previous searches from the computer I used here in my DC office.

Which brings me to the linguistic dispute among security officers.

A recent EDUCAUSE webinar presenter, during Data Privacy Month, was Matt Ivester, creator of JuicyCampus and author of lol…OMG!: What Every Student Needs to Know About Online Reputation Management, Digital Citizenship and Cyberbullying.

“In honor of Data Privacy Day,” the book’s website announced around the same time, “the full ebook of lol…OMG! (regularly $9.99) is being made available for FREE!” Since Ivester was going to be a guest presenter for EDUCAUSE, we encouraged webinar participants to avail themselves of this offer and to download the book.

One place we did that was in a discussion group we host for IT security professionals. A participant in that discussion group immediately took Ivester to task:

…you can’t download the free book without logging in to Amazon. And, near as I can tell, it’s Kindle- or Kindle-apps-only. In honor of Data Privacy Day. The irony, it drips.

“Pardon the rant,” another participant responded, “but what is the irony here?” Another elaborated:

I intend to download the book but, despite the fact that I can understand why free distribution is being done this way, I still find it ironic that I must disclose information in order to get something that’s being made available at no charge in honor of DPD.

The discussion grew lively, and eventually devolved into a discussion of the word “free”. If one must disclose personal information in order to download a book at no monetary cost, is the book “free”?

If words like “free”, “cost”, and “price” refer only to money, the answer is Yes. But money came into existence only to simplify barter economies. In a sense, today’s Internet economy involves a new form of barter that replaces money: If we disclose information about ourselves, then we receive something in return; conversely, vendors offer “free” products in order to obtain information about us.

In a recent post, Ed Bott presented graphs illustrating the different business models behind Microsoft, Apple, and Google. According to Bott, Microsoft is selling software, Apple is selling hardware, and Google is selling advertising.

More to the point here, Microsoft and Apple still focus on traditional binary transactions, confined to themselves and buyers of their products.

Google is different. Google’s triangle trade (which Facebook also follows) offers “free” services to individuals, collects information about those individuals in return, and then uses that information to tailor advertising that it then sells to vendors in return for money. In the triangle, the user of search results pays no money to Google, so in that limited sense it’s “free”. Thus the objection in the Security discussion group: if one directly exchanges something of value for the “free” information, then it’s not free.

Except for my own time, all three answers to my “How does selectivity relate to degree completion?” question were “free”, in the sense I paid no money explicitly for them. All of them cost someone something. But not all no-cost-to-the-user online data is funded through Google-like triangles.

In the case of the Chicago Public Library, my Chicago property taxes plus probably some federal and Illinois grants enabled the library to acquire, catalog, store, and retrieve the Horn report. They also built the spectacular Harold Washington Library where I’d go read it.

In the case of IPEDS, my federal tax dollars paid the bill.

In both cases, however, what I paid was unrelated to how much I used the resources, and involved almost no disclosure of my identity or other attributes.

In contrast, the “free” search Google provided involved my giving something of value to Google, namely something about my searches. The same was true for the Ivester fans who downloaded his “free” book from Amazon.

Not that there’s anything wrong with that, as Jerry Seinfeld might say: by allowing Google and Amazon to tailor what they show me based on what they know about me, I get search results or purchase suggestions that are more likely to interest me. That is, not only does Google get value from my disclosure; I also get value from what Google does with that information.

The problem–this is what takes us back to security–is twofold.

  • First, an awful lot of users don’t understand how the disclosure-for-focus exchange works, in large part because the other party to the exchange isn’t terribly forthright about it. Sure, I can learn why Google is displaying those particular ads (that’s the “Why these ads?” link in tiny print atop the right column in search results), and if I do that I discover that I can tailor what information Google uses. But unless I make that effort the exchange happens automatically, and each search gets added to what Google will use to customize my future ads.
  • Second, and much more problematic, the entities that collect information about us increasingly share what they know. This varies depending whether they’ve learned about us directly through things like credit applications or indirectly through what we search for on the Web, what we purchase from vendors like Amazon, or what we share using social media like Facebook or Twitter. Some companies take pains to assure us they don’t share what they know, but in many cases initial assurances get softened over time (or, as appears to have happened with Apple, are violated through technical or process failures). This is routinely true for Facebook, and many seem to believe it’s what’s behind the recent changes in Google’s privacy policy.

Indeed, companies like Acxiom are in the business of aggregating data about individuals and making them available. Data so collected can help banks combat identity theft by enabling them to test whether credit applicants are who they claim to be. If they fall into the wrong hands, however, the same data can enable subtle forms of redlining or even promote identity theft.

Vendors collecting data about us becomes a privacy issue whose substance depends on whether

  • we know what’s going on,
  • data are kept and/or shared, and
  • we can opt out.

Once we agree to disclose in return for “free” goods, however, the exchange becomes a security issue, because the same data can enable impersonation. It becomes a policy issue because the same data can enable inappropriate or illegal activity.

The solution to all this isn’t turning back the clock — the new barter economy is here to stay. What we need are transparency, options, and broad-based educational campaigns to help people understand the deal and choose according to their preferences.

As either Stan Delaplane or Calvin Trillin once observed about “market price” listings on restaurant menus (or didn’t — I’m damned if I can find anything authoritative, or for that matter any mention whatsoever of this, but  know I read it), “When you learn for the first time that the lobster you just ate cost $50, the only reasonable response is to offer half”.

Unfortunately, in today’s barter economy we pay the price before we get the lobster…

Impact of “Adult” and Generic Top-Level Internet Domains on Colleges and Universities

(This is a copy of one of my EDUCAUSE blog posts)

Internet domains in the new “adult” .xxx domain recently became available. So did arbitrary generic top-level domains (gTLDs) beyond the existing .com, .net, .org, .edu, .gov, and so forth. Both initiatives affect higher education. The effects of these initiatives thus far have been modest, but they have been entirely negative. So far as we know, no college or university has benefited from either initiative. Rather, institutions have been exposed to risk and incurred costs without receiving any value in return. On behalf of its members, EDUCAUSE proposes that procedures for issuing and managing generic top-level domains be tightened to reduce their unintended negative effects on colleges and universities.

I discussed the initiatives themselves more fully in an August 2011 post. Now that the initiatives are fully launched, this post provides some additional information and recommendations. I comment first on the risks arising from the .xxx domain, then on the costs institutions have incurred to mitigate those risks, and finally on some issues arising around generic top-level domains. I conclude with a few recommendations for ICANN and gTLD registrars, and one for colleges and universities.

Risks from the .xxx domain

Colleges and universities typically have .edu domains, and use these for their official business. In addition, many institutions have claimed relevant .com, .org, .biz, .info, or .net domains. Stanford University, for example, uses “stanford.edu” for its Web presence, but it also has licensed “stanford.com” and “stanford.org”. Similarly, many institutions have claimed relevant domains in selected country top-level domains (cTLDs) such as .us, .mx, .uk, or .cn, typically those where the institution has branch campuses. The goals in these cases typically  have been simply to avoid confusion.

The .xxx domain does more than simply increase the number of top-level domains that might lead to confusion. Institutions worry that purveyors of adult material might explicitly seek to market their wares by associating those wares with college or university names, much as Playboy magazine once did with its “Women of the Ivy League” and similar features, and that this might reflect negatively on a college or university’s reputation. That is, the risks introduced by the .xxx domain go well beyond those already arising from other top-level domains.

The risk is not hypothetical. As Hawaii News Now reported in early February,

The University of Hawaii is demanding the operator of a pornographic web site stop using the school’s name or face legal action. The web site, called universityofhawaii.xxx, claims to feature what it describes as “hot nude Hawaiian college girls.”  It is full of graphic pictures of men and women having sex on beaches and at other tropical locations.

This is precisely the kind of embarrassment many institutions worried about.

Before the new domain .xxx went live, institutions had the opportunity to block use of their identity in .xxx domains through a so-called “Sunrise B” registration — but only if the institution’s identity (name, team name, nickname, etc.) had been trademarked, and the identity to be blocked precisely matched the trademark. Once the new domain went live, Sunrise B registrations were no longer available, and the only recourse for an institution was to register the potentially offending domain itself once the “Landrush” and “General Availability” periods began– or, if the domain had already been registered, to persuade the registrant to reassign or relinquish it.

Sunrise B, regular registration, and persuasion all entail costs, which brings me to the next section.

Costs to Mitigate .xxx Risks

As I wrote above, institutions could have filed Sunrise B registrations for .xxx domains, and a few institutions did so successfully. Typically a successful Sunrise B registration cost $199 for ten years — but the fee was the same even if the registration was unsuccessful. Some institutions tried to obtain Sunrise B registrations for non-trademarked names, but this did not succeed. One college I’ll call “Alpha” paid $1,000 in an attempt to register five names, only three of which corresponded to registered trademarks. The registrar approved only the trademarked three, but did not refund the $400 Alpha had paid for the other two. Another college, “Beta”, paid $199 for one successful Sunrise B registration, and then obtained General Availability registrations for four others at $99/year per domain.

It’s interesting to note that the cost of a .xxx registration varies from registrar to registrar, from a reported low of $79/year to a reported high of $103; also, some registrars offered only 10-year Sunrise B registrations, while others offered a perpetual option.

An informal survey of EDUCAUSE members found that successful Sunrise B (trademark blocking) registrations varied from none to a high of 22, whereas Landrush and General Availability registrations varied from none to 11. The typical response was 1-3 Sunrise B registrations and about 4 regular registrations. A quick Web search finds myriad other instances of colleges and universities registering .xxx domains.

The names being registered or blocked typically are variations on the institution’s name plus variations on team names. Most institutions report that the process for registering .xxx domains is straightforward and efficient. Although most institutions complained that they should not have to pay to defend their names, few complained about the actual amount of the fees.

Domain squatters — individuals or entities who register domains with the intention of  reselling rather than using them — have been a long-time problem. In the Hawaii case, it’s reported that the entity that registered uhawaii.xxx demanded $100,000 to relinquish it. We have reports of at other institutions being approached by .xxx squatters, but in each case the institution simply refused to deal with the squatter.

Generic Top-Level Domains

Although the idea of a generic top-level domain in a college’s or university’s name is appealing, the logistics of applying for and managing one have kept most institutions from pursuing this option. As one colleague put it,

There are two problems.  First, I have been unable to find a third party to do the registrar function for us (and we are unable to do it ourselves). It seems no one has yet figured out that there is a business opportunity in doing this. Also, the application itself needs to be a multi-hundred page submission to meet the requirements of the guidebook.  I’m actually hoping that will change over time for trademark holders.  If I hold the trademark for [institution name], I don’t see why I need to answer most of the questions in the guidebook.

Unless these two issues are addressed, it is unlikely most colleges or universities will pursue their own gTLD.

Recommendations

Other than the Hawaii case, the new .xxx and gTLD initiatives have mostly caused colleges and universities to divert administrative effort and funds to blocking or registering domains. Even so, we believe that ICANN could impose some simple requirements on new domains such as .xxx that would greatly reduce problems for higher education without materially complicating matters for registrars in those gTLDs.

  1. Automatically impose a Sunrise B block on any domain within a gTLD that corresponds to a registered trademark. That is, if “alphagroup” is a registered trademark, then, for example, the registrar for the .xxx domain should automatically refuse to issue alphagroup.xxx to any entity other than the trademark holder. The simplest way to achieve this would be to require that applicants for a domain affirm, under penalty of perjury, that they have searched the relevant trademark databases and that the domain name they seek does not conflict with any registered trademark. The registrar should then be required to randomly spot-audit some fraction of applications to ensure that affirmations are valid.
  2. Automatically impose a Sunrise B block on any domain name within a gTLD that corresponds to an domain within the .edu, .gov, .mil, or any other similarly regulated gTLDs. That is, if there is already a domain bigstate.edu, then the registrar for the .xxx domain, for example, should reject an application for bigstate.xxx, and similarly for other gTLDs.
  3. For gTLDs designated for potentially offensive material, such as .xxx, impose a waiting period between application and registration during which the application is public and other entities may object to the registration of a particular domain. If someone objects formally to the registration, invoke an arbitration or mediation process to resolve the dispute in a timely way.
  4. For gTLD applications, reject any gTLD suffix that conflicts with a registered trademark unless it is being sought by the trademark holder.

If these requirements had been imposed on the .xxx domain, most of its negative effects on colleges and universities would have been mitigated. Some institutions would still have wanted to claim some .xxx domains as a defensive strategy, but at least they would not have been required to devote extra effort and money to defending names already trademarked.

This leads to one important recommendation for colleges and universities:

  1. Colleges and universities should wherever possible trademark the official name of their institution, the variations on that name and nicknames in common use, and do the same for team names, named schools, departments, institutes, and so forth, and distinctive mottos or slogans.

EDUCAUSE will be continuing to monitor this situation, and to file comments and make recommendations that might produce progress.

What Should We Learn from Megaupload?

Seizure Notice from megaupload.comHere’s how the New York Times broke the story:

In what the federal authorities on Thursday called one of the largest criminal copyright cases ever brought, the Justice Department and the Federal Bureau of Investigation seized the Web site Megaupload and charged seven people connected with it with running an international enterprise based on Internet piracy.

Since then, we’ve learned variously that the site’s principal barricaded himself with to avoid arrest inside a “safe room” at his New Zealand mansion where police eventually arrested him as he was brandishing a shotgun, that federal authorities plan to delete all of data stored on Megaupload’s servers and so harm scores of innocent cloud-storage users, and that the raid on Megaupload was just a ploy to move SOPA and PIPA ahead by demonstrating that copyright pirates were getting rich–or maybe a ploy to do the opposite by demonstrating that existing law already is sufficient to deal with them.

Tempting as it is to identify princes and villains in this story, I’m not going to do that. Rather, I think there are some useful policy and practice reminders we should take away from this incident, no matter how it turns out.

Good Cloud=Bad Cloud, and vice versa. It’s tempting to say that Megaupload, by virtue of its blatant encouraging of unauthenticated file sharing, represents evil, while Dropbox, Box, Xythos, and other services that prevent or constrain sharing represent good. But it’s not that simple. Whatever its sins, Megaupload also provided very convenient, accessible cloud-based storage for digital files. Whatever their virtues, most other cloud-based storage services also enable individuals to share digital material in violation of copyright. It’s not technology that’s good or bad , it’s what people do with it.

There Is No Technical Solution. There’s only one way to clearly separate infringement-permitting sites from infringement-forbidding sites, and that’s by requiring that the latter inspect every single file they store and make sure that it does not infringe upon someone’s copyright. Some commentators have proposed this as the standard for whether file-sharing sites can operate. Yet if we’ve learned anything from the past several years of copyright wars, it’s that clearly distinguishing infringing from noninfringing material by purely technical means simply doesn’t work. We can identify music and maybe movies (as the amazing Shazam apps illustrate, and Audible Magic has tried to do for networks). But identifying a digital file in no way establishes whether that file is infringing. So long as cloud storage exists–indeed, so long as it is possible to make files in one place accessible from another–it will be possible for copyrighted or otherwise illegal material to be shared. To argue that because of this neither cloud storage nor file sharing should be permitted, as the entertainment industry implicitly does, is pointless and futile. The cloud–risks, benefits, ambivalence, and all–is here to stay.

Cease-and-Desist. The Digital Millennium Copyright Act has many, many failings. But its authors got one thing right: if a copyright holder believes its material is being distributed or used without permission on an Internet service, it must before taking any other action notify the offending site and request that the material be removed. The site, in turn, may take reasonable steps to confirm that the complaint is valid before causing the material to be removed–and the site may choose various ways to achieve that last. Only if a site fails to act appropriately on complaints can it be held liable (other than for its own infringements, of course). Only through legal process (albeit rather minimal process) can the copyright holder obtain the infringing individual’s identity, and only by suing him or her can the copyright holder obtain redress. This is the basic fairness required by the United States’s commitment to due process and checks and balances. The recent Stop Online Piracy Act (SOPA) and Protect IP Act (PIPA) proposed to trample on this basic fairness, and that is part of why they stopped moving forward. Unfortunately, the Megaupload case reminds us that even without SOPA and PIPA it is possible for law enforcement to bypass due process. Users of Megaupload should have been warned that many files had been found to infringe copyright, and users should have had the opportunity to delete and/or move their files before law enforcement seized the company and disabled access.

The Long Arm of the Law. Here I defer to others wiser than I, and confine myself to two simple observations about jurisdiction. First, it’s quite clear that copyright and related laws vary dramatically from country to country. Whether they are enforced varies even more. It has thus been tempting to move questionable activities to other jurisdictions, for reasons ranging from  a copyright pirate’s or a pornographer’s desire to avoid prosecution to a government’s desire to not have another government poking into its digital affairs. Second, however, the complex border-hopping nature of today’s technology means that safe harbor is hard to find, because placing data “in” a particular country may not mean that at all–as Megaupload clearly learned when its Virginia-based servers enabled U.S. law enforcement to reach its multinational venues. So long as international variation exists, so will attempts to avoid jurisdiction, but the clear lesson of Megaupload is that the variation will have more effect on lawyers’ employment than on a priori protection.

Backup, Backup, Backup. If anyone is harmed when Megaupload data is deleted, it won’t because of Megaupload’s crimes or law enforcement’s actions. Rather, it will be because individuals or companies did not back up their data appropriately. If data are important, then they should be stored in a way that protects them from plausible failures. LOCKSS (“lots of copies keeps stuff safe”) is the classic way to protect data: making sure that they exist in at least two distinct locations. It’s that term “distinct locations” that requires attention. So far as backup is concerned, “distinct locations” differ from each other not only geographically, but also technologically and organizationally. Keeping one’s data on two different devices in the same place is better than having only one copy, but it’s less wise that having those devices two two different places–or, better still, having more than two in more than two places. Less obviously, if those devices all store data the same way (for example, magnetically), that’s not as good as diverse technologies. Even less obviously–and the key lesson for users of cloud services–that a cloud provider backs up its data store isn’t the same as having one’s own copy. If, for example, the provider goes bankrupt, or is affected by a court order, or seized by law enforcement, all the data it holds–master copy and backups alike–may become inaccessible. LOCKSS!

I have my own Internet domain. It’s hosted on a commercial service, Hostmonster.  On occasion, I use my site to make things I’ve written available either generally, for example a recent paper of how information technology might help transform higher education, or with restrictions, such as a collection of downloaded comic strips I keep online for my own use and entertainment. I also use it to host my personal blog. Both of these functions require that Hostmonster provide file sharing on my behalf–indeed, if you just clicked on the link to my paper, I just shared a digital copyrighted document with you (and notice that I didn’t share the comic strips, which I’m not licensed to redistribute).

One early consequence of the Megaupload arrests was a move by several Internet services to eliminate or constrain file sharing. Yet file sharing is here to stay, a central building block of today’s networked society and economy. The key lesson we must draw from Megaupload is not that file sharing is evil and must be stamped out, but rather that the value and cost file sharing, like everything else, depend not on what they are but rather on how we use them.

(This post also appears among my EDUCAUSE blog posts)