Posts Tagged ‘“information technology”’

Revisiting IT Policy #3: Harassment

OwlBThe so-called “star wars” campuses of the mid-1980s (Brown, Carnegie Mellon, Dartmouth, and MIT) invented (or at least believe they invented–IT folklore runs rampant) much of what we take for granted and appreciate today in daily electronic life: single signon, secure authentication, instant messaging, cloud storage, interactive online help, automatic updates, group policy, and on and on.

They also invented things we appreciate less. One of those is online harassment, which takes many forms.

Early in my time as MIT’s academic-computing head, harassment seemed to be getting worse. Partly this was because the then-new Athena computing environment interconnected students in unprecedentedly extensive ways, and partly because the Institute approached harassment purely as a disciplinary matter–that is, trying to identify and punish offenders.

Those cases rarely satisfied disciplinary requirements, so few complaints resulted in disciplinary proceedings. Fewer still led to disciplinary action, and of course all of that was confidential.

Stopit

imgresWorking with Mary Rowe, who was then the MIT “Ombuds“, we developed a different approach. Rather than focus on evidence and punishment, we focused on two more general goals: making it as simple as possible for victims of harassment to make themselves known, and persuading offenders to change their behavior.

The former required a reporting and handling mechanism that would work discreetly and quickly. The latter required something other than threats.

stopit poster (2)Satisfying the first requirement was relatively simple. We created an email alias (stopit@mit.edu) to receive and handle harassment (and, in due course, other) complaints.  Email sent to that address went to a small number of senior IT and Ombuds staff, collectively known as the Stopits. The duty Stopit–often me–responded promptly to each complaint, saying that we would do what we could to end the harassment.

We publicized Stopit widely online, in person, and with posters. In the poster and other materials, we gave three criteria for harassment:

  • Did the incident cause stress that affected your ability, or the ability of others, to work or study?
  • Was it unwelcome behavior?
  • Would a reasonable person of your gender/race/religion subjected to this find it unacceptable?”

Anyone who felt in danger, we noted, should immediately communicate with campus police or the dean on call, and we also gave contact information for other hotlines and resources. Otherwise, we asked that complainants share whatever specifics they could with us, and promised discretion under most circumstances.

To satisfy the second requirement, we had to persuade offenders to stop–a very different goal, and this is the key point, from bringing them to justice. MIT is a laissez-faire, almost libertarian place, where much that would be problematic elsewhere is tolerated, and where there is a high bar to formal action.

As I wrote in an MIT Faculty Newsletter article at the time, we knew that directly accusing offenders would trigger demands for proof and long, futile arguments about the subtle difference between criticism and negative comments–which are common and expected at the Institute–and harassment. Prosecution wouldn’t address the problem.

UYA

And so we came up with the so-called “UYA” note.

“Someone using your account…”, the note began, and then went on to describe the alleged behavior. “If you did not do this,” the note went on, “…then quite possibly someone has managed to access your account without permission, and you should take immediate steps to change your password and not share it with anyone.” The note then concluded by saying “If the incident described was indeed your doing, we ask that you avoid such incidents in the future, since they can have serious disciplinary or legal consequences”.

keep-calm-and-change-your-password-1Almost all recipients of UYA notes wrote back to say that their accounts had indeed been compromised, and that they had changed their passwords to make sure their accounts would not be used this way again. In virtually all such cases, the harassment then ceased.

Did we believe that most harassment involved compromised accounts, and that the alleged offenders were innocent? Of course not. In many cases we could see, in logs, that the offender was logged in and doing academic work at the very workstation and time whence the offending messages originated. But the UYA note gave offenders a way to back off without confession or concession. Most offenders took advantage of that. Our goal was to stop the harassment, and mostly the UYA note achieved that.

heatherThere was occasional pushback, usually the offender arguing that the incident was described accurately but did not constitute harassment. Here again, though, the offending behavior almost always ceased. And in a few cases there was pushback of the “yeah, it’s me, and you can’t make me stop” variety. In those, the Stopits referred the incident into MIT’s disciplinary process. And usually, regardless of whether the offender was punished, the harassment stopped.

So Stopit and UYA notes worked.

Looking back, though, they neglected some important issues, and those remain problematic. In fact, the two teaching cases I mentioned in the Faculty Newsletter article and have used in myriad class discussions since–Judy and Michael–reflect two such issues: the difference between harassment and a hostile work environment, and jurisdictional ambiguity.

Work Environment

fishbowl.57Judy Hamilton complains that images displayed on monitors in a public computing facility make it impossible for her to work comfortably. This really isn’t harassment, since the offending behavior isn’t directed at her. Rather, the offender’s behavior made it uncomfortable for Judy to work even though the offender was unaware of Judy or her reaction.

The UYA note worked: the offender claimed that he’d done nothing wrong, and that he had every right to display whatever images he chose so long as they weren’t illegal, but nevertheless he chose to stop.

But it was not correct to suggest that he was harassing Judy, as we did at the time. Most groups that have discussed this case over the years come to that conclusion, and instead say this should have been handled as a hostile-work-environment case. It’s an important distinction to keep in mind.

Jurisdiction

001Michael Zareny, on the other hand, is interacting directly with Jack Oiler, and there’s really no work environment involved. Jack feels harassed, but it’s not clear Michael’s behavior satisfies the harassment criteria. Jack appears to be annoyed, rather than impaired, by Michael’s comments. In any case the interaction between the two would be deemed unfortunate, rather than unacceptable, by many of Jack’s peers.

Or, and this is a key point, the interaction would be seen that way by Jack’s peers at MIT. There’s an old Cambridge joke: At Harvard people are nice to you and don’t mean it, and MIT people aren’t nice to you and don’t mean it. The cultural norms are different. What is unacceptable to someone at Harvard might not be to someone at MIT. So arises the first jurisdictional ambiguity.

In the event, the Michael situation turned out to be even more complicated. When Kim tried to send a UYA note to Michael, it turned out that there was no Michael Zareny at MIT. Rather, it turned out that Michael Zareny was a student elsewhere, and his sole MIT connection was interacting with Jack Oiler in an the newsgroup.

There thus wasn’t much Kim could do, especially since Michael’s own college declined to take any action because the problematic behavior hadn’t involved its campus or IT.

Looking Ahead

The point to all this is straightforward, and it’s relevant beyond the issue of harassment. In today’s interconnected world, it’s rare for problematic online behavior to occur within the confines of a single institution. As a result, taking effective action generally requires various entities to act consistently and collaboratively to gather data from complainants and dissuade offenders.

Yet the relevant policies are rarely consistent from campus to campus, let alone between campuses and ISPs, corporations, or other outside entities. And although campuses are generally willing to collaborate, this often proves difficult for FERPA, privacy, and other reasons.

It’s clear, especially with all the recent attention to online bullying and intimidation, that harassment and similarly antisocial behavior remain a problem for online communities. It’s hard to see how this will improve unless campuses and other institutions work together. If they don’t do that, then external rules–which most of us would prefer to avoid–may well make it a legal requirement.

You Report. We Decide?

botstein “It’s one of the real black marks on the history of higher education, ” Leon Botstein, the long-time President of Bard College, recently told The New Yorker’s Alice Gregory, “that an entire industry that’s supposedly populated by the best minds in the country … is bamboozled by a third-rate news magazine.” He was objecting, of course, to the often criticized but widely influential rankings of colleges and universities by US News & World Reports.

Two stories, and a cautionary note.

Wired

leydonSeeing Wired magazine‘s annual “wired campus” rankings in the same way Botstein viewed those from US News, some years ago several of us college and university CIOs conspired to disrupt Wired‘s efforts. As I later wrote, the issue wasn’t that some campuses had different (and perhaps better or worse) IT than others. Rather, for the most part these differences bore little relevance to the quality of those campuses’ education or the value they provided to students.

wiredWe persuaded almost 100 key campuses to withhold IT data from Wired. After meeting with us to see whether compromise was possible (it wasn’t) and an abortive attempt to bypass campus officials and gather data directly from students, the magazine discontinued its ratings. Success.

But, as any good pessimist knows, every silver lining has a cloud. Wired had published not only summary ratings, but also, to its credit, the data (if not the calculations) upon which the ratings were based. Although the ratings were questionable, and some of the data seemed suspect, the latter nevertheless had some value. Rather than look at ratings, someone at Campus A could look and see how A’s reported specific activity compared to its peer Campus B’s.

Partly to replace the data Wired had gathered and made available, and so extend A’s ability to see what B was doing, EDUCAUSE started the Core Data Survey (now the Core Data Service, CDS). This gathered much of the same information Wired had, and more. (Disclosure: I served on the committee that helped EDUCAUSE design the initial CDS, and revised it a couple of years later, and have long been a supporter of the effort.)

Unlike Wired, EDUCAUSE does not make individual campus data publicly available. Rather, participating campuses can compare their own data to those of all or subsets of other campuses, using whatever data and comparison algorithm they think appropriate. I can report from personal experience that this is immensely useful, if only because it stimulates and focuses discussions among campuses that appear to have made different choices.

cds postitBut back to Botstein. EDUCAUSE doesn’t just make CDS data available to participating campuses. It also uses CDS data to develop and publish “Free IT Performance Metrics,” which it describes as “Staffing, financials, and services data [campuses] can use for modifications, enhancements, and strategic planning.” The heart of Botstein’s complaint about US News & World Reports  isn’t that the magazine is third rate–that’s simply Botstein being Botstein–but rather that US News believes the same rating algorithm can be validly used to compare campuses.

Which raises the obvious question: Might EDUCAUSE-developed “performance metrics” fall into that same trap? Are there valid performance metrics for IT that are uniformly applicable across higher education?

mckMany campuses have been bedeviled and burned by McKinseys, BCGs, Accentures, Bains, PWCs, and other management consultants. These firms often give CFOs, Provosts, and Presidents detailed “norms” and “standards” for things like number of users per help-desk staffer, the fraction of operating budgets devoted to IT, or laptop-computer life expectancy. These can then become targets for IT organizations, CIOs, or staff in budget negotiations or performance appraisal.

Some of those “norms” are valid. But many of them involve inappropriate extrapolation from corporate or other different environments, or implicitly equate all campus types. Language is important: “norms,” “metrics,” “benchmarks,” “averages,” “common”, “typical,” and “standards” don’t mean the same thing. So far EDUCAUSE has skirted the problem, but it needs to be careful to avoid asserting uniform validity when there’s no evidence for it.

US News

lake desertA second story illustrates a different, more serious risk. A few years ago a major research university–I’ll call it Lake Desert University or LDU–was distressed about its US News ranking. To LDU’s leaders, faculty, and students the ranking seemed much too low: Lake Desert generally ranked higher elsewhere.

patA member of the provost’s staff–Pat, let’s say–was directed to figure out what was wrong. Pat spent considerable time looking at US News data and talking to its analysts. An important component of the US News ranking algorithm, Pat learned, was class size. The key metric was the fraction of campus-based classes with enrollments smaller than 20.

tutorialPat, a graduate of LDU, knew that there were lots of small classes at Lake Desert–the university’s undergraduate experience was organized around tutorials with 4-5 students–and so it seemed puzzling that LDU wasn’t being credited for that. Delving more deeply, Pat found the problem. Whoever had completed LDU’s US News questionnaire had read the instructions very literally, decided that “tutorials” weren’t “classes”, and so excluded them from the reporting counts. Result: few small classes, and a poor US News ranking.

usnewsUS News analysts told Pat that tutorials should have been counted as classes. The following year, Lake Desert included them. Its fraction-of-small-classes metric went up substantially. Its ranking jumped way up. The Provost sent Pat a case of excellent French wine.

In LDU’s case, understanding the algorithm and looking at the survey responses unearthed a misunderstanding. Correcting this involved no dishonesty (although some of LDU’s public claims about the “improvement” in its ranking neglected to say that the improvement had resulted from data reclassification rather than substantive progress).

Caution

But not all cases are as benign as LDU’s . As I wrote above, there were questions not only about Wired‘s ranking algorithm, but about some of the data campuses provided. Lake Desert correcting its survey responses in consultation with analysts is one thing; a campus misrepresenting its IT services to get a higher ranking is another. But it can be hard to distinguish the two.

whistleAuditing is one way to address this problem, but audits are expensive and difficult. Publishing individual responses is another–both Wired and US News have done this, and EDUCAUSE shares them with survey respondents–but that only corrects the problem if respondents spend time looking at other responses, and are willing to become whistleblowers when they find misrepresentation. Most campuses don’t have the time to look at other campuses’ responses, or the willingness to call out their peers.

If survey responses are used to create ratings, and those ratings become measures of performance, then those whose performance is being measured have incentive to tailor their survey responses accordingly. If the tailoring involves just care within the rules, that’s fine. But if it involves stretching or misrepresenting the truth, it’s not.

More generally, it’s important to closely connect the collection of data to their evaluative use. Who reports, should decide.

 

 

 

Notes on “Swag”

logo(…with apologies to Susan Sontag, of course.)

Visiting the trade show at the EDUCAUSE conference requires strategy. At one time it was simple: collect every pen being given away (having some conversations with vendors in the process), so that back home the kid could give them to his friends at school. Kid grew up, though, and there came “No more pens, Dad, please.”

After that I usually walked around with Ira Fuchs, who had an excellent eye for the interestingly novel product. But Ira hasn’t been attending, so I’ve taken to observing two things: how vendors staff their booths, and what they give away–the swag.

Who

The interesting thing about staffing is what it tells us vendors assume about higher-education IT, and especially what they assume about our procurement decisions. I track two variables: whether booths are staffed by people who know something about the product and higher education, and whether they’re chosen for reasons other than expertise.

This year the booth staff seem reasonably attuned to product and customer, and, with the exception of some game barkers, two people dressed up as giant blue bears, and two women dressed like 1950s flight attendants, most of them pretty much looked like the attendees, except with logos on their shirts.

To be be more precise, the place wasn’t full of what are sometimes called Demo Dollies, attractive young women with no product knowledge deployed on the assumption that they will attract men to their booths (and therefore on the assumption that men are making the key decisions). That there aren’t many of them is good, since a few years back things were quite different, reaching a nadir with the infamous catwomen. We don’t want industry thinking of higher education as a market easily influenced by Demo Dollies.

What

20140930_213554931_iOSThe interesting thing about swag–the stuff that vendors give away–is that it tells us something about the resources vendors are committing to higher education, the resources they think are available from higher education, or both. There are two dimensions to swag: how swanky it is, and how creative it is.

I spent some time on this year’s tradeshow floor looking for swag that rose above the commonplace, and here’s what struck me: there wasn’t much. There were lots of pens (which I’m still not allowed to bring home), lots of candy, and lots of small USB thumb drives, all of course bearing vendor logos. I count those as neither swanky nor creative.

20140930_221214136_iOSThe growing swag sector is stuff made out of foam or soft plastic. This includes baseballs, footballs, various kinds of phone-propper-uppers, can holders, and a few creatures and cartoon characters. Some of this related in some way to the vendor’s product or slogan or brand, but most of it didn’t. The foam stuff was mildly creative, except it’s less and less so each year; there was lots more of that this year than last.

20140930_215951875_iOSThere were also various items that weren’t intrinsically creative, and also not swanky, but were distinctive, if only because few vendors offered them.

There were keychain carabiners (which I always look for, since I keep leaving them in rental cars–and this year only two vendors had them), earphones, t-shirts (remarkably few of those compared to previous years, when they were ubiquitous), USB chargers, corkscrews, can openers, pens that light up, baseball caps, and kitchen utensils (my personal favorite, I think). Several vendors told me the one to get was a jump rope with blinking handles, but I couldn’t find it. Next year.

(I’ve uploaded photos of the distinctive swag to an album on my Facebook account.)

So…

…here’s the thing. That most of the available swag was low-end and uncreative may disappoint those who take lots home for friends or family or colleagues or whomever. It also may mean that vendors selling to higher education aren’t as flush as they once were, or think we aren’t; both of those are probably somewhat true, and neither is especially good news.

Combined with the dearth of Demo Dollies, though, I see the situation somewhat more positively. It seems to me that even though they may be less flush, this year’s vendors are taking the higher-education market seriously, using knowledgeable staff rather than artifice to engage customers, who may also be less flush, and sell product wisely.

That, as Martha Stewart would say, is a good thing!

Mythology, Belief, Analytics, & Behavior

MIT_Building_10_and_the_Great_Dome,_Cambridge_MAI’m at loose ends after graduating. The Dean for Student Affairs, whom I’ve gotten to know through a year of complicated political and educational advocacy, wants to know more about MIT‘s nascent pass/fail experiment, under which first-year students receive written rather than graded evaluations of their work.

MIT being MIT, “know more” means data: the Dean wants quantitative analysis of patterns in the evaluations. I’m hired to read a semester’s worth, assign each a “Usefulness” score and a “Positiveness” score, and then summarize the results statistically.

Two surprises. First, Usefulness turns out to be much higher than anyone had expected–mostly because evaluations contain lots of “here’s what you can do to improve” advice, rather than lots of terse “you would have gotten a B+” comments, as had been predicted. Second, Positiveness distributes remarkably as grades had for the pre-pass/fail cohort, rather than skewing higher, as had been predicted. Even so, many faculty continue to believe both predictions (that is, they think written evaluations are both generally useless and inappropriately positive).

20120502161716-1_0A byproduct of the assignment is my first exposure to MIT’s glass-house computer facility, an IBM 360 located in the then-new Building 39. In due course I learn that Jay Forrester, an MIT faculty member, had patented the use of 3-D arrays of magnetic cores for computer memory (the read-before-write use of cores, which enabled Forrester’s breakthrough, had been patented by An Wang, another faculty member, of the eponymous calculators and word processors). IBM bought Wang’s patent, but not Forrester’s, and after protracted legal action eventually settled with Forrester in 1964 for $13-million.

According to MIT mythology, under the Institute’s intellectual-property policy half of the settlement came to the Institute, and that money built Building 39. Only later do I wonder whether the Forrester/IBM/39 mythology is true. But not for long: never let truth stand in the way of a good story.

Not just because mythology often involves memorable, simple stories, belief in mythology is durable. This is important because belief so heavily drives behavior. That belief resists even data-driven contradiction–data analysis rarely yields memorable, simple stories–is one reason analytics so often prove curiously ineffective in modifying institutional behavior.

Two examples, both involving the messy question of copyright infringement by students and what, if anything, campuses should do about it.

44%

laurelI’m having lunch with a very smart, experienced, and impressive senior officer from an entertainment-industry association, whom I’ll call Stan. The only reason universities invest heavily in campus networks, Stan tells me, is to enable students to download and share ever more copyright-infringing movies, TV shows, and music. That’s why campuses remain major distributors of “pirated” entertainment, he says, and therefore why it’s appropriate to subject higher education generally to regulations and sanctions such as the “peer to peer” regulations from the 2008 Higher Education Opportunity Act.

That Stan believes this results partly from a rhetorical problem with high-performance networks, such as the research networks within and interconnecting colleges and universities. High-performance networks–even those used by broadcasters–usually are engineered to cope with peak loads. Since peaks are occasional, most of the time most network capacity goes unused. If one doesn’t understand this–as Stan doesn’t–then one assumes that the “unused” capacity is in fact being used, but for purposes not being disclosed.

And, as it happens, there’s mythology to fill in the gap: According to a 2005 MPAA study, Stan tells me, higher education accounts for almost half of all copyright infringement. So MPAA, and therefore Stan, knows what campuses aren’t telling us: they’re upgrading campus networks to enable infringement.

But Stan is wrong. There are two big problems with his belief.

MPAAFirst, shortly after MPAA asserted, both publicly and in letters to campus presidents, that 44% of all copyright infringement emanates from college campuses, which is where Stan’s “almost half” comes from, MPAA learned that its data contractor had made a huge arithmetic error. The correct estimate should have been more like 10-15%. But the corrected estimate was never publicized as extensively as the erroneous one: the errors that statisticians make live after them; the corrections are oft interred with their bones.

Second, if Stan’s belief is correct, then there should be little difference among campuses in the incidence of copyright infringement, at least among campuses with research-capable networking. Yet this isn’t the case. As I’ve found researching three years of data on the question, the distribution of detected infringement is highly skewed. Most campuses are responsible for little or no distribution of infringing material, presumably because they’re using Packetlogic, Palo Alto firewalls, or similar technologies to manage traffic. Conversely, a few campuses account for the lion’s share of detected infringement.

So there are ample data and analytics contradicting Stan’s belief, and none supporting it. But his belief persists, and colors how he engages the issues.

Targeting

imagesOKVW44NDI’m having dinner with the CIO from an eminent research university; I’ll call her Samantha, and her campus Helium (the same name it has in the infringement-data post I cited above). We’re having dinner just as I’m completing my 2013 study, in which Helium has surpassed Hydrogen as the largest campus distributor of copyright-infringing movies, TV shows, and music.

In fact, Helium accounts for 7% of all detected infringement from the 5,000 degree-granting colleges and universities in the United States. I’m thinking that Samantha will want to know this, that she will try to figure out what Helium is doing–or not doing–to stand out as such a sore thumb among peer campuses, and perhaps make some policy or practice changes to bring Helium into closer alignment.

But no: Samantha explains to me that the data are entirely inaccurate. Most of the infringement notices Helium receives are duplicates, she tells me, and in any case the only reason Helium receives so many is that the entertainment industry intentionally targets Helium in its detection and notification processes. Since the data are wrong, she says, there’s no need to change anything at Helium.

I offer to share detailed data with Helium’s network-security staff so that they can look more closely at the issue, but Samantha declines the offer. Nothing changes, and in 2014 Helium is again one of the top recipients of infringement notices (although Hydrogen regains the lead it had held in 2012).

The data Samantha declines to see tell an interesting story, though. The vast majority of Helium’s notices, it turns out, are associated with eight IP addresses. That is, each of those eight IP addresses is cited in hundreds of notices, which may account for Samantha’s comment about “duplicates”. Here’s what’s interesting: the eight addresses are consecutive, and they each account for about the same number of notices. That suggests technology at work, not individuals.

image0021083244899217As in Stan’s case, it helps to know something about how campus networks work. Lots of traffic distributed evenly across a small number of IP addresses sounds an awful lot like load balancing, so perhaps those addresses are the front end to some large group of users. “Front end to some large group of users” sounds like an internal network using Network Address Translation (NAT) for its external connections.

NAT issues numerous internal IP addresses to users, and then technologically translates those internal addresses traceably into a much smaller set of external addresses. Most campuses use NAT to conserve their limited allocation of external IP addresses, especially for their campus wireless networks. NAT logs, if kept properly, enable campuses to trace connections from insiders to outside and vice versa, and so to resolve those apparent “duplicates”.

So although it’s true that there are lots of duplicate IP addresses among the notices Helium receives, this probably stems from Helium’s use of NAT on its campus wireless. Helium’s data are not incorrect. If Helium were to manage NAT properly, it could figure out where the infringement is coming from, and address it.

Samantha’s belief that copyright holders target specific campuses, like Stan’s that campuses expand networks to encourage infringement, has a source–in this case, a presentation some years back from an industry association to a group of IT staff from a score of research universities. (I attended this session.) Back then, we learned, the association did target campuses, not out of animus, but simply as a data-collection mechanism. The association would choose a campus, look for infringing material being published from the campus’s network, send notices, and then move on to another campus.

utorrent-facebook-mark-850-transparentSince then, however, the industry had changed its methodology, in large part because the BitTorrent protocol replaced earlier ones as the principal medium for download-based infringement. Because of how BitTorrent works, the industry’s methodology shifted from searching particular networks to searching BitTorrent indexes for particularly popular titles and then seeing which networks were making those titles available.

I spent lots of time recently with the industry’s contractors looking closely at that methodology. It appears to treat campus networks equivalently to each other and to commercial networks, and so it’s unlikely that Helium was being targeted as Samantha asserted.

If Samantha had taken the infringement data to her security staff, they probably would have discovered the same thing I did, and either used NAT data to identify offenders, or perhaps to justify policy changes for the wireless network. Same goes for exploring the methodology. But instead Samantha relied on her belief that the data were incorrect and/or targeted

Promoting Analytic Effectiveness

Because of Stan’s and Samantha’s belief in mythology, their organizations’ behavior remains largely uninformed by analytics and data.

decision-treeA key tenet in decision analysis holds that information has no value (other than the intrinsic value of knowledge) unless the decisions an individual or an institution have before them will turn out differently depending on the information. That is, unless decisions depend on the results of data analysis, it’s not worth collecting or analyzing data.

Colleges, universities, and other academic institutions have difficulty accepting this, since the intrinsic value of information is central to their existence. But what’s valuable intrinsically isn’t necessarily valuable operationally.

Generic praise for “data-based decision making” or “analytics” won’t change this. Neither will post-hoc documentation that decisions are consistent with data. Rather, what we need are good, simple stories that will help mythology evolve: case studies of how colleges and universities have successfully and prospectively used data analysis to change their behavior for the better. Simply using data analysis doesn’t suffice, and neither does better behavior: we need stories that vividly connect the two.

Ironically, the best way to combat mythology is with–wait for it–mythology…

Revisiting IT Policy #2: Campus DMCA Notices

Under certain provisions from the Digital Millennium Copyright Act, copyright holders send a “notification of claimed infringement” (sometimes called a “DMCA” or “takedown” notice) to Internet service providers, such as college or university networks, when they find infringing material available from the provider’s network. I analyzed counts of infringement notices from the four principal senders to colleges and universities over three time periods (Nov 2011-Oct 2012, Feb/Mar 2013, and Feb/Mar 2014).

In all three periods, most campuses received no notices, even campuses with dormitories. Among campuses receiving notices, the distribution is highly skewed: a few campuses account for a disproportionately large fraction of the notices. Five campuses consistently top the distribution in each year, but beyond these there is substantial fluctuation from year to year.

The volume of notices sent to campuses varies somewhat positively with their size, although some important and interesting exceptions keep the correlation small. The incidence of detected infringement varies strongly with how residential campuses are. It varies less predictably with proxy measures of student-body affluence.

I elaborate on these points below.

Patterns

The estimated total number of notices for the twelve months ending October 2012 was 243,436. The actual number of notices in February/March 2013 was 39,753, and the corresponding number a year later was 20,278.

The general pattern was the same in each time period.

  • According to the federal Integrated Postsecondary Education Data Service (IPEDS), from which I obtained campus attributes, there are 4,904 degree-granting campuses in the United States. Of these, over 80% received no infringement notices in any of the three time periods.
  • 90% of infringement notices went to campuses with dormitories.
  • Of the 801 institutions that received at least one notice in one period, 607 received at least one notice in two periods, and 437 did so in all three. The distribution was highly skewed among the campuses that received at least one infringement notice. The top two recipients in each period were the same: they alone accounted for 12% of all notices in 2012, and 10% in 2013 and 2014.
  • In 2012, 10 institutions accounted for a third of all notices, and 41 accounted for two thirds. In 2013, the distribution was only a little less skewed: 22 institutions accounted for a third of all notices, and 94 accounted for two thirds. In 2014, 22 institutions also accounted for a third of all notices, and 99 accounted for two thirds.

Campus Type

In 2014, just 590 of the 4,904 campuses received infringement notices in 2014. Here is a breakdown by institutional control and type:

Capture

Here are the same data, this time broken down by campus size and residential character (using dormitory beds per enrolled student to measure the latter; the categories are quintiles):

Capture2

About a third of all notices went to very large campuses in the middle residential quintile. In keeping with the classic Pareto ratio, the largest 20% of campuses account for 80% of all notices (and enroll ¾ of all students). Although about half of the largest group is nonresidential (mostly community colleges, plus some state colleges), only a few of them received notices.

Campus Distributions

The top two among the 100 campuses that received the most notices in Feb/Mar 2014 received over 1,000 notices each in the two months. The next highest campus received 615. As the graph below shows, the top 100 campuses accounted for two thirds of the notices; the next 600 campuses accounted for the remaining third (click on this graph, or the others below, to see it full size):

image001

Below is a more detailed distribution for the top 30 recipient campuses, with comparisons to 2012 and 2013 data. To enable valid comparison, this chart shows the fraction of notices received by each campus in each year, rather than the total. The solid red bars are the campus’s 2014 share, and the lighter blue and green bars are the 2012 and 2013 shares. The hollow bar for each campus is the incidence of detected infringement, defined as the number of 2014 notices per thousand headcount students.

image003

As in earlier analyses, there is an important distinction between campuses whose high volume of notices stems largely from their size, and those where it stems from a combination of size and incidence—that is, the ratio of notices received to enrollment.

In the graph, Carbon and Nitrogen are examples of the former: they are both very large public urban universities enrolling over 50,000 students, but with relatively low incidence of around 7 notices per thousand students. They stand in marked contrast to incidences of 20-60 notices per thousand students at Lithium, Boron, Neon, Magnesium, Aluminum, and Silicon, each of which enrolls 10-25,000 students—all private except Aluminum.

Changes over Time

The overall volume of infringement notices varies from time to time depending on how much effort copyright holders devote to searching for infringement (effort costs money), and to a lesser extent based on which titles they use to seed searches. The volume of notices sent to campuses varies accordingly. However, the distribution of notices across campuses should not be affected by the total volume. To analyze trends, therefore, it is important to use a metric independent of total volume.

As in the preceding section, I used the fraction of all campus notices each campus received for each period. The top two campuses were the same in all three years: Hydrogen was highest in 2012 and 2014, and Helium was highest in 2013.

Only five campuses received at least 1.5% of all notices in more than one year:

image005

These campuses consistently stand at the top of the list, account for a substantial fraction of all infringement notices, and except for Beryllium have incidence over 20. As I argue below, it makes sense for copyright holders to engage them directly, to help them understand how different they are from their peers, and perhaps to persuade them to better “effectively combat” infringement from their networks by adopting policies and practices from their low-incidence peers.

Aside from these five campuses, there is great year-to-year variation in how many notices campuses receive. Below, for example, is a similar graph for the approximately 50 campuses receiving 0.5%-1.5% of all notices in at least one of the three years. Such year-to-year variation makes engagement much more difficult to target efficiently and much less likely to have discernible effects.

image007

Relationships

Size

All else equal, if infringement is the same across campuses and campuses take equally effective measures to prevent it from reaching the Internet, then the volume of detected infringement should generally vary with campus size. That this is only moderately the case implies that student behavior varies from campus to campus and/or that campuses’ “effectively combat” measures are different and have different effects.

Here are data for the 100 campuses receiving the most infringement notices in 2014:

image009

It appears visually that the overall correlation between campus size and notice volume is modest (and indeed r=0.29) because such a large volume of notices went to Hydrogen and Helium, which are not the largest campuses.

However, the correlation is slightly lower if those two campuses are omitted. This is because Lithium has the next highest volume, yet is of average size, and Manganese, the largest campus in the group, with over 70,000 students, had very low incidence of 2 notices per thousand students. (I’ve spoken at length with the CIO and network-security head at Manganese, and learned that its anti-infringement measures comprise a full array of policies and practices: blocking of peer-to-peer protocols at the campus border, with well-established exception procedures; active followthrough on infringement notices received; and direct outreach to students on the issue.)

Residence

If students live on campus, then typically their network connection is through the campus network, their detectable infringement is attributed to the campus, and that’s where the infringement notice goes. If students live off campus, then they do not use the campus network, and infringement notices go to their ISP. This is why most infringement notices go to campuses with dorms, even though the behavior of their students probably resembles that of their nonresidential peers.

For the same reason, we might expect that residentially intensive campuses (measured by the ratio of dormitory beds to total enrollment) would have a higher incidence of detectable infringement, all else equal, than less residential campuses. Here are data for the 100 campuses receiving the most infringement notices:

image011

The relationship is positive, as expected, and relatively strong (r=.58). It’s important, though, to remember that this relationship between campus attributes (residential intensity and the incidence of detected infringement) does not necessarily imply a relationship between student attributes such as living in dorms and distributing infringing material. Drawing inferences about individuals from data about groups is the “ecological fallacy.”

Affluence

One hears arguments that infringement varies with affluence, that is, that students with less money are more likely to infringe. There’s no way to assess that directly with these data, since they do not identify individuals. However, IPEDS campus data include the fraction of students receiving Federal grant aid, which varies inversely with income. The higher this fraction, the less affluent, on average, the student body should be. So it’s interesting to see how infringement (measured by incidence rather than volume) varies with this metric:

image013

The relationship is slightly negative (r=-.12), in large part because of Polonium, a small private college with few financial-aid recipients that received 83 notices per 1000 students in 2014. (Its incidence was similar in 2012, but much lower in 2013.) Even without Polonium, however, the relationship is small.

For the same reason, we might expect a greater incidence of detected infringement on less expensive campuses. The data:

image015

Once again the relationship is the opposite (r=.54), largely because most campuses have both low tuition and low incidence.

Campus Interactions

Following the 2012 and 2013 studies, I communicated directly with IT leaders at several campuses with especially high volumes of infringement notices. All save one (Hydrogen) of these interactions were informative, and several appear to have influenced campus policies and practices for the better.

  • Helium. Almost all of Helium’s notices are associated with a small, consecutive group of IP addresses, presumably the external addresses for a NAT-mediated campus wireless network. I learned from discussions with Helium’s CIO that the university does not retain NAT logs long enough to identify wireless users when infringement notices are received; as a result, few infringement notices reach offenders, and so they have little impact directly or indirectly. Helium apparently understands and recognizes the problem, but replacing its wireless logging systems is not a high priority project.
  • Hydrogen. Despite diverse direct, indirect, and political efforts to engage IT leaders at Hydrogen, I was never able to open discussions with them. I do not understand why the university receives so many notices (unlike Helium’s, they are not concentrated), and was therefore unable to provide advice to the campus. It is also unclear whether the notices sent to Hydrogen are associated with its small-city main campus or with its more urban branch campus.
  • Krypton. Krypton used to provide guests up to 14 days of totally unrestricted and anonymous use of its wired and wireless networks. I believe that this led to its high rate of detected infringement. More recently, Krypton implemented a separate guest wireless network, which is still anonymous but apparently is either more restricted or is routed to an external ISP. I believe that this change is why Krypton is no longer in the top 20 group in 2014. (Krypton still offers unrestricted 14-day access to its wired network.)
  • Lithium. The network-security staff at Lithium told me that there are plans to implement better filtering and blocking on their network, but that implementation has been delayed.
  • Nitrogen. Nitrogen enrolls over 50,000 students, more than almost any other campus. As I pointed out above, although Nitrogen’s infringement notice counts are substantial, they are actually relatively low when adjusted for enrollment.
  • Gallium. I discussed Gallium’s high infringement volume with its CIO in early 2013. She appeared to be surprised that the counts were so high, and that they were not all associated with Gallium affiliate campuses, as the university had previously believed. Although the CIO was noncommittal about next steps, it appears that something changed for the better.
  • Palladium. The Palladium CIO attended a Symposium I hosted in March 2013, and while there he committed to implementing better controls at the University. The CIO appears to have followed through on this commitment.
  • No Alias. Although it doesn’t appear in the graph, No Alias is an interesting story. It ranked very high in the 2012 study. NA, it turns out, provides exit connections for the Tor network, which means that some traffic that appears to originate at NA in fact originates from anonymous users elsewhere. Most of NA’s 2012 notices were associated with the Tor connections, and I suggested to NA’s security officer that perhaps No Alias might impose some modest filters on those. It appears that this may have happened, and may be why NA dropped out of the top group.

I also interacted with several other campuses that ranked high in 2013. In many of these conversations I was able to point IT staff to specific problems or opportunities, such as better configuring firewalls. Most of these campuses moved out of the top group.

And So…

The 2014 DMCA notice data reinforce earlier implications (from both data and direct interactions) for campus/industry interactions. Copyright holders should interact directly with the few institutions that rank consistently high, and with large residential institutions that rank consistently low. In addition, copyright holders should seek opportunities to better understand how best to influence student behavior, both during and after college.

Conversely, campuses that receive disproportionately many notices, and so give higher education a bad reputation with regard to copyright infringement, should consult peers at the other end of the distribution, and identify reasonable ways to improve their policies and practices.

9|4|14 gj-c

 

Revisiting IT Policy #1: Network Neutrality

The last time I wrote about network neutrality, higher education was deeply involved in the debate, especially through the Association of Research Libraries and EDUCAUSE, whose policy group I then headed. We supported a proposal by the then Federal Communications Commission (FCC) chairman, Julius Genachowski, to require public non-managed last-mile networks to transmit end-user Internet traffic neutrally.

We worried that otherwise those networks might favor commercial over intellectual content, and so make it difficult for off-campus students to access course, library, and other campus content, and for campus entities such as libraries to access content on other campuses or in central shared repositories. (The American Library Association had similar worries on behalf of public libraries and their patrons.) Almost as a footnote, we opposed so-called “paid prioritization”, an ill-defined concept, rarely implemented, but now reborn as “Internet fast lanes”.

Although courts overturned the FCC on neutrality, for the most part its key principle has held: traffic should flow across the Internet without regard for its source, its destination, or its content.

But the paid-prioritization footnote is pushing its way back into the main text. It’s doing so in a particularly arcane way, but one that may have serious implications for higher education. Understanding this requires some definitions. After addressing those (as Steve Worona points out, an excellent Wired article has even more on how the Internet, peering, and content delivery networks work), I’ll  turn to current issues and higher education’s interests.

What Is Network Neutrality?

To be “neutral”, in the FCC’s earlier formulation, a network must transmit public Internet traffic equivalently without regard for its source, its destination, or its content. Public Internet traffic means traffic that involves externally accessible IP addresses. A network can discriminate on the basis of type–for example, treat streaming video differently from email. But a neutral network cannot discriminate on source, destination, or content within a given type of traffic. A network can  treat special traffic such as cable TV programming or cable-based telephony–“managed services”, in the jargon–differently than regular public Internet traffic, although this is controversial since the border is murky. More controversial still, given current trends, is the exclusion of cellular wireless Internet traffic (but not WiFi) from neutrality requirements.

Pipes

The word “transmit” is important, because it’s different from “send” and “receive”. Users connect computers, servers, phones, television sets, and other devices to networks. They choose and pay for the capacity of their connection (the “pipe”, in the usual but imperfect plumbing analogy) to send and receive network traffic. Not all pipes are the same, and it’s perfectly acceptable for a network to provide lower-quality pipes–slower, for example–to end users who pay less, and to charge customers differently depending on where they are located. But a neutral network must provide the same quality of service to those who pay for the same size, quality, and location of “pipe”.

A user who is mostly going to send and receive small amounts of text (such as email) can get by with very modest and inexpensive capacity. One who is going to view video needs more capacity, one who is going to use two-way videoconferencing needs even more, and a commercial entity that is going to transmit multiple video streams to many customers needs lots. Sometimes the capacity of connections is fixed–one pays for a given capacity regardless of whether one uses it all–and sometimes their capacity and cost adjust dynamically with use. But in all cases one is merely paying for a connection to the network, not for how quickly traffic will get to or arrive from elsewhere. That last depends on how much someone is paying at the other end, and on how well the intervening networks interconnect. Whether one can pay for service quality other than the quality of one’s own connection is central to the current debate.

Users

It’s also important to consider two different (although sometimes overlapping) kinds of users: “end users” and “providers”. In general, providers deliver services to end users, sometimes content (for example, Netflix, the New York Times, or Google Search), sometimes storage (OneDrive, Dropbox), sometimes communications (Gmail, Xfinity Connect), and sometimes combinations of these and other functionality (Office Online, Google Apps).

The key distinctions between providers and end users are scale and revenue flow. The typical provider serves thousands if not millions of end users; the typical end user uses more than a few but rarely more than a few hundred providers. End users provide revenue to providers, either directly or by being counted; providers receive revenue (or sometimes other value such as fame) from end users or advertisers, and use it to fund the services they provide.

Roles

Networks (and therefore network operators) can play different roles in transmission: “first mile”, “last mile”, “backbone”, and “peering”. Providers connect to first-mile networks. End users do the same to last-mile networks. (First-mile and last-mile networks are mirror images of each other, of course, and can swap roles, but there’s always one of each for any traffic.) Sometimes first-mile networks connect directly to last-mile networks, and sometimes they interconnect indirectly using backbones, which in turn can interconnect with other backbones. Peering is how first-mile, last-mile, and backbone networks interconnect.

To use another imperfect analogy, first mile networks are on-ramps to backbone freeways, last-mile networks are off-ramps, and peering is where freeways interconnect. But here’s why the analogy is  imperfect: sometimes providers connect directly to backbones, and sometimes first-mile and last-mile networks have their own direct peering interconnections, bypassing backbones. Sometimes, as the Wired article points out, providers pay last-mile networks to host their servers, and sometimes special content-distribution systems such as Akamai do roughly the same. Those imperfections account for much of the current controversy.

Consider how I connect the Mac on my desk in Comcast‘s downtown office (where a few of us from NBCUniversal also work) to hostmonster.com, where this blog lives. I connect to the office wireless, which gives me a private (10.x.x.x) IP address. That goes to an internal (also private) router in Philadelphia, which then connects to Comcast’s public network. Comcast, as the company’s first-mile network, takes the traffic to Pennsylvania, then to Illinois, then back east to Virginia. There Comcast has a peering connection to Cogent, which is Hostmonster’s first-mile network provider. Cogent carries my traffic from Virginia to Illinois, Missouri, Colorado, and Utah, where Hostmonster is located and connects to Cogent.

If Comcast and Cogent did not have a direct connection, then my traffic would flow through a backbone such as Level3. If Hostmonster placed its servers in Comcast data centers, my traffic would be all-Comcast. As I’ll note repeatedly, this issue–how first-mile, last-mile, and backbones peer, and how content providers deal with this–is driving much of today’s network-neutrality debate. So is the increasing consolidation of the last-mile network business.

Public/Private

“Public” networks are treated differently than “private” ones. Generally speaking, if a network is open to the general public, and charges them fees to use it, then it’s a public network. If access is mostly restricted to a defined, closed community and does not charge use fees, then it’s a private network. The distinction between public and private networks comes mostly from the Communications Assistance to Law Enforcement Act (CALEA), which took effect in 1995. CALEA required “telecommunications carriers” to assist police and other law enforcement, notably by enabling court-approved wiretaps.

Even for traditional telephones, it was not entirely clear which “telecommunications carriers” were covered–for example, what about campus-run internal telephone exchanges?–and as CALEA extended to the Internet the distinction became murkier. Eventually “open to the general public, and charges them fees” provided a practical distinction, useful beyond CALEA.

Most campus networks are private by this definition. So are my home network, the network here in the DC Comcast office, and the one in my local Starbucks. To take the roadway analogy a step further, home driveways, the extensive network of roads within gated residential communities (even a large one such as Kiawah Island), and roadways within large industrial facilities (such as US Steel’s Gary plant) are private. City streets, state highways, and Interstates are public. (Note that the meaning of “public network” in Windows, MacOS, or other security settings is different.)

Neutrality

In practice, and in most of the public debate until recently, the term “network neutrality” has meant this: except in certain narrow cases (such as illegal uses), a neutral-network operator does not prioritize traffic over the last mile to or from an end user according to the source of the traffic, who the end user is, or the content of the traffic. Note the important qualification: “over the last mile”.

An end user with a smaller, cheaper connection will receive traffic more slowly than one who pays for a faster connection, and the same is true for providers sending traffic. The difference may be more pronounced for some types of traffic (such as video) than for others (email). Other than this, however, a neutral network treats all traffic the same. In particular, the network operator does not manipulate the traffic for its own purposes (such as degrading a competitor’s service), and does not treat end users or providers differently except to the extent they pay for the speed or other qualities of their own network connections.

“Public” networks often claim to be neutral, at least to some degree; “private” ones rarely do. Most legislative and regulatory efforts to promote network neutrality focus on public networks.

Enough definition. What does this all mean for higher education, and in particular how is that meaning different from what I wrote about back in 2011?

The Rebirth of Paid Prioritization

Where once the debate centered on last-mile neutrality for Internet traffic to and from end users, which is relatively straightforward and largely accepted, it has now expanded to include both Internet and “managed services” over the full path from provider to end user, which is much more complicated and ambiguous.

An early indicator was AT&T’s proposal to let providers subsidize the delivery of their traffic to AT&T cellular-network end users, specifically by allowing providers to pay the data costs associated with their services to end users. That is, providers would pay for how traffic was delivered and charged to end users. This differs fundamentally from the principle that the service end users receive depends only on what end users themselves pay for. Since cellular networks are not required to be neutral, AT&T’s proposal violated no law or regulation, but it nevertheless triggered opposition: It implied that AT&T’s customers would receive traffic (ads, downloads, or whatever) from some providers more advantageously–that is, more cheaply–than equivalent traffic from other providers. End user would have no say in this, other than to change carriers. Thus far AT&T’s proposal has attracted few providers, but this may be changing.

Then came the running battles between Netflix, a major provider, and last-mile providers such as Comcast and Verizon. Netfllix argued that end users were receiving its traffic less expeditiously than other providers’ traffic, that this violated neutrality principles, and that last-mile providers were responsible for remedying this. The last-mile providers rejected this argument: in their view the problem arose because Netfllix’s first-mile network (as it happens, Cogent, the same one Hostmonster uses) was unwilling to pay for peering connections capable of handling Netflix’s traffic (which can amount to more than a quarter of all Internet traffic some evenings). In the last-mile networks’ view, Netflix’s first-mile provider was responsible for fixing the problem at its (and therefore presumably Netflix’s) expense. The issue is, who pays to ensure sufficient peering capacity? Returning to the highway metaphor, who pays for sufficient interchange ramps between toll roads, especially when most truck traffic is in one direction?

In the event Netflix gave in, and arranged (and paid for) direct first-mile connections to Comcast, Verizon, and other last-mile providers. But Netflix continues to press its case, and its position has relevance for higher education.

Colleges and Universities

Colleges and universities have traditionally taken two positions on network neutrality. Representing end users, including their campus community and distant students served over the Internet, higher education has taken a strong position in support of the FCC’s network-neutrality proposals, and even urged that they be extended to cover cellular networks. As operators of networks funded and designed to support campuses’ instructional, research, and administrative functions, however, higher education also has taken the position that campus networks, like home, company, and other private networks, should continue to be exempted from network-neutrality provisions.

These remain valid positions for higher education to take in the current debate, and indeed the principles recently posted by EDUCAUSE and various other organizations do precisely that. But the emergence of concrete paid-prioritization services may require more nuanced positions and advocacy.  This is partly because the FCC’s positions have shifted, and partly because the technology and the debate have evolved.

Why should colleges and universities care about this new network-neutrality battleground? Because in addition to representing end users and operating private networks, campuses are increasingly providing instruction to distant students over the Internet. Massively open online courses (MOOCs) and other distance-education services often involve streamed or two-way video. They therefore require high-quality end-to-end network connections.

In most cases, campus network traffic to distant student flows over the commercial Internet, rather than over Internet2 or regional research and education (R&E) networks. Whether it reaches students expeditiously depends not only on the campus’s first-mile connection (“first mile” rather than “last mile” because the campus is now a provider rather than simply representing end users), but also on how the campus’s Internet service provider connects to backbones and/or to students’ last-mile networks–and of course on whether distant students have paid for good enough connections. This is similar to Netflix’s situation.

Unlike Netflix, however, individual campuses probably cannot afford to pay for direct connections to all of their students’ last-mile networks, or to place servers in distant data centers. They thus depend on their first-mile networks’ willingness to peer effectively with backbone and last-mile networks. Yet campuses are rarely major customers of their ISPs, and therefore have little leverage to influence ISPs’ backbone and peering choices. Alternatively, campuses can in theory use their existing connections to R&E networks to deliver instruction. But this is only possible if those R&E networks peer directly and capably with key backbone and last-mile providers. R&E networks generally have not done this.

Here’s what this all means: Higher education needs to continue supporting its historical positions promoting last-mile neutrality and seeking private-network exemptions for campus networks. But colleges and universities also need to work together to make sure their instructional traffic will continue to reach distant students. One way to achieve this is by opposing paid prioritization, of course. But FCC and other regulations may permit limited paid prioritization, or technology may as usual stay one step ahead of regulation. Higher education must figure out the best ways to deal with that, and collaborate to make them so.

 

 

 

 

The evil that men do lives after them. The good is oft interred with their bones.

- Exterior  GeneralLunch with an old friend, beautiful day in Washington, seated outdoors enjoying surprisingly excellent hamburgers. We’re going to talk about our kids, and what we’re doing this summer, and maybe even about working together on a project some day (as we did decades ago).

But as is so often the case for those of us who work in IT, first there’s a technical question about calendars on his iPhone. He’s not clear on the distinction between the iCloud calendar and the one installed by his campus IT group.

I clarify that one is personal and the other enterprise. That segues into a discussion of calendar/email/contacts services (somewhat inexplicably, his campus still uses Notes), and then into IT services and help desks.

My friend observes that his campus provides an excellent array of IT equipment, software (Notes excepted),  and services. But it also has one of those “your call will be handled by the next available representative” queuing systems on its IT help desk.

Cobbe_portrait_of_Shakespeare“I really hate that,” my friend says, as I swipe some of his sweet-potato fries. Because he so dislikes the queuing system, he says, he can’t think positively about his campus’s IT, no matter how good the rest of it is. The evil that men do lives after them; the good is oft interred with their bones. (Why is the Bard on my mind? Because at home we’ve been watching the excellent BBC/PBS Shakespeare Uncovered series on Netflix.)

It’s a familiar refrain. I’ve just been rereading a 1999 article with advice for new CIOs, where I had this to say:

Information technology most often succeeds when it is invisible–when people do not realize they are using it and focus on larger goals. When you and your staff do things right, even spectacularly, no one will notice. This is immensely frustrating. The only comments you are ever going to hear–from the big bosses, from faculty, from staff, from the student newspaper–will be negative, sometimes vitriolically so. This will drive you crazy. No one outside IT at the institution will sympathize.

We like to think this is peculiar to IT. It isn’t.

sct logoCase in point: Registrars. During my tenure at the University of Chicago, we replaced an old terminal-based student system for staff only with a highly flexible, modern web-based system directly accessible by students, faculty, and staff. Students used to wait in line to give their class choices to Registrar clerks, who would then set class lists and enter data in the system manually. Grading, transcripts, and other processes were similar. No one was happy except the Registrar, whose staff and budget necessarily remained large.

The new system (now-defunct SCT‘s now-defunct Matrix product) changed everything: no more waiting in line, simpler scheduling, later deadlines for grades, online transcript requests, you name it. Asked about specifics, almost everyone described almost everything as better.

But no one seemed to feel any better about the University than they had before.

Irving_Frederick_Herzberg_y_sus_teorias_de_motivacion_en_el_trabajoIrving_Frederick_Herzberg_y_sus_teorias_de_motivacion_en_el_trabajoIrving_Frederick_Herzberg_y_sus_teorias_de_motivacion_en_el_trabajo herzAt lunch, my friend pointed to this apparent conundrum as an interesting parallel to “two-factor theory,” the suggestion by Frederick Herzberg that job satisfaction and job dissatisfaction are independent of each other. The Registrar’s customers were less dissatisfied, but that did not mean they were more satisfied.

Messier case in point: Business travel. Time was, one made business-travel arrangements by calling (or having one’s assistant call) a travel agency or travel office to make reservations and get a travel advance, and one accounted for the advance and/or got reimbursed for out-of-pocket expense by filling out (or having one’s assistant fill out) a form, attaching paper receipts to it, mailing it somewhere, and eventually receiving a check.

Concur_Logo_VT_Color_500px--1-Today it’s much more typical to make one’s own reservations through an employer-provided website, to pay expenses with a credit card that charges the employer directly, to account for expenses through the same dedicated website, and to have any reimbursement deposited directly. This all goes much faster, and is much more cost-effective for the employer.

For those of us who like rolling our own, it’s also much more appealing. But for those who don’t, and who don’t have assistants, it’s more awkward and burdensome.

We implemented a modern travel system (Concur) while I was at UChicago. I know anecdotally that most users liked its speed and convenience, but the public reaction consisted largely of complaints (most of which really weren’t about the travel system, but rather about the loss of departmental secretaries as the University did away with them in favor of centralized clerical support).

Coincidentally, my current employer switched to Concur from a paper-based system shortly before I arrived, and I observe the same pattern: widespread private appreciation completely overwhelmed by isolated objection (much of which is actually about changes in policy, such as having to justify non-preferred hotels, rather than the system itself).

marlon-brando-antonyWhat to do? For the most part we can’t use Mark Antony’s technique: through sarcasm (“Brutus is an honourable man“–imagine the air quotes), he discredits assertions of Caesar’s evil. However, it’s unwise for us to treat our customers’ complaints sarcastically.

Rather, a principal strategy for those of us in domains where dissatisfaction automatically overwhelms satisfaction must be to minimize the former. For example, I wrote,

One way to gain unproductive visibility is by unnecessarily constraining choice. To avoid this, wherever possible use carrots rather than sticks to encourage standardization, so that homogeneity is the product of aggregated free choice rather than central mandate… Try to keep institutional options open. Avoid strategies, vendors, architectures, and technologies that constrain choice. Seek interoperability. Wherever possible, have spillover vendors… Think carefully ahead about likely small disasters, many of which are caused by backhoes doing minor excavation, contractors oblivious to wiring closets, incompetent hacking, vandalism, or broken pipes.

But although minimizing unproductive visibility is important, it’s not enough. Mark Antony didn’t rely entirely on discrediting Brutus; he also cited Caesar’s good:

He was my friend, faithful and just to me… He hath brought many captives home to Rome, whose ransoms did the general coffers fill… When that the poor have cried, Caesar hath wept…

Mark Antony understood that discrediting Brutus and extolling Caesar aren’t the same thing. But it was necessary for him to do the former in order to succeed at the latter.

So let it be with IT. We need to recognize more explicitly that maximizing the good things we in IT do to satisfy our customers and campuses (or other organizations) is important, but those good things are different from and do not counterbalance the unproductively visible ways we dissatisfy them.

Notes From (or is it To?) the Dark Side

“Why are you at NBC?,” people ask. “What are you doing over there?,” too, and “Is it different on the dark side?” A year into the gig seems a good time to think about those. Especially that “dark side” metaphor.  For example, which side is “dark”?

This is a longer-than-usual post. I’ll take up the questions in order: first Why, then What, then Different; use the links to skip ahead if you prefer.

Why are you at NBC?

5675955This is the first time I’ve worked at a for-profit company since, let’s see, the summer of 1967: an MIT alumnus arranged an undergraduate summer job at Honeywell‘s Mexico City facility. Part of that summer I learned a great deal about the configuration and construction of custom control panels, especially for big production lines. I think of this every time I see photos of big control panels, such as those at older nuclear plants—I recognize the switch types, those square toggle buttons that light up. (Another part of the summer, after the guy who hired me left and no one could figure out what I should do, I made a 43½-foot paper-clip chain.)

One nice Honeywell perk was an employee discount on a Pentax 35mm SLR with a 40mm and 135mm lenses, which I still have in a box somewhere, and which still works when I replace the camera’s light-meter battery. (The Pentax brand belonged to Honeywell back then, not Ricoh.) Excellent camera, served me well for years, through two darkrooms and a lot of Tri-X film. I haven’t used it since I began taking digital photos, though.

5499942818_d3d9e9929b_nI digress. Except, it strikes me, not really. One interesting thing about digital photos, especially if you store them online and make most of them publicly visible (like this one, taken on the rim of spectacular Bryce Canyon, from my Backdrops collection), is that sometimes the people who find your pictures download them and use them for their own purposes. My photos carry a Creative Commons license specifying that although they are my intellectual property, they can be used for nonprofit purposes so long as they are attributed to me (an option not available, apparently, if I post them on Facebook instead).

So long as those who use my photos comply with the CC license requirement, I don’t require that they tell me, although now and then they do. But if people want to use one of my photos commercially, they’re supposed to ask my permission, and I can ask for a use fee. No one has done that for me—I’m keeping the day job—but it’s happened for our son.

dmcaI hadn’t thought much about copyright, permissions, and licensing for personal photos (as opposed to archival, commercial, or institutional ones) back when I first began dealing with “takedown notices” sent to the University of Chicago under the Digital Millennium Copyright Act (DMCA). There didn’t seem to be much of a parallel between commercialized intellectual property, like the music tracks that accounted for most early DMCA notices, and my photos, which I was putting online mostly because it was fun to share them.

Neither did I think about either photos or music while serving on a faculty committee rewriting the University’s Statute 18, the provision governing patents in the University’s founding documents.

sealThe issues for the committee were fundamentally two, both driven somewhat by the evolution of “textbooks”.

First, where is the line between faculty inventions, which belong to the University (or did at the time), and creations, which belong to creators—between patentable inventions and copyrightable creations, in other words? This was an issue because textbooks had always been treated as creations, but many textbooks had come to include software (back then, CDs tucked into the back cover), and software had always been treated as an invention.

Second, who owns intellectual property that grows out of the instructional process? Traditionally, the rights and revenues associated with textbooks, even textbooks based on University classes, belonged entirely to faculty members. But some faculty members were extrapolating this tradition to cover other class-based material, such as videos of lectures. They were personally selling those materials and the associated rights to outside entities, some of which were in effect competitors (in some cases, they were other universities!).

fathomAs you can see by reading the current Statute 18, the faculty committee really didn’t resolve any of this. Gradually, though, it came to be understood  that textbooks, even textbooks including software, were still faculty intellectual property, whereas instructional material other than that explicitly included in traditional textbooks was the University’s to exploit, sell, or license.

With the latter well established, the University joined Fathom, one of the early efforts to commercialize online instructional material, and put together some excellent online materials. Unfortunately, Fathom, like its first-generation peers, failed to generate revenues exceeding its costs. Once it blew through its venture capital, which had mostly come from Columbia University, Fathom folded. (Poetic justice: so did one of the profit-making institutions whose use of University teaching materials prompted the Statute 18 review.)

Gradually this all got me interested in the thicket of issues surrounding campus online distribution and use of copyrighted materials and other intellectual property, and especially the messy question how campuses should think about copyright infringement occurring within and distributed from their networks. The DMCA had established the dual principles that (a) network operators, including campuses, could be held liable for infringement by their network users, but (b) they could escape this liability (find “safe harbor”) by responding appropriately to complaints from copyright holders. Several of us research-university CIOs worked together to develop efficient mechanisms for handling and responding to DMCA notices, and to help the industry understand those and the limits on what they might expect campuses to do.

heoaAs one byproduct of that, I found myself testifying before a Congressional committee. As another, I found myself negotiating with the entertainment industry, under US Education Department auspices, to develop regulations implementing the so-called “peer to peer” provisions of the Higher Education Opportunity Act of 2008.

That was one of several threads that led to my joining EDUCAUSE in 2009. One of several initiatives in the Policy group was to build better, more open communications between higher education and the entertainment industry with regard to copyright infringement, DMCA, and the HEOA requirements.

hero-logo-edxI didn’t think at the time about how this might interact with EDUCAUSE’s then-parallel efforts to illuminate policy issues around online and nontraditional education, but there are important relevancies. Through massively open online courses (MOOCs) and other mechanisms, colleges and universities are using the Internet to reach distant students, first to build awareness (in which case it’s okay for what they provide to be freely available) but eventually to find new revenues, that is, to monetize their intellectual property (in which case it isn’t).

music-industryIf online campus content is to be sold rather than given away, then campuses face the same issues as the entertainment industry: They must protect their content from those who would use it without permission, and take appropriate action to deter or address infringement.

Campuses are generally happy to make their research freely available (except perhaps for inventions), as UChicago’s Statute 18 makes clear, provided that researchers are properly credited. (I also served on UChicago’s faculty Intellectual Property Committee, which among other things adjudicated who-gets-credit conflicts among faculty and other researchers.) But instruction is another matter altogether. If campuses don’t take this seriously, I’m afraid, then as goes music, so goes online higher education.

Much as campus tumult and changes in the late Sixties led me to abandon engineering for policy analysis, and quantitative policy analysis led me into large-scale data analysis, and large-scale data analysis led me into IT, and IT led me back into policy analysis, intellectual-property issues led me to NBCUniversal.

Peacock_CleanupI’d liked the people I met during the HEOA negotiations, and the company seemed seriously committed to rethinking its relationships with higher education. I thought it would be interesting, at this stage in my career, to do something very different in a different kind of place. Plus, less travel (see screwup #3 in my 2007 EDUCAUSE award address).

So here I am, with an office amidst lobbyists and others who focus on legislation and regulation, with a Peacock ID card that gets me into the Universal lot, WRC-TV, and 30 Rock (but not SNL), and with a 401k instead of a 403b.

What are you doing over there?

NBCUniversal’s goals for higher education are relatively simple. First, it would like students to use legitimate sources to get online content more, and illegitimate “pirate” sources less. Second, it would like campuses to reduce the volume of infringing material made available from their networks to illegal downloaders worldwide.

477px-CopyrightpiratesMy roles are also two. First, there’s eagerness among my colleagues (and their counterparts in other studios) to better understand higher education, and how campuses might think about issues and initiatives. Second, the company clearly wants to change its approach to higher education, but doesn’t know what approaches might make sense. Apparently I can help with both.

To lay foundation for specific projects—five so far, which I’ll describe briefly below—I looked at data from DMCA takedown notices.

Curiously, it turned out, no one had done much to analyze detected infringement from campus networks (as measured by DMCA notices sent to them), or to delve into the ethical puzzle: Why do students behave one way with regard to misappropriating music, movies, and TV shows, and very different ways with regard to arguably similar options such as shoplifting or plagiarism? I’ve written about some of the underlying policy issues in Story of S, but here I decided to focus first on detected infringement.

riaa-logoIt turns out that virtually all takedown notices for music are sent by the Recording Industry Association of America, RIAA (the Zappa Trust and various other entities send some, but they’re a drop in the bucket).

MPAAMost takedown notices for movies and some for TV are sent by the Motion Picture Association of America, MPAA, on behalf of major studios (again, with some smaller entities such as Lucasfilm wading in separately). NBCUniversal and Fox send out notices involving their movies and TV shows.

sources chartI’ve now analyzed data from the major senders for both a twelve-month period (Nov 2011-Oct 2012) and a more recent two-month period (Feb-Mar 2013). For the more recent period, I obtained very detailed data on each of 40,000 or so notices sent to campuses. Here are some observations from the data:

  • Almost all the notices went to 4-year campuses that have at least 100 dormitory beds (according to IPEDS). To a modest extent, the bigger the campus the more notices, but the correlation isn’t especially large.
  • Over half of all campuses—even of campuses with dorms—didn’t get any notices. To some extent this is because there are lots and lots of very small campuses, and they fly under the infringement-detection radar. But I’ve learned from talking to a fair number of campuses that, much to my surprise, many heavily filter or even block peer-to-peer traffic at their commodity Internet border firewall—usually because the commodity bandwidth p2p uses is expensive, especially for movies, rather than to deal with infringement per se. Outsourced dorm networks also have an effect, but I don’t think they’re sufficiently widespread yet to explain the data.
  • Several campuses have out-of-date or incorrect “DMCA agent” addresses registered at the Library of Congress. Compounding that, it turns out some notice senders use “abuse” or other standard DNS addresses rather than the registered agent addresses.
  • Among campuses that received notices, a few campuses stand out for receiving the lion’s share, even adjusting for their enrollment. For example, the top 100 or so recipient campuses got about three quarters of the total, and a handful of campuses stand out sharply even within that group: the top three campuses (the leftmost blue bars in the graph below) accounted for well over 10% of the notices. (I found the same skewness in the 2012 study.) With a few interesting exceptions (interesting because I know or suspect what changed), the high-notice groups have been the same for the two periods.

utorrent-facebook-mark-850-transparentThe detection process, in general, is that copyright holders choose a list of music, movie, or TV titles they believe likely to be infringed. Their contractors then use BitTorrent tracker sites and other user tools to find illicit sources for those titles.

For the most part the studios and associations simply look for titles that are currently popular in theaters or from legitimate sources. It’s hard to see that process introducing a bias that would affect some campuses so much differently than others. I’ve also spent considerable time looking at how a couple of contractors verify that titles being offered illicitly (that is, listed for download on a BitTorrent tracker site such as The Pirate Bay) are actually the titles being supplied (rather than, say, malware, advertising, or porn), and at how they figure out where to send the resulting takedown notices. That process too seems pretty straightforward and unbiased.

argo-15355-1920x1200Sender choices clearly can influence how notice counts vary from time to time: for example, adding a newly popular title to the search list can lead to a jump in detections and hence notices. But it’s hard to see how the choice of titles would influence how notice counts vary from institution to institution.

This all leads me to believe that takedown notices tell us something incomplete but useful about campus policies and practices, especially at the extremes. The analysis led directly to two projects focused on specific groups of campuses, and indirectly to three others.

Role Model Campuses

Based on the results of the data analysis, I communicated individually with CIOs at 22 campuses that received some but relatively few notices: specifically, campuses that (a) received at least one notice (and so are on the radar) but (b) fewer than 300 and fewer than 20 per thousand student headcount, (c) have at least 7,500 headcount students, and (d) have at least 10,000 dorm beds (per IPEDS) or sufficient dorm beds to house half your headcount. (These are Group 4, the purple bars in the graph below. The solid bars represent total notices sent, and the hollow bars represent incidence, or notices per thousand headcount students. Click on the graph to see it larger.)

I’ve asked each of those campuses whether they’d be willing to document their practices in an open “role models” database developed jointly by the campuses and hosted by a third party such as groups charta higher-education association (as EDUCAUSE did after the HEOA regulations took effect). The idea is to make a collection of diverse effective practices available to other campuses that might want to enhance their practices.

High Volume Campuses

Separately, I communicated privately with CIOs at 13 campuses that received exceptionally many notices, even adjusting for their enrollment (Group 1, the blue bars in the graph). I’ve looked in some detail at the data for those campuses, some large and some small, and in some cases that’s led to suggestions.

For example, in a few cases I discovered that virtually all of a high-volume campus’s notices were split evenly among a small number of consecutive IP addresses. In those cases, I’ve suggested that those IP addresses might be the front-end to something like a campus wireless network. Filtering or blocking p2p (or just BitTorrent) traffic on those few IP addresses (or the associated network devices) might well shrink the campus’s role as a distributor without affecting legitimate p2p or BitTorrent users (who tend to be managing servers with static addresses).

Symposia

Back when I was at EDUCAUSE, we worked with NBCUniversal to host a DC meeting between senior campus staff from a score of campuses nationwide and some industry staff closely involved with the detection and notification for online infringement. The meeting was energetic and frank, and participants from both sides went away with a better sense of the other’s bona fides and seriousness. This was the first time campus staff had gotten a close look at the takedown-notice process since a Common Solutions Group meeting in Ann Arbor some years earlier; back then the industry’s practices were much less refined.

university-st-thomas-logo-white croppedBased on the NBCUniversal/EDUCAUSE experience, we’re organizing a series of regional “Symposia” along these lines on campuses in various cities across the US. The objectives are to open new lines of communication and to build trust. The invitees are IT and student-affairs staff from local campuses, plus several representatives from industry, especially the groups that actually search for infringement on the Internet. The first was in New York, the second in Minneapolis, the third will be in Philadelphia, and others will follow in the West, the South, and elsewhere in the Midwest.

Research

We’re funding a study within a major state university system to gather two kinds of data. Initially the researchers are asking each campus to describe the measures it takes to “effectively combat” copyright infringement: its communications with students, its policies for dealing with violations, and the technologies it uses. The data from the first phase will help enhance a matrix we’ve drafted outlining the different approaches taken by different campuses, complementing what will emerge from the “role models” project.

Based on the initial data, the researchers and NBCUniversal will choose two campuses to participate in the pilot phase of the Campus Online Education Initiative (which I’ll describe next). In advance of that pilot, the researchers will gather data from a sample of students on each campus, asking about their attitudes toward and use of illicit and legitimate online sources for music, movies, and video. They’ll then repeat that data collection after the pilot term.

Campus Online Entertainment Initiative

Last but least in neither ambition nor complexity, we’re crafting a program that will attempt to address both goals I listed earlier: encouraging campuses to take effective steps to reduce distribution of infringing material from their networks, and helping students to appreciate (and eventually prefer) legitimate sources for online entertainment.

maxresdefaultWorking with Universal Studios and some of its peers, we’ll encourage students on participating campuses to use legitimate sources by making a wealth of material available coherently and attractively—through a single source that works across diverse devices, and at a substantial discount or with similar incentives.

Participating campuses, in turn, will maintain or implement policies and practices likely to shrink the volume of infringing material available from their networks. In some cases the participating campuses will already be like those in the “role models” group; in others they’ll be “high volume” or other campuses willing to  adopt more effective practices.

I’m managing these projects from NBCUniversal’s Washington offices, but with substantial collaboration from company colleagues here, in Los Angeles, and in New York; from Comcast colleagues in Philadelphia; and from people in other companies. Interestingly, and to my surprise, pulling this all together has been much like managing projects at a research university. That’s a good segue to the next question.

Is it different on the dark side?

IMG_1224Newly hired, I go out to WRC, the local NBC affiliate in Washington, to get my NBCUniversal ID and to go through HR orientation. Initially it’s all familiar: the same ID photo technology, the same RFID keycard, the same ugly tile and paint on the hallways, the same tax forms to be completed by hand.

But wait: Employee Relations is next door to the (now defunct) Chris Matthews Show. And the benefits part of orientation is a video hosted by Jimmy Fallon and Brian Williams. And there’s the possibility of something called a “bonus”, whatever that is.

Around my new office, in a spiffy modern building at 300 New Jersey Avenue, everyone seems to have two screens. That’s just as it was in higher-education IT. But wait: here one of them is a TV. People watch TV all day as they work.

Toto, we’re not in higher education any more.

IMG_1274It’s different over here, and not just because there’s a beautiful view of the Capitol from our conference rooms. Certain organizational functions seem to work better, perhaps because they should and in the corporate environment can be implemented by decree: HR processes, a good unified travel arrangement and expense system, catering, office management. Others don’t: there’s something slightly out of date about the office IT, especially the central/individual balance and security, and there’s an awful lot of paper.

Some things are just different, rather than better or not: the culture is heavily oriented to face-to-face and telephone interaction, even though it’s a widely distributed organization where most people are at their desks most of the time. There’s remarkably little email, and surprisingly little use of workstation-based videoconferencing. People dress a bit differently (a maitre d’ told me, “that’s not a Washington tie”).

But differences notwithstanding, mostly things feel much the same as they did at EDUCAUSE, UChicago, and MIT.

tiny NBCUniversal_violet_1030Where I work is generally happy, people talk to one another, gossip a bit, have pizza on Thursdays, complain about the quality of coffee, and are in and out a lot. It’s not an operational group, and so there’s not the bustle that comes with that, but it’s definitely busy (especially with everyone around me working on the Comcast/Time Warner merger). The place is teamly, in that people work with one another based on what’s right substantively, and rarely appeal to authority to reach decisions. Who trusts whom seems at least as important as who outranks whom, or whose boss is more powerful. Conversely, it’s often hard to figure out exactly how to get something done, and lots of effort goes into following interpersonal networks. That’s all very familiar.

MIT_Building_10_and_the_Great_Dome,_Cambridge_MAI’d never realized how much like a research university a modern corporation can be. Where I work is NBCUniversal, which is the overarching corporate umbrella (“Old Main”, “Mass Hall”, “Building 10”, “California Hall”, “Boulder”) for 18 other companies including news, entertainment, Universal Studios, theme parks, the Golf Channel, Telemundo (which are remarkably like schools and departments in their varied autonomy).

Meanwhile NBCUniversal is owned by Comcast—think “System Central Office”. Sure, these are all corporate entities, and they have concrete metrics by which to measure success: revenue, profit, subscribers, viewership, market share. But the relationships among organizations, activities, and outcomes aren’t as coherent and unitary as I’d expected.

Dark or Green?

So, am I on the dark side, or have I left it behind for greener pastures? Curiously, I hear both from my friends and colleagues in higher education: Some of them think my move is interesting and logical, some think it odd and disappointing. Curioser still, I hear both from my new colleagues in the industry: Some think I was lucky to have worked all those decades in higher education, while others think I’m lucky to have escaped. None of those views seems quite right, and none seems quite wrong.

The point, I suppose, is that simple judgments like “dark” and “greener” underrepresent the complexity of organizational and individual value, effectiveness, and life. Broad-brush characterizations, especially characterizations embodying the ecological fallacy, “…the impulse to apply group or societal level characteristics onto individuals within that group,” do none of us any good.

It’s so easy to fall into the ecological-fallacy trap; so important, if we’re to make collective progress, not to.

Comments or questions? Write me: greg@gjackson.us

(The quote is from Charles Ess & Fay Sudweeks, Culture, technology, communication: towards an intercultural global village, SUNY Press 2001, p 90. Everything in this post, and for that matter all my posts, represents my own views, not those of my current or past employers, or of anyone else.)

3|5|2014 11:44a est

Perceived Truths as Policy Paradoxes

imagesThe quote I was going to use to introduce this topic — “You’re entitled to your own opinion, but not to your own facts” — itself illustrates my theme for today: that truths are often less than well founded, and so can turn policy discussions weird.

I’d always heard the quote attributed to Pat Moynihan, an influential sociologist who co-wrote Beyond the Melting Pot with Nathan Glazer, directed the MIT-Harvard Joint Center for Urban Studies shortly before I worked there (and left behind a closet full of Scotch, which stemmed from his perhaps apocryphal rule that no meeting extend beyond 4pm without a bottle on the table), and later served as a widely respected Senator from New York. The collective viziers of Wikipedia have found other attributions for the quote, however. (This has me once again looking for the source of “There go my people, I must go join them, for I am their leader,” supposedly Mahatma Gandhi but apparently some French general — but I digress.). The quote will need to stand on its own.

a0157b7d-9976-410d-bba8-6ccf1dbf4c48-The-ACT-Here’s the Scott Jaschik item from Inside Higher Education that triggered today’s Rumination:

A new survey from ACT shows the continued gap between those who teach in high school and those who teach in college when it comes to their perceptions of the college preparation of today’s students. Nearly 90 percent of high school teachers told ACT that their students are either “well” or “very well” prepared for college-level work in their subject area after leaving their courses. But only 26 percent of college instructors reported that their incoming students are either “well” or “very well” prepared for first-year credit-bearing courses in their subject area. The percentages are virtually unchanged from a similar survey in 2009.

This is precisely what Moynihan (or whoever) had in mind: two parties to an important discussion each bearing their own data, and therefore unable to agree on the problem or how to address it. The teachers presumably think the professors have unreasonable expectations, or don’t work very hard to bring their students along; the professors presumably think the teachers aren’t doing their job. Each side therefore believes the problem lies on the other, and has data to prove that. Collaboration is unlikely, progress ditto. This is what Moynihan had observed about the federal social policy process.

5-financial-aid-tips-1The ACT survey reminded me of a similar finding that emerged back when I was doing college-choice research. I can’t locate a citation, but I recall hearing about a study that surveyed students who had been admitted to several different colleges.

The clever wrinkle in the study was that the students received several different survey queries, each purporting to be from one of the colleges to which he or she had been admitted, and each asking the student about the reasons for accepting or declining the admission offer. Here’s what they found: students told the institution they’d accepted that the reason was excellent academic quality, but they told the institutions they’d declined that the reason was better financial aid from the one they’d accepted.

131More recently, I was talking to a colleague in a another media company who was concerned about the volume of copyright infringement on a local campus. According to the company, the campus was hosting a great deal of copyright infringementl, as measured by the volume of requests for infringing material being sent out by BitTorrent. But according to the campus, a scan of the campus network identified very few hosts running the peer-to-peer applications. The colleague thought the campus was blowing smoke, the campus thought the company’s statistics were wrong.

Although these three examples seem similar — parties disagreeing about facts — in fact they’re a bit different.

  • In the teacher/professor example, the different conclusions presumably stem from different (and unshared) definitions of “”prepared for college-level work”.
  • In the accepted/decline example, the different explanations possibly stem from students’ not wanting to offend the declined institution by questioning its quality, or wanting think of their actual choice as good rather than cheap.
  • In the infringement/application case, the different explanations stem from divergent metrics.

compass-badgeWe’ve seen similar issues arise around institutional attributes in higher education. Do ratings like those from US News & World Report gather their own data, for example, or rely on presumably neutral sources such as the National Center for Educational Statistics? This is critical where results have major reputational effects — consider George Washington University’s inflation of class-rank admissions data, and similar earlier issues with Claremont McKenna, Emory, Villanova, and others.

I’d been thinking about this because in my current job it’s quite important to understand patterns of copyright infringement on campuses. It would be good to figure out which campuses seem to have relatively low infringement rates, and to explore and document their policies and practices lest other campuses might benefit. For somewhat different reasons, it would be good to figure out which campuses seem to have relatively high infringement rates, so that they could be encouraged adopt different policies and practices.

But here we run into the accept/decline problem. If the point to data collection is to identify and celebrate effective practice, there are lots of incentives for campuses to participate. But if the point is to identify and pressure less effective campuses, the incentives are otherwise.

Compounding the problem, there are different ways to measure the problem:

  • One can rely on externally generated complaints, whose volume can vary for reasons having nothing to do with the volume of infringement,
  • one can rely on internal assessments of network traffic, which can be inadvertently selective, and/or
  • one can rely on external measures such as the volume of queries to known sources of infringement;

I’m sure there are others — and that’s without getting into the religious wars about copyright, middlemen, and so forth I addressed in an earlier post).

There’s no full solution to this problem. But there are two things that help: collaboration and openness.

  • By “collaboration,” I mean that parties to questions of policy or practice should work together to define and ideally collect data; that way, arguments can focus on substance.
  • By “openness,” I mean that wherever possible raw data, perhaps anonymized, should accompany analysis and advocacy based on those data.

As an example what this means, here are some thoughts for one of my upcoming challenges — figuring out how to identify campuses that might be models for others to follow, and also campuses that should probably follow them. Achieving this is important, but improperly done it can easily come to resemble the “top 25″ lists from RIAA and MPAA that became so controversial and counterproductive a few years ago. The “top 25″ lists became controversial partly because their methodology was suspect, partly because the underlying data were never available, and partly because they ignored the other end of the continuum, that is, institutions that had somehow managed to elicit very few Digital Millennium Copyright Act (DMCA) notices.

PirateBay_1_NETT_26916dIt’s clear there are various sources of data, even without internal access to campus network data:

  • counts of DMCA notices sent by various copyright holders (some of which send notices methodically, following reasonably robust and consistent procedures, and some of which don’t),
  • counts of queries involving major infringing sites, and/or
  • network volume measures for major infringing protocols.

Those last two yield voluminous data, and so usually require sampling or data reduction of some kind. And not all queries or protocols they follow involve infringement. It’s also clear, from earlier studies, that there’s substantial variation in these counts over time and even across similar campuses.

This means it will be important for my database, if I can create one, to include several different measures, especially counts from different sources for different materials, and to do that over a reasonable period of time. Integrating all this into a single dataset will require lots of collaboration among the providers. Moreover, the raw data necessarily will identify individual institutions, and releasing them that way would probably cause more opposition than support. Clumping them all together would bypass that problem, but also cover up important variation. So it makes much more sense to disguise rather than clump — that is, to identify institutions by a code name and enough attributes to describe them but not to identify them.

It’ll then be important to be transparent: to lay out the detailed methodology used to “rank” campuses (as, for example, US News now does), and to share the disguised data so others can try different methodologies.

big_dataAt a more general level, what I draw from the various examples is this: If organizations are to set policy and frame practice based on data — to become “data-driven organizations,” in the current parlance — then they must put serious effort into the source, quality, and accessibility of data. That’s especially true for “big data,” even though many current “big data” advocates wrongly believe that volume somehow compensates for quality.

If we’re going to have productive debates about policy and practice in connection with copyright infringment or anything else, we need to listen to Moynihan: To have our own opinions, but to share our data.

Story of S, and the Mythology of the Lost Generation

argo_ver7_xlgDinner talk turned from Argo and Zero Dark Thirty to movies more generally. A 21-year-old college senior—I’ll call her “S”—recognized most of the films we were discussing. She had seen several, but others she hadn’t, which was a bit surprising, since S was an arts major, wanted to be a screenwriter, and was enthusiastic about her first choice for graduate school: the screenwriting program at a major California institution focused on the movie industry.

S had older brothers in the movie business, and she already had begun writing. What she needed, S said, was broader and deeper exposure to what made good screenplays. Graduate school would provide “deeper.” Her plan for “broader” was to watch as many well-regarded classics as possible, and apparently we were helping her map out that strategy.

But many of the films she wanted to see weren’t available on cable in her dormitory, even as pay-per-view. “Buying” or “renting” them online she found too expensive and awkward, especially given the number of films she wanted to see. So S was doing what unfortunately many students (and others) do: looking for movies on the Internet, and then streaming or downloading the least expensive version she could find. Since S’s college dormitory provided good Internet connectivity, S used that to download or stream her movies. Bluebeard_PirateUsually, she said, the least expensive version was an unauthorized copy, a so-called “pirate” version.

Some of us challenged her: Didn’t S realize that downloading or streaming “pirated” copies was against the law? Was she not concerned about the possible consequences? As a budding screenwriter, would she want others to do as she was doing, and deprive her of royalties? Didn’t it just seem wrong to take something without the owner’s permission?

S listened carefully—she was pretty sharp—but she didn’t seem convinced. Indeed, she seemed to feel that her choice to use unauthorized copies was reasonable, given the limited and unsatisfactory alternatives provided by the movie industry.

cary-shermanIn so believing, S was echoing the persistent mythology of the lost generation. I first heard Cary Sherman, the President of the Recording Industry Association of America (RIAA), use “the lost generation” to describe the approximately 25 million students who became digital consumers between two milestones: Napster‘s debut in 1999, which made sharing of MP3s ripped from CDs easy, and Apple’s discontinuing digital rights management (DRM) for most iTunes music in 2009, which made buying tracks legally almost as easy and convenient.

Even without the illusion that infringing materials were “free,” there were ample incentives to infringe during that period: illegal mechanisms were comprehensive and easy to use, for the most part, whereas legal mechanisms did not exist, were inflexible and awkward, and/or did not include many widely-desired items.

Age_of_Mythology_LinerBecause of this, many members of the lost generation adopted a mythology comprising some subset of

  • digital materials are priced too high, since it costs money to manufacture CDs and DVDs but the Internet is free,
  • profits flow to middlemen rather than artists, and so artists aren’t hurt by infringement,
  • DRM is just the industry’s mechanism for controlling users and rationing information,
  • people who stream or download unauthorized copies wouldn’t have bought legal copies anyway, and so copyright holders don’t lose any revenue because of unauthorized copying,
  • there’s no way to sample material before buying it, and so unauthorized sources are the only easy way to explore new or arcane stuff,
  • the entertainment  industry has no interest in serving customers, as evidenced by its keeping so much material unavailable,
  • copyright is wrong, since information should be free and users should just pay what they think it’s worth, and
  • (the illegitimate moral leap S and others make) therefore it’s “okay” to copy and share digital materials without permission.

Unfortunately, the lost generation’s beliefs, most of which have always been exaggerated or invalid, have been passed down to successor generations, a process accelerated rather than slowed by the current industry emphasis on monitoring and penalizing network users.

cool-hand-luke-martinWhy does the mythology persist?

There are the obvious technical and financial arguments: if illegal technology is more convenient that legal, and illegal content costs less than legal, then it’s not surprising that illegal stuff remains prominent.

But in addition, as the Captain might observe, what we have here is failure to communicate:

  • There’s lots of evidence that convenient, comprehensive services like Netflix, Amazon Prime Instant Video, Hulu, Pandora, and Spotify draw users to them even when there are illegal “free” alternatives. But for this to happen, users must know about those services. S clearly didn’t—we asked her specifically—and that’s a marketing failure.
  • Shoplifting and plagiarism are relatively rare, at least among individuals like S. Yet they have the same appealing features as “pirate” music and video. Somehow S and her peers have come to understand that shoplifting, plagiarism, and various similar choices are unethical, immoral, or socially counterproductive. Yet they don’t put copyright infringement in the same category. That’s a social, educational, and parental failure.
  • LSb_120504_345.jpgFor all kinds of arguably irremediable licensing, contractual, competitive, and anti-trust reasons, it remains stubbornly difficult to “give the lady what she wants“: in S’s case, a comprehensive, reasonably priced, convenient service from which she could obtain all the movies she wanted. Whether this is customers not conveying their wants to providers (in part because they can bypass the latter), or whether this is providers stuck on obsolete delivery models, it’s a business failure.
  • Colleges and universities are supposed at least to tell their students about copyright infringement, and to implement technologies and other mechanisms to “effectively combat” it. S had no idea that the consequences of being caught downloading or streaming unauthorized copies were anything beyond being told to stop. So far as she knew, no one, at least no one at her college, had ever gotten in trouble for that. And she’d never heard anything from her college—which was also her Internet service provider—about the issue. That’s a policy failure.

To be fair, S’s dinner comments endorsed only a small subset of the lost generation’s tenets, she seemed generally interested in the streaming services we told her about, and she was now thinking about the consequences of being caught downloading or streaming unauthorized copies—and about how lots of people doing that might affect her future earnings. So there was progress.

But ganging up on 21-year-olds at dinner parties is a very inefficient way to counteract the mythology of the lost generation. We—and by this I mean everyone: users, parents, schools, artists, producers, network providers—need  to find much better ways to communicate about copyright infringement, to help potential infringers understand the choices they are making, and to provide and use better legal services.

Especially until we do that last, this will be hard, and progress will be slow. But it’s progress we need if the intellectual-property economy is to endure.