Archive for April, 2013

Perceived Truths as Policy Paradoxes

imagesThe quote I was going to use to introduce this topic — “You’re entitled to your own opinion, but not to your own facts” — itself illustrates my theme for today: that truths are often less than well founded, and so can turn policy discussions weird.

I’d always heard the quote attributed to Pat Moynihan, an influential sociologist who co-wrote Beyond the Melting Pot with Nathan Glazer, directed the MIT-Harvard Joint Center for Urban Studies shortly before I worked there (and left behind a closet full of Scotch, which stemmed from his perhaps apocryphal rule that no meeting extend beyond 4pm without a bottle on the table), and later served as a widely respected Senator from New York. The collective viziers of Wikipedia have found other attributions for the quote, however. (This has me once again looking for the source of “There go my people, I must go join them, for I am their leader,” supposedly Mahatma Gandhi but apparently some French general — but I digress.). The quote will need to stand on its own.

a0157b7d-9976-410d-bba8-6ccf1dbf4c48-The-ACT-Here’s the Scott Jaschik item from Inside Higher Education that triggered today’s Rumination:

A new survey from ACT shows the continued gap between those who teach in high school and those who teach in college when it comes to their perceptions of the college preparation of today’s students. Nearly 90 percent of high school teachers told ACT that their students are either “well” or “very well” prepared for college-level work in their subject area after leaving their courses. But only 26 percent of college instructors reported that their incoming students are either “well” or “very well” prepared for first-year credit-bearing courses in their subject area. The percentages are virtually unchanged from a similar survey in 2009.

This is precisely what Moynihan (or whoever) had in mind: two parties to an important discussion each bearing their own data, and therefore unable to agree on the problem or how to address it. The teachers presumably think the professors have unreasonable expectations, or don’t work very hard to bring their students along; the professors presumably think the teachers aren’t doing their job. Each side therefore believes the problem lies on the other, and has data to prove that. Collaboration is unlikely, progress ditto. This is what Moynihan had observed about the federal social policy process.

5-financial-aid-tips-1The ACT survey reminded me of a similar finding that emerged back when I was doing college-choice research. I can’t locate a citation, but I recall hearing about a study that surveyed students who had been admitted to several different colleges.

The clever wrinkle in the study was that the students received several different survey queries, each purporting to be from one of the colleges to which he or she had been admitted, and each asking the student about the reasons for accepting or declining the admission offer. Here’s what they found: students told the institution they’d accepted that the reason was excellent academic quality, but they told the institutions they’d declined that the reason was better financial aid from the one they’d accepted.

131More recently, I was talking to a colleague in a another media company who was concerned about the volume of copyright infringement on a local campus. According to the company, the campus was hosting a great deal of copyright infringementl, as measured by the volume of requests for infringing material being sent out by BitTorrent. But according to the campus, a scan of the campus network identified very few hosts running the peer-to-peer applications. The colleague thought the campus was blowing smoke, the campus thought the company’s statistics were wrong.

Although these three examples seem similar — parties disagreeing about facts — in fact they’re a bit different.

  • In the teacher/professor example, the different conclusions presumably stem from different (and unshared) definitions of “”prepared for college-level work”.
  • In the accepted/decline example, the different explanations possibly stem from students’ not wanting to offend the declined institution by questioning its quality, or wanting think of their actual choice as good rather than cheap.
  • In the infringement/application case, the different explanations stem from divergent metrics.

compass-badgeWe’ve seen similar issues arise around institutional attributes in higher education. Do ratings like those from US News & World Report gather their own data, for example, or rely on presumably neutral sources such as the National Center for Educational Statistics? This is critical where results have major reputational effects — consider George Washington University’s inflation of class-rank admissions data, and similar earlier issues with Claremont McKenna, Emory, Villanova, and others.

I’d been thinking about this because in my current job it’s quite important to understand patterns of copyright infringement on campuses. It would be good to figure out which campuses seem to have relatively low infringement rates, and to explore and document their policies and practices lest other campuses might benefit. For somewhat different reasons, it would be good to figure out which campuses seem to have relatively high infringement rates, so that they could be encouraged adopt different policies and practices.

But here we run into the accept/decline problem. If the point to data collection is to identify and celebrate effective practice, there are lots of incentives for campuses to participate. But if the point is to identify and pressure less effective campuses, the incentives are otherwise.

Compounding the problem, there are different ways to measure the problem:

  • One can rely on externally generated complaints, whose volume can vary for reasons having nothing to do with the volume of infringement,
  • one can rely on internal assessments of network traffic, which can be inadvertently selective, and/or
  • one can rely on external measures such as the volume of queries to known sources of infringement;

I’m sure there are others — and that’s without getting into the religious wars about copyright, middlemen, and so forth I addressed in an earlier post).

There’s no full solution to this problem. But there are two things that help: collaboration and openness.

  • By “collaboration,” I mean that parties to questions of policy or practice should work together to define and ideally collect data; that way, arguments can focus on substance.
  • By “openness,” I mean that wherever possible raw data, perhaps anonymized, should accompany analysis and advocacy based on those data.

As an example what this means, here are some thoughts for one of my upcoming challenges — figuring out how to identify campuses that might be models for others to follow, and also campuses that should probably follow them. Achieving this is important, but improperly done it can easily come to resemble the “top 25” lists from RIAA and MPAA that became so controversial and counterproductive a few years ago. The “top 25” lists became controversial partly because their methodology was suspect, partly because the underlying data were never available, and partly because they ignored the other end of the continuum, that is, institutions that had somehow managed to elicit very few Digital Millennium Copyright Act (DMCA) notices.

PirateBay_1_NETT_26916dIt’s clear there are various sources of data, even without internal access to campus network data:

  • counts of DMCA notices sent by various copyright holders (some of which send notices methodically, following reasonably robust and consistent procedures, and some of which don’t),
  • counts of queries involving major infringing sites, and/or
  • network volume measures for major infringing protocols.

Those last two yield voluminous data, and so usually require sampling or data reduction of some kind. And not all queries or protocols they follow involve infringement. It’s also clear, from earlier studies, that there’s substantial variation in these counts over time and even across similar campuses.

This means it will be important for my database, if I can create one, to include several different measures, especially counts from different sources for different materials, and to do that over a reasonable period of time. Integrating all this into a single dataset will require lots of collaboration among the providers. Moreover, the raw data necessarily will identify individual institutions, and releasing them that way would probably cause more opposition than support. Clumping them all together would bypass that problem, but also cover up important variation. So it makes much more sense to disguise rather than clump — that is, to identify institutions by a code name and enough attributes to describe them but not to identify them.

It’ll then be important to be transparent: to lay out the detailed methodology used to “rank” campuses (as, for example, US News now does), and to share the disguised data so others can try different methodologies.

big_dataAt a more general level, what I draw from the various examples is this: If organizations are to set policy and frame practice based on data — to become “data-driven organizations,” in the current parlance — then they must put serious effort into the source, quality, and accessibility of data. That’s especially true for “big data,” even though many current “big data” advocates wrongly believe that volume somehow compensates for quality.

If we’re going to have productive debates about policy and practice in connection with copyright infringment or anything else, we need to listen to Moynihan: To have our own opinions, but to share our data.

Story of S, and the Mythology of the Lost Generation

argo_ver7_xlgDinner talk turned from Argo and Zero Dark Thirty to movies more generally. A 21-year-old college senior—I’ll call her “S”—recognized most of the films we were discussing. She had seen several, but others she hadn’t, which was a bit surprising, since S was an arts major, wanted to be a screenwriter, and was enthusiastic about her first choice for graduate school: the screenwriting program at a major California institution focused on the movie industry.

S had older brothers in the movie business, and she already had begun writing. What she needed, S said, was broader and deeper exposure to what made good screenplays. Graduate school would provide “deeper.” Her plan for “broader” was to watch as many well-regarded classics as possible, and apparently we were helping her map out that strategy.

But many of the films she wanted to see weren’t available on cable in her dormitory, even as pay-per-view. “Buying” or “renting” them online she found too expensive and awkward, especially given the number of films she wanted to see. So S was doing what unfortunately many students (and others) do: looking for movies on the Internet, and then streaming or downloading the least expensive version she could find. Since S’s college dormitory provided good Internet connectivity, S used that to download or stream her movies. Bluebeard_PirateUsually, she said, the least expensive version was an unauthorized copy, a so-called “pirate” version.

Some of us challenged her: Didn’t S realize that downloading or streaming “pirated” copies was against the law? Was she not concerned about the possible consequences? As a budding screenwriter, would she want others to do as she was doing, and deprive her of royalties? Didn’t it just seem wrong to take something without the owner’s permission?

S listened carefully—she was pretty sharp—but she didn’t seem convinced. Indeed, she seemed to feel that her choice to use unauthorized copies was reasonable, given the limited and unsatisfactory alternatives provided by the movie industry.

cary-shermanIn so believing, S was echoing the persistent mythology of the lost generation. I first heard Cary Sherman, the President of the Recording Industry Association of America (RIAA), use “the lost generation” to describe the approximately 25 million students who became digital consumers between two milestones: Napster‘s debut in 1999, which made sharing of MP3s ripped from CDs easy, and Apple’s discontinuing digital rights management (DRM) for most iTunes music in 2009, which made buying tracks legally almost as easy and convenient.

Even without the illusion that infringing materials were “free,” there were ample incentives to infringe during that period: illegal mechanisms were comprehensive and easy to use, for the most part, whereas legal mechanisms did not exist, were inflexible and awkward, and/or did not include many widely-desired items.

Age_of_Mythology_LinerBecause of this, many members of the lost generation adopted a mythology comprising some subset of

  • digital materials are priced too high, since it costs money to manufacture CDs and DVDs but the Internet is free,
  • profits flow to middlemen rather than artists, and so artists aren’t hurt by infringement,
  • DRM is just the industry’s mechanism for controlling users and rationing information,
  • people who stream or download unauthorized copies wouldn’t have bought legal copies anyway, and so copyright holders don’t lose any revenue because of unauthorized copying,
  • there’s no way to sample material before buying it, and so unauthorized sources are the only easy way to explore new or arcane stuff,
  • the entertainment  industry has no interest in serving customers, as evidenced by its keeping so much material unavailable,
  • copyright is wrong, since information should be free and users should just pay what they think it’s worth, and
  • (the illegitimate moral leap S and others make) therefore it’s “okay” to copy and share digital materials without permission.

Unfortunately, the lost generation’s beliefs, most of which have always been exaggerated or invalid, have been passed down to successor generations, a process accelerated rather than slowed by the current industry emphasis on monitoring and penalizing network users.

cool-hand-luke-martinWhy does the mythology persist?

There are the obvious technical and financial arguments: if illegal technology is more convenient that legal, and illegal content costs less than legal, then it’s not surprising that illegal stuff remains prominent.

But in addition, as the Captain might observe, what we have here is failure to communicate:

  • There’s lots of evidence that convenient, comprehensive services like Netflix, Amazon Prime Instant Video, Hulu, Pandora, and Spotify draw users to them even when there are illegal “free” alternatives. But for this to happen, users must know about those services. S clearly didn’t—we asked her specifically—and that’s a marketing failure.
  • Shoplifting and plagiarism are relatively rare, at least among individuals like S. Yet they have the same appealing features as “pirate” music and video. Somehow S and her peers have come to understand that shoplifting, plagiarism, and various similar choices are unethical, immoral, or socially counterproductive. Yet they don’t put copyright infringement in the same category. That’s a social, educational, and parental failure.
  • LSb_120504_345.jpgFor all kinds of arguably irremediable licensing, contractual, competitive, and anti-trust reasons, it remains stubbornly difficult to “give the lady what she wants“: in S’s case, a comprehensive, reasonably priced, convenient service from which she could obtain all the movies she wanted. Whether this is customers not conveying their wants to providers (in part because they can bypass the latter), or whether this is providers stuck on obsolete delivery models, it’s a business failure.
  • Colleges and universities are supposed at least to tell their students about copyright infringement, and to implement technologies and other mechanisms to “effectively combat” it. S had no idea that the consequences of being caught downloading or streaming unauthorized copies were anything beyond being told to stop. So far as she knew, no one, at least no one at her college, had ever gotten in trouble for that. And she’d never heard anything from her college—which was also her Internet service provider—about the issue. That’s a policy failure.

To be fair, S’s dinner comments endorsed only a small subset of the lost generation’s tenets, she seemed generally interested in the streaming services we told her about, and she was now thinking about the consequences of being caught downloading or streaming unauthorized copies—and about how lots of people doing that might affect her future earnings. So there was progress.

But ganging up on 21-year-olds at dinner parties is a very inefficient way to counteract the mythology of the lost generation. We—and by this I mean everyone: users, parents, schools, artists, producers, network providers—need  to find much better ways to communicate about copyright infringement, to help potential infringers understand the choices they are making, and to provide and use better legal services.

Especially until we do that last, this will be hard, and progress will be slow. But it’s progress we need if the intellectual-property economy is to endure.