Archive for the ‘Uncategorized’ Category

Notes on Barter, Privacy, Data, & the Meaning of “Free”

It’s been an interesting few weeks:

  • Facebook’s upcoming $100-billion IPO has users wondering why owners get all the money while users provide all the assets.
  • Google’s revision of privacy policies has users thinking that something important has changed even though they don’t know what.
  • Google has used a loophole in Apple’s browser to gather data about iPhone users.
  • Apple has allowed app developers to download users’ address books.
  • And over in one of EDUCAUSE’s online discussion groups, the offer of a free book has somehow led security officers to do linguistic analysis of the word “free” as part of a privacy argument.

Lurking under all, I think, are the unheralded and misunderstood resurgence of a sometimes triangular barter economy, confusion about different revenue models, and, yes, disagreement what the word “free” means.

Let’s approach the issue obliquely, starting, in the best academic tradition, with a small-scale research problem. Here’s the hypothetical question, which I might well have asked back when I was a scholar of student choice: Is there a relationship between selectivity and degree completion at 4-year colleges and universities?

As a faculty member in the late 1970s, I’d have gone to the library and used reference tools to locate articles or reports on the subject. If I were unaffiliated and living in Chicago (which I wasn’t back then), I might have gone to the Chicago Public Library, found in its catalog a 2004 report by Laura Horn, and have had that publication pulled from closed-stack storage so I could read it.

By starting with that baseline, of course, I’m merely reminiscing. These days I can obtain the data myself, and do some quick analysis. I know the relevant data are in the Integrated Postsecondary Education Data System (IPEDS). And those IPEDS data are available online, so I can

(a) download data on 2010 selectivity, undergraduate enrollment, and bachelor’s degrees awarded for the 2,971 US institutions that grant four-year degree and import those data into Excel,

(b) eliminate the 101 system offices and such missing relevant data, the 1,194 that granted fewer than 100 degrees, the 15 institutions reporting suspiciously high degree/enrollment rates, the one that reported no degrees awarded (Miami-Dade College, in case you’re interested), and the 220 that reported no admit rate, and then

(c) for the remaining 1,440 colleges and universities, create a graph of degree completion (somewhat normalized) as a function of selectivity (ditto).

The graph doesn’t tell me much–scatter plots rarely do for large datasets–but a quick regression analysis tells me there’s a modestly positive relationship: 1% higher selectivity (according to my constructed index) translates on average into 1.4% greater completion (ditto). The download, data cleaning, graphing, and analysis take me about 45 minutes all told.

Or I might just use a search engine. When I do that, using “degree completion by selectivity” as the search term, a highly-ranked Google result takes me to an excerpt from a College Board report.

Curiously, that report tells me that “…selectivity is highly correlated with graduation rates,” which is a rather different conclusion than IPEDS gave me. The footnotes help explain this: the College Board includes two-year institutions in its analysis, considers only full-time, first-time students, excludes returning students and transfers, and otherwise chooses its data in ways I didn’t.

The difference between my graph and the College Board’s conclusion is excellent fodder for a discussion of how to evaluate what one finds online — in the quote often (but perhaps mistakenly) attributed to Daniel Patrick Moynihan, “Everyone is entitled to his own opinion, but not his own facts.” Which gets me thinking about one of the high points in my graduate studies, a Harvard methodology seminar wherein Mike Smith, who was eventually to become US Undersecretary of Education, taught Moynihan what regression analysis is, which in turn reminds me of the closet full of Scotch at the Joint Center for Urban Studies kept full because Moynihan required that no meeting at the Joint go past 4pm without a bottle of Scotch on the table. But I digress.

Since I was logged in with my Google account when I did the search, some of the results might even have been tailored to what Google had learned about me from previous searches. At the very least, the information was tailored to previous searches from the computer I used here in my DC office.

Which brings me to the linguistic dispute among security officers.

A recent EDUCAUSE webinar presenter, during Data Privacy Month, was Matt Ivester, creator of JuicyCampus and author of lol…OMG!: What Every Student Needs to Know About Online Reputation Management, Digital Citizenship and Cyberbullying.

“In honor of Data Privacy Day,” the book’s website announced around the same time, “the full ebook of lol…OMG! (regularly $9.99) is being made available for FREE!” Since Ivester was going to be a guest presenter for EDUCAUSE, we encouraged webinar participants to avail themselves of this offer and to download the book.

One place we did that was in a discussion group we host for IT security professionals. A participant in that discussion group immediately took Ivester to task:

…you can’t download the free book without logging in to Amazon. And, near as I can tell, it’s Kindle- or Kindle-apps-only. In honor of Data Privacy Day. The irony, it drips.

“Pardon the rant,” another participant responded, “but what is the irony here?” Another elaborated:

I intend to download the book but, despite the fact that I can understand why free distribution is being done this way, I still find it ironic that I must disclose information in order to get something that’s being made available at no charge in honor of DPD.

The discussion grew lively, and eventually devolved into a discussion of the word “free”. If one must disclose personal information in order to download a book at no monetary cost, is the book “free”?

If words like “free”, “cost”, and “price” refer only to money, the answer is Yes. But money came into existence only to simplify barter economies. In a sense, today’s Internet economy involves a new form of barter that replaces money: If we disclose information about ourselves, then we receive something in return; conversely, vendors offer “free” products in order to obtain information about us.

In a recent post, Ed Bott presented graphs illustrating the different business models behind Microsoft, Apple, and Google. According to Bott, Microsoft is selling software, Apple is selling hardware, and Google is selling advertising.

More to the point here, Microsoft and Apple still focus on traditional binary transactions, confined to themselves and buyers of their products.

Google is different. Google’s triangle trade (which Facebook also follows) offers “free” services to individuals, collects information about those individuals in return, and then uses that information to tailor advertising that it then sells to vendors in return for money. In the triangle, the user of search results pays no money to Google, so in that limited sense it’s “free”. Thus the objection in the Security discussion group: if one directly exchanges something of value for the “free” information, then it’s not free.

Except for my own time, all three answers to my “How does selectivity relate to degree completion?” question were “free”, in the sense I paid no money explicitly for them. All of them cost someone something. But not all no-cost-to-the-user online data is funded through Google-like triangles.

In the case of the Chicago Public Library, my Chicago property taxes plus probably some federal and Illinois grants enabled the library to acquire, catalog, store, and retrieve the Horn report. They also built the spectacular Harold Washington Library where I’d go read it.

In the case of IPEDS, my federal tax dollars paid the bill.

In both cases, however, what I paid was unrelated to how much I used the resources, and involved almost no disclosure of my identity or other attributes.

In contrast, the “free” search Google provided involved my giving something of value to Google, namely something about my searches. The same was true for the Ivester fans who downloaded his “free” book from Amazon.

Not that there’s anything wrong with that, as Jerry Seinfeld might say: by allowing Google and Amazon to tailor what they show me based on what they know about me, I get search results or purchase suggestions that are more likely to interest me. That is, not only does Google get value from my disclosure; I also get value from what Google does with that information.

The problem–this is what takes us back to security–is twofold.

  • First, an awful lot of users don’t understand how the disclosure-for-focus exchange works, in large part because the other party to the exchange isn’t terribly forthright about it. Sure, I can learn why Google is displaying those particular ads (that’s the “Why these ads?” link in tiny print atop the right column in search results), and if I do that I discover that I can tailor what information Google uses. But unless I make that effort the exchange happens automatically, and each search gets added to what Google will use to customize my future ads.
  • Second, and much more problematic, the entities that collect information about us increasingly share what they know. This varies depending whether they’ve learned about us directly through things like credit applications or indirectly through what we search for on the Web, what we purchase from vendors like Amazon, or what we share using social media like Facebook or Twitter. Some companies take pains to assure us they don’t share what they know, but in many cases initial assurances get softened over time (or, as appears to have happened with Apple, are violated through technical or process failures). This is routinely true for Facebook, and many seem to believe it’s what’s behind the recent changes in Google’s privacy policy.

Indeed, companies like Acxiom are in the business of aggregating data about individuals and making them available. Data so collected can help banks combat identity theft by enabling them to test whether credit applicants are who they claim to be. If they fall into the wrong hands, however, the same data can enable subtle forms of redlining or even promote identity theft.

Vendors collecting data about us becomes a privacy issue whose substance depends on whether

  • we know what’s going on,
  • data are kept and/or shared, and
  • we can opt out.

Once we agree to disclose in return for “free” goods, however, the exchange becomes a security issue, because the same data can enable impersonation. It becomes a policy issue because the same data can enable inappropriate or illegal activity.

The solution to all this isn’t turning back the clock — the new barter economy is here to stay. What we need are transparency, options, and broad-based educational campaigns to help people understand the deal and choose according to their preferences.

As either Stan Delaplane or Calvin Trillin once observed about “market price” listings on restaurant menus (or didn’t — I’m damned if I can find anything authoritative, or for that matter any mention whatsoever of this, but  know I read it), “When you learn for the first time that the lobster you just ate cost $50, the only reasonable response is to offer half”.

Unfortunately, in today’s barter economy we pay the price before we get the lobster…

Transforming Higher Education through Learning Technology: Millinocket?

Down East

Note to prospective readers: This post has evolved, through extensive revision and expansion and more careful citation, into a paper available at http://gjackson.us/it-he.pdf.

You might want to read that paper, which is much better and complete, instead of this post — unless you like the pictures here, which for the moment aren’t in the paper. Even if you read this to see the pictures, please go read the other.

“Which way to Millinocket?,” a traveler asks. “Well, you can go west to the next intersection…” the drawling down-east Mainer replies in the Dodge and Bryan story,

“…get onto the turnpike, go north through the toll gate at Augusta, ’til you come to that intersection…. well, no. You keep right on this tar road; it changes to dirt now and again. Just keep the river on your left. You’ll come to a crossroads and… let me see. Then again, you can take that scenic coastal route that the tourists use. And after you get to Bucksport… well, let me see now. Millinocket. Come to think of it, you can’t get there from here.”

PLATO and its programmed-instruction kin were supposed to transform higher education. So were the Apple II, and then the personal computer – PC and then Mac – and then the “3M” workstation (megapixel display, megabyte memory, megaflop speed) for which Project Athena was designed. So were simulated laboratories, so were BITNET and then the Internet, so were MUDs, so was Internet2, so was artificial intelligence, so was supercomputing.

Each of these most certainly has helped higher education grow, evolve, and gain efficiency and flexibility. But at its core, higher education remains very much unchanged. That may no longer suffice.

What about today’s technological changes and initiatives – social media, streaming video, multi-user virtual environments, mobile devices, the cloud? Are they to be evolutionary, or transformational? If higher education needs the latter, can we get there from here?

It’s important to start conversations about questions like these from a common understanding of information technologies that currently play a role in higher education, what that role is, and how technologies and their roles are progressing. That’s what prompted these musings.

Information Technology

For the most part, “information technology” means a tripartite array of hardware and software:

  • end-user devices, which today range from large desktop workstations to small mobile phones, typically with some kind of display, some way to make choices and enter text, and various other capabilities variously enabled by hardware and software;
  • servers, which comprise not just racks of processors, storage, and other hardware but rather are aggregations of hardware, software, applications, and data that provide services to multiple users (when the aggregation is elsewhere, it’s often called “the cloud” today); and
  • networks, wireless or wired, which interlink local servers, remote server clouds, and end-user devices, and which typically comprise copper and glass cabling, routers and switches and optronics, and network operating system plus some authentication and logging capability.

Information technology tends to progress rapidly but unevenly, with progress or shortcomings in one domain driving or retarding progress in others.

Today, for example, the rapidly growing capability of small smartphones has taxed previously underused cellular networks. Earlier, excess capability in the wired Internet prompted innovation in major services like Google and YouTube. The success of Google and Amazon forced innovation in the design, management, and physical location of servers.

Perhaps the most striking aspects of technological progress have been its convergence and integration. Whereas once one could reasonably think separately about servers, networks, and end-user devices, today the three are not only tightly interconnected and interdependent, but increasingly their components are indistinguishable. Network switches are essentially servers, servers often comprise vast arrays of the same processors that drive end-user devices plus internal networks, and end-user devices readily tackle tasks – voice recognition, for example – that once required massive servers.

Access to Information Technology

Progress, convergence, and integration in information technology have driven dramatic and fundamental change in the information technologies faculty, students, colleges, and universities have. That progress is likely to continue.

Here, as a result, are some assumptions we can reasonably make today:

  • Households have some level of broadband access to the Internet, and at least one computer capable of using that broadband access to view and interact with Web pages, handle email and other messaging, listen to audio, and view videos of at least YouTube quality .
  • Teenagers and most adults have some kind of mobile phone, and that phone usually has the capability to handle routine Internet tasks like viewing Web pages and reading email.
  • Colleges and universities have building and campus networks operating at broadband speeds of at least 10Mb/sec, and most have wireless networks operating at 802.11b (11Mb/sec) or greater speed.
  • Server capacity has become quite inexpensive, largely because “cloud” providers have figured out how to gain and then sell economy of scale.
  • Everyone – or at least everyone between the ages of, say, 12 and 65 – has at least one authenticated online identity, including email and other online service accounts; Facebook, Twitter, Google, or other social-media accounts; online banking, financial, or credit-card access; or network credentials from a school, college or university, or employer.
  • Everyone knows how to search on the Internet for material using Google, Bing, or other search engines.
  • Most people have a digital camera, perhaps integrated into their phone and capable of both still photos and videos, and they know how to send them to others or offload their photos onto their computers or an online service.
  • Most college and university course materials are in electronic form, and so is a large fraction of library and reference material used by the typical student.
  • Most colleges and universities have readily available facilities for creating video from lectures and similarly didactic events, whether in classrooms or in other venues, and for streaming or otherwise making that video available online.

It’s striking how many of these assumptions were invalid even as recently as five years ago. Most of the assumptions were invalid a decade before that (and it’s sobering to remember that the “3M” workstation was a lofty goal as recently as 1980 and cost nearly $10,000 in the mid-1980s, yet today’s iPhone almost exceeds the 3M spec).

Looking a bit into the future, here are some further assumptions that probably will be safe:

  • Typical home networking and computers will have improved to the point they can handle streamed video and simple two-way video interactions (which means that at least one home computer will have an add-on or built-in camera).
  • Most people will know how to communicate with individuals or small groups online through synchronous social media or messaging environments, in many cases involving video.
  • Authentication and monitoring technologies will exist to enable colleges and universities to reasonably ensure that their testing and assessment of student progress is protected from fraud.
  • Pretty much everyone will have the devices and accounts necessary for ubiquitous connectivity with anybody else and to use services from almost any college, university, or other educational provider.

Technology, Teaching, and Learning

In colleges and universities, as in other organizations, information technology can promote progress by enabling administrative processes to become more efficient and by creating diverse, flexible pathways for communication and collaboration within and across different entities. That’s organizational technology, and although it’s very important, it affects higher education much the way it affects other organizations of comparable size.

Somewhat more distinctively, information technology can become learning technology, an integral part of the teaching and learning process. Learning technology sometimes replaces traditional pedagogies and learning environments, but more often it enhances and expands them.

The basic technology and middleware infrastructure necessary to enable colleges and universities to reach, teach, and assess students appears to exist already, or will before long. This brings us to the next question: What applications turn information technology into learning technology?

To answer this, it’s useful to think about four overlapping functions of learning technology.

Amplify and Extend Traditional Pedagogies, Mechanisms, and Resources

For example, by storing and distributing materials electronically, by enabling lectures and other events to be streamed or recorded, and by providing a medium for one-to-one or collective interactions among faculty and students, IT potentially expedites and extends traditional roles and transactions. Similarly, search engines and network-accessible library and reference materials vastly increase faculty and students access. The effect, although profound, nevertheless falls short of transformational. Chairs outside faculty doors give way to “learning management systems” like Blackboard or Sakai or Moodle, wearing one’s PJs to 8am lectures gives way to watching lectures from one’s room over breakfast, and library schools become information-science schools. But the enterprise remains recognizable. Even when these mechanisms go a step further, enabling true distance education whereby students never set foot on campus (in 2011, 3.7% of all students took all their coursework through distance education), the resulting services remain recognizable. Indeed, they are often simply extensions of existing institutions’ campus programs.

Make Educational Events and Materials Available Outside the Original Context

For example, the Open Courseware initiative (OCW) started as publicly accessible repository of lecture notes, problem sets, and other material from MIT classes. It since has grown to include similar material from scores of other institutions worldwide. Similarly, the newer Khan Academy has collected a broad array of instructional videos on diverse topics, some from classes and some prepared especially for Khan, and made those available for anyone interested in learning the material. OCW, Khan, and initiatives like them provide instructional material in pure form, rather than as part of curricula or degree programs.

Enable Experience-Based Learning 

This most productively involves experience that otherwise might have been unaffordable, dangerous, or otherwise infeasible. Simulated chemistry laboratories and factories were an early example – students could learn to synthesize acetylene by trial and error without blowing up the laboratory, or to fine-tune just-in-time production processes without bankrupting real manufacturers. As computers have become more powerful, so have simulations become more complex and realistic. As simulations have moved to cloud-based servers, multi-user virtual environments have emerged, which go beyond simulation to replicate complex environments. Experiences like these were impossible to provide before the advent of powerful, inexpensive server clouds, ubiquitous networking, and graphically capable end-user devices.

Replace the Didactic Classroom Experience

This is the most controversial application of learning technology – “Why do we need faculty to teach calculus on thousands of different campuses, when it can be taught online by a computer?” – but also one that drives most discussion of how technology might transform higher education. It has emerged especially for disciplines and topics where instructors convey what they know to students through classroom lectures, readings, and tutorials. PLATO (Programmed Logic for Automated Teaching Operations) emerged from the University of Illinois in the 1960s as the first major example of computers replacing teachers, and has been followed by myriad attempts, some more successful than others, to create technology-based teaching mechanisms that tailor their instruction to how quickly students master material. (PLATO’s other major innovation was partnership with a commercial vendor, the now defunct Control Data Corporation.)

Higher Education

We now come to the $64 question: what role might trends in higher-education learning technology play in the potential transformation of higher education?

The transformational goal for higher education is to carry out its social and economic roles with greater efficiency and within the resource constraints. Many believe that such transformation requires a very different structure for future higher education. What might that structure be, and what role might information technologies play in its development?

The fundamental purpose of higher education is to advance society, polity, and the economy by increasing the social, political, and economic skills and knowledge of students – what economists call “human capital“. At the postsecondary level, education potentially augments students’ human capital four ways:

  • admission, which is to say declaring that a student has been chosen as somehow better qualified or more adaptable in some sense than other prospective students (this is part of Lester Thurow‘s “job queue” idea);
  • instruction, including core and disciplinary curricula, the essentially unidirectional transmission of concrete knowledge through lectures, readings, and like, and also the explication and amplification of that through classroom, tutorial, and extracurricular guidance and discussion (this is what we often mean by the narrow term “teaching”);
  • certification, specifically the measuring of knowledge and skill through testing and other forms of assessment; and
  • socialization, specifically learning how to become an effective member of society independently of one’s origin family, through interaction with faculty and especially with other students.

Sometimes a student gets all four together. For example, MIT marked me even before I enrolled as someone likely to play a role in technology (admission), taught me a great deal about science and engineering generally, electrical engineering in particular, and their social and economic context (instruction), documented through grades based on exams, lab work, and classroom participation that I had mastered (or failed to master) what I’d been taught (certification), and immersed me in an environment wherein data-based argument and rhetoric guided and advanced organizational life, and thereby helped me understand how to work effectively within organizations, groups, and society (socialization).

Most students attend college whose admissions processes amount to open admission, or involve simple norms rather than competition.  That is, anyone who meets certain standards, such as high-school completion with a given GPA or test score, is admitted. In 2010, almost half of all institutions reporting having no admissions criteria, and barely 11% accepted fewer than 1/4 of their applicants. Moreover, most students do not live on campus — in 2007-08, only 14% of undergraduates lived in college-owned housing. This means that most of higher education has limited admission and socialization effects. Therefore, for the most part higher education affects human capital through instruction and certification.

Instruction is an especially fertile domain for technological progress. This is because three trends converge around it:

  • ubiquitous connectivity, especially from students’ homes;
  • the rapidly growing corpus of coursework offered online, either as formal credit-bearing classes or as freestanding materials from entities like OCW or Khan; and
  • perhaps more speculative) the growing willingness of institutions to grant credit and allow students to satisfy requirements through classes taken at other institutions or through some kind of testing or assessment.

Indeed, we can imagine a future where it becomes commonplace for students to satisfy one institution’s degree requirements with coursework from many other institutions. Further down this road, we can imagine there might be institutions that admit students, prescribe curriculum, certify progress, and grant degrees – but have no instructional faculty and do not offer courses. This, in turn, might spawn purely instructional institutions.

One problem with such a future is that socialization, a key function of higher education, gets lost. This points the way to one major technology challenge for the future: Developing online mechanisms, for students who are scattered across the nation or the world, that provide something akin to rich classroom and campus interaction. Such interaction is central to the success of, for example, elite liberal-arts colleges and major residential universities. Many advocates of distance education believe that social media such as Facebook groups can provide this socialization, but that potential has yet to be realized.

A second problem with such a future is that robust, flexible methods for assessing student learning at a distance remain either expensive or insufficient. For example, ProctorU and Kryterion are two of several commercial entities that provide remote exam proctoring, but they do so through somewhat intensive use of video observation, and that only works for rather traditional written exams. For another example, in the aftermath of 9/11 many universities figured out how to conduct doctoral thesis defenses using high-bandwidth videoconferencing facilities rather than flying in faculty from other institutions, but this simply reduced travel expense rather than changed the basic idea that several faculty members would examine one student at a time.

Millinocket

If learning technologies are to transform higher education, we must exploit opportunities and address problems. At the same time, transformed higher education cannot neglect important dimensions of human capital. In that respect, our goal should be not only to make higher education more efficient than it is today, but also better.

Drivers headed for Millinocket rarely pull over any more to ask directions of drawling downeasters. Instead, they rely on the geographic position and information systems built into their cars or phones or computers, which in turn rely on network connectivity to keep maps and traffic reports up to date. To be sure, reliance on GPS and GIS tends to insulate drivers from interaction with the diversity they pass along the road, much as Interstate highways standardized cross-country travel. So the gain from those applications is not without cost.

The same is true for learning technology: it will yield both gains and losses. Effective progress will result only if we explore and understand the technologies and their applications, decide how these relate to the structure and goals of higher education, identify obstacles and remedies, and figure out how to get there from here.

IT Demography in Higher Education: Some Reminiscence & Speculation

In oversimplified caricature, many colleges and universities have traditionally staffed the line, management, and leadership layers of their IT enterprise thus:

Students with some affinity for technology (perhaps their major, perhaps work-study, perhaps just a side interest) have approached graduation not quite sure what they should do next. They’ve had some contact with the institution’s IT organizations, perhaps having worked for some part of them or perhaps having criticized their services. Whatever the reason, working for an institutional IT organization has seemed a useful way to pay the rent while figuring out what to do next, and it’s been a good deal for the IT organizations because recent graduates are usually pretty clever, know the institution well, learn fast, and are willing to work hard for relatively meager pay.

Moreover, and partly compensating for low pay, the technologies being used and considered in higher education often have been more advanced than those out in business, so sticking around has been a good way to be at the cutting edge technologically, and college and universities have tended to value and reward autonomy, curiosity, and creativity.

Within four or five years of graduation, most staff who come straight into the IT organization have figured out that it’s time to move on. Sometimes a romantic relationship has turned their attention to life plans and long-term earnings, sometimes ambition has taken more focused shape and so they seek a steeper career path, sometimes their interests have sharpened and readied them for graduate school — but in any case, they have left the campus IT organization for other pastures after a few good, productive years, and have been replaced by a new crop of recent graduates.

But a few individuals have found that working in higher education suits their particular hierarchy of needs (to adapt and somewhat distort Maslow). For them, IT work in higher education has yielded several desiderata (remember I’m still caricaturing here): there’s been job security, a stimulating academic environment, a relatively flat organization that offers considerable responsibility and flexibility, and an opportunity to work with and across state-of-the-art (and sometimes even more advanced) technologies. Benefits have been pretty good, even though pay hasn’t and there have been no stock options. Individuals to whom this mix appeals have stayed in campus IT, rising to middle-management levels, sometimes getting degrees in the process, and sometimes, as they have moved into #3 or #2 positions, even moving to other campuses as opportunities present themselves.

Higher-education IT leaders — that is, CIOs, the heads of major decentralized IT organizations, and in some cases the #2s within large central organizations — typically have come from one of two sources. Some have come from within higher-education IT organizations, sometimes the institution’s own but more typically, since a given institution usually has more leadership-ready middle managers than it has available leadership positions, another institution’s. (Whereas insiders once tended to be heavy-metal computer-center directors,  more recently they have come from academic technologies or networking.) Other leaders have come from faculty ranks, often (but not exclusively) in computer science or other technically-oriented disciplines. Occasionally some come from other sources, such as consulting firms or even technology vendors, or even from administration elsewhere in higher education.

The traditional approach staffs IT organizations with well educated, generally clever individuals highly attuned to the institution’s culture and needs. They are willing and able to tackle complex IT projects involving messy integration among different technologies. Those individuals also cost less that comparable ones would if hired from outside. Expected turnover among line staff notwithstanding, they are loyal to the institution even in the face of financial and management challenges.

But the traditional model also tilts IT organizations toward idiosyncrasy and patchwork rather than coherent architecture and efficiency-driven implementation. It often works against the adoption of effective management techniques, and it can promote hostility toward businesslike approaches to procurement and integration and indeed the entire commercial IT marketplace. All of this has been known, but in general institutions have continued to believe that the advantages of the traditional model outweigh its shortcomings.

I saw Moneyball in early October. I liked it mostly because it’s highly entertaining, it’s a good story, it’s well written, acted, directed, and produced, and it involves both applied statistical analysis (which is my training) and baseball (my son’s passion, and mine when the Red Sox are in the playoffs). I also liked it because its focus — dramatic change in how one staffs baseball teams — led me to think about college and university IT staffing. (And yes, I know my principles list says that “all sports analogies mislead”, but never mind.)

In one early scene, the Oakland A’s scouting staff explains to Brad Pitt’s character, Billy Beane, that choosing players depends on intuition honed by decades of experience with how the game is played, and that the approach Beane is proposing — choosing them based on how games are won rather than on intuition — is dangerous and foolhardy. Later, Arliss Howard’s character, the Red Sox owner John Henry, explains that whenever one goes against long tradition all hell breaks loose, and whoever pioneers or even advocates that change is likely to get bloodied.

So now I’ll move from oversimplification and caricature to speculation. To believe in the continued validity of the traditional staffing model may be to emulate the scouts in Moneyball. But to abandon the model is risky, since it’s not clear how higher-education IT can maintain its viability in a more “businesslike” model based on externally defined architectures, service models, and metrics. After all, Billy Beane’s Oakland A’s still haven’t won the World Series.

The Beane-like critique of the traditional model isn’t that the advantage/shortcoming balance has shifted, but rather that it depends on several key assumptions whose future validity is questionable. To cite four interrelated ones:

  • With the increasing sophistication of mobile devices and cloud-based services, the locus of technological innovation has shifted away from colleges and universities. Recent graduates who want to be in the thick of things while figuring out their life plans have much better options than staying on campus — they can intern at big technology firms, or join startups, or even start their own small businesses. In short, there is now competition for young graduates interested IT but unsure of their long-term plans.
  • As campuses have outsourced or standardized much of their IT, jobs that once included development and integration responsibility have evolved into operations, support, and maintenance — which are important, but not very interesting intellectually, and which provide little career development.  Increased outsourcing has exacerbated this, and so has increased reliance on business-based metrics for things like user support and business-based architectures for things like authentication and systems integration.
  • College and university IT departments could once offset this intellectual narrowing because technology prices were dropping faster than available funds, and the resulting financial cushion could be dedicated to providing staff with resources and flexibility to go beyond their specific jobs (okay, maybe what I mean is letting staff buy gadgets and play with them). But tightened attention to productivity and resource constraints have large eliminated the offsetting toys and flexibility. So IT jobs in colleges and universities have lost much of their nonpecuniary attractiveness, without any commensurate increase in compensation. Because of this, line staff are less likely to choose careers in college or university IT, and without this source of replenishment the higher-education IT management layer is aging.
  • As IT has become pervasively important to higher education, so responsibility for its strategic direction has broadened. As strategic direction has broadened, so senior leadership jobs, including the CIO’s, have evolved away from hierarchical control and toward collaboration and influence. (I’ve written about this elsewhere.) At the same time, increasing attention to business-like norms and metrics has required that IT leaders possess a somewhat different skillset than usually emerges from gradual promotion within college and university IT organizations or faculty experience. This has disrupted the supply chain for college and university IT leadership, as a highly fragmented group of headhunter firms competes to identify and recruit nontraditional candidates.

I think we’re already seeing dramatic change resulting from all this. The most obvious change is rapid standardization around commercial standards to enable outsourcing — which is appealing not only intrinsically, but because it reduces dependence on an institution’s own staff. (On the minus side, it also tends to emphasize proprietary commercial rather than open-source or open-standards approaches.) I also sense much greater interest in hiring from outside higher education, both at the line and management levels, and a concomitant reappraisal of compensation levels. That, combined with flat or shrinking resources, is eliminating positions, and the elimination of positions is promoting even more rapid standardization and outsourcing.

On the plus side, this is making college and university IT departments much more efficient and businesslike. On the minus side, higher education IT organizations may be losing their ability to innovate. This is yet another instance of the difficult choice facing us in higher-education IT: Is IT simply an important, central element of educational, research, and administrative infrastructure, or is IT also the vehicle for fundamental change in how higher education works? (In Moneyball, the choice is between player recruitment as a mechanism for generating runs, and as a mechanism for exciting fans. Sure, Red Sox fans want to win. But were they more avid before or after the Curse ended with Bill James’s help?)

If it’s the latter, we need to make sure we’re equipped to enable that — something that neither the traditional model nor the evolving “businesslike” model really does.

 

 

 

Institutional Demography in Higher Education: A Reminder

To understand why policy debates sometimes seem to make no sense, to circle endlessly, or to become bafflingly confused, it’s important to remember that the demography of higher education isn’t politically straightforward. By “demography” I don’t mean Gen X, Gen Y, and echo booms, but rather straightforward counts of degree-granting institutions and students. And by “politically” I don’t mean Republicans and Democrats, but rather the relative importance of different constituencies with different resources and goals.

Here’s a graph (I’ll append a more detailed table at the end). The data come from the National Center for Educational Statistics 2008 IPEDS surveys. They describe the 4,474 public and private degree-granting institutions in the United States, classified into the usual Carnegie categories. I’ve collapsed Carnegie and size categories:  “Small” means enrollment under 2,500, and “Large” means enrollment of 20,000 or more. The categories whose labels I’ve italicized are mostly private, those I’ve underscored are mostly public, and those I’ve both italicized and underscored are split between public and private institutions.

Most of us know some key demographic facts about higher education — for example, that the largest group of students is in 2-year colleges, followed closely by research universities. We also know that an awful lot of commentary and influence in higher education comes from people in or connected with research universities, and therefore many of us have trouble thinking about other kinds of institutions, let alone new kinds.

Here are some things we tend to forget:

  • There really aren’t very many research and doctoral universities — they account for fewer than 10% of institutions even though they enroll over 25% of all students.
  • Although there are lots of big community colleges and they enroll lots of students, not all 2-year colleges are big community colleges; rather, more than half of them are small, and most of those are private.
  • Most small 4-year and master’s institutions are also private, and although they comprise almost 20% of all institutions, they enroll only 5% of all students.
  • There are a lot of specialized institutions — that is, freestanding business, health, medical, engineering, technical, design, theological, and other similar institutions — but they don’t enroll very many students.
  • Enrollment isn’t quite a Pareto distribution (that’s the classic 20-80 rule), but it’s pretty close: 33% of institutions enroll 80% of the students.

What this tells us is that the politics of higher education — and, indeed, the politics of organizations, like my own EDUCAUSE, that try to represent all of higher education — are very different depending on whether we focus on students or on institutions.

If we focus on students, the politics are pretty straightforward. Big community colleges, big master’s institutions, and doctoral and research universities count, and all other institutions don’t. The group that counts is mostly public institutions, so state governments and state system offices also count. Research and doctoral universities employ lots of faculty to whom research is as important as teaching, and who are vocal about its importance, so disciplinary groups and research funders are also relevant.

From the enrollment perspective, how higher education evolves depends critically on what happens in big community colleges and in research and doctoral universities, since that’s where students are. Conversely, unless those institutions adapt to what students expect, we can expect cataclysmic change in higher education.

But community colleges and universities are at opposite ends of the cultural spectrum. To give just two examples, the former rely heavily on faculty hired ad hoc to teach specific courses, the latter rely on tenured or tenure-track faculty, and the former have no interest in research productivity or eminence whereas the latter stake their reputations on it. So the two most important sectors (in the enrollment sense) often are misaligned if not at odds about policy choices, and the changes they contemplate and implement are likely to be divergent rather than synergistic.

The other 3,000 institutions are different. Of the 4,474 institutions, almost 2/3 are private, and almost 2/3 are small, with lots of overlap: just over half of all institutions are small and private. If we focus on institutions rather than enrollment, we attend to a very large number of small, private institutions that often have different missions and challenges, typically do not work together, and, with a few exceptions, are not organized or collectively vocal.

For these institutions, the difference between survival and demise can depend on tiny changes in enrollment or financial aid or even audit policies, since their small size denies them the operating cushions and economies of scale available to their larger and public counterparts. This also means that these institutions have no excess resources to invest in innovation, so they are unlikely to adapt to changing student needs.

In a sense, focusing on enrollment tends to yield interesting and strategic (if conflicting) attention to the future, whereas focusing on institutions tends to balance (if not replace) this with a focus on tactical survival and the intricacies of current policy.

But my point isn’t about specific policy options or imperatives. Rather, it’s this: what’s important varies dramatically depending whether we focus on institutions or students. And that, I think, not only contributes to the complexity of our conversations within the status quo of higher education, but also complicates thinking about its future.

Detailed Table

Degree-Granting Institutions by Type, Control, and Size, IPEDS 2008 Data

 

 

(Scientific) Knowledge Discovery in Open Networked Environments: Some Policy Issues

 (This is an edited version of comments from “The Future of Scientific Knowledge Discovery in Open Networked Environments: A National Symposium and Workshop” held in Washington, DC, March 10-11, 2011 under the auspices of the National Academy of Sciences Board on Research Data and Information and Computer Science and Telecommunications Board. The presentation slides can be found at http://gregj.us/oc9AtI)

First, a bit of personal history. The Coleman Report, released in 1966, was one of the first big studies of American elementary and secondary education, and especially of equality therein.

James ColemanSome years after the initial report, I was working in a group that was reanalyzing data from the Coleman study. We had one of what we were told were two remaining copies of a dataset that had been collected as part of that study but had never been used. It was a 12-inch reel of half-inch magnetic tape containing the Coleman “Principals” data, which derived from a survey of principals in high schools and elementary schools.

The first challenge was to decipher the tape itself, which meant, as contemporaries among you may recall, trying every possible combination of labeling protocol, track count, and parity bit until one of them yielded data rather than gibberish. Once we did that – the tape turned out to be seven track, even parity, unlabeled – the next challenge was to make sure the codebooks (which gave the data layout – the schema, in modern terms – but not the original questions or choice sets) matched what was on the tape. By the time we did all that, we had decided that the Principals data weren’t all that relevant to our work, and so we put the analysis and the tape aside.

The 12-inch reel of tape kept moving with me and my garage. Eventually we sold our house in Arlington, Massachusetts, and moved to Lexington. I had to clear out the garage, and there seemed no more reason to keep the tape than various boxes of old Hollerith cards from my dissertation, and so off they went to a dumpster.

Unfortunately, apparently what I’d had was actually the last remaining copy of the Principals data. The other so-called “original” copy had been discarded on the assumption that our research group would keep the tape. That illustrates what can happen with the LOCKSS strategy (lots of copies keeps stuff safe) strategy for data preservation: if everybody thinks somebody else is keeping a copy, then LOCKSS does not work terribly well.

Many of us who work in the social sciences, particularly at the policy end of things, never gather our own data. The research we do is almost always based on data that were collected by someone else but typically not analyzed by them. This notion that data collectors should be separate from data analysts is actually very well established and routine in my fields of work.

National Longitudinal Study of the High School Class of 1972The Coleman work came early in my doctoral studies. Most of my work on research projects back then (at the Huron Institute, the Center for the Study of Public Policy, and Harvard’s Administration, Planning, and Social Policy programs in education) involved secondary analysis.  Later, when it came time to do my own research, I used a large secondary dataset from the National Longitudinal Study of the High School Class of 1972 (usually called NLS72 for short). This study went on for years and years and years, and, as you can see, is now this huge longitudinal array of data.

My research question, based on the first NLS72 followup, was whether financial aid made any difference in kids’ decisions whether to enter college. The answer is “yes”, but the more complex question is whether that effect is big enough to matter. NLS72 taught me how important the relationship was between those who gather data and those who use it, and so it’s good to be here today to reflect on what I’ve learned since then.

EDUCAUSE Core Data ServiceMy current employer, EDUCAUSE, is an association of most of the higher-education institutions in the United States and a number elsewhere. Among other things, we collect data from our members on a whole raft of questions: How much do you spend on personal computers? How many helpdesk staff do you have? To whom does the CIO report? We gather all of these data, and then our members and many other folks use the Core Data Service for all sorts of purposes.

One of the things that struck us over time is that we get very few questions from people about what a data item actually means. Users apparently make their own assumptions about what each data data item means, and so although they are all producing research based ostensibly on the same data from the same source, because of this interpretation problem they sometimes get very different results, even if they proceed identically from a statistical point of view.

If issues like this go unexamined, then research based on secondary, “discovered” sources can be very misleading. It’s critical, in doing such analysis, to be clear about some important attributes of data that are “discovered” rather than collected directly. I want to touch quickly on five attributes of “discovered” data that warrant attention: quality, location, format, access, and support.

Quality

The classic quality problem for secondary analysis is that people use data for a given purpose without understanding that the data collection may have been inappropriate for that purpose. There are two general issues here. One has to do with very traditional measures of data quality: whether the questions were valid and reliable, what the methodology was, and other such attributes. Since that dimension of quality is well understood, I’ll say no more about it here.

Dewey defeats TrumanThe other is something most people do not usually consider a quality issue, but any archivist would tell you is absolutely critical: you have to think about where data came from and why they were gathered – their “provenance” – because why people do things makes a different in how they do them, and how people do things makes a difference in whether data are reusable.

One hears arguments that this is not true in the hard sciences and is completely true in the social sciences, but the reverse is equally true a great deal of the time. So the question of why someone gathered data is very important.

One key element of provenance I’ll call “primacy“, which is whether you are getting data from the people who gathered them or there have been intermediaries along the way. People often do not consider that. They say, “I’ve found some relevant data,” and that’s the end of it.

I was once assigned, as part of a huge Huron Institute review of federal programs for young children, to figure out what we knew about “latchkey” kids. These are kids who come home after school when their parents aren’t home, and let themselves in with a key (in the popular imagery of the time, with the key kept around their necks on a string). The question was, how many latchkey kids are there?

This was pre-Google, so I did a lot of library research and discovered that there were many studies attempting to answer this question. Curiously, though, all of them estimated about the same number of latchkey kids. I was intrigued by that, because I had done enough data work by then to find such consistency improbable.

I looked more deeply into the studies, figuring out where each researcher had gotten his or her data. It turned out that every last one of these studies traced to a single study, and that single study had been done in one town by someone who was arguing for a particular public policy and therefore was interested in showing that the number was relatively high. The original purpose of the data had been lost as they were reused by other researchers, and by the time I reviewed the finding, people thought that the latchkey-kid phenomenon was well and robustly understood based on multiple studies.

The same thing can happen with data mining. You can see multiple studies and think everyone has got separate data, but it turns out that everyone is using the same data. So provenance and primacy become very important issues.

Location

Bert & IIn many cases this turns into a financial issue: How do I get data from there to here? If the amount of data is small, the problem can be solved without tradeoffs.  But for enormous collections of data — x-ray data from a satellite, for example, or even financial transaction data from supermarkets —how data get from there to here and where and how they are stored become a policy issues, because sometimes the only way to get data from source to user and to store them are by summarizing, filtering, or otherwise “cleaning” or “compressing” the data. Large datasets gathered elsewhere often are subject to such pre-processing, especially when they involve images or substantial detection noise, and this is important for the secondary analyst to know.

Constraints on data located elsewhere arise too. There may be copyright constraints:  you can use the data, but you cannot keep them, or you can use them and keep them, but you cannot publish anything unless the data collector gets to see – or, worse, approve – the results. All of these things typically have to do with where the data are located, because the conditions accompany the data from where they were. Unlike the original data collector, the secondary analyst has no ability to change the conditions.

Format

Lantern slide projectorThere are fewer and fewer libraries that actually have working lantern slide projectors. Yet there are many lantern slides that are the only records of certain events or images. In most cases, nobody has money to digitize or otherwise preserve those slides. As is the case for lantern slides, whether data can be used depends on their format, and so format affects what data are available. There are three separate kinds of issues related to format: robustness, degradation, and description.

Robustness has to do with the chain of custody, and especially with accidental or intentional changes. Part of the reason the seven-track Coleman had even parity was so that I could check each six bits of data to make sure the ones and zeroes added up to an even number. If they did not, something had happened to those six bits in transit, and the associated byte of data could not be trusted. So one question about secondary data is: Is the format robust? Does it resist change, or at least indicate it? That is, does data format stand up to the vagaries of time and of technology?

Degradation has to do with losing data over time, which happens to all data sets regardless of format. Error-correction mechanisms can sometimes restore data that have been lost, especially if multiple copies of the data exist, but none of those mechanisms is perfect. It’s important to know how data might have degraded, and especially what measures have been employed to combat or “reverse” degradation.

Finally, most data are useless without a description of the items in this dataset: not just how the data are recorded on the medium or the database schema, but also how they were measured, what the different values are, what was done when there were missing data, whether any data correction was done, and so on.

So, as a matter of policy, the “codebook” becomes an important piece of format. Sometimes codebooks come with data, but quite often one gets to the codebook by a path that is different from the one leading to the data, or has to infer what the codebook was by reading someone’s prior research. Both are dangerous. That’s what we had to do with the Coleman Principals data, for example, because all we had were summary tables. We had to deduce what the questions were and which questions had which values. It’s probably just as well we never used the data for analysis.

Access

Here two policy issues arise.

NPRThe first access issue is promotion: researchers trying to market their data. The risks in that should be obvious, and it’s important that secondary analysts seek out the best data sources rather than vice versa.

As an example, as I was preparing this presentation a furor erupted over an public-radio fundraiser who had been recorded apparently offering to tilt coverage in return for sponsorship. That’s not the data issue – rather, it was the flood of “experts” offering themselves as commentators as the media frenzy erupted. Experts were available, apparently and ironically, to document pretty much any perspective on the relationship between media funding and reporting. The experts the Chronicle of Philanthropy quoted were very different from the ones Forbes quoted.

The second issue is restriction. Some data are sensitive, and people may not want them to be seen. There are regulations and standard practices to handle this, but sometimes people go further and attempt to censor specific data values rather than restrict access. The most frequent problem is the desire on the part of data collectors to control analysis and/or publications based on their data.

Most cases, of course, lie somewhere between promotion and censorship. The key policy point is, all data flow through a process in which there may be some degree of promotion or censorship, and secondary analysts ignore that at their peril.

Support

This has become a big issue for institutions. Suppose a researcher on Campus A uses data that originated with a researcher at Campus B. A whole set of issues arises. Some of them are technical issues. Some of them are coding issues. Many of these I’ve already mentioned under Location and Format above.

A wonderful New Yorker cartoon (http://gregj.us/pFj6EN) captures the issue perfectly: a car is driving in circles in front of a garage, and one mechanic says to another, “At what point does this become our problem”?

Whatever the issues are, whom does the researcher at A approach for help? For substantive questions, the answer is often doctoral students at A, but a better answer might come from the researcher at B. For technical things, A’s central IT organization might be the better source of help, but some technical questions can only be solved with guidance from the originator at B. Is support for secondary analysis an institutional role, or the researcher’s responsibility? That is, do all costs of research flow to the principal investigator, or are they part of central research infrastructure? In either case, does the responsibility for support lie with the originator – B – or with the secondary researcher? These questions often get answered financially rather than substantively, to the detriment of data quality.

When data collection carries a requirement that access to the data be preserved for some period beyond the research project, a second support question arises. I spoke with someone earlier at this meeting, for example, about the problem of faculty moving from institution to institution. Suppose that a faculty member comes to campus C and gets an NSF grant. The grant is to the institution. The researcher gathers some data, does his or her own analysis, publishes, and becomes famous. Fame has its rewards: Campus D makes an offer too good to refuse, off the faculty member goes, and now Campus C is holding the bag for providing access to the data to another researcher, at Campus E. The original principal investigator is gone, and NSF probably has redirected the funds to Campus D, so C is now paying the costs of serving E’s researcher out of C’s own funds. There’s no good answer to this one, and most of the regulations that cause the problem pretend it doesn’t exist.

A Caution about Cautions

Let me conclude by citing a favorite Frazz comic, which I don’t have permission to reproduce here – Frazz is a great strip, a worthy successor to the legendary Calvin and Hobbes. Frazz (he’s a renaissance man, an avid runner who works as a school janitor) starts reading directions: “Do not use in the bathtub”. Caulfield (a student, Frazz’s protégé, quite possibly Calvin grown up a bit) reads on: “Nor while operating a motor vehicle.” They continue reading: “And not to be used near a fire extinguisher, not recommended for unclogging plumbing, and you do not stick it in your ear and turn it on.”

Finally Caulfield says, “Okay, I think we are good,” and he puts on his helmet. Then the principal, watching the kid, who is wearing skates and about to try rocketing himself down an iced-over sidewalk by pointing a leaf blower backwards and turning it on, says “This cannot be a good idea.” To which Frazz replies, “When the warnings are that comprehensive, the implication is that they are complete.”

If there is a warning about policy advice, it is that the list I just gave cannot possibly be complete. The kinds of things we have talked about today require constant thought!

IT and Post-Institutional Higher Education: Will We Still Need Brad When He’s 54?

“There are two possible solutions,” Hercule Poirot says to the assembled suspects in Murder on the Orient Express (that’s p. 304 in the Kindle edition, but the 1974 movie starring Albert Finney is way better than the book, and it and the book are both much better than the abominable 2011 PBS version with David Suchet). “I shall put them both before you,” Poirot continues, “…to judge which solution is the right one.”

So it is for the future role, organization, and leadership of higher-education IT. There are two possible solutions. There’s a reasonably straightforward projection how the role of IT in higher education will evolve into the mid-range future, but there’s also a more complicated one. The first assumes institutional continuity and evolutionary change. The second doesn’t.

IT Domains

How does IT serve higher education? Let me count the ways:

  1. Infrastructure for the transfer and storage of pedagogical, bibliographic, research, operational, and administrative information, in close synergy with other physical infrastructure such as plumbing, wiring, buildings, sensors, controls, roads, and vehicles. This includes not only hardware such as processors, storage, networking, and end-user devices, but also basic functionality such as database management and hosting (or virtualizing) servers.
  2. Administrative systems that manage, analyze, and display the information students, faculty, and staff need to manage their own work and that of their departments. This includes identity management, authentication, and other so-called “middleware” through which institutions define their communities.
  3. Pedagogical applications students and faculty need to enable teaching and learning, including tools for data analysis, bibliography, simulation, writing, multimedia, presentations, discussion, and guidance.
  4. Research tools faculty and students need to advance knowledge, including some tools that also serve pedagogy plus a broad array of devices and systems to measure, gather, simulate, manage, share, distill, analyze, display, and otherwise bring data to bear on scholarly questions.
  5. Community services to support interaction and collaboration, including systems for messaging, collaboration, broadcasting, and socialization both within campuses and across their boundaries.

“…A Suit of Wagon Lit Uniform…and a Pass Key…”

The straightforward projection, analogous to Poirot’s simpler solution (an unknown stranger committed the crime, and escaped undetected), stems from projections how institutions themselves might address each of the IT domains as new services and devices become available, especially cloud-based services and consumer-based end-user devices. The core assumptions are that the important loci of decisions are intra-institutional, and that institutions make their own choices to maximize local benefit (or, in the economic terms I mentioned in an earlier post, to maximize their individual utility.)

Most current thinking in this vein goes something like this:

  • We will outsource generic services, platforms, and storage, and perhaps
  • consolidate and standardize support for core applications and
  • leave users on their own insofar as commercial devices such as phones and tablets are concerned, but
  • we must for the foreseeable future continue to have administrative systems securely dedicated and configured for our unique institutional needs, and similarly
  • we must maintain control over our pedagogical applications and research tools since they help distinguish us from the competition.

Evolution based on this thinking entails dramatic shrinkage in data-center facilities, as virtualized servers housed in or provided by commercial or collective entities replace campus-based hosting of major systems. It entails several key administrative and community-service systems being replaced by standard commercial offerings — for example, the replacement of expense-reimbursement systems by commercial products such as Concur, of dedicated payroll systems by commercial services such as ADP, and of campus messaging, calendaring, and even document-management systems by more general services such as Google’s or Microsoft’s. Finally, thinking like this typically drives consolidation and standardization of user support, bringing departmental support entities into alignment if not under the authority of central IT, and standardizing requirements and services to reduce response times and staff costs.

How might higher-education IT evolve if this is how things go? In particular, what effects would it have on IT organization, and leadership?

One clear consequence of such straightforward evolution is a continuing need for central guidance and management across essentially the current array of IT domains. As I tried to suggest in a recent article, the nature of that guidance and management would change, in that control would give way to collaboration and influence. But institutions would retain responsibility for IT functions, and it would remain important for important systems to be managed or procured centrally for the general good. Although the skills required of the “chief information officer” would be different, CIOs would still be necessary, and most cross-institutional efforts would be mediated through them. Many of those cross-institutional efforts would involve coordinated action of various kinds, ranging from similar approaches to vendors through collective procurement to joint development.

We’d still need Brads.

“Say What You Like, Trial by Jury is a Sound System…”

If we think about the future unconventionally (as Poirot does in his second solution — spoiler in the last section below!), a somewhat more radical, extra-institutional projection emerges. What if Accenture, McKinsey, and Bain are right, and IT contributes very little to the distinctiveness of institutions — in which case colleges and universities have no business doing IT idiosyncratically or even individually?

In that case,

  • we will outsource almost all IT infrastructure, applications, services, and support, either to collective enterprises or to commercial providers, and therefore
  • we will not need data centers or staff, including server administrators, programmers, and administrative-systems technical staff, so that
  • the role of institutional IT will be largely to provide only highly tailored support for research and instruction, which means that
  • in most cases means there will be little to be gained from centralizing IT,
  • it will make sense for academic departments to do their own IT, and
  • we can rely on individual business units to negotiate appropriate administrative systems and services, and so
  • the balance will shift from centralized to decentralized IT organization and staffing.

What if we’re right that mobility, broadband, cloud services, and distance learning are maturing to the point where they can transform education, so that we have simultaneous and similarly radical change on the academic front?

Despite changes in technology and economics, and some organizational evolution, higher education remains largely hierarchical. Vertically-organized colleges and universities grant degrees based on curricula largely determined internally, curricula largely comprise courses offered by the institution, institutions hire their own faculty to teach their own courses, and students enroll as degree candidates in a particular institution to take the courses that institution offers and thereby earn degrees. As Jim March used to point out, higher education today (well, okay, twenty years ago, when I worked with him at Stanford) is pretty similar to its origins: groups sitting around on rocks talking about books they’ve read.

It’s never been that simple, of course. Most students take some of their coursework from other institutions, some transfer from one to another, and since the 1960s there have been examples of network-based teaching. But the model has been remarkably robust across time and borders. It depends critically on the metaphor of the “campus”, the idea that students will be in one place for their studies.

Mobility, broadband, and the cloud redefine “campus” in ways that call the entire model into question, and thereby may transform higher education. A series of challenges lies ahead on this path. If we tackle and overcome these challenges, higher education, perhaps even including its role in research, could change in very fundamental ways.

The first challenge, which is already being widely addressed in colleges, universities, and other entities, is distance education: how to deliver instruction and promote learning effectively at a distance. Some efforts to address this challenge involve extrapolating from current models (many community colleges, “laptop colleges”, and for-profit institutions are examples of this), some involve recycling existing materials (Open CourseWare, and to a large extent the Khan Academy), and some involve experimenting with radically different approaches such as game-based simulation. There has already been considerable success with effective distance education, and more seems likely in the near future.

As it becomes feasible to teach and learn at a distance, so that students can be “located” on several “campuses” at once, students will have no reason to take all their coursework from a single institution. A question arises: If coursework comes from different “campuses”, who defines curriculum? Standardizing curriculum, as is already done in some professional graduate programs, is one way to achieve address this problem — that is, we may define curriculum extra-institutionally, “above the campus”. Such standardization requires cross-institutional collaboration, oversight from professional associations or guilds, and/or government regulation. None of this works very well today, in part because such standardization threatens institutional autonomy and distinctiveness. But effective distance teaching and learning may impel change.

As courses relate to curricula without depending on a particular institution, it becomes possible to imagine divorcing the offering of courses from the awarding of degrees. In this radical, no-longer-vertical future, some institutions might simply sell instruction and other learning resources, while others might concentrate on admitting students to candidacy, vetting their choices of and progress through coursework offered by other institutions, and awarding degrees. (Of course, some might try to continue both instructing and certifying.) To manage all this, it will clearly be necessary to gather, hold, and appraise student records in some shared or central fashion.

To the extent this projection is valid, not only does the role of IT within institutions change, but the very role of institutions in higher education changes. It remains important that local support be available to support the IT components of distinctive coursework, and of course to support research, but almost everything else — administrative and community services, infrastructure, general support — becomes either so standardized and/or outsourced as to require no institutional support, or becomes an activity for higher education generally rather than colleges or universities individually. In the extreme case, the typical institution really doesn’t need a central IT organization.

In this scenario, individual colleges and universities don’t need Brads.

“…What Should We Tell the Yugo-Slavian Police?”

Poirot’s second solution to the Ratchett murder (everyone including the butler did it) requires astonishing and improbable synchronicity among a large number of widely dispersed individuals. That’s fine for a mystery novel, but rarely works out in real life.

I therefore don’t suggest that the radical scenario I sketched above will come to pass. As many scholars of higher education have pointed out, colleges and universities are organized and designed to resist change. So long as society entrusts higher education to colleges and universities and other entities like them, we are likely to see evolutionary rather than radical change. So my extreme scenario, perhaps absurd on its face, seeks to only to suggest that we would do well to think well beyond institutional boundaries as we promote IT in higher education and consider its transformative potential.

And more: if we’re serious about the potentially transformative role of mobility, broadband, and the cloud in higher education, we need to consider not only what IT might change but also what effects that change will have on IT itself — and especially on its role within colleges and universities and across higher education.

Individual Utility, Joint Action, and The Prisoner’s Dilemma

Photo of Ken ArrowBack in 1977, Ken Arrow, having won the Nobel Prize five years earlier, wondered about the internal functioning of firms. “To what extent is it necessary for the efficiency of a corporation,” he wrote, “that its decisions be made at a high level where a wide degree of information is, or can be made, available? How much, on the other hand, is gained by leaving a great deal of latitude to individual departments which are closer to the situations with which they deal, even though there may be some loss due to imperfect coordination?” The answer depends somewhat on whether the firm has one goal or several, on the correlation among multiple goals, and the degree to which different departments contribute to different goals.

In general, though, the answer is sobering for advocates of decentralization. The severally optimal choices of departments rarely combine to yield the jointly optimal choice for the overall enterprise. That’s not to say that centralization is wrong, of course. It merely means that one must balance the healthy and interesting diversity that results from decentralization against the overall inefficiency it can cause.

If we shift focus from the firm to enterprises within an economic sector, the same observations hold. To the extent enterprises pursue diverse goals primarily for their own benefit rather than for the efficiency of the entire sector, that sector will be both diverse and inefficient — perhaps to the extremes of idiosyncrasy and counterproductivity. Put differently, if the actors within a sector value individuality, they will sacrifice sector-wide efficiency; if they value sector-wide efficiency, they must sacrifice individuality.

Photo of Doc HoweHigher education traditionally has placed a high value on institutional individuality. Some years back a Harvard faculty colleague of mine, Harold “Doc” Howe II (who had been US Commissioner of Education under Lyndon Johnson), observed how peculiar it was that mergers and acquisitions were so rarely contemplated, let alone achieved, in higher education, even though by any rational analysis there were myriad opportunities for interesting, effective mergers. (Does the United States really need almost 4,000 nonprofit, degree-granting postsecondary institutions, not to mention 14,000 public school districts?) Among research universities, for example, Case Western Reserve University and Carnegie-Mellon University were two of the few successful mergers, there were some instances of acquisitions and subordinations (I’m not counting Brown/Pembroke, Columbia/Barnard, Tufts/Jackson, or their kin), and several prominent failures — for example, the failed attempts to merge the Cambridge anchors Harvard and MIT. (Wikipedia’s page on college mergers lists fewer than 100 mergers of any kind.)

Photo of Fermilab detectorIf higher education isn’t going to gain efficiency through institutional aggregation, then its only option is to do so through institutional collaboration. There are lots of good examples where this has happened: I’d include athletic leagues, part of whose purpose is to negotiate effectively with networks; library collaborations, such as OCLC, that seek to reduce redundant effort; research collaborations, such as Fermilab, through which institutions share expensive facilities; and IT collaborations, such as Internet2.

That last is a bit different from the others, in that involves a group of institutions joining forces to buy services together. Why is joint procurement like that so rare in US higher education? I think there are two tightly connected reasons:

  • US higher education has valued institutional individuality far more highly than collective efficiency — that is, it assigns less importance to collective utility (that’s a microeconomics term for the value an actor expects) than to individual utility.
  • Photo of Ryan OakesAt the same time, it has failed to make the critical distinction between what Ryan Oakes, of Accenture‘s higher-education practice, recently called “differentiating” activities (those on which institutions reasonably compete) and generic “non-differentiating” activities (those where differences among peers are irrelevant to success). As a result, institutions have behaved competitively in all but a few contexts, even in those non-differentiating areas where collaboration is the right answer.

Although it’s a bit of a caricature, the situation somewhat resembles the scenario for the Rand Corporation‘s 1950s-era game-theory test, The Prisoner’s Dilemma. Here’s a version from Wikipedia:

Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated the prisoners, visit each of them to offer the same deal. If one testifies for the prosecution against the other (defects) and the other remains silent (cooperates), the defector goes free and the silent accomplice receives the full one-year sentence. If both remain silent, both prisoners are sentenced to only one month in jail for a minor charge. If each betrays the other, each receives a three-month sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How should the prisoners act?

Photo of Jake and EarlThe dilemma is this:

  • The optimal individual choice for each prisoner is to rat out the other — that is, to “defect” — since this guarantees him or her a sentence of no more than three months, with a shot at freedom if the other prisoner remains silent. Individuals seeking to maximize their own success (to make a “utility-maximizing rational choice”, in microeconomic terms) thus choose to defect. In decision-analytic terms, since prisoner A has no idea what prisoner B will do, A assigns a probability of .5 to each possible choice B might make. A multiplies those probabilities by the consequences to obtain the expected values of his or her two options: (3)(.5)+(0)(.5) = 1.5 months for defecting, and (12)(.5)+(1)(.5) = 6.5 months for cooperating. A chooses to defect. B does the same calculation, and also chooses to defect. Since both choose to defect, each gets a three-month sentence, and they serve a total of six months in jail.
  • The optimal choice for the two prisoners together, as measured by the total of their two sentences, is for both to remain silent, that is, to cooperate. This yields a total sentence of one month for each prisoner, or a total of two months total. In contrast, defect/cooperate and cooperate/defect each yield twelve months (one year for one prisoner, freedom for the other) and defect/defect yields six months (three months for each). So the best joint choice is for A and B both to remain silent.

So each prisoner acting in his or her own self interest yields more individual and total prison time than each acting for their joint good — each would serve three months rather than one. But since A cannot know that B will cooperate and vice versa, each of them chooses self interest, and both end up worse off.

Let's Make a DealThe situation isn’t quite the same for several colleges that might negotiate together for a good deal from a vendor, mostly because no one will get anything for free. But a problem like the prisoner’s dilemma arises when one or more members of the group conclude that they can get a better deal from the vendor by themselves than what they think the group would obtain. If those members try to cut side deals, the incentive for the vendor to deal with the other members shrinks, especially if the defecting members’ deals consume a substantial fraction of the vendor’s price flexibility. The vendor prefers doing a couple of side deals to the overall deal so long as the side deals require less total discount than the group deal would. Members have every incentive to cut side deals, vendors prefer a small number of side deals to a blanket deal, and so unless all the colleges behave altruistically a joint deal is unlikely.

TV Guide coverAnd so the $64 question: What would break this cycle? The answer is simple: sharing information, and committing to joint action. If the prisoners could communicate before deciding whether to defect or cooperate, their rational choice would be to cooperate. If colleges shared information about their plans and their deals, the likelihood of effective joint action would increase sharply. That would be good for the colleges and not so good for the vendor. From this perspective, it’s clear why non-disclosure clauses are so common in vendor contracts.

In the end, the only path to effective joint action is a priori collaboration — that is, agreeing to pool resources, including clout and information, and work together for the common good. So long as colleges and universities hold back from collaboration (for example, saying, as about 15% of respondents did in a recent EDUCAUSE survey, that their institutions would wait to see what others achieved before committing to collaboration), successful joint action will remain difficult.

Working Together Online: Are We There Yet?

Most of you reading this are probably too young to have seen Seven Days in May, in which Burt Lancaster, as General James Mattoon Scott, helps break up a military coup attempt within the United States. Good conspiracy story, so-so flick, old helicopters, and it harps on a weird bit of technology: every time Scott talks with his counterparts, they fire up big, old, black-and-white console TVs in their offices so they can see each other’s heads as they talk.

Let’s grant that seeing and talking is better than just talking. But the technology to do that can be confining, in that it requires much more equipment, networking, configuration, and support than a simple phone call. The difference is even greater if the phone call is standard but the video connection isn’t. The striking thing about the use of videoconferencing in Seven Days in May is that it appears to add almost no value to communication while making it much more complicated and constraining. (If the movie had been made a year later, it might have used Picturephones instead of TVs, but they weren’t much better.)

My question for today is how the balance between value and cost works out for collaboration and communication technologies we commonly consider today, putting aside one-to-one interactions as a separate case that’s pretty well understood. It seems to me that six mechanisms dominate when we work together at a distance:

  1. voice calls, perhaps augmented with material viewable online,
  2. text-based sharing and collaboration (for example,  listservs and wikis),
  3. personal video “calls” (for example, using Skype, Google Video Chat, Oovoo, Vidyo, and similar services),
  4. online presentations combined with text-based chat-like response (for example, Adobe Connect or Cisco WebEx with broadcast audio),
  5. voice calls combined with multi-user tools for synchronous editing or whiteboarding (for example, using Google Docs, Office Live, a wiki, or a group whiteboarding tool during a conference call), and
  6. large-screen video facilities (such as Cisco’s or Polycom’s hardware and services installed in well-designed facilities).

This is by no means a complete set of mechanisms. But I think it spans the space and helps distinguish where we have good value propositions and where we have work to do. (In addition to one-to-one mechanisms, I’ve ignored one-way webcasts with no audience-response mechanisms, since I think we understand those  pretty well too.)

Different mechanisms entail different costs (meaning both expense and difficulty). Their effectiveness varies depending on how they’re used. Their uses vary along two dimensions: the purpose of the interaction (presentation, communication, or collaboration) and the situation (one-to-many, few-to-few, or many-to-many — with the boundary between “few” and “many” being the familiar 7±2).

I think the nine logical combinations reduce to five interesting use cases: one-to-many presentations, few-to-few communication or collaboration, and many-to-many communication or collaboration.

I’ve found it useful to collapse all that into a matrix, with rows corresponding to the five use cases, columns corresponding to the six mechanisms, and ratings in each cell of effectiveness and cost. Here’s how my matrix turns out, coding “costs” as $ to $$$ and “effectiveness” as A=Excellent down through Good and Meh to D=Poor:



If my ratings are reasonable, what does this tell us? Let’s explore that by working through the columns.

  • Voice calls, as we know all too well, are inexpensive, but they don’t work well for large groups trying to communicate or collaborate. They work somewhat for one-to-many presentations, but their most effective use is for communications among small groups.
  • Text sharing by itself never rises above Good, but it’s better than simple voice calls for collaboration and for many-to-many communication. It’s very hard to have a conversation among more than about seven people on the phone, but it’s quite possible for much larger numbers to hold text-based online “conversations”.
  • Personal video, whose costs are higher than voice calls and text sharing because it typically requires cameras, better networking, a certain amount of extra setup, and sometimes a service subscription) doesn’t work very well beyond few-to-few situations. It’s better than phone calls for few-to-few collaboration, I think, because being able to see faces (which is pretty much all one can see) seems to help groups reach consensus more easily. Although it costs more than voice calls, in my experience it adds little value to presentations or few-to-few communications.
  • Presentation technologies that combine display and audio broadcast with some kind of text-response capability are very widely used. In my view, they’re the best technology for one-to-many presentations. They’re less useful for few-to-few interactions, largely because in those situations voice interactions are feasible and are much richer than the typical chat box. I rate their usefulness for many-to-many collaboration similarly, but rate it lower than text sharing for this use because the typical chat mechanisms within these technologies cope poorly with large volumes of comments from lots of participants. Text-sharing mechanisms, which usually have search, threading, and archiving capabilities, cope much better with volume.
  • Voice calling combined with synchronously editable documents or whiteboards is turning out to be very useful, I think, in that it combines the richness of conversation with the visual coherence of a document or whiteboard. This makes it especially effective for few-to-few situations, and even for one-to-many presentations — although it can’t cope if too many people try modifying on a document or whiteboard at the same time (in that case, more structured technologies like IdeaSpace are useful, albeit much less flexible).
  • Finally, although I’ve spent a great deal of time in online presentation, communication, and collaboration using specialized videoconferencing facilities, I’ve come to believe that they are most effective only for few-to-few communications. They’re reasonable for few-to-few collaboration, but this use case usually produces some push-and-pull between looking at other participants and working together on documents or whiteboards. They’re not very effective for presentations or many-to-many interactions because except in rare cases there are capacity limitations (although interesting counterexamples are emerging, such as some classrooms at Duke’s Fuqua business school).

What might we infer from all this?

  • First, it’s striking that some simple, inexpensive technologies remain the best or most cost-effective way to handle certain use cases. Although it’s unsurprising, this is sometimes hard to remember amidst the pressure to keep up with the technologically advanced Joneses.
  • Second, it’s been interesting to see unexpected combinations of technologies such as jointly editing documents during conference calls become popular and effective even in the absence of marketing — I’ve never seen a formal ad or recommendation for that, even as its use proliferates.
  • Third, and as unsurprising as the first two, it’s clear that good solutions to the many-to-many challenge remain elusive. Phone calls, personal video, and video facilities all fail in this situation, regardless of purpose. Hybrid and text-based tools don’t do much better. If one wants a large group to communicate effectively within itself other than by one-to-many presentations, there’s no good way to achieve that technologically. As our organizations become ever more distributed geographically and travel becomes ever more difficult and expensive, the need for many-to-many technologies is going to increase.

Of course the technologies I’ve chosen may not be the right set, there may be other important use cases, and my ratings may not be accurate. But going through the classification and rating exercise helped clarify some concerns I’d been unable to frame. I encourage others to explore their views in similar ways, and perhaps we’ll learn something by comparing notes.

 

 

 

 

 

GoTo, Gas Pedals, & Google: What Students Should Know, and Why That’s Not What We Teach Them

In the 1980s I began teaching a course in BASIC programming in the Harvard University Extension, part of an evening Certificate of Advanced Study program for working students trying to get ahead. Much to my surprise, students immediately filled the small assigned lecture hall to overflowing, and nearly overwhelmed my lone teaching assistant.

Within two years, the course had grown to 250+ students. They spread throughout the second-largest room in the Harvard Science Center (Lecture Hall C– the one with the orange seats, for those of you who have been there). I now had a dozen TAs, so I was in effect not only teaching the BASIC course, but also leading a seminar on the pedagogical challenge of teaching non-technical students how to write structured programs in a language that heretically allowed “GoTo” statements.

Computer Literacy?

There’s nothing very interesting or exciting about learning to program in BASIC. Although I flatter myself a good teacher, even my best efforts to render the material engaging – for example, assignments that variously involved having students act out various roles in Stuart Madnick’s deceptively simple Little Man Computer system, automating Shirley Ellis‘s song The Name Game, and modeling a defined-benefit pension system – in no way explained the course’s popularity.

So what was going on? I asked students why they were taking my course. Most often, they said something about “computer literacy”. That’s a useful (if linguistically confused) term, but in this case a misleading one.

If the computer becomes important, the analogy seems to run, then the ability to use a computer becomes important, much as the spread of printed material made reading and writing important. So far so good. For the typical 1980s employee, however, using computers in business and education centered on applications like word processors, spreadsheets, charts, and maybe statistical packages. Except for those within the computer industry, it rarely involved writing code in high-level languages.

BASIC programming thus had little direct relevance to the “computer literacy” students actually needed. The print era made reading and writing important  for the average worker and citizen. But only printers needed adeptness with the technologies of paper, ink, composition (in the Linotype sense), and presses. That’s why the analogy fails: programming, by the 1980s, was about making computer applications, not using them. That’s the opposite of what students actually needed.

Yet clearly students viewed the ability to program in BASIC – even “Shirley Shirley bo-birley…” – as somehow relevant to the evolving challenges of their jobs. If BASIC programming wasn’t directly relevant to actual computer literacy, why did they believe this? Two explanations of its indirect importance suggest themselves:

  • Perhaps ability to program was an accessible indicator of more relevant yet harder-to-measure competence. Employers might have been using programming ability, however irrelevant directly, as a shortcut measure to evaluate and sort job applicants or promotion candidates. (This is essentially a recasting of Lester Thurow‘s “job queues” theory about the relationship between educational attainment and hiring, namely that educational attainment signals the ability to learn quickly rather than provides direct training.) Applicants or employees who believed this was happening would thus perceive programming ability as a way to make themselves appear attractive, even though the skill was actually irrelevant.
  • Perhaps students learned to program simply to gain confidence that they could cope with the computer age.

I propose a third explanation:

  • As technology evolves, generations that experience the evolution tend to believe it important for the next generation to understand what came before, and advise students accordingly.

That is, we who experience technological change believe that competence with current technology benefits from understanding prior technology – a technological variant of George Santayana’s aphorism “Those who cannot remember the past are condemned to repeat it” – and send myriad direct and indirect messages to our successors and students that without historical understanding one cannot be fully competent.

Shifting Gears

My father taught me to drive on the family’s 1955 Chevy station wagon, a six-cylinder car with a three-speed, non-synchromesh, stalk-mounted-shifter manual transmission and power nothing. After a few rough sessions learning to get the car moving without bucking and stalling, to turn and shift at the same time, and to double-clutch and downshift while going downhill, I became a pretty good driver.

But my father, who had learned to drive on a Model T Ford with a planetary transmission and separate throttle and spark-advance controls, remained skeptical of my ability. He was always convinced that since I didn’t understand that latter distinction, I really wasn’t operating the car as well as I might. (Today’s “accelerator”, if I understand it correctly, combines the two functions: it tells the engine to spin faster, which is what the spark-advance lever did, and then feeds it the necessary fuel mixture, which was the throttle’s function.)

Years later it came time for our son’s first driving lesson. We were in our automatic-transmission Toyota Camry, equipped with power steering and brakes, on a not-yet-opened Cape Cod subdivision’s newly paved streets. Apparently forgetting how irrelevant the throttle/spark distinction had been to my learning to drive, I delivered a lecture on what was going on in the automatic transmission – why it didn’t need a clutch, how it was deciding when to shift gears, and so forth. Our son listened patiently, and then rapidly learned to drive the Camry very well without any regard to what I’d explained. My lecture had absolutely no effect on his competence (at least not until several years later, I like to believe, when he taught himself to drive a friend’s four-in-the-floor VW).

Technological Instruction

Which brings me to the present, and the challenge of preparing today’s students for tomorrow’s technological workplaces. What should be our advice to them be, either explicitly – in the form of direct instruction or requirements – or implicitly, in the form of contextual guidance such as induced so many students to take my BASIC course? In particular, how can we break away from the generational tendency to emphasize how we got here rather than where we’re going?

I don’t propose to answer that question fully here, but rather to sketch, though two examples, how a future-oriented perspective might differ from a generational one. The first example is cloud services, and the second example is online information.

Cloud Services

I started writing this essay on my DC office computer. I’m typing these words on an old laptop I keep in my DC apartment, and I’ll probably finish it on my traveling computer or perhaps on my Chicago home computer. A big problem ensues: How do I keep these various copies synchronized? My answer is a service called Dropbox, which copies documents I save to its central servers and then disseminates them automatically to all my other computers and even my phone. What I need is to have the same document up to date wherever I’m working. Dropbox achieves this by synchronizing multiple copies of the same documents across multiple computers and other devices.

Alternatively, I might gotten what I need– having the same document up to date wherever I’m working– by drafting this post as a Google or Live document. Rather than editing different synchronized copies of the document, I’d actually have been editing the same remote document from different computers rather than synchronizing local copies among those computers.

My instincts are that this difference between synchronized and remote documents is important, something that I, as an educator, should be sure the next generation understands. When my son asks about how to work across different machines, my inclination is to explain the difference between the options, how one is giving way to the other, and so forth. Is that valid, or is this the same generational fallacy that led my father to explain throttles and spark advance or me to explain clutches and shifting?

Online Information

When I came to the history quote above, I couldn’t remember its precise wording or who wrote it. That’s what the Internet is for, right? Finding information?

I typed “those who ignore the past are doomed”, which was the partial phrase I remembered, into Google’s search box. Among the first page of hits, the first time I tried this, were links to answers.com, wikiquote.org, answers.google.com, wikipedia.org, and www.nku.edu. The first four of those pointed me to the correct quote, usually giving the specific source including edition and page. The last, from a departmental web page at Northern Kentucky University, blithely repeated the incorrect quote (but at least ascribed it to Santayana). One of the sources (answers.com) pointed to an earlier, similar quote from Edmund Burke. The Wikipedia entry reminded me that the quote is often incorrectly ascribed to Plato.

I then typed the same search into Bing’s search box. Many links on its first page of results were the same as Google’s — answers.com and wikiquotes — but there were more links to political comments (most of them embodying incorrect variations on the quote), and one link to a conspiracy-theorist page linking the Santayana quote to George Orwell’s “He who controls the present, controls the past. He who controls the past, controls the future”.

It wasn’t hard for me to figure out which search results to heed and which to ignore. The ability to screen search results and then either to choose which to trust or to refine the search is central to success in today’s networked world. What’s the best way to inculcate that skill in those who will need it?

I’ve been working in IT since before the Digital Equipment Corporation‘s Altavista, in its original incarnation, became the first Web search engine. The methods different search services use to locate and rank information have always been especially interesting. The early Altavista ranked pages based on how many times search words appeared in them – a method so obviously manipulable (especially by sneaking keywords into non-displayed parts of Web pages) that it rapidly gave way to more robust approaches. The links one gets from Google or Bing today come partly from some very sophisticated ranking said to be based partly on user behavior (such as whether a search seems to have succeeded) and partly on links among sites (this was Google’s original innovation, called PageRank) – but also, quite openly and separately, from advertisers paying to have their sites displayed when users search for particular terms.

Here again the generational issue arises. Obviously we want to teach future generations how to search effectively, and how to evaluate the quality and reliability of the information their searches yield. But do we do this by explaining the evolution of search and ranking algorithms – the generational approach based on the preceding paragraph – or by teaching more generally, as bibliographic instructors in libraries have long done, how to cross-reference, assess, and evaluate information whatever its form?

Understanding throttles and spark advance did not help me become a better driver, understanding BASIC probably didn’t help prepare my Harvard students for their future workplaces, and explaining diverse cloud mechanisms and search algorithms isn’t the best way for us to maximize our students’ technological competence. Much as I love explaining things, I think the essence of successful technological teaching is to focus on the future, on the application and consequences of technology rather than its origins.

That doesn’t mean that we should eschew the importance of history, but rather than history does not suffice as a basis for technological instruction. It’s easier to explain the past than to anticipate the future, but that last, however risky and uncertain and detached from our personal histories, is our job.

Network Neutrality: Who’s Involved? What’s the Issue? Why Do We Give a Shortstop?

Who’s on First, Abbott and Costello’s classic routine, first reached the general public as part of the Kate Smith Radio Hour in 1938. It then appeared on almost every radio network at some time or another before reaching TV in the 1950s. (The routine’s authorship, as I’ve noted elsewhere, is more controversial than its broadcast history.) The routine can easily be found many places on the Internet – as a script, as audio recordings, or as videos. Some of its widespread availability is from widely-used commercial services (such as YouTube), some is from organized groups of fans, and some is from individuals. The sources are distributed widely across the Internet (in the IP-address sense).

I can easily find and read, listen to, or watch Who’s on First pretty much regardless of my own network location. It’s there through the Internet2 connection in my office, through my AT&T mobile phone, through my Sprint mobile hotspot, through the Comcast connections where I live, and through my local coffeeshops’ wireless in DC and Chicago.

This, most of us believe, is how the Internet should work. Users and content providers pay for Internet connections, at rates ranging from by buying coffee to thousands of dollars, and how fast one’s connection is thus may vary by price and location. One may need to pay providers for access, but the network itself transmits similarly no matter where stuff comes from, where it’s going, or what its substantive content is. This, in a nutshell, is what “network neutrality” means.

Yet network neutrality remains controversial. That’s mostly for good, traditional political reasons. Attaining network neutrality involves difficult tradeoffs among the economics of network provision, the choices available to consumers, and the public interest.

Tradeoffs become important when they affect different actors differently. That’s certainly the case for network neutrality:

  • Network operators (large multifunction ones like AT&T and Comcast, large focused ones like Verizon and Sprint, small local ones like MetroPCS, and business-oriented ones like Level3) want the flexibility to invest and charge differently depending on who wants to transmit what to whom, since they believe this is the only way to properly invest for the future.
  • Some Internet content providers (which in some cases, like Comcast, are are also networks) want to know that what they pay for connectivity will depend only on the volume and technical features of their material, and not vary with its content, whereas others want the ability to buy better or higher-priority transmission for their content than competitors get — or perhaps to have those competitors blocked.
  • Internet users want access to the same material on the same terms regardless of who they are or where they are on the network.

Political perspectives on network neutrality thus vary depending on who is proposing what conditions for whose network.

But network neutrality is also controversial because it’s misunderstood. Many of those involved in the debate either don’t – or won’t – understand what it means for a public network to be neutral, or indeed what the difference is between a public and a private network. That’s as true in higher education as it is anywhere else. Before taking a position on network neutrality or whose job it is to deal with it, therefore, it’s important to define what we’re talking about. Let me try to do that.

All networks discriminate. Different kinds of network traffic can entail different technical requirements, and a network may treat different technical requirements differently. E-mail, for example, can easily be transmitted in bursts – it really doesn’t matter if there’s a fifty-millisecond delay between words – whereas video typically becomes jittery and unsatisfactory if the network stream isn’t steady. A network that can handle email may not be able to handle video. One-way transmission (for example, a video broadcast or downloading a photo) can require very different handling than a two-way transmission (such as a videoconference). Perhaps even more basic, networks properly discriminate between traffic that respects network protocols – the established rules of the road, if you will – and traffic that attempts to bypass rule-based network management.

Network neutrality does not preclude discrimination. Rather, as I wrote above, a network behaves neutrally if it avoids discriminating on the basis of (a) where transmission originates, (b) where transmission is destined, and (c) the content of the transmission. The first two elements of network neutrality are relatively straightforward, but the third is much more challenging. (Some people also confuse how fast their end-user connection is with how quickly material moves across the network – that is, someone paying for a 1-megabit connection considers the Internet non-neutral if they don’t get the same download speeds as someone paying for a 26-megabit connection – but that’s a separate issue largely unrelated to neutrality.) In particular, it can be difficult to distinguish between neutral discrimination based on technical requirements and non-neutral discrimination based on a transmission’s substance.In some cases the two are inextricably linked.

Consider several ways network operators might discriminate with regard to Who’s on First.

  • Alpha Networks might decide that its network simply can’t handle video streaming, and therefore might configure its systems not to transmit video streams. If a user tries to watch a YouTube version of the routine, it won’t work if the transmission involves Alpha Networks. The user will still be able to read the script or listen to an audio recording of the routine (for example, any of those listed in the Media|Audio Clips section of http://www.abbottandcostello.net/). Although refusing to carry video is clearly discrimination, it’s not discrimination based on source, destination, or content. Alpha Networks therefore does not violate network neutrality.
  • Beta Networks might be willing to transmit video streams, but only from providers that pay it to do so. Say, purely hypothetically, that the Hulu service – jointly owned by NBC and Fox – were to pay Beta Networks to carry its video streams, which include an ad-supported version of Who’s on First. Say further that Google, whose YouTube streams include many Who’s on First examples, were to decline to pay. If Beta Networks transmitted Hulu’s versions but not Google’s, it would be discriminating on the basis of source – and probably acting non-neutrally.

What if Hulu and Google use slightly different video formats? Beta might claim that carrying Hulu’s traffic but not Google’s was merely technical discrimination, and therefore neutral. Google would probably disagree. Who resolves such controversies – market behavior, the courts, industry associations, the FCC – is one of the thorniest points in the national debate about network neutrality. Onward…

  • Gamma Networks might decide that Who’s on First ridicules and thus disparages St. Louis (many performances of the routine refer to “the St Louis team”, although others refer to the Yankees). To avoid offending customers, Gamma might refuse to transmit Who’s on First, in any form, to any user in Missouri. That would be discrimination on the basis of destination. Gamma would violate the neutrality principle.
  • Delta Networks, following Gamma’s lead, might decide that Who’s on First disparages not just St. Louis, but professional baseball in general. Since baseball is the national pastime, and perhaps worried about lawsuits, Delta Networks might decide that Who’s on First should not be transmitted at all, and therefore it might refuse to carry the routine in any form. That would be discrimination on the basis of content. Delta would be violating the neutrality principle.
  • Epsilon Networks, a competitor to Alpha, might realize that refusing to carry video disserves customers. But Epsilon faces the same financial challenges as Alpha. In particular, it can’t raise its general prices to cover the expense of transmitting video since it would then lose most of its customers (the ones who don’t care about video) to Alpha’s lesser but less expensive service. Rather than block video, Epsilon might decide to install equipment that will enable video as a specially provided service for customers who want it, and to charge those customers – but not its non-video customers – extra for the added capability. Whether an operator violates network neutrality by charging more for special network treatment of certain content – the usual term for this is “managed services” – is another one of the thorniest issues in the national debate.

As I hope these examples make clear, there are various kinds of network discrimination, and whether they violate network neutrality is sometimes straightforward and sometimes not.  Things become thornier still if networks are owned by content providers or vice versa – or, as is more typical, if there are corporate kinships between the two. Hulu, for example, is partly owned by NBC Universal, which is becoming part of Comcast. Can Comcast impose conditions on “outside” customers, such as Google’s YouTube, that it does not impose on its own corporate cousin?

Why do we give a shortstop (whose name, lest you didn’t read to the end of the Who’s on First script, is “darn”)? That is, why is network neutrality important to higher education? There are two principal reasons.

First, as mobility and blended learning (the combination of online and classroom education) become commonplace in higher education, it becomes very important that students be able to “attend” their college or university from venues beyond the traditional campus. To this end, it is very important that colleges and universities be able to provide education to their students and interconnect researchers over the Internet. This should be constrained only by the capacity of the institution’s connection to the Internet, the technical characteristics of online educational materials and environments, and the capacity of students’ connections to the Internet.

Without network neutrality, achieving transparent educational transmission from campus to widely-distributed students could become very difficult. The quality of student experience could come to depend on the politics of the network path from campus to student.To address this, each college and university would need to negotiate transmission of its materials with every network operator along the path from campus to student. If some of those network operators negotiate exclusive agreements for certain services with commercial providers – or perhaps with other colleges or universities – it could become impossible to provide online education effectively.

Second, many colleges and universities operate extensive networks of their own, or together operate specialized inter-campus networks for education, research, administrative, and campus purposes. Network traffic inconsistent with or detrimental to these purposes is managed differently than traffic that serves them. It is important that colleges and universities retain the ability to manage their networks in support of their core purposes.

Networks that are operated by and for the use of particular organizations, like most college and university networks, are private networks. Private and public networks serve different purposes, and thus are managed based on different principles. The distinction is important because the national network-neutrality debate – including the recent FCC action, and its evolving judicial, legislative, and regulatory consequences – is about public networks.

Private networks serve private purposes, and therefore need not behave neutrally. They are managed to advance private goals. Public networks, on the other hand, serve the public interest, and so – network-neutrality advocates argue – should be managed in accordance with public policy and goals. Although this seems a clear distinction, it can become murky in practice.

For example, many colleges and universities provide some form of guest access to their campus wireless networks, which anyone physically on campus may use. Are guest networks like this public or private? What if they are simply restricted versions of the campus’s regular network? Fortunately for higher education, there is useful precedent on this point. The Communications Assistance for Law Enforcement Act (CALEA), which took effect in 1995, established principles under which most college and university campus networks are treated as private networks – even if they provide a limited set of services to campus visitors (the so-called “coffee shop” criterion).

Higher education needs neutrality on public networks because those networks are increasingly central to education and research. At the same time, higher education needs to manage campus networks and private networks that interconnect them in support of education and research, and for that reason it is important that there be appropriate policy differentiation between public and private networks.

Regardless, colleges and universities need to pay for their Internet connectivity, to negotiate in good faith with their Internet providers, and to collaborate effectively on the provision and management of campus and inter-campus networks. So long as colleges and universities act effectively and responsibly as network customers, they need assurance that their traffic will flow across the Internet without regard to its source, destination, or content.

And so we come to the central question: Assuming that higher education supports network neutrality for public networks, do we care how its principles – that public networks should be neutral, and that private ones should be manageable for private purposes – are promulgated, interpreted, and enforced? Since the principles are important to us, as I outlined above, we care that they be implemented effectively, robustly, and efficiently. Since the public/private distinction seems to be relatively uncontroversial and well understood, the core issue is whether and how to address network neutrality for public networks.

There appear to be four different ideas about how to implement network neutrality.

  1. A government agency with the appropriate scope, expertise, and authority could spell out the circumstances that would constitute network neutrality, and prescribe mechanisms for correcting circumstances that fell short of those. Within the US, this would need to be a federal agency, and the only one arguably up to the task is the Federal Communications Commission. The FCC has acted in this way, but there remain questions whether it has the appropriate authority to proceed as it has proposed.
  2. The Congress could enact laws detailing how public networks must operate to ensure network neutrality. In general, it has proven more effective for the Congress to specify a broad approach to a public-policy problem, and then to create and/or empower the appropriate government agency to figure how what guidelines, regulations, and redress mechanisms are best. Putting detail into legislation tends to enable all kinds of special negotiations and provisions, and the result is then quite hard to change.
  3. The networking industry could create an internal body to promote and enforce network neutrality, with appropriate powers to take action when its members fail to live up to neutrality principles. Voluntary self-regulatory entities like this have been successful in some contexts and not in others. Thus far, however, the networking industry is internally divided as to the wisdom of network neutrality, and without agreement on the principle it is hard to see how there could be agreement on self-regulation.
  4. Network neutrality could simply be left to the market. That is, if network neutrality is important to customers, they will buy services from neutral providers and avoid services from non-neutral providers. The problem here is that network neutrality must extend across diverse networks, and individual consumers – even if they are large organizations such as many colleges and universities – interact only with their own “last mile” provider.

Those of us in higher education who have been involved in the network-neutrality debates have come to believe that among these four approaches the first is most likely to yield success and most likely to evolve appropriately as networking and its applications evolve. This is especially true for wireless (that is, cellular) networking, where there remain legitimate questions about what level of service should be subject to neutrality principles, and what kinds of service might legitimately be considered managed, extra-cost services.

In theory, the national debate about network neutrality will unfold through four parallel processes. Two of these are already underway: the FCC has issued an order “to Preserve Internet Freedom and Openness”, and at least two network operators have filed lawsuits challenging the FCC’s authority to do that. So we already have agency and court involvement, and we can possiible congressional actions and industry initiatives to round out the set.

One thing’s sure: This is going to become more complicated and confusing…

Lou: I get behind the plate to do some fancy catching, Tomorrow’s pitching on my team and a heavy hitter gets up. Now the heavy hitter bunts the ball. When he bunts the ball, me, being a good catcher, I’m gonna throw the guy out at first base. So I pick up the ball and throw it to who?

Bud: Now that’s the first thing you’ve said right.

Lou: I don’t even know what I’m talking about!