Tag: quantitative methods

The Value Alignment Problem’s Problem

Having recently attended a workshop and conference on beneficial artificial intelligence (AI), one of the overriding concerns is how to design beneficial AI.  To do this, the AI needs to be aligned with human values, and as such is known, pace Stuart Russell, as the “Value Alignment Problem.”  It is a “problem” in the sense that however one creates an AI, the AI may try to maximize a value to the detriment of other socially useful or even noninstrumental values given the way one has to specify a value function to a machine.

Continue reading

Put a DA-RT in it

If you have been living under a rock as I apparently have, then like me you may be unaware of the DA-RT controversy that is brewing in the American Political Science Association.* Turns out that some of our colleagues have been pushing for some time to write a new set of rules for qualitative scholarship that, among other things will require “cited data [be] available at the time of publication through a trusted digital repository” [This is from the Journal Editor’s Transparency Statement, which is what is being implemented Jan. 15]. The goal I gather is to enhance transparency and reproducibility. A number of journal editors have signed on, although Jeffrey Issac, editor at Perspectives on Politics, has refused to sign onto the DA-RT agenda.

There are a number of reasons to doubt that the DA-RT agenda will solve whatever problem it aims to address. Many of them are detailed in a petition to delay implementation (which I have signed) of the DA-RT protocol, currently set for January 15, 2016. To explore how posting data is more or less an optical solution that does little to enhance transparency or reproducibility, I want to run through a hypothetical scenario for interviews, arguably the most prone of qualitative methods to suspicion.

Regardless of the subject, IRBs nearly always insist on anonymity of the interviewees. Which means that in addition to scrubbing names and identifying markers, recordings of interviews cannot be made public (if they even exist, which many IRB decisions preclude). Therein lies the central problem—meaningful transparency is impossible, and as a result reproducibility as DA-RT envisions it is deeply impaired. Even if someone were interested in reproducing a study relying on interviews, doing so would be hindered by the fact that s/he would not be able to interview the same people as the person(s) who undertook the study (this neglects of course that the reproduction interviews could not be collected at the same time, introducing the possibility of contingency effects). In this very simple and nearly universal IRB requirement, there is fundamentally nothing to stop a nefarious, ne’er-do-well academic poser from completely fabricating the interview data that gets posted to the digital database DA-RT requires because there is no way to verify it (e.g. call up the person who gave the interview and ask if they really said that?!). Continue reading

Data Street Cred

I am not known for being a statistics whiz.  I have published quantitative work, but I am seen, rightly so, as more comfortable with qualitative work, comparing apples and oranges.  Still, I had the gumption to offer advice on twitter about data today.  What and why?

Continue reading

Podcast No. 20: Interview with Phil Schrodt

Phil SchrodtThe twentieth Duck of Minerva podcast features Phil Schrodt of Pennsylvania State University. The interview includes Professor Schrodt’s views on a number of interesting topics, including the history of quantitative and computational conflict studies, his “seven deadly sins” project, advice for graduate students in political science, and an explanation of his decision to take up blogging.

This is the third podcast to only feature an mp3 version. I don’t get the sense that anyone is missing the m4a (“enhanced”) enhanced podcasts, but please correct me if I am mistaken on that point.

I should reiterate important change to procedures. From now on, the Minervacast feed will always host mp3 versions of the podcasts. The whiteoliphaunt feed will host m4a versions when they are available–otherwise this feed will also host mp3 versions.

Continue reading

Afternoon Miscellany: Latour, Podcasts, and Big Data

This post would be much more interesting if it concerned the nexus of its three subjects. Sadly, it does not.

  1. I’m working on a forum piece with Vincent Pouliot on Actor-Network Theory (ANT) — one written from the explicit perspective of outsiders. We’ve been puzzled by the apparent lack of theorization of “the body” in Latour. For example, if social relations must be ‘fixed’ by physical objects, why isn’t the human body one such object? If any of our readers are able to weigh in, I’d appreciate it.
  2. I’ve been considering discontinuing the m4a versions of the Duck of Minerva podcast. They take much more time to produce than the mp3 versions; most people seem to listen to the mp3 versions anyway. Is there a constituency in favor of retaining the m4a variants, i.e., the ones with chapter markers and static images?
  3. Henry Farrell tweeted a paper by Gary King on setting up quantitative social-science centers. Henry highlights the section on the end of the quantitative-qualitative divide. I’m sympathetic to it: I certainly feel the pull of teaming with computational-savvy colleagues to do interesting things with “big data,” and I find myself often thinking about how it would be neat to use particular data for uncovering interesting relationships. But it also strikes me as a bit cavalier about the importance of questions — and forms of empirical analysis — that don’t fit cleanly within that rubric. Nonetheless, right on the direction that sociological and economic forces are driving social-scientific research.

Continue reading

Winecoff vs. Nexon Cage Match!

Kindred Winecoff has a pretty sweet rebuttal to my ill-tempered rant of late March. A lot of it makes sense, and I appreciate reading graduate student’s perspective on things.

Some of his post amounts to a reiteration of my points: (over)professionalization is a rational response to market pressure, learning advanced methods that use lots of mathematical symbols is a good thing, and so forth.

On the one hand, I hope that one day Kindred will sit on a hiring committee (because I’d like to see him land a job). On the other hand, I’m a bit saddened by the prospect because his view of the academic job market is just so, well, earnest.  I hate to think what he’ll make of it when he sees how the sausage actually gets made.

I do have one quibble:

While different journals (naturally) tend to publish different types of work, it’s not clear whether that is because authors are submitting strategically, editors are dedicated to advancing their preferred research paradigms, both, or neither. There are so many journals that any discussion of them as doing any one thing — or privileging any one type of work — seems like painting with much too wide a brush.

Well, sure. I’m not critical enough to publish in Alternatives, Krinded’s not likely to storm the gates of International Political Sociology, and I doubt you’ll see me in the Journal of Conflict Resolution in the near future. But while some of my comments are applicable to all journals, regardless of orientation, others are pretty clearly geared toward the “prestige” journals that occupy a central place in academic certification in the United States.

But mostly, this kind of breaks my heart:

I’ve taken more methods classes in my graduate education than substantive classes. I don’t regret that. I’ve come to believe that the majority of coursework in a graduate education in most disciplines should be learning methods of inquiry. Theory-development should be a smaller percentage of classes and (most importantly) come from time spent working with your advisor and dissertation committee. While there are strategic reasons for this — signaling to hiring committees, etc. — there are also good practical reasons for it. The time I spent on my first few substantive classes was little more than wasted; I had no way to evaluate the quality of the work. I had no ability to question whether the theoretical and empirical assumptions the authors were making were valid. I did not even have the ability to locate what assumptions were being made, and why it was important to know what those are.

Of course, most of what we do in graduate school should be about learning methods of inquiry, albeit understood in the broadest terms. The idea that one does this only in designated methods classes, though, is a major part of the problem that I’ve complained about. As is the apparent bifurcation of “substantive” and “methods of inquiry.”And if you didn’t get anything useful out of your “substantive” classes because you hadn’t yet had your coursework in stochastic modeling… well, something just isn’t right there. I won’t tackle what Kindred means by “theory-development,” as I’m not sure we’re talking about precisely the same thing, but I will note that getting a better grasp of theory and theorization is not the same thing as “theory-development.”

Anyway, I’ll spot a TKO to Kindred on most of the issues.

Friday Nerd Blogging: Baseball Edition

Yesterday, long-time Duck of Minerva contributor Bill Petti taped an on-field segment for “Clubhouse Confidential” on the MLB Network! For the non-baseball nerds, MLB = major league baseball.

It’s great to see Bill using his methodological and analytical skills in this way. This is not his first appearance on the program to provide sabermetrical analysis.

Most of the time, Bill uses his talents in his “work with senior executives at various pharmaceutical, financial, and consumer packaged goods companies to help them align their human capital and processes with their overall strategy to drive top and bottom line results.”

I quoted that from Bill’s website only because this baseball fan is jealous and envious (path not taken and all that). Good work Bill.

One style question: should viewers be worried that Bill finds everything “interesting”?

Challenges to Qualitative Research in the Age Of Big Data

Technically, “because I didn’t have observational data.”
Working with experimental data requires only
calculating means and reading a table. Also, this
may be the most condescending comic strip
about statistics ever produced.

The excellent Silbey at the Edge of the American West is stunned by the torrents of data that future historians will be able to deal with. He predicts that the petabytes of data being captured by government organizations such as the Air Force will be a major boon for historians of the future —

(and I can’t be the only person who says “Of the future!” in a sort of breathless “better-living-through-chemistry” voice)

 — but also predicts that this torrent of data means that it will take vastly longer for historians to sort through the historical record.

He is wrong. It means precisely the opposite. It means that history is on the verge of becoming a quantified academic discipline. That is due to two reasons. The first is that statistics is, very literally, the art of discerning patterns within data. The second is that the history that academics practice in the coming age of Big Data will not be the same discipline that contemporary historians are creating.

The sensations Silbey is feeling have already been captured by an earlier historian, Henry Adams, who wrote of his visit to the Great Exposition of Paris:

He [Adams] cared little about his experiments and less about his statesmen, who seemed to him quite as ignorant as himself and, as a rule, no more honest; but he insisted on a relation of sequence. And if he could not reach it by one method, he would try as many methods as science knew. Satisfied that the sequence of men led to nothing and that the sequence of their society could lead no further, while the mere sequence of time was artificial, and the sequence of thought was chaos, he turned at last to the sequence of force; and thus it happened that, after ten years’ pursuit, he found himself lying in the Gallery of Machines at the Great Exposition of 1900, his historical neck broken by the sudden irruption of forces totally new.

Because it is strictly impossible for the human brain to cope with large amounts of data, this implies that in the age of big data we will have to turn to the tools we’ve devised to solve exactly that problem. And those tools are statistics.

It will not be human brains that directly run through each of the petabytes of data the US Air Force collects. It will be statistical software routines. And the historical record that the modal historian of the future confronts will be one that is mediated by statistical distributions, simply because such distributions will allow historians to confront the data that appears in vast torrents with tools that are appropriate to that problem.

Onset of menarche plotted against years for Norway.
In all seriousness, this is the sort of data that should
be analyzed by historians but which many are content
to abandon to the economists by default. Yet learning
how to analyze demographic data is not all that hard,
and the returns are immense. And no amount of
reading documents, without quantifying them,
 could produce this sort of information.

This will, in one sense, be a real gift to scholarship. Although I’m not an expert in Hitler historiography, for instance, I would place a very real bet with the universe that the statistical analysis in King et al. (2008) , “Ordinary Economic Voting Behavior in the Extraordinary Election of Adolf Hitler,” tells us something very real and important about why Hitler came to power that simply cannot be deduced from the documentary record alone. The same could be said for an example closer to (my) home, Chay and Munshi (2011), “Slavery’s Legacy: Black Mobilization in the Antebellum South,” which identifies previously unexplored channels for how variations in slavery affected the post-war ability of blacks to mobilize politically.

In a certain sense, then, what I’m describing is a return of one facet of the Annales school on steroids. You want an exploration of the daily rhythms of life? Then you want quantification. Plain and simple.

By this point, most readers of the Duck have probably reached the limits of their tolerance for such statistical imperialism. And since I am a member in good standing of the qualitative and multi-method research section of APSA (which I know is probably not much better for many Duck readers!), who has, moreover, just returned from spending weeks looking in archives, let me say that I do not think that the elimination of narrativist approaches is desirable or possible. Principally, without qualitative knowledge, quantitative approaches are hopelessly naive. Second, there are some problems that can only practically be investigated with qualitative data.

But if narrativist approaches will not be eliminated they may nevertheless lose large swathes of their habitat as the invasive species of Big Data historians emerges. Social history should be fundamentally transformed; so too should mass-level political history, or what’s left of it, since the availability of public opinion data, convincing theories of voter choice, and cheap analysis means that investigating the courses of campaigns using documents alone is pretty much professional malpractice.

The dilemma for historians is no different from the challenge that qualitative researchers in other fields have faced for some time. The first symptom, I predict, will be the retronym-ing of “qualitative” historians, in much the same way that the emergence of mobile phones created the retroynm “landline.” The next symptom will be that academic conferences will in fact be dominated by the pedantic jerks who only want to talk about the benefits of different approaches to handling heteroscedasticity. But the wrong reaction to these and other pains would be kneejerk refusal to consider the benefits of quantitative methods.

© 2017 Duck of Minerva

Theme by Anders NorenUp ↑