Stuff political scientists like #5 — a Large N

28 June 2011, 2300 EDT


I have been doing a lot of work with survey data lately, as well as some reading in critical theory. Maybe that inspired my deconstruction of the gendered language of stats. Or maybe I just like to work blue.

Your girlfriend has told you, “Honey, your data set is big enough for me. It’s OK if it doesn’t get you into the APSR.” She might tell you, “It is not the size of p-value that matters, it is what you do with it.” A good theory can make up for a large-N, she reassures you. But political scientists know the truth. Size matters. Political scientists like a large-N.

A large-N enables you to find a statistically significant relationship between any two variables, and to find evidence for any number of crazy arguments that are so surprising, they will get you published. Political scientists like to be surprised. Your theory might be dapper and well dressed, but without the large-N, political scientists will not swoon. They go crazy for those little asterisks.

Some qualitative researcher might come in and show that your variables are not actually causally related, but it will be too late. You will have 200 citations on Google Scholar, and their article will be in the Social Science Research Network archive forever. Your secret is safe. Go back to Europe, qually!

Political scientists also like a large-N because it gives you degrees of freedom. You can experiment with other variables in your model without worrying about multicollinearity. You aren’t tied down to one boring variable. Political scientists like to swing.

Political scientists prefer it if the standard error in your data is smooth and consistent and does not increase as the X value rises. Consider waxing or shaving your data with simple robust standard errors if you have problems with heteroskedasticity. They also like a big coefficient that slopes upward. Doesn’t everyone? And fit, don’t forget about fit. Fit makes things more enjoyable.


It is best if your large-N data does not have a lot of measurement error. You might say, a little is natural, like when I jump in the pool, but this is not acceptable in political science. You should, however, have variation in your dependent variable. Variety is good. It keeps things spicy. When a political scientists wants to get really kinky, he or she will bootstrap his data.

It is best if your data is normally distributed, but political scientists generally forgive that. They like data of all shapes and sizes. They just close their eyes and pretend that it is symmetrical. Binomial. Fat tails. Oooh. That just sounds dirty.

Political scientists will tell you that if your dataset is not big enough, your confidence intervals will be too wide. Paradoxically, this will drain your confidence and make it harder for you to perform in the future. But don’t worry, they have drugs for that.

Don’t leave anything to chance. Get yourself a large-N. But don’t listen to those ads on TV late at night. Those quick data fixes don’t work.