Written for various quantitative methods courses that I teach at ESPOL in Lille and Sciences Po in Paris, and in which I use some of the datasets listed below.
Links last checked and updated in May 2019.
Links also provided for R and Stata data retrieval packages. R users can also turn to the Econometrics, Official Statistics, Open Data and Social Sciences Task Views.
Most datasets are individual-level or country-level.
To merge country codes/names, use the countrycode
(R) or
kountry
(Stata) packages.
Council of European Social Science Data Archives (CESSDA)
Consortium of European social science data archives.
See their data catalogue, which aggregates those of the consortium members.
Research Data Service (University of Edinburgh)
An example of a full-fledged research data university office.
See their full listing of data sources.
GESIS (Leibniz Institute for the Social Sciences)
German data provider that hosts many cross-country surveys.
See the surveys available via ZACAT and DBK, their data catalogues.
See also their Overview of Comparative Surveys Worldwide.
For data retrieval in R, see the gesis
package.
Replication archive run by the Harvard Institute for Quantitative Social Science (IQSS).
Heavily used especially by U.S. political scientists.
Inter-university Consortium for Political and Social Research (ICPSR)
Aggregates social science data from several hundred universities and research centres.
Possibly the largest data provider in the present list. Restricted to ICPSR members.
Includes openICPSR, which is not restricted to ICPSR members. Example datasets:
For data retrieval in R, see the icpsrdata
package.
Statistical and Data Archives (IPSAportal)
Selection of datasets particularly relevant to political scientists.
Published by the International Political Science Association (IPSA).
The same website also lists some official records, media sources and special collections.
Published by the Norwegian Centre for Research Data (NSD).
Data on all Member States of the European Union.
For data retrieval in R, see the eurostat
and restatapi
packages.
Food and Agricultural Organization (FAO)
For data retrieval in R, see the FAOSTAT
package.
International Labour Organization (ILO)
For data retrieval in R, see the Rilostat
package.
International Monetary Fund (IMF)
See e.g. the Global Debt Database.
Organisation for Economic Co-operation and Development (OECD)
See e.g.
For data retrieval in R, see the OECD
package.
United Nations (UN)
See UNdata as well as e.g.
World Institute for Development Economics Research (UNU-WIDER)
Part of the United Nations University (UNU).
See e.g.
World Income Inequality Database (WIID)
See also: WID, SWIID (both in ‘Economics’ section).
World Bank (WB)
See e.g.
For data retrieval in R, see the wbstats
and WDI
packages.
For data retrieval in Stata, see the wbopendata
package.
World Health Organization (WHO)
See e.g.
There is a long list of profit and nonprofit sources producing measures of human rights, environmental performance, good government and so on. Only a few examples are shown below.
Economist Intelligence Unit (EIU)
‘Good governance’ indicators.
Freedom House (FH)
“Freedom of the Press” index.
Reporters Without Borders (RSF/RWB)
Transparency International (TI)
Country-level indicators on prison systems and populations.
For data retrieval in R, see the prisonbrief
package.
Open data organizations, e.g. Open Knowledge International, Regards Citoyens (France) and the Sunlight Foundation (US), try to make – mostly governmental – data available to everyone via open data portals like CKAN. A few examples below.
By the Open Knowledge Network.
Searchable collection of worldwide public data.
Organizations and projects working on parliaments, parliamentarians and legislation.
Economic, socio-demographic and electoral data for Morocco, collected by a local NGO.
Examples for four countries. For more providers, see the U.S. statistical agencies and world agencies lists by Brent Moulton at Political Arithmetick. For electoral surveys from those countries (and others), see the dedicated section.
Archives de Données Issues de la Statistique Publique (ADISP)
A section of PROGEDO, which runs the Réseau Quételet.
Centre de données socio-politiques (CDSP)
Main data provider for French electoral surveys, based at Sciences Po in Paris.
Survey data available via the Réseau Quételet, qualitative data available via beQuali.
Governmental open data portal. Includes recent electoral results.
Institut national d'études démographiques (INED)
National Demographic Institute.
See e.g. the Trajectoires et Origines (TeO) survey.
Institut national de la statistique et des études économiques (INSEE)
National Statistics Institute. Large and very detailed data.
National survey data repository. Applies strict access conditions.
Make sure to use both its search engines:
For electoral data, also check the CDSP data catalogue.
Data registration agency.
By the Bundesinstitut für Bevölkerungsforschung (national demographic institute).
Forschungsdatenzentrum (FDZ)
Governmental open data portal.
Indikatoren und Karten zur Raum- und Stadtentwicklung (INKAR)
Spatial aned economic data on German states (Länder).
By the Statistische Ämter des Bundes und der Länder (statistical agencies).
Statistisches Bundesamt (Destatis)
National (federal) statistical agency.
See its data catalogue, GENESIS-Online Datenbank.
General Election Results from 1918 to 2017
As it says on the tin.
Governmental open data portal.
Office of National Statistics (ONS)
See its release calendar and Nomis labour market data portal.
For data retrieval in R, see the nomisr
package.
UK Data Archive (UKDA)
Formerly known as ESDS. See its survey question bank.
Example surveys:
Bureau of Labor Statistics (BLS)
See e.g.
See e.g.
See also IPUMS USA for enhanced access and documentation.
For data retrieval in R, see the tidycensus
and acs
packages.
Data Preservation Alliance for the Social Sciences (Data-PASS)
Includes the ICPSR, the Qualitative Data Repository (QDR), the Roper Center and others.
Governmental open data portal. The U.S. has many others — see e.g. NYC Open Data.
U.S. economic time series.
Hate Crime Laws and Statistics
Data produced by the Anti-Defamation league (ADL) and the FBI.
National Health Interview Survey (NHIS)
Published by the National Center for Health Statistics (NHCS) at the CDC.
National Longitudinal Study of Adolescent to Adult Health (Add Health)
See the guide to accessing Pew data.
For data retrieval in R, see the pewdata
package.
Example datasets/publications:
Roper Center for Public Opinion Research
For data retrieval in R, see the ropercenter
package.
British Election Study (BES)
See also the British Election Studies Information System (BESIS).
French Electoral Study (FES)
See this blog post (in French) about locating the data.
Includes a geocoded dataset of Afrobarometer surveys.
Survey (2014) and longitudinal (1960-2015) data for 7 Arabic countries.
The data and codebooks are listed on the reports page.
Comparative Candidates Survey (CCS)
Elite surveys of national parliamentary electoral candidates.
Demographic Health Surveys (DHS)
By USAID.
Eurobarometers (EB)
Data on all Member States of the European Union.
European Election Studies (EES)
Voter surveys, as well as manifesto, elite and media analyses.
European Social Survey (ESS)
Also includes (mostly Eurostat) multilevel data on the countries covered by the survey.
For data retrieval in R, see the essurvey
package.
European Values Surveys (EVS)
See the download guide to get it from GESIS/ZACAT.
For data retrieval in R, see the gesis
package.
Integrated Public Use Microdata Series (IPUMS)
U.S. and international population data, including censuses and health surveys.
For data retrieval in R, see the ipumsr
package.
Making Electoral Democracy Work (MEDW)
Electoral surveys for selected regions in Belgium, France, Germany, Spain and Sweden.
Social Stratification in Eastern Europe after 1989 (SSEE)
Surveys conducted in 1993-4 in Bulgaria, the Czech Republic, Hungary, Poland, Russia and Slovakia.
Survey for Health Ageing and Retirement in Europe (SHARE)
Cross-national European panel.
World Values Surveys (WVS)
Includes WVS/EVS integrated files.
Focusing on (economic, institutional, political) country-level data, so excluding region-level like e.g. Regions of Russia (RoR).
Archived page from the Digital Activism Research Project (DARP).
Main dataset produced by the project: Global Digital Activism Data Set (GDADS).
Also available via ICPSR.
Party manifestos across several democracies.
Comparative Policy Agendas (CAP)
Includes the U.S. Policy Agendas project.
Comparative Political Data Set (CPDS)
Politics and expenditure levels in European and OECD countries.
Comparative Study of Electoral Systems (CSES)
See also: DES (below), IDEA.
Democratic Electoral Systems (DES)
See also: CSES (above), IDEA.
International Country Risk Guide (ICRG)
Political, financial and economic risk.
Inter-Parliamentary Union (IPU)
PARLINE database on parliaments and electoral systems.
Extensive data on EU and most OECD parties, elections and cabinets.
Aggregates several datasets on political parties.
Political Constraint Index (POLCON)
Measure of political risk (see methodological note).
Political Data Yearbook (PDYi)
Published by the European Consortium for Political Research (ECPR).
Quality of Government (QOG)
Aggregates many datasets on institutions, development, and much, much more.
See also Markus Kainu's interactive access point to QOG metadata.
For data retrieval in R, see the rqog
package.
Unfortunately, the QOG-related Stata commands are currently out of date.
Carefully crafted democracy indices.
All Minorities at Risk Project (AMAR)
Data on important ethnic groups.
Correlates of War (COW)
Militarized Interstate Disputes, and more.
Coups in the World, 1950-Present
Informally known as the ‘Powell and Thyne’ dataset, from its authors.
Global Terrorism Database (GTD)
Worldwide terrorist events.
Historical Terrorist Groups (1860–1969)
Joshua Tschantret's list of such groups. See also: paper, appendix .
Integrated Network for Societal Conflict Research (INSCR)
Political regime characteristics, plus state conflict and fragility measures. Includes:
Peace Research Institute Oslo (PRIO)
Armed conflicts, conflict geography.
Social Conflict Database (SCAD)
All sorts of conflict, in Africa plus a few more countries.
Part of the Climate Change and African Political Stability (CCAPS) research program.
CCAPS has several additional datasets focused on Africa.
Stockholm International Peace Research Institute (SIPRI)
Worldwide (dis)armament.
Data on jihadi plots in the West, foreign fighters, and nuclear terrorist events.
Uppsala Conflict Data Program (UCDP)
Armed conflict onsets and terminations.
Bureau for Research and Economic Analysis of Development (BREAD)
Household surveys, and much more, compiled by a non-profit.
Centre d'études prospectives et d'informations internationales (CEPII)
Several datasets, including “square” gravity data of trade flows.
Center for International Development (CID)
Various development economics data, partly documented at this address.
Comparative Welfare Entitlements Dataset (CWED)
Structure and level of social policy benefits (see also: SOCX, OECD).
Informally known as the ‘Scruggs’ dataset, from its main author.
Data for development economics, sometimes with brief literature reviews.
See e.g. posts on ethnic and linguistic diversity (“fractionalization”).
European Central Bank (ECB)
See its Statistical Data Warehouse.
For data retrieval in R, see the ecb
package.
European Commission Annual Macro-Economic Database (AMECO)
By its Directorate General for Economic and Financial Affairs.
For data retrieval in R, see the ameco
package (N.B. last updated 2018).
IMF working paper, with fiscal crises data for many countries from 1970 to 2015.
Various highly aggregated indexes on e.g. globalization or youth labour markets.
Maddison Historical Statistics
GDP estimates over almost 3,000 years.
For data retrieval in R, see the maddison
package (N.B. last updated 2013).
National Bureau of Economic Research (NBER)
Development and trade datasets.
[U.S.] Panel Study of Income Dynamics (PSID)
“Longest running longitudinal household survey in the world,” started in 1968.
For data retrieval in R, see the psidR
package.
Penn World Tables (PWT)
Precise estimation of real gross domestic products.
For data retrieval in R, see the pwt
,
pwt8
and
pwt9
packages.
Standardized World Income Inequality Database (SWIID)
Highly comparable income inequality data.
See also: WIID (UNU-WIDER), WID (below).
World Income Database (WID)
Between-country and within-country wealth and income inequality.
See also: SWIID (above), WIID (UNU-WIDER).
Berlin Social Science Center (WZB), “Data Sources” (n.d.); ECPR Standing Group on Public Opinion and Voting Behaviour, “National Election Studies” (n.d.); Edelman, “Using Internet Data for Economic Research” (2012); Faoro et al., “Data Resources for Studies in Comparative Politics” (n.d.); Franzese, Jr., “Empirical Strategies for Various Manifestations of Multilevel Data” (2005); Pennings, Keman and Kleinnijenhuis, Doing Research in Political Science: An Introduction to Comparative Methods and Statistics (2005, p. 57); Smith, “Resources for Conducting Cross-National Survey Research” (2015); Emiliano Grossman and Nicolas Sauger, and many other colleagues and students, with a special mention to Felix von Nostitz.
The best ever survey data analysis repository, by Anthony J. Damico.
R code to download and set up many of the surveys featured on this page, and more.
The best data newsletter ever, by Jeremy Singer-Vine.
The Economist's Graphic Detail blog
Daily charts by a magazine with a long empirical tradition.
Several newspapers, e.g. The Financial Times and The Guardian, run similar sections.
Beautiful, instructive visualizations of worldwide data. Rest in peace, Hans Rosling.
A valuable complement to Gapminder, by Max Roser.
My bookmarks contain more (but less organised) links to social science, economic and health data, as well as some readings on e.g. data availability, the sociology of quantification, and statistical measurement.
Also, although they are not or less focused on social science, scientific data repositories like Figshare, OSF and Zenodo might also be worth a look, as well as Nature’s Scientific Data journal.
Last, there’s much more to it than just academic research data: try e.g. Awesome Public Datasets, Data.World, Google Public Data and Dataset Search, the Guardian Data Blog data index, or the /r/datasets and /r/dataisbeautiful Reddit channels.
P.S. I am not interested in documenting closed-source business data (re)sellers like Qlik DataMarket, Quandl and Statista.