Replication Data for: BIGPROD Data Sample

  • Sajad Ashouri (Creator)
  • Arash Hajikhani (Contributor)
  • Arho Suominen (Contributor)
  • Angela Jäger (Contributor)
  • Torben Schubert (Contributor)
  • Lukas Pukelis (Contributor)
  • Scott Cunningham (Contributor)
  • Cees Van Beers (Contributor)
  • Serdar Turkeli (Contributor)



This data sample (in support the article "Indicators on firm level innovation activities from web scraped data" contains data on companies' innovative behavior measured at the firm-level based on web scraped firm-level data derived from medium-high and high-technology companies in the European Union and the United Kingdom. The data are retrieved from individual company websites and contains in total data on 96,921 companies. The data provide information on various aspects of innovation, most significantly the research and development orientation of the company at the company and product level, the company’s collaborative activities, company’s products, and use of standards. In addition to the web scraped data, the dataset aggregates a variety firm-level indicators including patenting activities. In total, the dataset includes 28 variables with unique identifiers which enables connecting to other databases such as financial data.

Terms of reuse

CC0 Public Domain Dedication
Date made available4 Oct 2021


  • Big data
  • Web scraped data
  • Text data
  • Firm-level data
  • Micro data

JEL classifications

  • o40 - Economic Growth and Aggregate Productivity: General
  • d24 - "Production; Cost; Capital; Capital, Total Factor, and Multifactor Productivity; Capacity"
  • o00 - Economic Development, Technological Change, and Growth
  • c30 - "Multiple or Simultaneous Equation Models; Multiple Variables: General"

Cite this