Moviegalaxies – Social networks in movies

  • Jermain Kaminski (Creator)
  • Michael Schober (Creator)
  • Raymond Albaladejo (Creator)
  • Oleksandr Zastupailo (Creator)
  • César Hidalgo (Creator)



This repository contains network graphs and network metadata from Moviegalaxies, a website providing network graph data from about 773 films (1915–2012). The data includes individual network graph data in Graph Exchange XML Format and descriptive statistics on measures such as clustering coefficient, degree, density, diameter, modularity, average path length, the total number of edges, and the total number of nodes.

Methods: We created a movie script parser and determined same-scene appereance of characters as a proxy of connectedness (each co-appeareance is measured as one degree unit per scene). A technical documentation will follow with the next version. Even after multiple manual checks, the data set can still contain minor errors. You are welcome to check back regularly as we plan to increase and improve our database soon (both movies and series). Last but not least, we would be very grateful to know how you make use of our data. In recent years, welearned about a variety of use cases, reaching from school or university education settings to robotics, and museums. Such examples motivated us to continue. Thank you and enjoy!
Date made available18 Jul 2018
PublisherHarvard Dataverse
Temporal coverage1915 - 2012

Cite this