

themoviedb API is free but has api request limits, and it does not natively include the IMDBid in a typical search which can make integrating data from multiple sources difficult. I am pretty sure the restrictions on the data only pertain to commercial usages but you should verify that before diving in head first.
#Imdb data file converter full#
However going through the full text of that article, you may be able to glean some clues as to how they got their data and replicate those - so this could potentially help you.Īs was alluded to in one in a comments you should check out which would allow you to either download manually via ftp or through a terminal interface. I looked at their citations for clues but they only thing they cite verbatim is: Political preferences and other potentially sensitive information. Identified the Netflix records of known users, uncovering their apparent
#Imdb data file converter movie#
Movie Database<<< as the source of background knowledge, we successfully Identify this subscriber’s record in the dataset. Knows only a little bit about an individual subscriber can easily Movie ratings of 500,000 subscribers of Netflix, the world’s largest

Methodology to the Netflix Prize dataset, which contains anonymous Robust to perturbation in the data and tolerate some mistakes in theĪdversary’s background knowledge. Recommendations, transaction records and so on. High-dimensional micro-data, such as individual preferences, We present a new class of statistical de-anonymization attacks against It's quite a famous paper and was even on the news when it got published. The University of Texas at Austin February 5, 2008. "Robust De-anonymization of Large Datasets (How to Break Anonymity of the Netflix Prize Dataset)". So in reading this question I HAVE to point this out - ever heard of the paper?:Īrvind Narayanan and Vitaly Shmatikov. Not sure if this would classify as a comment or an answer, but it's useful information nonethelss:
