De-anonymization
De-anonymization (also spelt as deanonymization) is a strategy in data mining in which anonymous data is cross-referenced with other sources of data to re-identify the anonymous data source.
More and more data are becoming publicly available over the Internet. These data are released after applying some anonymization techniques like removing personally identifiable information (PII) such as names, addresses and social security numbers to ensure the sources' privacy. This assurance of privacy allows the government to legally share limited data sets with third parties without requiring written permission. Such data has proved to be very valuable for researchers, particularly in health care. However, as the Netflix contest dramatically revealed so much of data is available, even after anonymization, that a specific individual’s identity could be re-discovered.