A group of researchers has released a data set on nearly 70,000 users of online dating site, OkCupid. The information was collected by Danish researchers who never contacted OkCupid, or its clientele, about using it.
Commenting on this, Rob Sobers, director at Varonis, said: "We have to live under the assumption that, if we make data public, it can and will be scraped and collected and archived permanently. The profile data is public, so technically this is not a hack or a breach. Anyone could easily get their hands on any individual user profile that is in the dump. However, what the researchers did was compile it all into one big structured data set, which makes it easy for both good guys and bad guys to analyse.
They should have stripped the usernames from the data dump to anonymise it. It was poor judgment not to do that. They claimed they left the usernames in the dump so that they could back-fill the dataset with more information in the future. But they could have used an anonymous unique ID and kept the mapping of anonymous IDs to usernames private and it would solve that problem.
It’s helpful to create data dumps for studies. OkCupid does this themselves. They often release really interesting findings about their users based on aggregate data. But today we have to be more security and privacy conscious—publishing data dumps with PII and sensitive information without adequate de-identification isn’t a good thing.
I don’t think the researchers here were after bragging rights, it seems like they were just naïve vis-à-vis the privacy implications of compiling OkCupid’s data into an easy-to-exploit format without any prior notice to OkCupid or the people involved."