AI researchers at Microsoft accidentally leaked 38TB of company data

This post was originally published on this site

https://content.fortune.com/wp-content/uploads/2023/09/GettyImages-1195300888-e1695141476853.jpg?w=2048

The AI team, which was uploading training data to let other researchers train AI models for image recognition, accidentally exposed the data, which included “secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages,” according to cloud security platform Wiz, which first noticed the data exposure.

Microsoft, in a report of its own about the leak, noted that “no customer data was exposed, and no other internal services were put at risk because of this issue” and there was no action required by those customers.

The link to the data that included the files was made using a feature in Azure called “SAS tokens,” which let users create sharable links. Wiz first discovered the access to the data on June 22 and alerted Microsoft. The token was revoked the next day and Microsoft says it has fixed the issue and adjusted SAS tokens that are more permissive than intended.

“The information that was exposed consisted of information unique to two former Microsoft employees and these former employees’ workstations,” the company said. “No customer data was exposed, and no other Microsoft services were put at risk because of this issue. Customers do not need to take any additional action to remain secure. Like any secret, SAS tokens need to be created and handled appropriately. As always, we highly encourage customers to follow our best practices when using SAS tokens to minimize the risk of unintended access or abuse.”

Wiz cautioned that these sorts of mistakes could become more common as AI is trained and used more often.

“This case is an example of the new risks organizations face when starting to leverage the power of AI more broadly, as more of their engineers now work with massive amounts of training data,” the group wrote. “As data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards.”