
The AI world was shaken - not by a new model release, but by a massive lapse in data security. Scale AI, one of the most high-profile startups in the space (recently valued at $14.8 billion in a deal with Meta), is now under the microscope for using public Google Docs to store and share extremely sensitive information related to clients like Meta, Google, and Elon Musk’s xAI.
The details are jaw-dropping. Confidential documents. Employee pay data. Training prompts. Codenamed AI projects. All openly accessible - just one click away from the wrong hands. One former Scale AI employee summed it up:
“The whole Google Docs system always seemed incredibly janky.”
They weren’t wrong. But the issue isn’t Google Docs. The issue is how companies use these tools - and what controls they fail to put in place.
What Happened at Scale AI
Let’s break it down:
- Critical customer data - including AI training docs from top customers such as Meta and Google - were stored in Google Drive.
- No access controls were enforced on sharing policies. In many cases, files were set to “anyone with the link can view.”
- Thousands of documents were found public, including:
- AI training materials.
- Internal ratings of employees.
- Pay information and contractor emails.
- Confidential manuals from Google and Meta.
- AI training materials.
- Despite these documents being labeled “confidential,” they were exposed to anyone who stumbled across a link - or was sent one.
This Isn’t a Google Problem. It’s a Process Problem.
The root cause here isn’t the use of Google Workspace. Millions of organizations rely on Google Drive daily - including the most security-conscious enterprises.
The problem is a complete lack of automated controls, guardrails, and visibility.
At DoControl, we’ve seen this story play out time and time again:
- Sensitive files are shared externally - either accidentally or through a “good enough” business process.
- Over time, those shares accumulate. Hundreds turn into thousands.
- No one knows what’s exposed, to whom, or why - until it’s too late.
When you don’t have continuous monitoring, lifecycle governance, or automated remediation for SaaS file sharing, risk compounds silently.
Why This Matters Now
AI companies are collecting some of the most sensitive and strategic data in the world - data that powers billion-dollar models and proprietary products. The value of this data makes it a top target for competitors, adversaries, and insiders alike.
If these files are leaking out through unmanaged collaboration tools, the consequences are existential.
How DoControl Could Have Prevented This
Let’s imagine a different version of this story - one where Scale AI had deployed DoControl.
Here’s what would have happened:
- Real-time visibility into all Google Drive file sharing activity.
- Automated policies flagging and revoking public or external shares that contain sensitive terms (e.g. “confidential,” “training data,” “payroll,” etc.).
- Time-based access controls—ensuring that temporary shares expire, and access is never “forever.”
- Granular audit logs showing who accessed what, when, and why.
- Self-serve access requests replacing risky workarounds.
With DoControl, the data never goes dark - and access never goes unmanaged. Check out a recent blog we wrote on this exact scenario - how to know if sensitive files are publicly shared in your organization.
{{cta-1}}
A $14B Wake-Up Call
The irony here is sharp: a company at the cutting edge of AI, acquired to lead Meta’s “superintelligence” lab, was undone by a primitive data security gap - one that could have been avoided with modern SaaS security hygiene.
If your organization is building AI products, storing customer IP, or simply operating in a cloud-first world, this isn’t someone else’s problem.
This is your problem too.