min read
Feb 20, 2024

How DoControl and Google Workspace AI Labels Solve for Data Security

A few months ago, Google launched AI-powered data classification for Workspace customers to automatically label files across Google Drive. This blog post explains Google Drive Labels, how Google AI automatically creates Labels, and how DoControl combines these Labels with HR and IDP metadata to solve for Data Security.

Google Drive Labels

Google Workspace customers maintain millions of files stored in Google Drive. While customers traditionally catalog and organize these files in different Shared Drive folders, most files are actually stored within users’ My Drive folders. This is a result of the default user experience of files creation within Google Drive, especially when using shortcuts such as docs.new, sheets.new, or slides.new to kickoff new documents instantly to be saved in My Drive. 

As such, Google Drive data are spreaded across personal and shared drives by definition. Therefore, it’s essential that organizations use Drive Labels to help organize, find, and apply policies to files in Drive.

There are four methods to create Drive Labels:

How Google use AI to automatically create Labels

DoControl recommends Google Workspace customers to leverage Google AI classification labels because it’s more accurate, covers more use cases, and requires significantly less maintenance. 

According to the official documentation, Google’s AI classification is enabled through a training process in which specific users (“designated labelers”) respond to automatically generated labels to help train the model and improve accuracy. Based on users’ examples and responses, the model begins to learn how to similarly classify sensitive files. 

After about a week of training, Google Admins are prompted to turn on automatic classification. Google provides monitoring on how many files are classified, accuracy level, etc. 

How DoControl combines Labels with HR, IDP, and End-User Business Context

Use Labels in Assets Inventory

DoControl automatically updates Google Drive file metadata to go based on user activity events as well as Google Labels activity. With Labels in hand, customers can filter through the DoControl Assets Inventory and correlate between Google Labels, Sharing Status, Data Ownership, External Collaborators, File Activity/Inactivity, and much more. From there, customers can take bulk actions, such as external sharing cleanup, data ownership transfer, etc. 

Full Data Enrichment

Advanced Filtering

Use Labels in Workflows

DoControl Automated Workflows are triggered based on user activity events, ongoing schedule, or manually by DoControl users. Workflows are granular, scalable, and sophisticated which allows for all kinds of threat modeling mitigation. Workflows combine Google Drive Labels, HRIS Employment Status, IDP Group Membership, and End-User Business Context to narrow down the scope and solve critical use cases with high confidence. 

Top Customer Use Cases

1. Attack Surface Discovery 

DoControl aggregates all Google Labels (Manual, DLP, Vault, AI) across all Google Drive files (My Drive, Shared Drive, Org Units) to enrich its assets inventory with data classification information. From there, DoControl surfaces metrics displaying what % of data is sensitive, exposed, overshared internally, inactive, accessed by former employees/vendors, etc. Customers can export reports describing the current status of their Google Drive attack surface to assess the risk and cost of a potential data breach as well as list concrete action items. 

2. Bulk Remediation / Cleanup 

At the most basic level, customers can filter Google Drive files based on their labels, activity/inactivity, data owners, external collaborators, sharing status, and much more. From there, customers can run a bulk remediation action removing millions of permissions all at once. This is extremely helpful in cleaning up unauthorized access, inactive permissions, and sensitive overexposures both internally and externally.

3. Internal “Ethical Walls”

Users store sensitive data in both My Drive and Shared Drive. In many cases, users prefer to share with anyone with a link internally as Editor and simply send the link in emails or Slack to collaborate with multiple users. As a result, significant sensitive data is overexposed to non authorized users. DoControl Workflows can ensure only specific team members can access specific data points, either on My Drive or Shared Drive, having the relevant Google Labels. For example, enforcing only Finance team members to access Finance data within the Finance Shared Drive, or any My Drive containing relevant Google Labels.

4. Granular External Sharing Auto-Expiration

Not all external collaborations are created equally. While some require longer term collaborations, most external sharing becomes irrelevant X days. DoControl leverages Google Labels to auto-expire labeled data’s external sharing to ensure no company information is exposed forever. This is also true for public sharing. 

5. Departing Employee Data Theft 

DoControl integrates with your HRIS platforms, such as Workday, HiBob, or BambooHR, which allows for monitoring of departing employees who pose much higher risk by definition. With Google Labels in place, DoControl can detect and respond to potential sensitive data exfiltration by leaving employees attempting to steal sensitive data. 


  • Setup Labels: Google Workspace Enterprise customers should start using a combination of DLP and AI classification labels to tag their entire Google Drive environment with relevant labels (intellectual property, PII, PCI, PHI, etc). 
  • Review Attack Surface: With a fully labeled Google Drive environment, sign up and integrate DoControl to understand your entire attack surface across Shared Drive, My Drive, Org Units, IDP groups, HRIS departments, External Collaborators, etc. 
  • Cleanup Technical Debt: Identify and execute low/no risk remediation action items, such as external sharing cleanup of inactive labeled files, cleanup of publicly shared labeled data, removal of internal with a link permissions for highly sensitive labeled data, etc. 
  • Set Up Automated Workflows: For high risk scenarios, such as departing employees sharing sensitive, labeled data, set up automated workflows to remediate right away
  • Schedule Workflows: Trigger a Workflow every 90 days to search for inactive, labeled data shared with external collaborators and perform cleanups automatically 
  • Empower End-Users: In low-confidence scenarios where labeled data is being collaborated with no business justification, use the DoControl Slack Bot and/or Emails to get business context from end-users to determine the right course of action with high confidence. 

Adam Gavish is the Co-Founder and Chief Executive Officer of DoControl. Adam brings 15  years of experience in product management, software engineering, and network security. Prior to founding DoControl, Adam was a Product Manager at Google Cloud, where he led ideation, execution, and strategy of Security & Privacy products serving Fortune 500 customers. Before Google, Adam was a Senior Technical Product Manager at Amazon, where he launched customer-obsessed products improving the payment experience for 300M customers globally. Before Amazon, Adam was a Software Engineer in two successfully acquired startups, eXelate for $200M and Skyfence for $60M.

Adam is a lifetime information geek, breaking down business and technical problems into components to generate long-term learning. He loves running outdoors, playing with LEGOs with his son, and watching a good movie with his wife.

Adam holds a B.S. in Computer Science from the Academic College of Tel-Aviv Yafo and an MBA from the Johnson Graduate School of Management at Cornell University.

Get updates to your inbox

Our latest tips, insights, and news