• About
  • Advertise
  • Careers
  • Contact
  • Local Guide
Thursday, January 15, 2026
No Result
View All Result
NEWSLETTER
The Seattle Today
  • Home
  • Arts & Culture
  • Business
  • Politics
  • Technology
  • Housing
  • International
  • National
  • Local Guide
  • Home
  • Arts & Culture
  • Business
  • Politics
  • Technology
  • Housing
  • International
  • National
  • Local Guide
No Result
View All Result
The Seattle Today
No Result
View All Result
Home International

OpenAI Contractors Asked to Upload Real Work Files, Raising IP and Confidentiality Concerns

by Favour Bitrus
January 12, 2026
in International, Technology
0 0
0
Picture Credit: TechCrunch
0
SHARES
13
VIEWS
Share on FacebookShare on Twitter

OpenAI and training data company Handshake AI are asking third-party contractors to upload real work they did in past and current jobs, according to a report in Wired, as part of a broader AI industry strategy to generate high-quality training data that could eventually allow models to automate white-collar work. A company presentation reportedly asks contractors to describe tasks performed at other jobs and upload examples of “real, on-the-job work” including “concrete output” files like Word documents, PDFs, PowerPoints, Excel spreadsheets, images, and code repositories. OpenAI instructs contractors to delete proprietary and personally identifiable information before uploading and points them to a ChatGPT “Superstar Scrubbing” tool for this purpose, but intellectual property lawyer Evan Brown warns the approach puts OpenAI “at great risk” by requiring “a lot of trust in its contractors to decide what is and isn’t confidential.”

The fundamental problem is that contractors being asked to determine what constitutes proprietary or confidential information in their past work likely lack legal expertise to make those determinations accurately. What seems like non-sensitive material to a contractor might contain trade secrets, competitive intelligence, client information, or intellectual property that former employers consider protected. A marketing presentation might reveal strategic positioning that competitors could exploit. A code repository might contain proprietary algorithms or architectural decisions that constitute valuable IP. An Excel spreadsheet might include formulas, methodologies, or data structures that represent significant business investment. Asking contractors to judge confidentiality creates massive liability exposure for both the contractors, who might violate non-disclosure agreements or employment contracts, and OpenAI, which could face lawsuits from companies whose proprietary information ends up in training data.

The “Superstar Scrubbing” tool that OpenAI provides for removing sensitive information raises questions about its effectiveness and the standards it applies. If the tool uses AI to identify and remove proprietary information, it might miss context-specific confidentiality that requires industry knowledge or legal judgment. If it relies on contractors manually identifying sensitive content, it provides no more protection than asking contractors to delete such information themselves. Either way, the burden of determining confidentiality falls on people least equipped to make those judgments, contractors performing data labeling work rather than lawyers with expertise in intellectual property, employment law, and confidentiality agreements.

For contractors being asked to provide this material, the request creates significant personal risk. Most employment contracts and non-disclosure agreements prohibit sharing work product with third parties, even after employment ends. Violating those agreements can result in lawsuits seeking damages, injunctions preventing future work in the industry, and reputational harm that affects career prospects. A contractor who uploads work files to OpenAI believing they’ve adequately scrubbed sensitive information might discover years later that a former employer considers the uploads to violate confidentiality agreements, potentially resulting in litigation the contractor can’t afford to defend against and might lose despite good-faith efforts to comply with OpenAI’s scrubbing instructions.

The broader strategy Wired describes, AI companies hiring contractors to generate high-quality training data to eventually automate white-collar work, reveals the ultimate goal: using examples of actual professional work to train models that can perform those same tasks. If OpenAI can collect thousands of real marketing presentations, legal briefs, financial analyses, code repositories, and other professional work products, their models can learn patterns, structures, and techniques that practitioners use in actual business contexts rather than synthetic examples created for training purposes. That real-world data presumably produces models better at generating work that meets professional standards, but it requires access to vast quantities of proprietary professional work that companies and individuals created through significant investment.

The tension between OpenAI’s need for high-quality professional training data and the intellectual property rights of companies that created or paid for that work reflects fundamental questions about AI training that remain legally unresolved. Can contractors provide their past work to AI companies for training purposes? Does that violate employment agreements? Does it constitute theft of trade secrets if the work contains proprietary information? Courts haven’t definitively answered these questions, creating gray area where AI companies push boundaries by asking contractors for material that might or might not be legally available for training use.

For Seattle’s tech industry, where many workers have employment contracts with IP assignment clauses stating that work created during employment belongs to the company, not the employee, this OpenAI practice creates particular concern. A software engineer who worked at Amazon, Microsoft, or local startups and uploads code repositories to OpenAI might be violating IP assignment agreements even if the code seems generic or non-confidential. A product manager who uploads strategy documents might be sharing competitive intelligence that former employers consider protected. A data scientist who uploads analysis spreadsheets might be revealing methodologies and approaches that constitute valuable IP. Whether those uploads violate contracts depends on specific agreement language and the nature of material shared, but the risk is substantial enough that employment lawyers would likely advise clients to refuse such requests rather than attempt to determine what’s safe to share.

The instruction to delete “proprietary and personally identifiable information” treats those categories as if they’re clearly defined and easily identified, but in practice determining what constitutes proprietary information requires legal analysis of specific contractual relationships and business contexts. Information that seems generic might be proprietary if it reveals processes, methodologies, or insights that give a company competitive advantage. A template might be proprietary if it embodies strategic approaches developed through significant investment. Even publicly available information assembled in specific ways might constitute protectable trade secrets if the compilation itself provides value. Contractors tasked with making these determinations will inevitably make errors, either being overly cautious and providing less useful training data, or being insufficiently cautious and sharing material that violates confidentiality.

The approach also raises questions about informed consent from the clients and employers whose work ultimately ends up in training data. When a contractor uploads a presentation created for a client, did that client consent to their proprietary strategic thinking being used to train AI models that might eventually serve their competitors? When a contractor uploads code written for an employer, did that employer agree to their IP being incorporated into commercial AI products? The contractual relationship between OpenAI and contractors doesn’t include the third parties whose work is being collected, creating potential liability for using material without authorization from actual rights holders.

Intellectual property lawyer Evan Brown’s warning that this approach puts OpenAI “at great risk” reflects the scale of potential liability. If dozens or hundreds of contractors upload thousands of files containing proprietary information from their past employers, and those employers discover their IP in OpenAI’s training data or in outputs from OpenAI’s models, the resulting litigation could involve claims for misappropriation of trade secrets, breach of contract, copyright infringement, and other IP violations. Even if OpenAI argues they relied on contractors’ representations that material was properly scrubbed, that might not provide adequate defense if courts determine OpenAI should have implemented more robust verification processes before accepting potentially confidential material.

The requirement for “concrete output” rather than summaries reveals OpenAI’s need for actual work products that models can learn from directly. Summaries or descriptions of work provide less training value than the work itself because models learn by analyzing patterns, structures, and techniques visible in real documents. A summary of a marketing strategy doesn’t show how that strategy was presented, formatted, or argued in the actual deliverable to the client. The real PowerPoint deck shows all those elements, providing rich training data about professional communication standards, persuasive techniques, and industry conventions. But requiring actual files rather than summaries exponentially increases IP risk because files are more likely to contain proprietary information that summaries would omit.

For OpenAI’s competitors developing their own AI models, watching this approach provides strategic intelligence about how OpenAI is sourcing training data. If collecting real professional work through contractors proves effective for improving model capabilities, competitors might adopt similar approaches despite legal risks. If it triggers significant litigation or regulatory backlash, competitors might avoid similar practices. The fact that this information leaked to Wired suggests either whistleblowing by concerned contractors or sources worried about the ethical and legal implications of the practice, indicating internal discomfort with the approach even as company leadership pursues it.

The decline to comment from OpenAI’s spokesperson is notable because the company typically defends its training data practices publicly when questioned. Silence suggests either that the practice is still being refined and leadership doesn’t want to commit to defending it publicly, or that legal concerns about discussing it outweigh benefits of transparency. Either way, the lack of public defense from OpenAI indicates this isn’t a practice the company is eager to spotlight, even as it apparently pursues it behind the scenes through contractor relationships.

For contractors considering whether to comply with these requests, the calculation involves weighing compensation from OpenAI against personal legal risk and ethical concerns about sharing work that might violate confidentiality agreements or harm former employers and clients. Some contractors might refuse, determining the risk isn’t worth whatever OpenAI pays for training data contributions. Others might comply, either believing they can adequately identify and remove sensitive information, or simply prioritizing immediate income over potential future liability. That puts contractors in impossible position of making legal determinations they’re not qualified to make, facing consequences if they guess wrong, with minimal protection if disputes arise.

The ChatGPT “Superstar Scrubbing” tool name itself is interesting, suggesting either internal tool naming conventions at OpenAI or a branded feature marketed to contractors as solving confidentiality concerns. Whether the tool actually provides reliable scrubbing of proprietary information, or whether it’s primarily designed to create appearance of due diligence while pushing liability onto contractors who must ultimately judge what to scrub, affects whether the practice provides real protection or just plausible deniability for OpenAI when confidential information inevitably ends up in training data.

For companies whose employees become contractors for OpenAI or similar firms, this practice creates new risks that employment agreements and IP protections might not adequately address. Traditional non-disclosure and IP assignment agreements assume employees might share confidential information through intentional disclosure to competitors or through careless security practices. They might not contemplate scenarios where AI training companies systematically solicit professional work products from contractors who previously worked at other companies, creating industrial-scale potential for confidential information to flow into training data that gets incorporated into commercial products serving entire industries including direct competitors.

The ultimate question is whether the AI industry’s need for high-quality training data justifies approaches that require contractors to make complex legal judgments about confidentiality and intellectual property rights, exposing them to personal liability while potentially violating rights of third parties whose work is being collected. OpenAI’s apparent willingness to pursue this approach despite obvious legal risks suggests the competitive pressure to improve model capabilities through better training data outweighs concerns about potential litigation or regulatory backlash. Whether that calculation proves correct depends on whether the practice generates lawsuits that create precedents restricting how AI companies can source training data, and whether regulators eventually impose restrictions that retroactively make current practices illegal or actionable.

For now, contractors face requests to upload real work from past jobs, with minimal guidance on determining confidentiality and tools of uncertain effectiveness for scrubbing sensitive information, being asked to make legal judgments that could expose them to liability if they judge incorrectly. That arrangement serves OpenAI’s need for training data while pushing risk onto the most vulnerable participants in the AI development process, contractors with limited resources to defend against potential litigation from former employers whose proprietary information ends up training models that might eventually compete with their businesses.

Tags: AI model training methodsAI training data sourcesAI training liabilitycode repository uploadsconfidential information AIcontractor legal riskcontractors upload work filesemployment contract violationsEvan Brown AIHandshake AIintellectual property AI trainingIP lawyer AI concernsNDA violations AIOpenAI confidentiality concernsOpenAI contractor requestsOpenAI data collectionOpenAI training dataprofessional work training dataproprietary documents AIproprietary information AIreal work examplesSeattle tech employmentSuperstar Scrubbing tooltrade secrets AIwhite-collar automationwork product AI training
Favour Bitrus

Favour Bitrus

Recommended

SPD Officer Collides with Uber at Downtown Seattle Intersection; Witnesses Say Patrol Car Ran Red Light

SPD Officer Collides with Uber at Downtown Seattle Intersection; Witnesses Say Patrol Car Ran Red Light

6 months ago

Seattle Police Arrest 16-Year-Old Suspect in Baker Park Homicide Investigation

10 months ago

Popular News

  • Picture Credit: Yahoo

    Trump Threatens Sanctuary City Funding Cuts, Seattle Prepares Legal and Budget Response

    0 shares
    Share 0 Tweet 0
  • Armed Man Arrested After U-District Church Standoff, No Injuries Reported

    0 shares
    Share 0 Tweet 0
  • Washington Senate Debates Ban on Law Enforcement Face Masks

    0 shares
    Share 0 Tweet 0
  • Seattle Police Arrest Felon With Knives Violating Stay Out of Drug Areas Order in Chinatown-ID

    0 shares
    Share 0 Tweet 0
  • Mason County Investigates Two Deaths in Lake Limerick Home

    0 shares
    Share 0 Tweet 0

Connect with us

  • About
  • Advertise
  • Careers
  • Contact
  • Local Guide
Contact: info@theseattletoday.com
Send Us a News Tip: info@theseattletoday.com
Advertising & Partnership Inquiries: julius@theseattletoday.com

Follow us on Instagram | Facebook | X

Join thousands of Seattle locals who follow our stories every week.

© 2025 Seattle Today - Seattle’s premier source for breaking and exclusive news.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Arts & Culture
  • Business
  • Politics
  • Technology
  • Housing
  • International
  • National
  • Local Guide

© 2025 Seattle Today - Seattle’s premier source for breaking and exclusive news.