Working with Artificial Intelligence: Privacy Pitfalls (and Opportunities)

Blog Post

09.05.2023

Global Privacy & Security Blog

By:

John Pavolotsky

As consumer demand for new artificial intelligence (“AI”) tools continues to grow, businesses must be prepared to build tools with “privacy by design” principles in mind, and to remain educated about privacy best practices and risk mitigation strategies when working with AI. The following areas provide the greatest opportunities to manage data privacy risks and effectively leverage the potential benefits of working with AI:

1. Consent is Key: Unlike past generations of AI, most of which embraced a rules-based model, the present and the future of deep learning tools and other AI systems are built with data – lots of data. Deep learning and other neural network-based AI systems rely upon huge troves of training data to enable systems to learn and improve at performing specific tasks.

Training data is usually not created – it is acquired – and acquired data may come with copyrights, privacy rights in personal information, or other risks. Some of the most popular datasets have already been subject to privacy claims for using ill-gotten training data, resulting in serious penalties. Recent FTC enforcement actions have included orders to not only delete any improperly obtained data, but also destroy all products created using ill-gotten data, forfeit any right to create future products using such data, or forfeit the right to commercialize the products created using the ill-gotten data.

Global Privacy & Security Blog
This blog is designed to inform our clients and readers about developing privacy issues and regulatory updates, and provide alerts on cyber security stories and data breaches that impact various data privacy and security requirements.
Learn More

Businesses must take heed to ensure training data does not contain personal information. While many privacy laws carve out publicly available information from the definition of personal information, some, like GDPR, still require notification of the data subject even if data is acquired from publicly available sources. Further, it may be unclear whether the subject made the data available to an unrestricted audience, which may require a case-by-case inquiry. Even harvested data shared without restriction by a user may nonetheless have been acquired in violation of a third party’s terms of use, which commonly prohibit scraping, crawling, and commercial exploitation of user content, among other things.

If the training data must contain personal information for the applicable AI product, consider whether the data aggregator (1) can demonstrate it employed a process for screening or filtering out personal data belonging to minors or sensitive categories of personal data, (2) properly notified and disclosed to the data subjects that the data would be subject to sharing or sale, (3) disclosed the purpose for collecting the data sufficiently to enable the intended use, and (4) can demonstrate any necessary consents were obtained prior to collecting the data. Maintaining records demonstrating compliance can significantly aid a business using this data in defending itself against claims and making fulsome disclosures for due diligence in a merger or sale.

2. Imitation is Not Always Flattery: One of the greatest strengths of generative AI, systems capable of creating media using generative models, is the ability to make imitations that are indistinguishable from the genuine data being sampled, such as photos, voices, art, and other media. These “deepfakes” have evolved from the era of generative adversarial networks to a new age of architecture, spearheaded by the likes of Midjourney and Stable Diffusion, each of which have been targeted with class action claims. While many of the claims against generative AI companies are centered around copyrights, these tools present additional risks vis-à-vis state privacy laws (for training data, as discussed previously) and privacy tort claims.

Imitations of human subjects could expose companies to risks under privacy tort claims for (mis)appropriation or false light. A claim for appropriation exists where a party uses the name or likeness of another, without consent, for commercial gain. While imitating certain subjects may create entertaining marketing opportunities, commercializing the image of an unwilling subject may invite liability. Additionally, depicting a person’s likeness in any manner which places the subject in a false light that would be highly offensive to a reasonable person could create exposure to false light claims. Nevertheless, disclaimers can play an important role in mitigating the risk that a viewer could interpret generated content as reality.

3. Automated Discrimination - Garbage in, Garbage out: All data has a source, and not all sources are unbiased. Data sourced from web-scraping user contributions in forums, a common data-source used, for example, in popular Stack Exchange datasets, will disproportionately represent the universe of people who participate in the space where the data is harvested. When these datasets are used to create AI which perform tasks like language modeling, the outputs of the AI will disproportionately reflect that same universe of people. This application could result in unintended consequences if unchecked, for example, if the tool was applied to screening job applications, as it may be prone to discarding applicants whose applications do not align with the population used to form the dataset. Accordingly, businesses should review the white papers associated with major datasets, if available, and consider what biases underly the data and how to account for those biases to avoid discrimination.

4. Security Risks and Opportunities: Businesses providing AI tools for consumer use may acquire massive troves of data, and should be mindful of basic privacy best practices, including data minimization and maintaining adequate security. At the same time, AI tools present new opportunities to increase security and consumer privacy, including by using AI to protect data and search for vulnerabilities more effectively and at a greater scale than with human actors.

Despite the novel opportunities presented by AI, many of the privacy risks mirror patterns that have been well-handled by the law in similar contexts. By staying abreast of the rapidly changing legal landscape and maintaining adequate privacy and cybersecurity protocols, businesses can capitalize on emerging opportunities while managing risk appropriately.

Related Professionals

John Pavolotsky
Partner

Related Practices & Industries

Practices

Industries

Technology

Media Contact

Jamie Moss (newsPRos)
Media Relations
w. 201.493.1027 c. 201.788.0142
Email

Mac Borkgren
Director of Marketing Operations
503.294.9326
Email