OpenAI’s ChatGPT has attracted significant attention worldwide since its launch in November 2022. Similar artificial intelligence (AI) applications, such as Google’s Bard and Microsoft’s Bing have emerged as competitors. While these AI applications are disruptive technologies and have impressed users with their outputs, caution is needed while using such applications due to associated data privacy concerns.
These AI models have been trained on vast amounts of data collected from the internet. The data points include books, articles, blog posts, websites and even personal information available online. ChatGPT’s general FAQs mention that such data also includes conversations, which is the reason for the AI being so real and lifelike. However, they do not explain how such conversations were collected, whether they include chats between users online and if consent for the collection of data was obtained.
News reports have brought to light the concerns of employers regarding confidential information being leaked by employees when using ChatGPT. Such leakage may be unintentional, and employers have warned their employees against it. ChatGPT is being used across many industries for research, review of documents and even writing software code. If confidential information is provided as an input to ChatGPT, it may be difficult to delete this from ChatGPT’s database. The general FAQs of ChatGPT reveal that specific prompts cannot be deleted from users’ histories. This is followed by a one-line disclaimer, “Please don’t share any sensitive information in your conversations”. Since OpenAI uses inputs from its users for the development and improvement of ChatGPT, it is possible that its output may include or resemble confidential information that has been provided by employees. This is a significant business risk. Use of data for improvement is a default setting and if users wish to opt-out they can do so by filling out a form, which is an inconvenient process. OpenAI states that personally identifiable information is removed from data intended to be used to improve model performance. However, this does not address the concerns of confidential information being included in such data.
To address these concerns, ChatGPT and similar applications should post clear and specific disclaimers on their websites regarding the use of inputs and outputs that contain copyrighted material. The applications should caution users against inputting confidential information and develop a more user-friendly process for deleting inputs once submitted. Instead of an opt-out mechanism for using inputs to enhance services, explicit consent from users should be obtained for this purpose. It is important to examine data privacy laws and collection practices to ensure that ChatGPT and similar applications comply with them. Technology is advancing at a rapid pace and the legal system has to evolve with it.
Ashima Obhan is a senior partner and Seerat Bhutani is an associate at Obhan & Associates.
Advocates and Patent Agents
N – 94, Second Floor
New Delhi 110017, India