Tesla recently recalled 2 million of their auto-enabled vehicles after not passing the safety regulation checks requiring the immediate cessation of autonomous driving until better features were implemented. This decision was made in response to over 1,000 crashes that occurred while the cars were operating autonomously. For an individual interested in technology, it is no surprise that innovators and leaders such as Elon Musk have been pushing autonomous driving for years; the technology to enable it developed decades ago. To implement any lifestyle-altering technology safely, regulations need to be set into place to ensure the end state after the change is “better” then “before”. This takes effort and time.
Similarly, generative AI, which differs from autonomous driving in many ways, including faster development time, will experience delays in implementation and mass distribution as new regulations are introduced and organizations strive for compliance. To prevent a monopolistic state or the concentration of power in a few AI organizations, and to ensure safety for a general audience that may not be well-educated on the topic, governments must establish guidelines for the ethical use, training, and deployment of data and AI models. By promoting ethical development of AI, we can progress towards a safer world where organizations are required to disclose training data and encrypt sensitive user data that is being processed, and also limit it’s deployment to check it’s negative impact on the society at large.
The EU is soon to pass the world's first AI regulation, indicating that it is already taking place. Moreover, the US is discussing AI and Rights for the ethical development and deployment of AI. Other major economies like India and China won’t be falling behind.
The first thing you need before creating an AI model is data. For general AI models like OpenAI's GPT, Anthropic's Claude, or Google's Bard, the data must be a good representation of the world, including sensitive information about people. However, current blackbox closed-source models trained on such sensitive data provide limited control over what’s generated. This makes it prone to 6 and sometimes revealing this sensitive information. In response, it is essential to establish an ethical data collection pipeline that respects individuals' right to be forgotten. Additionally, it is crucial to ensure the secure flow of user data during inference and obtain consent before using it to improve the model. Therefore, data security, transparency, and compliance are the primary criteria for AI-based regulations.
Among many concerns, there are 3 that affect the consumers and end users of these AI systems directly, and are discussed below.
With AI leading the way in technological innovation, there is a collective concern among governments and users regarding the regulation and transparency of training data. This involves giving data owners a voice in determining whether their data can be used to train a model, as well as the ability to exercise their right to be forgotten. These concerns are aimed at preventing unauthorized monopolistic actions by AI providers.
In response, AI research and development organizations should be open about the data that their models have been and will be trained on, making this information publicly accessible.
As mentioned earlier, with AI models like the GPT series already available in the market, the primary concern is to protect the user information that is passed (queried) into the model during inference time. This means safeguarding the information that users provide when they interact with the AI model. It is important to ensure that this data is not used to train the model further without the user's consent, as it could result in the potential leakage of sensitive data and/or financial benefits for the AI organization providing the services.
Organizations at the forefront of AI research and development must ensure transparency regarding their data flows and how user information is processed, stored, and accessed. By providing this visibility, these AI organizations can build trust and attract enterprise customers who are currently highly skeptical about allowing their sensitive data to flow through AI pipelines due to concerns about data security.
With organizations like OpenAI taking proactive steps to simplify the integration of services such as email scheduling and event control through their upcoming GPTs platform, both existing and upcoming products will be able to offer improved AI-powered services to their users. This also involves facilitating the exchange of information between the user and the AI provider.
However, managing data in this context raises security concerns, particularly regarding to these information brokers or middlewares. Therefore, companies and teams offering such services must prioritize transparency and compliance with data protection regulations and best practices. Just as with AI providers, this approach will provide a significant business advantage for these organizations and help establish trust more quickly.
When it comes to data security and regulations, regular audits and compliance are essential. Teams that need to comply with these regulations, both for current data protection and future incorporation of AI, must allocate resources to manage their security posture. This includes continuous monitoring and learning from threats. As technology advances, data attacks will become more advanced. Regulations serve as the first step towards enhancing security posture and building brand reputation. However, regulations alone are not enough. Managing data security posture (DSPM) is an iterative process that requires constant monitoring and re-engineering for improvement.