In a recent “Fast Facts” article published in the magazine BMJ, Researchers discuss recent advances in genetic artificial intelligence (AI), the importance of the technology in the world today, and the potential risks that need to be addressed before large language models (LLMs) like ChatGPT become the reliable sources of real information we believe to be is.
BMJ Fast Facts: Quality and security of health information generated by artificial intelligence. Image credit: Le Panda / Shutterstock
What is genetic artificial intelligence?
“General Artificial Intelligence (AI)” is a subset of artificial intelligence models that generate context-dependent content (text, images, audio and video) and underpin the natural language models that power AI assistants (Google Assistant, Amazon Alexa and Siri) and productivity apps including ChatGPT and Grammarly AI. This technology represents one of the fastest growing areas in digital computing and has the potential to significantly advance various aspects of society, including healthcare and medical research.
Unfortunately, advances in genetic artificial intelligence, especially large language models (LLMs) such as ChatGPT, have far exceeded ethical and safety controls, introducing the possibility of serious consequences, both accidental and intentional (malicious). Research estimates that more than 70% of people use the Internet as their primary source of health and medical information, with more people using LLMs like Gemini, ChatGPT and Copilot with their queries every day. This article focuses on three vulnerable aspects of AI, namely AI bugs, health misinformation, and privacy concerns. It highlights the efforts of new disciplines, including AI security and AI ethics, to address these vulnerabilities.
AI bugs
Errors in data processing are a common challenge in all AI technologies. As input data sets become more extensive and model outputs (text, audio, images, or video) become more sophisticated, false or misleading information becomes increasingly difficult to detect.
“The ‘AI illusion’ phenomenon has gained prominence with the widespread use of AI chatbots (eg ChatGPT) supported by LLMs. presented as fact”.
For lay members of society who cannot distinguish between factual and inaccurate information, these errors can become very costly very quickly, especially in cases of incorrect medical information. Even trained medical professionals can suffer from these errors, given the increasing amount of research being conducted using LLMs and genetic AI for data analyses.
Fortunately, numerous technological strategies aimed at mitigating AI errors are currently being developed, the most promising of which involves the development of generative AI models that are “based” on information from reliable and authoritative sources. Another method is to incorporate “uncertainty” into the output of the AI model – when presenting a result. The model will also present its degree of confidence in the validity of the information presented, thus allowing the user to refer to reliable information repositories in cases of high uncertainty. Some AI output models already incorporate references as part of their output, thus encouraging the user to educate themselves further before accepting the model output at face value.
Health misinformation
Disinformation differs from AI illusions in that the latter is accidental and unintentional, while the former is intentional and malicious. While the practice of disinformation is as old as human society itself, genetic AI presents an unprecedented platform for creating “differentiated, high-quality, targeted disinformation at scale” with almost no financial cost to the malicious actor.
“One option to prevent AI-generated health misinformation involves refining models to align with human values and preferences, including avoiding generating known harmful or misinformed responses. An alternative is to create a specialized model (without the AI generation model) to identify inappropriate or harmful requests and responses.”
While both of the above techniques are viable in the war against disinformation, they are experimental and modeled. In order to prevent inaccurate data from even reaching the model to be processed, initiatives such as digital watermarking, designed to validate accurate data and represent AI-generated content, are currently in the pipeline. Equally important, the establishment of AI watchdog services will be required before AI can be unequivocally trusted as a robust intelligence system.
Privacy and bias
Data used to train AI models, especially medical data, must be screened to ensure that no identifiable information is included, while respecting the privacy of the users and patients whose data the models were trained on. For crowdsourced data, AI models typically include privacy terms and conditions. Study participants must ensure that they comply with these conditions and do not provide information that can be traced back to the volunteer in question.
Bias is the inherent risk of artificial intelligence models to distort data based on the training material of the model’s training source. Most AI models are trained on extensive data sets, usually obtained from the Internet.
“Despite developers’ efforts to mitigate biases, it remains difficult to identify and fully understand the biases of accessible LLMs due to a lack of transparency about the data and training process. Ultimately, strategies aimed at minimizing these risks include exercising greater discretion in selecting training data, scrutinizing productive AI results, and taking corrective action to minimize identified biases.”
conclusions
Genetic AI models, the most popular of which include LLMs such as ChatGPT, Microsoft Copilot, Gemini AI, and Sora, represent some of the best human productivity enhancements of the modern era. Unfortunately, developments in these areas have far outstripped reliability checks, resulting in errors, misinformation, and bias that could lead to serious consequences, especially when considering health care. This article summarizes some of the risks of genetic artificial intelligence in its current form and highlights the techniques that have not been developed to mitigate these risks.