Method prevents an AI model from being overconfident about wrong answers (2024)

People use large language models for a huge array of tasks, from translating an article to identifying financial fraud. However, despite the incredible capabilities and versatility of these models, they sometimes generateinaccurate responses.

On top of that problem, the models can be overconfident about wrong answers or underconfident about correct ones, making it tough for a user to know when a model can be trusted.

Researchers typically calibrate a machine-learning model to ensure its level of confidence lines up with its accuracy. A well-calibrated model should have less confidence about an incorrect prediction, and vice-versa. But because large language models (LLMs) can be applied to a seemingly endless collection ofdiverse tasks, traditional calibration methods are ineffective.

Now, researchers from MIT and the MIT-IBM Watson AI Lab have introduced a calibration method tailored to large language models. Their method, called Thermometer, involves building a smaller, auxiliary model that runs on top of a large language model to calibrate it.

Thermometer is more efficient than other approaches — requiring less power-hungry computation — while preserving the accuracy of the model and enabling it to produce better-calibrated responses on tasks it has not seen before.

By enabling efficient calibration of an LLM for a variety of tasks, Thermometer could help users pinpoint situations where a model is overconfident about false predictions, ultimately preventing them from deploying that model in a situation where it may fail.

“With Thermometer, we want to provide the user with a clear signal to tell them whether a model’s response is accurate or inaccurate, in a way that reflects the model’s uncertainty, so they know if that model is reliable,” says Maohao Shen, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on Thermometer.

Shen is joined on the paper by Gregory Wornell, the Sumitomo Professor of Engineering who leads the Signals, Information, and Algorithms Laboratory in the Research Laboratory for Electronics, and is a member of the MIT-IBM Watson AI Lab; senior author Soumya Ghosh, a research staff member in the MIT-IBM Watson AI Lab; as well as others at MIT and the MIT-IBM Watson AI Lab. The research was recently presented at the International Conference on Machine Learning.

Universal calibration

Since traditional machine-learning models are typically designed to perform asingle task, calibrating them usually involves one task-specific method. On the other hand, since LLMs have the flexibility to perform many tasks, using a traditional method to calibrate that model for one task might hurt its performance on another task.

Calibrating an LLM often involvessampling from the model multiple timesto obtain different predictions and then aggregating these predictions to obtain better-calibrated confidence. However, because these models have billions of parameters, the computational costs ofsuchapproaches rapidly add up.

“In a sense, large language models are universal because they can handlevarious tasks. So, we need a universal calibration method that can also handle many different tasks,” says Shen.

With Thermometer, the researchers developed a versatile technique that leverages a classical calibration method called temperature scaling to efficiently calibrate an LLM for a new task.

In this context, a “temperature” is ascaling parameter used toadjust amodel’s confidenceto be aligned with its prediction accuracy. Traditionally, one determines the right temperature using a labeled validation dataset of task-specific examples.

Since LLMs are often applied to new tasks, labeled datasets can be nearly impossible toacquire. For instance, a user who wants to deploy an LLM to answer customer questions about a new product likely does not have a dataset containing such questions and answers.

Instead of using a labeled dataset, the researchers train an auxiliary model that runs on top of an LLM to automatically predict the temperature needed to calibrate it for this new task.

They use labeled datasets of a few representative tasks to train the Thermometer model, but then once it has been trained, it can generalize to new tasks ina similar category without the need foradditional labeled data.

A Thermometer model trained on acollection of multiple-choice question datasets, perhaps including one with algebra questions and one withmedical questions, could be used to calibrate an LLM that will answer questions about geometry orbiology, for instance.

“The aspirational goal is for it to work on any task, but we are not quite there yet,” Ghosh says.

The Thermometer model only needs to access a small part of the LLM’s inner workings to predict the right temperature that will calibrate its prediction for data points of a specific task.

An efficient approach

Importantly, the technique does not require multiple training runs and only slightly slows the LLM. Plus, since temperature scaling does not alter a model’s predictions, Thermometer preserves its accuracy.

When they compared Thermometer to several baselines on multiple tasks, it consistently produced better-calibrated uncertainty measures while requiring much less computation.

“As long as we train a Thermometer model on a sufficiently large number of tasks, it should be able to generalize well across any new task, just like a large language model, it is also a universal model,” Shen adds.

The researchers also found that if they train a Thermometer model for a smaller LLM, it can bedirectlyapplied tocalibratea larger LLM within the same family.

In the future, they want to adapt Thermometer for more complex text-generation tasks and apply the technique to even larger LLMs. The researchers also hope to quantify thediversity andnumber of labeled datasets one would need to train a Thermometer model so it can generalize to a new task.

This research was funded, in part, by the MIT-IBM Watson AI Lab.

Method prevents an AI model from being overconfident about wrong answers (2024)

FAQs

Method prevents an AI model from being overconfident about wrong answers? ›

Thermometer, a method for calibrating a large language model, could help users pinpoint situations where a model is overconfident about false predictions. People use large language models for a huge array of tasks, from translating an article to identifying financial fraud.

How do we prevent AI errors or AI bias? ›

Here are a several ways data scientists are addressing the problem.
  1. Understand the Potential for AI Bias. Supervised learning, one of the subsets of AI, operates on rote ingestion of data. ...
  2. Increase Transparency. ...
  3. Institute Standards. ...
  4. Test Models Before and After Deployment. ...
  5. Use Synthetic Data.

How does AI reduce mistakes? ›

By leveraging AI and machine learning algorithms and analyzing vast amounts of data, AI systems can provide more accurate and unbiased insights to guide decision-making. This can help organizations avoid costly mistakes and make more informed choices.

How do you secure an AI model? ›

How do you store an AI model securely?
  1. Encryption – Encrypt your AI models and training data to prohibit theft and tampering.
  2. Access Controls – Limit access to the model to only those who need that access to do their job.
May 1, 2024

Which AI model help a machine learn from its mistakes? ›

Reinforcement learning is a machine learning model that can be described as “learn by doing” through a series of trial and error experiments. An “agent” learns to perform a defined task through a feedback loop until its performance is within a desirable range.

How to reduce model bias? ›

5 Ways to Get Rid of Bias in Machine Learning Algorithms
  1. Prioritize data diversity.
  2. Proactively identify your edge cases.
  3. Obtain high-quality, accurate and consistent data annotation.
  4. Understand where and why your model is failing.
  5. Constantly check in on your model.
Feb 6, 2024

How to mitigate bias in generative AI? ›

Consider building debiasing tools into the model. Debiasing is the intentional effort to reduce bias in AI-generated content regarding how humans are represented and portrayed. It helps reduce stereotypes and misrepresentation by applying country or cultural specifics to prompts.

How do I make AI more trustworthy? ›

How can we know AI can be trusted?
  1. Safe, secure, and resilient.
  2. Accountable and transparent.
  3. Explainable and interpretable.
  4. Privacy-enhanced.
  5. Fair with harmful bias managed.
  6. Transparency Above All is Essential.
  7. AI features should be opt-in across systems.
May 8, 2024

How can I make my AI model more accurate? ›

Key Takeaways
  1. Generate and test hypotheses to improve model performance.
  2. Clean and preprocess data to handle missing and outlier values.
  3. Use feature engineering techniques to create new features from existing data.
  4. Experiment with different model selection techniques to find the best model for your data.

How to safeguard AI? ›

Laying The Groundwork For Core Principles
  1. Crafting An Ethical Foundation. ...
  2. Navigating Through Risk Assessment. ...
  3. Enhancing Understandability. ...
  4. Governing Data With Diligence. ...
  5. Fortifying AI With Robust Security. ...
  6. Ensuring Regular Monitoring And Refinement. ...
  7. Fostering Collaboration And Engagement.
Apr 12, 2024

What is robustness in AI? ›

What is robustness? Robustness, in the context of AI systems, refers to the ability of an algorithm or model to maintain its performance and stability under different conditions, including variations in input data, environmental changes, and attempts at adversarial interference.

How does AI handle uncertainty? ›

AI deals with uncertainty by using models and methods that assign probabilities to different outcomes. Managing uncertainty is important for AI applications like self-driving cars and medical diagnosis, where safety and accuracy are key.

How does AI bias come in AI models? ›

What Is the Role of Bias in AI Models? All models are made by humans and reflect human biases. Machine learning models can reflect the biases of organizational teams, of the designers in those teams, the data scientists who implement the models, and the data engineers that gather data.

How can you prevent algorithmic bias? ›

From a privacy perspective, the following components should comprise the toolkit used to reduce algorithmic bias:
  1. Data set criteria: Criteria for ensuring accuracy, integrity, and representation in data sets.
  2. Production outputs: Markers that can detect algorithmic bias in production environments.

How can we avoid AI? ›

If you want to prevent AI from being used against you, it is mandatory to implement these superior security controls, which include multifactor authentication (MFA), role-based access control (RBAC), biometrics, data encryption, and many more.

How to avoid gender bias in AI? ›

How can AI overcome gender bias? The first step towards overcoming bias is making sure training samples are as diverse as possible - in terms of gender, but also ethnicity, age, sexuality and so on - and the people developing AI are also from different backgrounds.

How can AI prevent medical errors? ›

AI can reduce medical errors by enhancing the accuracy and efficiency of diagnosis, treatment, and patient care processes. By analyzing vast amounts of medical data, AI algorithms can identify patterns and anomalies that might be overlooked by human practitioners.

References

Top Articles
Latest Posts
Article information

Author: Carlyn Walter

Last Updated:

Views: 5285

Rating: 5 / 5 (70 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Carlyn Walter

Birthday: 1996-01-03

Address: Suite 452 40815 Denyse Extensions, Sengermouth, OR 42374

Phone: +8501809515404

Job: Manufacturing Technician

Hobby: Table tennis, Archery, Vacation, Metal detecting, Yo-yoing, Crocheting, Creative writing

Introduction: My name is Carlyn Walter, I am a lively, glamorous, healthy, clean, powerful, calm, combative person who loves writing and wants to share my knowledge and understanding with you.