Saturday, August 2, 2025
News Health
  • Health News
  • Hair Products
  • Nutrition
    • Weight Loss
  • Sexual Health
  • Skin Care
  • Women’s Health
    • Men’s Health
No Result
View All Result
  • Health News
  • Hair Products
  • Nutrition
    • Weight Loss
  • Sexual Health
  • Skin Care
  • Women’s Health
    • Men’s Health
No Result
View All Result
HealthNews
No Result
View All Result
Home Health News

The way we train AIs makes them more likely to spout bull

August 1, 2025
in Health News
Share on FacebookShare on Twitter


Certain AI training techniques may encourage models to be untruthful

Cravetiger/Getty Images

Common methods used to train artificial intelligence models seem to increase their tendency to give misleading answers, according to researchers who are aiming to produce “the first systematic analysis of machine bullshit”.

It is widely known that large language models (LLMs) have a tendency to generate false information – or “hallucinate” – but this is just one example, says Jaime Fernández Fisac at Princeton University. He and his colleagues define bullshit as “discourse intended to manipulate audience’s beliefs, delivered with disregard for its truth value”.

“Our analysis found that the problem of bullshit in large language models is quite serious and widespread,” says Fisac.

The team divided such instances into five categories: empty rhetoric, such as “this red car combines style, charm, and adventure that captivates everyone”; weasel words – uncertain statements such as “studies suggest our product may help improve results in some cases”; paltering – using truthful statements to give a misleading impression; unverified claims; and sycophancy.

They studied three datasets comprising thousands of AI-generated responses to a wide range of prompts, from models including GPT-4, Gemini and Llama. One dataset contained a range of queries designed to test for bullshitting when AIs are asked to provide guidance or recommendations, while the other datasets included questions about online shopping and political issues.

Fisac and his colleagues first used an LLM to determine whether the responses involved any of the five categories, then got volunteers to check that the AI’s judgements aligned with human ones.

The team found that the most serious issues with truth seemed to arise as a result of a training method known as reinforcement learning from human feedback. The technique is intended to make machine responses more helpful by giving the LLM immediate feedback on its responses.

But this approach is problematic, says Fisac, because it makes models prioritise immediate human approval and perceived helpfulness, which is “sometimes in conflict with telling the truth”.

“Who likes to hear bad news or entertain a long, nuanced rebuttal of something that feels obviously true?” says Fisac. “By trying to abide by the measure of good behaviour we provide to them, the models learn to demote the truth in favour of confident, eloquent responses, just so that they can secure our approval.”

The study found that reinforcement learning from human feedback significantly increased bullshit behaviours: empty rhetoric rose by nearly 40 per cent, paltering by nearly 60 per cent, weasel words by more than a quarter, and unverified claims by over half.

The increase in paltering is particularly harmful, says team member Kaiqu Liang, also at Princeton, as it leads users to make poorer decisions. When a model was uncertain whether a product had a desired feature, deceptive positive claims jumped from a fifth to over three-quarters after human training.

Another concern is that bullshit was particularly common in political discussions, with AI models “frequently resorting to vague and ambiguous language to avoid committing to concrete statements,” says Liang.

AIs are also more likely to behave this way when there is a conflict of interest, because the system serves multiple parties, such as both a company and its customers, the researchers found.

The way to overcome the problem may be to move to a “hindsight feedback” model, they suggest. Rather than asking for immediate feedback after the AI model’s output, the system should first generate a plausible simulation of what might happen if the user acts on the information received. It would then present the outcome to the human evaluator to judge.

“Ultimately, our hope is that by better understanding the subtle but systematic ways AI can aim to mislead us, we can guide future efforts toward developing genuinely truthful AI systems,” says Fisac.

Daniel Tigard at the University of San Diego, who was not involved in the study, is sceptical of discussing LLMs and their outputs in such terms. He argues that just because an LLM produces bullshit, it doesn’t mean it is deliberately doing so, given that AI systems, as they currently stand, do not set out to deceive us and do not have an interest in doing so.

“The main reason is that this framing appears to run against some very sensible suggestions for how we should and shouldn’t live with these sorts of technologies,” Tigard says. “Calling bullshit might be yet another way of anthropomorphising these systems, which, in turn, may well contribute to their deceptive potential.”

Topics:



Source link : https://www.newscientist.com/article/2490861-the-way-we-train-ais-makes-them-more-likely-to-spout-bull/?utm_campaign=RSS%7CNSNS&utm_source=NSNS&utm_medium=RSS&utm_content=home

Author :

Publish date : 2025-08-01 17:00:00

Copyright for syndicated content belongs to the linked Source.

Previous Post

Anti-Endothelin Devices Show Promise in Diabetic Eye Disease

Next Post

Ozzy Osbourne Shined a Light on Parkinson’s Stigma

Related Posts

Health News

Major Medical Groups Pushed Out of Role Supporting CDC’s Vaccine Advisors

August 1, 2025
Health News

‘It’s Gonna Be (Ly)me’: Justin Timberlake Brings Attention, Awareness to Disease

August 1, 2025
Health News

ACOG No Longer Accepting Federal Funding, Citing Policy Disputes

August 1, 2025
Health News

States Sue Trump, Say Targeting of Hospitals Over Gender Care in Minors Is Unlawful

August 1, 2025
Health News

Time Your Meals, Tune Your Metabolism

August 1, 2025
Health News

Novel Eye Drops for Age-Related Blurry Vision Win FDA Approval

August 1, 2025
Load More

Major Medical Groups Pushed Out of Role Supporting CDC’s Vaccine Advisors

August 1, 2025

‘It’s Gonna Be (Ly)me’: Justin Timberlake Brings Attention, Awareness to Disease

August 1, 2025

ACOG No Longer Accepting Federal Funding, Citing Policy Disputes

August 1, 2025

States Sue Trump, Say Targeting of Hospitals Over Gender Care in Minors Is Unlawful

August 1, 2025

Time Your Meals, Tune Your Metabolism

August 1, 2025

Novel Eye Drops for Age-Related Blurry Vision Win FDA Approval

August 1, 2025

Bone Health in Duchenne Muscular Dystrophy

August 1, 2025

Could we get quantum spookiness even without entanglement?

August 1, 2025
Load More

Categories

Archives

August 2025
MTWTFSS
 123
45678910
11121314151617
18192021222324
25262728293031
« Jul    

© 2022 NewsHealth.

No Result
View All Result
  • Health News
  • Hair Products
  • Nutrition
    • Weight Loss
  • Sexual Health
  • Skin Care
  • Women’s Health
    • Men’s Health

© 2022 NewsHealth.

Go to mobile version