Shared post - AI chatbots outperform humans in evaluating social situations, study finds

OwenGregorian

Politics • Education • Science & Tech

AI chatbots outperform humans in evaluating social situations, study finds | Eric W. Dolan, PsyPost

December 04, 2024

Recent research published in Scientific Reports has found that certain advanced AI chatbots are more adept than humans at making judgments in challenging social situations. Using a well-established psychological tool known as a Situational Judgment Test, researchers found that three chatbots—Claude, Microsoft Copilot, and http://you.com’s smart assistant—outperformed human participants in selecting the most effective behavioral responses.

The ability of AI to assist in social interactions is becoming increasingly relevant, with applications ranging from customer service to mental health support. Large language models, such as the chatbots tested in this study, are designed to process language, understand context, and provide helpful responses. While previous studies have demonstrated their capabilities in academic reasoning and verbal tasks, their effectiveness in navigating complex social dynamics has remained underexplored.

Large language models are advanced artificial intelligence systems designed to understand and generate human-like text. These models are trained on vast amounts of data—books, articles, websites, and other textual sources—allowing them to learn patterns in language, context, and meaning.

This training enables these models to perform a variety of tasks, from answering questions and translating languages to composing essays and engaging in detailed conversations. Unlike earlier AI models, large language models rely on their ability to process context and generate responses that often feel conversational and relevant to the user’s input.

“As researchers, we are interested in the diagnostics of social competence and interpersonal skills,” said study author Justin M. Mittelstädt of the Institute of Aerospace Medicine.

“At the German Aerospace Center, we apply methods for diagnosing these skills, for example, to find suitable pilots and astronauts. As we are exploring new technologies for future human-machine interaction, we were curious to find out how the emerging large language models perform in these areas that are considered to be profoundly human.”

To evaluate AI performance, the researchers used a Situational Judgment Test, a tool widely used in psychology and personnel assessment to measure social competence. The test presented 12 scenarios requiring participants to evaluate four potential courses of action. For each scenario, participants were tasked with identifying the best and worst responses, as rated by a panel of 109 human experts.

The study compared the performance of five AI chatbots—Claude, Microsoft Copilot, ChatGPT, Google Gemini, and http://you.com’s smart assistant—with a sample of 276 human participants. These human participants were pilot applicants selected for their high educational qualifications and motivation. Their performance provided a rigorous benchmark for the AI systems.

Each chatbot completed the Situational Judgment Test ten times, with randomized presentation orders to ensure consistent results. The responses were then scored based on how well they aligned with the expert-identified best and worst options. In addition to choosing responses, the chatbots were asked to rate the effectiveness of each action in the scenarios, providing further data for comparison with expert evaluations.

The researchers found that all the tested AI chatbots performed at least as well as the human participants, with some outperforming them. Among the chatbots, Claude achieved the highest average score, followed by Microsoft Copilot and http://you.com’s smart assistant. These three systems consistently selected the most effective responses in the Situational Judgment Test scenarios, aligning closely with expert evaluations.

Interestingly, when chatbots failed to select the best response, they most often chose the second-most effective option, mirroring the decision-making patterns of human participants. This suggests that AI systems, while not perfect, are capable of nuanced judgment and probabilistic reasoning that closely resembles human thought processes.

“We have seen that these models are good at answering knowledge questions, writing code, solving logic problems, and the like,” Mittelstädt told PsyPost. “But we were surprised to find that some of the models were also, on average, better at judging the nuances of social situations than humans, even though they had not been explicitly trained for use in social settings. This showed us that social conventions and the way we interact as humans are encoded as readable patterns in the textual sources on which these models are trained.”

The study also highlighted differences in reliability among the AI systems. Claude showed the highest consistency across multiple test iterations, while Google Gemini exhibited occasional contradictions, such as rating an action as both the best and worst in different runs. Despite these inconsistencies, the overall performance of all tested AI systems surpassed expectations, demonstrating their potential to provide socially competent advice.

“Many people already use chatbots for a variety of everyday tasks,” Mittelstädt explained. “Our results suggest that chatbots may be quite good at giving advice on how to behave in tricky social situations and that people, especially those who are insecure in social interactions, may benefit from this. However, we do not recommend blindly trusting chatbots, as we also saw evidence of hallucinations and contradictory statements, as is often reported in the context of large language models.”

It is important to note that the study focused on simulated scenarios rather than real-world interactions, leaving questions about how AI systems might perform in dynamic, high-stakes social settings.

“To facilitate a quantifiable comparison between large language models and humans, we selected a multiple-choice test that demonstrates prognostic validity in humans for real-world behavior,” Mittelstädt noted. “However, performance on such a test does not yet guarantee that large language models will respond in a socially competent manner in real and more complex scenarios.”

Nevertheless, the findings suggest that AI systems are increasingly able to emulate human social judgment. These advancements open doors to practical applications, including personalized guidance in social and professional settings, as well as potential use in mental health support.

“Given the demonstrated ability of large language models to judge social situations effectively in a psychometric test, our objective is to assess their social competence in real-world interactions with people and the conditions under which people benefit from social advice provided by a large language model,” Mittelstädt told PsyPost.

“Furthermore, the response behavior in Situational Judgment Tests is highly culture-dependent. The effectiveness of a response in a specific situation may vary considerably from one culture to another. The good performance of large language models in our study demonstrates that they align closely with the judgments prevalent in Western cultures. It would be interesting to see how large language models perform in tests from other cultural contexts and whether their evaluation would change if they were trained with more data from a different culture.”

https://www.psypost.org/ai-chatbots-outperform-humans-in-evaluating-social-situations-study-finds/

Join the OwenGregorian Community

To read more articles like this, sign up and join my community today

What else you may like…

Videos

Posts

Articles

OwenGregorian@OwenGregorian

December 11, 2024

Over 100 Navy SEALS Set to Descend on Washington D.C. in Explosive Show of Support for Army Veteran Pete Hegseth | The Gateway Pundit

Washington, D.C., is bracing for an unprecedented show of support as over 100 Navy SEALs prepare to descend on the nation’s capital, standing in solidarity with President-elect Donald Trump’s nominee for Secretary of Defense, Pete Hegseth.

Hegseth, a decorated Army combat veteran and prominent conservative voice, has faced relentless media attacks in recent weeks.

The fake news media have leveled accusations ranging from outdated and disproven sexual misconduct claims to allegations of public drunkenness and financial mismanagement during his tenure at Concerned Veterans for America (CVA).

Critics on the right are pushing back hard against what they view as a deliberate effort to derail a nominee poised to shake up the Defense Department.

Randy Lair, a trustee at CVA, categorically denied the whistleblower allegations, describing them as “sensational fabrications designed to undermine a patriot.”

In an exclusive letter to the New York Post, Lair emphasized that Hegseth left CVA on good ...

00:01:07

OwenGregorian@OwenGregorian

December 07, 2024

‘Charlatan’ Vaccine Promoter Dr. Peter Hotez Says Multiple Viruses Will be Unleashed on America the Day After Trump Takes Office | Cristina Laila, The Gateway Pundit

‘Charlatan’ vaccine promoter Dr. Peter Hotez said multiple viruses will be unleashed on America one day after Trump is inaugurated next month.

“We have some big picture stuff coming down the pike starting on January 21st,” Hotez said to MSNBC’s Nicolle Wallace before rattling off a list of viruses:

Bird flu
New Coronavirus
SARS
Mosquito-transmitted viruses
Dengue
Zika
Oropouche virus
Yellow fever
Pertussis/Whooping cough
Measles
Polio

Of course, Dr. Hotez failed to mention the measles outbreaks and Polio cases are primarily a problem with the illegal migrants invading the US.

Dr. Peter Hotez previously made headlines for refusing to debate author, activist, then-presidential candidate, attorney and now Trump’s nominee for HHS, Robert Kennedy, Jr., on the effectiveness of the COVID-19 vaccines.

Rather than accept the challenge, Hotez lashed out at both Robert Kennedy Jr. and Joe Rogan, who invited the two to debate the facts on his show.

Hotez refused and ...

00:01:30

OwenGregorian@OwenGregorian

December 04, 2024

Mysterious 'Car-Sized Drones' Over New Jersey Prompt FBI Investigation | ZeroHedge

Several weeks of mysterious drone swarms over the skies of one New Jersey county near the military research and manufacturing facility Picatinny Arsenal have sparked concerns among residents and prompted an FBI investigation.

"It's kind of unsettling," Mike Walsh, a Morris County resident who has spotted the drones on numerous occasions, told local media outlet PIX11 News.

He said some drones "are very big, probably the size of a car."

Since Nov. 18, Walsh and many other residents have spotted these drones in the night sky.

"They're kind of go slow," he said, adding, "They come towards you. Then they change direction a little. They're all going different ways."

We first detailed the story on Nov. 19 in a note titled "Spy Drones? "Unusual Activity" Reported Over Morris County, New Jersey, Near Military Research Facility."

The potential national security threat piqued our interest, considering multiple reports that the mysterious drones were observed near Picatinny Arsenal.

PIX11 News said...

00:02:18

OwenGregorian@OwenGregorian

July 12, 2025

Coffee With Scott Adams Afterparty X Spaces - 7/12/25

You’re invited to the next Coffee With Scott Adams Afterparty Spaces, on X following Scott’s morning show!

Bring your smartest friends!

https://x.com/owengregorian/status/1944004141900017669?s=46&t=za1kQOtu4Dod6Yb1P465eg

X.COM

Owen Gregorian (@OwenGregorian) on X

Join us at the next Coffee With Scott Adams Afterparty Spaces, today after Scott’s morning livestream! This is your opportunity to talk about the topics on Scott’s show, and the news of

Edward laviolette@edman69

July 09, 2025

https://www.tiktok.com/t/ZT6dNcELu/
can you ask scott what this was that he stayed quiet about?

OwenGregorian@OwenGregorian

July 05, 2025

Coffee With Scott Adams Afterparty X Spaces - 7/5/25

You’re invited to the next Coffee With Scott Adams Afterparty Spaces, on X following Scott’s morning livestream today!

Bring your friends!

https://x.com/owengregorian/status/1941478983738196411?s=46&t=za1kQOtu4Dod6Yb1P465eg

X.COM

Owen Gregorian (@OwenGregorian) on X

Join us at the next Coffee With Scott Adams Afterparty Spaces! This is your chance to talk about the topics from Scott’s show and the news of the week! We’ll start a few minutes after

OwenGregorian@OwenGregorian

February 20, 2025

I played a small part in making some news here.

This story focuses on Elon Musk of course, as the news tends to focus on every X post or reply he makes.

If you read this story, you may have noticed that it quotes Scott Adams' reaction to a previous story about the idea of issuing $5,000 "DOGE dividend" checks, questioning why we would do that when we are facing such massive budget deficits and an enormous national debt. Elon responded to that stating that he thought we needed to balance the budget first.

This isn't the only story written about this.

What it doesn't state is that my post about the $5,000 DOGE check idea is what prompted this whole conversation. 😎

https://x.com/OwenGregorian/status/1892184391717384462

It's also notable that the whole thing started with an X post from James Fishback (@j_fishback) that included a letter he wrote with a detailed proposal about the idea:

https://x.com/j_fishback/status/1891933120313663493

It's still a trip to think that an ordinary person like me can have even a small impact on the national conversation.

Don't underestimate the influence what you do on this platform could have. And thank you for your support, and for coming along this strange journey with me.

Read full Article

OwenGregorian@OwenGregorian

February 07, 2025

4th Generation Information Warfare

John Boyd and William S. Lind changed our understanding of modern warfare, and their ideas can help you in business, politics, influencing people, and lots more.

John Boyd was a fighter pilot in the US Air Force, and later became a military strategist and advised the Pentagon. He wrote many essays on how the nature of warfare was changing and what we should do about it. William S. Lind wrote a book called The 4th Generation Warfare Handbook that included many of Boyd's ideas which has been influential to how we approach warfighting.

I'm going to describe some of the key concepts and frameworks covered in that book, and how they can be applied to information warfare. I'm hoping you'll come away with enough understanding that you can notice when these techniques are being applied, judge how well various players are doing with them to help predict how a conflict will turn out, and be able to start practicing these techniques yourself.

Only for Supporters

To read the rest of this article and access other paid content, you must be a supporter

Read full Article

OwenGregorian@OwenGregorian

January 20, 2025

Poll Shows Post-Election Crash in Public Tolerance for Illegal Migration | Neil Munro, Breitbart News

Just 10 percent of Americans oppose President Donald Trump’s promise to deport illegal migrants with criminal records, according to an Ipsos poll for the New York Times.

In contrast, 87 percent support the deportations strongly or “somewhat,” so providing a broad consensus for a national enforcement campaign.

And just 19 percent of Americans — fewer than one in five — strongly oppose “deporting all immigrants who are here illegally,” the Ipsos poll also showed. Fifty-six percent support the deportations strongly or somewhat.

The post-election collapse of tolerance for illegal migration was spotlighted Saturday when the newspaper posted its early-January poll of 2,128 citizens and residents that confirmed recent polling trends.

The broad shift in political opinion — dubbed a “preference cascade” by academics — was likely caused when Trump’s campaign and November win showed Americans how many other Americans oppose migration.

The new numbers will help Trump and his deputies begin the careful, low-drama, and gradual removal of millions of wage-cutting, rent-spiking migrants from U.S. society.

A patient and popular enforcement campaign will also help shift the political attention to the even bigger impact of legal migration on Americans.

Already, the rising public demand for less legal migration was spotlighted over Christmas when Twitter erupted in a furious debate over white-collar migration via the H-1B visa program. That drama was ignored by the poll but is expected to rise as the nation draws closer to the 2026 election.

Pro-migration groups, however, hope the Trump enforcement is chaotic and rushed because any tactical mistakes will help their media allies paint the repatriations as cruel and counterproductive.

The newspaper’s coverage of the poll downplays the drama, saying:

Many Americans who otherwise dislike President-elect Donald J. Trump share his bleak assessment of the country’s problems and support some of his most contentious prescriptions to fix them, according to a new poll from The New York Times and Ipsos.

The new poll includes much evidence that GOP voters are leading Democrats away from their politically disastrous support for the quasi-open borders policies put in place by President Joe Biden’s pro-migration, Cuban-born border chief, Alejandro Mayorkas.

For example, only 16 percent of Democrats oppose the deportation of criminal migrants, and only 34 percent of Democrats now “strongly” oppose the deportation of “all immigrants [emphasis added] who are here illegally.”

Overall, 55 percent want all migrants to be deported, and 87 percent want crminal migrants to be deported.

These numbers — and the approaching 2026 midterm elections — help to explain why a critical share of Senate Democrats are expected on Monday to support the Laken Riley bill that would allow the detention of criminal migrants.

Similarly, 56 percent said the Mayorkas migration has caused more crime. Just 38 percent — including 63 percent of Democrats, said the migration “doesn’t have much impact on crime.”

The poll said that 41 percent of Americans, including 68 percent of GOP voters — say “immigrants today are a burden on our country because they take our jobs, housing and health care.” However, the “forced choice” question did not offer alternative answers, so it prodded 56 percent of respondents to say immigrants “strengthen our country because of their hard work and talents.”

There is much evidence that legal and illegal migration makes ordinary Americans poor and less productive.

Elsewhere in the poll, Ipsos asked if there were too many or too few legal migrants.

Thirty percent said too many, and just 24 percent said too few. But the plurality of 43 percent picked a middle option — “the right number” — likely because the respondents do not know that Biden’s deputies imported roughly one migrant for every American birth since 2021.

https://www.breitbart.com/immigration/2025/01/18/poll-shows-post-election-crash-in-public-tolerance-for-illegal-migration/

Read full Article

Available on mobile and TV devices