Research news on AI alignment

AI alignment examines how artificial systems acquire, represent, and act on goals, values, and social norms, and why their behavior often diverges from human expectations. Work in this area studies systematic failures such as bias, sycophancy, hallucinations, deceptive or selfish reasoning, and cultural or linguistic inequities, as well as limitations in commonsense, emotion, and social understanding. It also develops methods for preference learning, norm-following, interpretability, and reliability guarantees to better align AI behavior with human values and societal constraints.

Computer Sciences

Can AI quantify beauty? New study suggests it can't

Attempts to define human beauty using artificial intelligence may reveal more about bias in data than universal standards, according to a new analysis from the University of Virginia's School of Data Science. Using computer ...

Internet

How AI bias can creep into online content moderation

A University of Queensland study has shown large language models (LLMs) used in AI content moderation may be prone to subtle biases that undermine their neutrality. A team led by data scientist Professor Gianluca Demartini ...

Machine learning & AI

AI works best with humans—not instead of them

A new academic study says the most effective use of artificial intelligence may be to strengthen human thinking and decision-making, rather than replace it. Published in the Journal of Knowledge Management, the paper examines ...

Machine learning & AI

Anthropic says will put AI risks 'on the table' with Mythos model

American AI developer Anthropic plans to "lay the risks out on the table" even as it restricts deployment of a new model dubbed Mythos, whose powerful cybersecurity capabilities raise stark questions for companies and governments.

Machine learning & AI

Unpredictable AGI may resist full control, making diverse AI safer

Public concern about AI safety has grown significantly in recent years. As AI systems become more powerful, a key question is how we make sure they do what we actually want. Now, researchers suggest that rather than trying ...

page 1 from 34