Research news on AI alignment

AI alignment examines how artificial systems acquire, represent, and act on goals, values, and social norms, and why their behavior often diverges from human expectations. Work in this area studies systematic failures such as bias, sycophancy, hallucinations, deceptive or selfish reasoning, and cultural or linguistic inequities, as well as limitations in commonsense, emotion, and social understanding. It also develops methods for preference learning, norm-following, interpretability, and reliability guarantees to better align AI behavior with human values and societal constraints.

Security

AI can seem more human than real humans in a classic Turing test

A new University of California San Diego study unveils the first empirical evidence that a modern artificial intelligence system can pass the Turing test—a major scientific benchmark that asks whether a machine can imitate ...

Consumer & Gadgets

Humans are bad at making complex decisions. AI can call them out

When a list of pros and cons won't cut it, a new decision-making tool developed by Cornell researchers can use artificial intelligence to help make difficult decisions. But there's a twist: Instead of checking AI's result, ...

Computer Sciences

Blind ambition: AI agents can turn tasks into digital disasters

Computer scientists at UC Riverside have identified troubling flaws in a new generation of artificial intelligence (AI) agents designed to take over routine computer chores while users are away—sorting emails, organizing ...

Machine learning & AI

Don't let AI give your eulogy

There was something a bit off about a speech at one of my recent colleague's retirements. It was beautifully written, very generously worded and the pacing was impeccable. And yet, I hate to say it, it was utterly lifeless.

page 1 from 36