Page 4: Research news on AI alignment

AI alignment examines how artificial systems acquire, represent, and act on goals, values, and social norms, and why their behavior often diverges from human expectations. Work in this area studies systematic failures such as bias, sycophancy, hallucinations, deceptive or selfish reasoning, and cultural or linguistic inequities, as well as limitations in commonsense, emotion, and social understanding. It also develops methods for preference learning, norm-following, interpretability, and reliability guarantees to better align AI behavior with human values and societal constraints.

Machine learning & AI

Anthropic says will put AI risks 'on the table' with Mythos model

American AI developer Anthropic plans to "lay the risks out on the table" even as it restricts deployment of a new model dubbed Mythos, whose powerful cybersecurity capabilities raise stark questions for companies and governments.

Machine learning & AI

Unpredictable AGI may resist full control, making diverse AI safer

Public concern about AI safety has grown significantly in recent years. As AI systems become more powerful, a key question is how we make sure they do what we actually want. Now, researchers suggest that rather than trying ...

Consumer & Gadgets

Dear AI, I'm autistic; should I go to this party?

When people ask ChatGPT and other AI models for advice, they often share deeply personal details in hopes of getting better answers: their age, their gender, their mental health history, even medical diagnoses like autism. ...

Machine learning & AI

Can Europe create AI that we actually understand?

Artificial intelligence is becoming increasingly important in nearly every aspect of society, but is completely dominated by the United States and China. Leaving the field to foreign powers and large companies may entail ...

Machine learning & AI

When AI seems to know you better than you know yourself

I was at my clinic the other day and asked an AI assistant about the differential diagnosis of a rash in a child. A routine question. The response came back clear and sensible. And then it added, "Are you asking about one ...

page 4 from 36