Page 32: Research news on AI alignment

AI alignment examines how artificial systems acquire, represent, and act on goals, values, and social norms, and why their behavior often diverges from human expectations. Work in this area studies systematic failures such as bias, sycophancy, hallucinations, deceptive or selfish reasoning, and cultural or linguistic inequities, as well as limitations in commonsense, emotion, and social understanding. It also develops methods for preference learning, norm-following, interpretability, and reliability guarantees to better align AI behavior with human values and societal constraints.

Machine learning & AI

Neurosymbolic AI could be leaner and smarter than today's LLMs

Could AI that thinks more like a human be more sustainable than today's LLMs? The AI industry is dominated by large companies with deep pockets and a gargantuan appetite for energy to power their models' mammoth computing ...

Machine learning & AI

AI overconfidence mirrors a human language disorder

Agents, chatbots and other tools based on artificial intelligence (AI) are increasingly used in everyday life by many. So-called large language model (LLM)-based agents, such as ChatGPT and Llama, have become impressively ...

Machine learning & AI

Key units in AI models mirror human brain's language system

EPFL researchers have discovered key "units" in large AI models that seem to be important for language, mirroring the brain's language system. When these specific units were turned off, the models got much worse at language ...

page 32 from 37