Page 7: Research news on AI alignment

AI alignment examines how artificial systems acquire, represent, and act on goals, values, and social norms, and why their behavior often diverges from human expectations. Work in this area studies systematic failures such as bias, sycophancy, hallucinations, deceptive or selfish reasoning, and cultural or linguistic inequities, as well as limitations in commonsense, emotion, and social understanding. It also develops methods for preference learning, norm-following, interpretability, and reliability guarantees to better align AI behavior with human values and societal constraints.

Computer Sciences

Visualizing the internal structure behind AI decision-making

Although deep learning–based image recognition technology is rapidly advancing, it still remains difficult to clearly explain the criteria AI uses internally to observe and judge images. In particular, technologies that ...

Computer Sciences

Six criteria for the reliability of AI

Language models based on artificial intelligence (AI) can answer any question, but not always correctly. It would be helpful for users to know how reliable an AI system is. A team at Ruhr University Bochum and TU Dortmund ...

page 7 from 28