2w ago

Exposing biases, moods, personalities, and abstract concepts hidden in large language models

news.mit.edu

A new method can test whether a large language model contains hidden biases, personalities, moods, or other abstract concepts. MIT researchers can zero in on connections within a model that encode for...