http://www.amazon.com/Language-Thought-Action-Fifth-Edition/...
http://www.amazon.com/Language-Thought-Action-Fifth-Edition/...
Which talks in detail about the "informative" and "affective" connotations of words (it's a lot more entertaining than it sounds). Only slightly related, but I was just reading in the book how we have the terms "light meat" and "dark meat": because ladies and gentleman in 19th century Britain couldn't bring themselves to say "leg", "thigh" or "breast" – even of a chicken!
Read the article. It's very clear. To quote it:
"Next, I wanted to see if my model could accurately track the state of the board. A quick overview of linear probes: We can take the internal activations of a model as it’s predicting the next token, and train a linear model to take the model’s activations as inputs and predict board state as output. Because a linear probe is very simple, we can have confidence that it reflects the model’s internal knowledge rather than the capacity of the probe itself."
If the article doesn't satisfy your curiosity, you can continue with the academic paper it links to: https://arxiv.org/abs/2403.15498v2
See also Anthropic's research: https://www.anthropic.com/research/mapping-mind-language-mod...
If that's not enough, you might explore https://www.amazon.com/Thought-Language-Lev-S-Vygotsky/dp/02...
or https://www.amazon.com/dp/0156482401 to better connect language and world models in your understanding.