Dark Hacker News
new
|
best
|
ask
|
show
|
jobs
Arrows of Time for Large Language Models
(arxiv.org)
6 points
by
tianlong
2 years ago
| 3 comments
Arrows of Time for Large Language Models | Dark Hacker News
nyoncore
2 years ago
|
next
[−]
Isn't it obvious that since LLM are trained to predict the next word they do better than to predict the previous one?
frotaur
2 years ago
|
parent
|
next
[−]
In the paper it is mentioned that the LLMs predicting the previous token are indeed pre-trained in this way, so it is not true that the difference is obvious.
tianlong
2 years ago
|
next
[−]
There is a link with entropy creation?