How AI is unlocking ancient texts(nature.com) |
How AI is unlocking ancient texts(nature.com) |
"Ithaca restored artificially produced gaps in ancient texts with 62% accuracy, compared with 25% for human experts. But experts aided by Ithaca’s suggestions had the best results of all, filling gaps with an accuracy of 72%. Ithaca also identified the geographical origins of inscriptions with 71% accuracy, and dated them to within 30 years of accepted estimates."
and
"[Using] an RNN to restore missing text from a series of 1,100 Mycenaean tablets ... written in a script called Linear B in the second millennium bc. In tests with artificially produced gaps, the model’s top ten predictions included the correct answer 72% of the time, and in real-world cases it often matched the suggestions of human specialists."
Obviously 62%, 72%, 72% in ten tries, etc. is not sufficient by itself. How do scholars use these tools? Without some external source to verify the truth, you can't know if the software output is accurate. And if you have some reliable external source, you don't need the software.
Obviously, they've thought of that, and it's worth experimenting with these powerful tools. But I wonder how they've solved that problem.
Without an extant text to compare, everything would be a guess. Maybe this would be helpful if you're trying to get a rough and dirty translation of a bunch of papyri or inscriptions? Until we have an AI that's able to adequately explain it's reasoning I can't see this replacing philologists with domain-specific expertise who are able to walk you through the choices they made.
I hope to GOD they're holding on to the originals so they can go back and redo this in 20,30 years when tools have improved.
Someone deleted part of a known text.
This does require the AI hasn’t been trained on the test text previously..
The "known" sample would need to be handled and controlled for by an independent trusted party, obviously, and therein lies the problem: It will be hard to properly configure an experiment and believe it if any of the parties have any kind of vested interest in the success of the project.
Then accuracy might be unknown but it's not subjective.
I am aware of how advanced algorithms such as those used for flash memory today can "recover" data from imperfect probability distributions naturally created by NAND flash operation, but there seems to a huge gap between those, which are based on well-understood information-theoretic principles, and the AI techniques described here.
Zhemao, posing as a scholar, claimed to have a Ph.D. in world history from Moscow State University and was the daughter of a Chinese diplomat based in Russia. She used machine translation to understand Russian-language sources and filled in gaps with her own imagination. Her articles were well-crafted and included detailed references, making them appear credible. However, many of the sources cited were either fake or did not exist.
The articles were eventually investigated by Wikipedia editors who found that Zhemao had used multiple “puppet accounts” to lend credibility to her edits. Following the investigation, Zhemao was banned from Chinese Wikipedia, and her edits were deleted.
What we call AI does have superhuman powers but they are not powers of insight, they are powers of generalization. AI is more capable than a human is of homogenizing experience down to what a current snapshot of 'human thought' would be, because it's by definition PEOPLE rather than 'person'. The effort to invoke a specific perspective from it (that seems ubiquitous) sees AI at its worst. This idea that you could use it to correctly extract a specific perspective from the long dead, is wildly, wildly misguided.
It's well overdue although from statistical profiling it's believed to be a valid linguistic script being used for writing system of the ancient Harappan language, the likely precursor of modern Dravidian languages.
[1] Indus script:
Just think of it abstractly. The AI will be trained on the errors the previous generation made. As long as it keeps making new errors each generation, they will tend to multiply.
But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve.
human - "Please compile these texts"
AI - "done! Here you go"
human - "114AD? Are you sure. We expect this to be around 100BC"
AI - "NO! Nothing to see, here! Water turning into Wine was clearly added to this God AFTER our lord and saviour!"
human - "But I have this thing.."
AI - "NOTHING TO SEE, HERE! THEY WERE PLANTED TO TEST US! TEST US!"
...
...
"BURN IT!"
The world is about to change much faster than any of us have ever witnessed to this point. What a life.
Very few people on HN are claiming there’s no value to neural networks — CNNs have been heralded here for well over a decade.
I almost read it as "How Ali is" due to speed reading and the font in the original article. :)
And now I wonder how AI would do on that same test :)
Chat, GPT!
Sometimes a clue or nudge can trigger a cascade of discovery. Even if that clue is wrong, it causes people to look at something they maybe never would have. In any case, so long as we're reasonably skeptical this is really no different than a very human way of working... have you tried "...fill in wild idea..."
> I'd really prefer we not start filling the gaps with lies formed of regurgitated data pools
A lie requires an intent to deceive and that is beyond the capability of modern AI. In many cases lie can reveal adjacent truth - and I suspect that is what is happening. Regardless, finding truth in history is really hard because many times, the record is filled with actual lies intended to make the victor, ruler or look better.
Have you ever met an archaeologist?
With the benefit of greater knowledge and context we are able to critique some of the answers provided by today's LLMs. With the benefit of hindsight we are able to see where past academics and thought leaders went wrong. This isn't the same as confirming that our own position is a zenith of understanding. It would be more reasonable to assume it is a false summit.
Could we not also say that academics have a priority to "publish or perish"? When we use the benefits of hindsight to examine debunked theories, could we not also say that they were too eager to supply answers?
I agree about models filling the gaps with whatever is most probable. That's what they are designed to do. My quibble is that humans often synthesize the least objectionable answers based on group-think, institutional norms and pure laziness.
Care needs to be taken, of course, but ancient works often followed certain patterns or linguistic choices that could be used to identify authorship. As long as this is viewed as one tool of many, there's unlikely much harm unless scholars lean too heavily on the opinions of AI analysis (which is the real risk, IMO).
This is what I was talking about. Knowledge and ideas develop often by violating the prior patterns. If your tool is (theoretically) built to repeat the prior patterns and it frames your work, you might not be as innovative. But this is all very speculative.