Natural Language Processing: The Age of Transformers(blog.scaleway.com) |
Natural Language Processing: The Age of Transformers(blog.scaleway.com) |
XLNet (https://arxiv.org/abs/1906.08237) is in essence a recurrent neural network, using a transformer (which is based on neural networks) which recurrently keeps context between different batches. But the gated RNN's, such as AWD-LSTM/GRU, are fading out to the superior transformer architectures, this is true.
That's my only complain though, excellent theoretical introduction.
Although, if anyone wanted to actually implement a transformer, be ware that you want to have a 8+ GB GPU unit available, or be prepared to use cloud computing (Google Colab is free, for now). Training neural networks is quite hardware dependent still.
Also, a recent piece of interesting work [1] shows that with the right control parameters, you could still use gated RNNs, like LSTMs, for pretty good language modeling.
[1] http://www.abigailsee.com/2019/08/13/what-makes-a-good-conve...
My take away - pretraining achieves excellent paper results but robust application is hard. There is still quite a way to go down this road for fault intolerant users and applications.
Recent work by Jakob and team:https://ai.google/research/people/author37567/
Image transformer is particularly interesting
You can check it out for yourself. https://talktotransformer.com/ Pretty awesome.
As someone who spent a lot of time trying to manually code up solutions to anaphora resolution (pronoun coreference), BERT seemed like a small miracle to me. As a side comment: I love that getting training data for BERT is so cheap: any text source, and randomly remove words, target output is predicting the words removed.
If anyone's interested in further experiments on their own. There is now a unified Python framework for dialogue models ;)
<human>The bridge wasn't quite right. There were small sparks coming off the incomplete sections of material hanging just to the right of the traffic lights. Gary wasn't worried, however. He walked up carefully.</human><computer> "This is it, guys!"
A huge explosion rocked the bridge, and a fireball engulfed the entire bridge. "Shit! Shit!"
It seemed as though the fire was just blowing out, and the bridges were back in place. Gary was a little concerned, but it wasn't a problem at the time. At least it was under control.
He walked forward, peering over the top. "This is terrible. I'm not leaving the village tonight! How about a break for some water and a drink?"
When he looked down, the large chunk he had just taken off of the bridge was gone. As soon as he realized it was gone, he turned back to the village to check to see what had happened. Not sure what to do, he continued down the hill, heading toward the village at all times. The entire town was completely surrounded, and everyone was either inside or had</computer>
Not bad, but it's strange that it autogenerated text with multiple spaces at the end of sentences. Also, it is far more dramatic than I would have guessed.
Determining an actual official government or other institutional request is going to get much harder.
For those of you who don't know what GPT-2 is, here's the simplest & fastest way with a bit of humor: https://blog.floydhub.com/gpt2/
Disclaimer: built it.