Ask HN: What's Hot in Natural Language Processing?

12 points by IWantToRelocate 4 years ago | 6 comments

Hello HN!

What's hot in the NLP area in 2022 and in the next 2-3 years, in both academia and industry?(particularly more interested in the industry)

How do you guys stay updated in that matter (links, blogs, etc?) ?

Thank you!

kingcai 4 years ago |

For academia: I expect that FAANG will continue to produce larger and larger language models as well. Probably new improvements in multi-task learning to keep on increasing model size - this will probably take ideas from relational / contrastive learning. Few-shot /zero-shot learning is also something that's come a long way the last couple of years. There will probably be a bunch of secondary papers about language models as well - hypothesis on how they work, explanation of corner cases, ways to deal with bias and fairness.

For industry: Feels like “making BERT / some other language model do things” is a common job nowadays. On the more engineering side - I think we’ll see more tools to quickly and efficiently fine-tune language models, especially tools that allow a human in the loop.

Overall it feels like we’re getting to a point where there’s a pretty standardized approach to simple NLP problems like text classification - no more real feature engineering, just throw BERT at the problem. I expect for this trend to continue - with more and more of a focus on dataset creation and validation and less of an emphasis on model architecture.

I also think there will be a rise in multi-modal language models - combination of language and vision models for example. But I think the more interesting application will be combining dense language model representations with sparser tabular data. Think of trying to predict a users likelihood to buy a product given a review of another product (dense embedding of text), but also their clicks over the last 2 hours. (sparser tabular data) - this feels like a much more common problem people have.

To stay updated: read papers (arxiv-sanity.com is a lifesaver) and watch talks (usually just on youtube or a lot of uni reading groups are public on zoom nowadays).

nceasy 4 years ago | |

Thank you very much! That's the type of answer I was expecting. Could you elaborate a little bit more on what kinda problem the industry is trying to solve with NLP today, even in the ecommerce space? That one you mentioned, about product review, is really interesting.

kingcai 4 years ago | | |

Sure. I don't work in the ecommerce space, but I think the big problem that industry is attempting to solve right now is - "how can we take off the shelf language models and use them to do things that make us money?". This is a super broad problem and there's many answers to this question. It can be as complicated as creating intelligent chatbot, but also as simple as adding multi-lingual support to an app via cross-lingual training.

The example I gave of multi-modal learning was really just highlighting a dichotomy in the techniques that we use in machine learning today. FWIW I am a couple of years removed from working heavily with tabular data, so do take this with a grain of salt. But there are essentially two different modeling approaches for two different types of datasets. On the one hand, you have deep learning (BERT, language models, CV models) which does well on raw data like text or images. These usually work by mapping the raw data to dense embeddings, which are the output of neural models. On the other hand, you have decision trees / forests (think XG boost) that work great on tabular data - spreadsheets or other data of that nature.

But what do you do if you have a spreadsheet of data and one of the columns is raw text data but the other columns are say sparse boolean features? How can you incorporation the extra information from the spreadsheet into your language model? I think this is a common problem in industry that there's not a clear solution for right now.

melony 4 years ago |

The last 3 years were mostly dominated by transformer models. BERT, GPT-3, they are all scaled up and variations on the transformer model. It is surprisingly good and long-lived.

JHonaker 4 years ago |

Not that I'm knocking the deep learning aspects of the field, but what are the interesting non-DL avenues currently being explored?

throwawaynay 4 years ago |

not an expert, but the fact that you can now finetune gpt-3 seems pretty cool