Andrew Ng: Unbiggen AI

Andrew Ng: Unbiggen AI(spectrum.ieee.org)

209 points by sbehere 4 years ago | 84 comments

notsag-hn 4 years ago |

I was going to interview at LandingAI. I was asked before the interview to install a spyware browser extension to monitor my traffic to detect if I was cheating during the interview. I respectfully declined and didn't have that interview.

kevsim 4 years ago | |

Wow if you can “cheat” during an interview - meaning either that they’re asking trivial, google-able stuff or that they’re so bad at interviewing that they can’t tell if you actually know your stuff - then their hiring process is pretty bad.

tablespoon 4 years ago | | |

> Wow if you can “cheat” during an interview - meaning either that they’re asking trivial, google-able stuff or that they’re so bad at interviewing that they can’t tell if you actually know your stuff - then their hiring process is pretty bad.

Not necessarily, at least on the first point. Someone could be getting coached.

A few years ago, a coworker of mine hired a contractor onto his team and was convinced the person who actually showed up was not the person who he interviewed (over the phone). He also thought the guy who did show up was getting a lot of help day-to-day from somewhere. The guy was a contractor, so it wasn't a huge problem because we could drop him quickly, but I would have never expected someone would do anything like that. However, it kind of makes sense as a scam: be a decent developer, get a stable of unhirable incompetents, and rotate them through companies while taking a cut of their salary.

strikelaserclaw 4 years ago | | |

i mean, people are good at finding clever ways to cheat.

cloogshicer 4 years ago | |

Well, Ng is also one of those people who believe that we should all work 70+ hours per week:

https://news.ycombinator.com/item?id=15251769

mdp2021 4 years ago | | |

~80hrs on topic A squeezes the available time for being acquainted with the rest. [Edited because there was little way not to make the former formulation read, unwillingly, nasty]

Some of us believe instead on the advantage of being a polymath, (also) to be able to export wisdom from other contexts into the current work.

Also in terms of the proper ground to facilitate innovation.

nomilk 4 years ago | | |

Musk recommends 80-100hr weeks, every week

Source: https://www.youtube.com/watch?v=GtaxU6DZvLs&t=1m20s

weego 4 years ago | |

It's literally our job to not just assume the possible solution that rolls off the top of our heads might not be the most up to date / best practice and to research it

mirntyfirty 4 years ago | | |

Agreed. A decent interviewer can also determine a person’s understanding of a topic by simply talking to them about it. IE why did you build a model like this? What diagnostics did you use? Have you tried ____ before in your career?

mdp2021 4 years ago | |

I'd just note that if pushed by circumstances (if one was willing to be interviewed in spite of their ways), the interview environment could be (would be) on a throwaway virtual machine...

Possibility which, by the way, makes the interviewer's cautionary move generally useless.

ThalesX 4 years ago | | |

Or, in 2022, one could reach into their pocket just use a phone, making the interviewer's cautionary move generally useless.

kevinventullo 4 years ago | | |

I suppose if you’re clever enough to set up a VM in order to evade detection, that’s a pretty positive aptitude signal in its own right (though pretty negative on the behavioral/ethics side).

tromp 4 years ago | |

Missed opportunity to say you landed an interview at LandingAI :-)

whatever1 4 years ago |

My understanding is that they are trying to automate the data preparation steps that seasoned ML practitioners are doing anyway today.

The fact that he tries this in manufacturing makes the case stronger. In most manufacturing companies you do not have access to top ML talent.

You have Greg who knows python and recently visualized some production metrics.

If we could empower Greg with automated ML libraries that guide him in the data preparation steps in combination with precooked networks like autogluon, then manufacturing could become a huge beneficiary of the ML revolution.

overkalix 4 years ago | |

Greg probably also knows SAS and AMPL, and has a good knowledge of ops research, which is within stone-tossing distance of whatever ML is pretending to be this week.

NumberCruncher 4 years ago | | |

After 15 years of experience with SAS this sounds to me like saying "knowing how to write and having a pen makes you to a poet". But it depends on how far you can toss a stone...

whatever1 4 years ago | | |

OR and ML have their own space in manufacturing.

OR is perfect when you can describe explicitly what the decision space is and what the restrictions are.

ML is great fit when you want to identify and use patterns. Quality control with machine vision is a good application for ML. NLP for PDF documents is a huge field for manufacturing as well. Companies have so much data in email attachments that they do not currently take advantage of.

andrewf 4 years ago | | |

A tangent, if you have time: where would I go for a primer on operations research and/or discrete event simulation?

My thought is that Goldratt's "The Goal" / theory of constraints is a useful way of thinking about optimizing throughput in a computer system. http://www.qdpma.com/Arch_files/RWT_Nehalem-5.gif plus an instruction latency table is something like a well modeled factory. (The Phoenix Project applies these principles to project management, which I think is a somewhat less useful analogy!)

I'm curious about applying existing tools to modeling things like: how will this multi-tiered application behave when it gets a thundering herd of requests? What if I tweak these timeouts, adjust this queue, make a particular system process requests on a last-in-first-out basis? Can I get a pretty visualization of what would happen?

itissid 4 years ago |

That is the problem with generalization and cop outs like these. It's no good to people in the field doing actual work where the devil is in the detail.

Big data is fairly important to a lot of things, for example I was listening to Tesla's use of Deep net models where they mentioned that there were literally so many variations of Stop Signs that they needed to learn what was really in the "tail" of the distribution of Stop Sign types to construct reliable AI

vasco 4 years ago | |

Interestingly, when you learn how to drive you need to see approximately one example and you're able to identify them all.

teruakohatu 4 years ago | | |

That is called transfer learning. You might only need to see one photo of a sign to identify it in real life (although arguably learner drivers take a while to notice signs) but that is only because you have been training on identifying generic objects since you left the womb.

You brain already knows how to select the most important features of a sign. The shape, the size and the color. You have also learned how to understand the text on the sign.

A new born baby does not have that ability.

This is applied in ANN as well. Transfer learning is using a pre-trained neural network, which has already learned identifying objects, and then using it to train on identifying a new, usually smaller, set of objects using, usually, a lot less training data. That is what Andrew is talking about in the article.

corndoge 4 years ago | | |

Is there some underlying point to this statement? It comes off as a passive dismissal of something but I'm not sure what. It might be helpful to directly state what you're trying to say so that other people can engage with it.

itissid 4 years ago | | |

It does feel though that a model like the human mind will be very fundamentally different from any of the models of today. No?

Like the NN State of the art models of today are so different from state of the art 12 or so years ago which was SVMs.

simulate-me 4 years ago | | |

Your brain is also the result of billions of years of evolutionary "training." Neural nets start from scratch.

a_square_peg 4 years ago |

I’ve been wondering about the limits of data-centric approach – there seems to be this implicit notion that more data equals better performing ML or AI. I think it would be interesting to imagine a point of diminishing return on additional data if we consider that our ability to perceive is probably largely based on two parts - sensory input and knowledge. Note that I’m making an explicit distinction here on the difference between data and knowledge.

For instance, an English speaker and a non-English speaker may listen to someone speaking English and while the auditory signals received by both are the same, the meaning of the speech will only be perceived by the English speaker. When we’re learning a new language, it’s this ‘knowledge’ aspect that we’re enhancing in our brain, however that is encoded.

This knowledge part is what allows us to see what’s not there but should be (e.g. the curious incident of the dog in the night) and when the data is inconsistent (e.g. all the nuclear close calls). I’m really not sure how this ‘knowledge’ part will be approached by the AI community but feel like we’re already close to having squeezed out as much as we can from just the data side of things.

Somewhat related, we have a saying in Korean – ‘you see as much as you know’.

aj7 4 years ago |

“I once built a face recognition system using 350 million images.”

Did this make any of you a little queasy?

mdp2021 4 years ago | |

Well noted! Explicitly: where does such database come from?

mkl 4 years ago | | |

Frames of video could make the number sky-high like that without involving enormous numbers of people.

a-dub 4 years ago |

data quality is important. every ai project i've worked on has started with visualizing the data and thinking about it.

it's easy to get complacent and focus on building big datasets. in practice, looking at the data often reveals issues sometimes in data quality and sometimes scope of what's in there (if you're missing key examples, it's simply not going to work).

most ml is actually data engineering.

atbpaca 4 years ago |

Glad to see the term ML being used more often than AI in the comments as it looks like most "AI" models are trained for image classification. Having said that, the idea of "doing more with less" sounds interesting and I wonder what it means exactly. Does it mean taking a dataset of 50 images and to create 1000s of synthetic images from it?

spupe 4 years ago | |

Yeah I was very interested about that point in particular. I think synthetic data is one of the ideas, but I got the sense that he also means helping to identify what makes a data set good, even if small. It looks like Andrew Ng is developing a platform for automatically detecting whether a dataset is suitable and, if not, what are the steps to improve it. A sort of automated ML consultant, allowing you to sell capabilities much cheaper than if you needed to consult an actual expert.

DeathArrow 4 years ago |

Pretty interesting. Mr. Ng claims that for some applications having a small set of quality data can be as good as using huge set of noisy data.

I wonder if, assuming the data is of highest quality, with minimal noise, having more data will matter for training or not. And if it matters, on what degree?

frozenport 4 years ago | |

This is at the heart of the ML training problem.

In general you want to add more variants of data but not so much that the network doesn't get trained by them. Typical practice is to find images whose inclusion causes high variation in final accuracy (under k-fold validation, aka removing/adding the image causes a big difference) and prefer more of those.

Now, why not simply add everything? Well in general it takes too long to train.

pbowyer 4 years ago | | |

> Typical practice is to find images whose inclusion causes high variation in final accuracy (under k-fold validation, aka removing/adding the image causes a big difference)

How do you identify these images? It sounds like I'd need to build small models to see the variance but I'm hoping that there's a more scientific way?

kavalg 4 years ago | |

It is relatively easy to turn small and accurate data to bigger and less accurate data with various forms of augmentation. The opposite is harder.

xiphias2 4 years ago |

I can imagine that customizing AI solutions in an automated way is quite important, but writing that as the next wave is probably an overstatement.

Of course few shot learning is important for models, but for example for Pathways it was already part of the evaluation.

kappi 4 years ago |

For industrial application, there are already mature systems based on CV. For majority of those applications, there is no need for deep learning or multilayer CNN. Shocked to see Andrew Ng talking like a marketing guy.

leobg 4 years ago |

What are some ML data annotation tools that guide you towards those data points where the model gets confused? I hear Prodigy does this. Any others?

jstx1 4 years ago | |

What's the role of these tools? Can't a developer just write the code to get those data points?

At a first glance it seems like the hassle of integrating such a product into an existing ML codebase/pipeline is larger than solving the problem by hand.

leobg 4 years ago | | |

What I mean is an annotation tool that interacts with the model itself in such a way that it will present to the user exactly those training examples next that will have the greatest impact in helping the model learn. So an annotation tool that provides a user interface for annotating data quickly (with keyboard shortcuts etc.). And looped into inference through the model to be trained, so you always get presented with the very training example that, out of the ones available, the model currently would be most unsure about.

TOMDM 4 years ago |

Yeah that'd be great.

I also want cars that run on salt water.

I'm not saying that small data ai is equally impossible, but simply saying "we should make this better thing" isn't enough.

Datenstrom 4 years ago | |

> simply saying "we should make this better thing" isn't enough.

Besides the references to his company which has customers and a product that already works on these principles the literature currently shows that this is very much possible if you dig into the correct niches. Besides the SOTA in few-shot and meta-learning it is possible to smartly choose the correct few samples for the network that yield the same results.

It has also been my primary focus for the past 5 years and the core of the company I founded.

riku_iki 4 years ago | | |

> it is possible to smartly choose the correct few samples for the network that yield the same results.

And then, someone is using pretrained 500B model, and fine-tuning your few examples, and getting new SOTA.

sanxiyn 4 years ago | |

It's more of "this direction seems higher ROI than that direction", in particular quality vs quantity of data.

Already in 2018 SenseTime reported that for face recognition, clean dataset surpasses accuracy of 4x larger raw dataset.

https://arxiv.org/abs/1807.11649

mdp2021 4 years ago | |

«Small data /ai/» is not "impossible", it is actually necessary: AI, opposed to this ML, implies perfectioned digestion of the input data.

Only, the article seemed to show a very conservative Ng about the algorithms, a focus on data management - so it's still ML.

technocratius 4 years ago | |

I would say that Andrew Ng has some credibility in putting practice to his preaching.

atulsnj 4 years ago | |

Atleast someone's working on it.

tacosbane 4 years ago |

can we build an AI to detect that the AI goalposts keep getting moved?

girvo 4 years ago | |

A simple “return true;” should suffice, but to be honest that’s what makes the field fascinating to me as an outsider