You probably don't need AI/ML. You can make do with well written SQL scripts

You probably don't need AI/ML. You can make do with well written SQL scripts(threadreaderapp.com)

1075 points by passenger 8 years ago | 322 comments

gnicholas 8 years ago |

My startup was approached by a corporate VC that wanted to make a strategic investment. Based on the attendee list from our meeting, which included very high up folks from the company, I felt good going in. They expressed interest in our technology that makes reading on screen easier [1], but they were surprised to learn that we didn't use machine learning to accomplish this.

I indicated that it was actually quite effective without ML, and that it was easier to explain to users this way. They kept prodding around on the ML stuff, and how we might be able to use ML to accomplish roughly the same thing.

A week later they said that they were no longer interested because, although they liked what our tech was able to accomplish, it didn't fit with their investment thesis — which was all about ML.

My wife asked me why I didn't just make some stuff up and say we could do v2 using ML. Perhaps she was right.

1: http://www.beelinereader.com/individual

update: in response to feedback below, I edited the link to point to a page with relevant content instead of our generic landing page. Lesson learned!

JepZ 8 years ago | |

That is just how larger corporations work.

They probably decided they wanted to be part of that future market about AI and therefore want to invest in AI/ML startups. Someone sold your company internally as being a nice fit for the AI portfolio, but when you answered the question about ML with "No" the door was shut.

From someone who makes large trade deals as his daily business, I learned "Never say 'No'" (during a negotiation). In most cases, it is better to be diplomatic. So might have been better off with something like:

'So far we had decent results without ML, but we are constantly evaluating options to improve our technology, which includes optimizations based on machine learning.'

That way, some might hear '[...] includes optimizations based on machine learning.', while you just said 'we do not use machine learning' ;-)

adriand 8 years ago | | |

You should never mislead a potential investor. Why would you want an investor who is misaligned with your core technology/strategic direction?

digi_owl 8 years ago | | |

How the stock market cargo cult works.

We have already seen this with things like offshoring.

A few big name companies did the numbers and found they could save some on moving their production somewhere else.

Then came a rush of companies that made big that they had offshored who knows what, simply because that was what the stock market expected of a would be forward looking company.

The basic problem is that these decisions are not done based on what is good for the company long term, but what is good for the stock price short term. And in large part because CEOs are paid in stock options, and can be ousted by activist stock owners.

db1 8 years ago | |

I went to your website, and I can't figure out what your product is or does. I had to click in to individuals before I figured out what it does.

Why not put "BeeLine Reader makes reading on-screen easier, faster, and more enjoyable. We use a simple cognitive trick — an eye-guiding color gradient — to pull your eyes from one line to the next." on your front page?

Maybe you guys have already done some kind of testing and found out that the current layout is optimal?

gnicholas 8 years ago | | |

Appreciate the feedback. We recently redesigned the website to offer silos, and one of the goals was to make the landing page not offer too many specifics that would make some audiences bounce. But perhaps your suggestion is just what we need to do. Thanks for taking the time to share!

One lesson learned: I should have linked to the /individual page instead of the generic landing page in my comment. Updating link now...

tejasmanohar 8 years ago | | |

Agree with db1's feedback. I almost immediately closed the page after not finding the pitch.

Someone 8 years ago | |

If your code has even a single magical constant (I guess it likely has, if only because your code has to figure out whether a period is sentence-ending), you can and, now that it is cheap and easy, probably should use machine learning (a moniker that includes such algorithms as gradient descent) to optimize that constant.

Also, if you ever have considered two code variants where one is a bit better in situation A and the other in situation B, you can use machine learning to combine the two methods, (e.g. using a random forest (https://en.wikipedia.org/wiki/Random_forest)

guy98238710 8 years ago | | |

You don't need AI to find values for free parameters. Unless you consider all of statistics to be a subfield of AI.

fny 8 years ago | |

Great product idea! I'm glad people find it useful.

Hypotheses:

> corporate VC that wanted to make a strategic investment

They want to put money in your company to make sure they don't have to compete with you or compete with someone else who might buy you down the line.

> They kept prodding around on the ML stuff, and how we might be able to use ML to accomplish roughly the same thing.

They're looking for secret sauce, barriers, friction. Something patentable. Something to keep others out and keep them ahead. Perhaps something that they don't think they can do themselves or maybe rip off when they realize it's useful. Remember Flux?[0]

They're investment thesis makes sense from this perspective: they're thinking like classic, amoral businesses.

Also note that if your product is simple and popular enough, people will likely make free clones. You'll need to be a good steward to keep your value.

Best of luck!

----

A bit of feedback: you need to be more transparent and up front about what how and why you're using analytics, and let people opt-in. I was a bit disconcerted to see you using Google Analytics in your extension without informing me.

[0]: https://justgetflux.com/news/2016/01/14/apple.html

dahart 8 years ago | | |

> you need to be more transparent and up front about what how and why you're using analytics

That's a new one to me. Is this Facebook changing the landscape right now, or have you been expecting this for a while? Do you have sites in mind that warn about GA?

Pretty much all websites use GA or something like it, and in my experience it's extremely rare to be warned about it. It always goes in the privacy policy, which you should be able to find. But I'm not sure I've ever seen an advance warning that analytics was taking place. I suppose it's assumed, but in any case appears to be normal and acceptable to not warn people that logging and searching of those logs exists.

Cookies are a different story, since the EU passed legislation requiring notice of their presence.

guiriduro 8 years ago | | |

> They're looking for secret sauce, barriers, friction. Something patentable.

While that may be true, I'd suggest the VC's are just looking for a later exit when a greater fool buys them out. They're betting on the ML investment market getting hotter, and providing you meet the criteria of being a growing startup (by some metric) nominally in the ML field and that field attracts more investment dollars later - even whether or not you succeed as a bottom line business, or whether your business success comes from ML or not - doesn't necessarily matter.

gnicholas 8 years ago | | |

Appreciate the feedback! Regarding analytics, I think this is some A/B testing, which I don't think is even active in the current version (though still in the sources).

More importantly, we don't tie GA or any other usage analytics to individual users. We basically just use it to see where the extension is used and where people are blacklisting sites. We use this data to make the extension run better on more sites. We don't monetize user data in any way. You're right that we should make this point more salient. In our Privacy Policy, we do describe how to opt out by blocking GA, but we should put this somewhere more prominent.

Thanks for the feedback!

brod 8 years ago | |

I posted that link in slack and it's twitter:* meta content is really strange:

    Parade of Fans for Houston’s Funeral

    NEWARK - The guest list and parade of limousines with celebrities emerging from them seemed more suited to a red carpet event in Hollywood or New York than than a gritty stretch of Sussex Avenue near the former site of the James M. Baxter Terrace public housing project here.

I found an example [1] with the same content so I guess you just haven't updated it.

1: https://gist.github.com/blairanderson/85cc961295fd03a6c4b3

gnicholas 8 years ago | | |

Wow, how embarrassing. Thanks for pointing this out. I have no idea how we didn't catch it sooner.

matchagaucho 8 years ago | |

Often times the output of ML/AI will result in writing code for an expert system.

The system doesn't actually use ML or AI in realtime. ML just validates the correct conditionals and decisions are used in the software.

Most ML models can be distilled down using the Pareto principle anyway, and the 80/20 rule written into the code.

A politically correct and high-integrity story to tell investors might be "We use machine learning regression analysis to validate our expert systems are operating on optimal statistical models."

azernik 8 years ago | |

Awesome product! Been using it for a bit, and does indeed help.

Just a quick bug report - it does the wrong thing on right-to-left text. e.g. on this article [1], it highlights the beginning (right side) of one line and the end (left side) of the next line in the same color, which doesn't actually reflect the flow of reading. There's also a weird effect on LTR text embedded in the RTL (on my setup, the "a" in "axios" on the third line of the body is bright red, while the rest is complete black).

EDIT: [1]http://www.maariv.co.il/news/world/Article-633899

gnicholas 8 years ago | | |

Can you share the link? I thought we fixed the right-to-left issue a long time ago. Feel free to share screenshots or more info with contact@[domain].

xevb3k 8 years ago | |

Can someone explain the dynamic that’s going on here? I somehow can believe that VCs are so stupid to be purely hype driven, so what is pushing them to make this their investment focus?

Do they just jump on every bandwagon in the hope that one of them might pan out?

joewee 8 years ago | | |

VCs raise money the same way startups do, only with more regulation. For that bucket of money they likely presented a thesis for how they would used to invest in the ML/AI wave. Investors have rights if the money isn’t used they way they were told it would be.

VCs raise money from fund managers (sovereign wealth funds for example) who also present a thesis to their stakeholders for how the money will be managed.

Depends on the timing but I think VC focus on ML is a good indicator of where smart money thinks the money is going to be made. But there are a lot of people that catch a trend in the tail in, I don’t think we’ve gotten anywhere close to that for ML investments.

azernik 8 years ago | | |

They put a lot of study into hyped categories, so they at least understand the businesses they're investing in. They can evaluate the caliber of talent, understand the depth of your moat, etc. Whereas with businesses outside of the hype (or outside of their specific specialty) they're not confident in their ability to even evaluate the potential of an investment.

(I'm working on a startup right now that's in a similar position - non-hyped B2B segment. VCs are all "hey, look, we don't understand X market very well, maybe go to Y VC?" And Y VC is all "hmm, you're only tangentially related to Z market that we actually understand, so you'll need to get to Series A-sized revenues before we'll put in a seed-size investment.")

rorykoehler 8 years ago | | |

They invest in future potential. What you can do now is nearly irrelevant besides showing vision and competency. The next big thing isn't going to come from a couple of clever sql scripts.

wellboy 8 years ago | | |

VCs are very stupid most of the time. They don't understand even how ML works orh blockchain or any new trend.

But their manager told them ML, ML, so they need to invest in ML. This is not a joke.

Avshalom 8 years ago | | |

The thing is we're at a stage where companies hoover up every bit and byte of data they can get hold of on every customer and many non-customers because... well it's gotta be valuable some how right?

But it's so much damn data so loosely coupled to a company's actual product that trying to get any actionable intelligence out of it is basically impossible. Enter ML. "Just" chuck your data at a NN use A/B tests to train and hope the company ends up with higher revenues, if they do: claim the 'data scientists' are definitely a profit center; if they don't: claim you need more data.

Until Exxon-Mobile stops storing metrics on how many goldfish I own in a given month, cranking out ML related companies seems like a good bet for VCs because it's not going to get any easier to turn progressively more esoteric data points into money.

manigandham 8 years ago | | |

Yes, due diligence is rare. This is portfolio theory in action, investing in many losing bets because the 1% will win big and create the returns.

Disqualify the easy "no", then everything else is a "maybe". Invest and move on.

robryan 8 years ago | | |

The VCs is this case were probably looking for some tech that had a higher barrier to entry.

Whether it is true or not in this case they may have perceived that if it was based on a ML approach it might be harder for the competition to replicate if it was successful.

collyw 8 years ago | | |

Its not just every VC, its a good percentage,maybe the majority of developers out there.

bayesian_horse 8 years ago | |

Well, they probably want engineers with ML experience, and companies who already have a foot in that door when an opportunity arises for an actual profit maker.

You may have pitched a perfectly fine product to somebody in the market for a perfectly fine acquihire.

sixhobbits 8 years ago | |

I did your test. Interesting idea. Feedback: I got "It looks like BeeLine didn’t improve your reading speed this time through." which I think needs another pass from your copy editors.

Interesting idea overall. I've been using spritz a lot recently and I'm liking it more and more though it doesn't seem to have progressed from when I discovered it a few years ago.

Re the investors blindly wanting ML - we have this problem too (and it's definitely a problem - I can see this in spite of having a masters in ML). On the bright side, some investors - often the better ones - have said things like "I was interested the moment you didn't use ML or blockchain in your pitch", so stay true to your vision :)

zhte415 8 years ago | |

General comment on colour highlighting -

I read the updated /individual page and found it very... difficult.

The colours led me to speed up and slow down at an uncomfortable rate, to the extent I had to re-read it three times. I typically receive 200-1000 emails per day which all need to be read (yay!) plus at least 30k words per day out of email (business stuff but excluding newspapers, books, etc).

I speed-read about 6-10 words per flick of the eyes (for a short document), which is about 50 words per second. Do slow down to ponder: careful phrasing, a needingly precise written document, some graphics, and this is not for detriment.

I found the colour-coding very difficult, however.

gnicholas 8 years ago | | |

I'm curious to know if you tried it in the more subtle color schemes. OTOH, if you're a speed reader who uses other techniques, this might not fit your reading patterns.

TheOtherHobbes 8 years ago | | |

Similar experience here. I suspect it's not a good option for speed readers.

Also, there are issues with red-screen sleep promoting systems like f.lux.

qwerty456127 8 years ago | | |

Wow! Would you mind sharing some practical advice on how to actually train to read this fast?

arghwhat 8 years ago | |

You should definitely have made something up.

Up front, many investors only seek buzzwords before they shift focus to numbers. Be a sales(wo)man, and feed them the ones they want. Later on, once you end up making them money, they're not going to care much about what technologies are involved.

I wish you the best of luck with your startup in the future.

(P.S. I had a FOUC on your webpage. The icons in the "want more" section were missing, and haphazardly popping in. Might want to fix that.)

mindhash 8 years ago | |

This seems pretty cool. I found a good difference reading your education product. It was easy on eyes.

I read a lot of blogs. Thought you could add a wp plugin if it doesn't exist.

Also you must have a ml/ai roadmap for tracking user interactions, etc for possibility of further personalization. Leaving aside the buzz of ai/ml, I think every business has opportunity to apply ml/ai. It doesn't happen often but when it does, its gold for any business

gnicholas 8 years ago | | |

It's true, we could use AI or ML to do stuff not related to the core product. But these investors thought that we should be using ML in the coloring algorithm itself. Even if this made sense to do (that is, if it improved reading ability for people compared with our current product), it would be much harder to explain.

Right now, it's easy to say it's a color gradient that wraps from one line to the next, guiding your eyes. I don't know how I insert "machine learning" into that sentence somewhere and not sound like I'm trying to trend-surf.

gadders 8 years ago | |

First time I've heard of it, but that app is actually pretty cool. I can feel it dragging my eyeballs RTL :-)

nefitty 8 years ago | |

Thank you for your work on Beeline. I've been using it for several months and it definitely helps when diving into large Wikipedia articles, research papers, blog posts, etc. I found that it facilitates my reaching flow quickly when investigating large problems.

gnicholas 8 years ago | | |

I'm curious to know where you heard about BeeLine a few months ago. We haven't had press in a long time, and most HNers who know about us found out from our Show HN way back in 2013.

mort96 8 years ago | |

So, I went to the page, clicked the button for the Firefox extension, and was greeted with this page: https://s.mort.coffee/d/img/scr-4494758.png

You should probably link to the add-on page (https://addons.mozilla.org/en-US/firefox/addon/beelinereader...) instead of directly to the .xpi file. I would never accept to install a plug-in from a random link even if it hadn't been for the dialog's scary language.

gnicholas 8 years ago | | |

Thanks for the feedback. We tried to make it as easy as possible, but I can see that it comes across as more legit if the install comes from the Firefox Store, which requires vetting.

parmesan 8 years ago | |

Wow this is amazing! I don't understand the VCs input about ML, why and how would it help in this use case?

gnicholas 8 years ago | | |

I think they wanted us to do something with NLP and ML, to make it syntactically-based. I happen to have studied linguistics and know how complex this would be — and of course it would have to be custom-built for every language. The current version, which just uses line position, works fine and is language-independent.

Another explanation: they didn't actually know how they wanted us to us ML — they just knew they wanted us to use it.

duckqlz 8 years ago | |

Is there a white paper or a clinical study to back up the claims on bee reader? I have read numerous studies claiming the exact opposite especially on a light projecting surface. I personally found the colour changing on each line to be extremely distracting and straining on my eyes. Furthermore, this seems like a bad joke for someone who is colour blind. Finally, VCs asking for ML are really asking how your technology fits in to the general future. With a script that reads line length and changes text between 3 colours, let’s be honest, I could have made beeReader when I was 10. Perhaps the VCs were initially wow’d by how profound you believed your product to be but then were underwhelmed by the tech (I’m guessing that’s why the individual page is burried). And if any publishers are in talks with you about using this in printed media (welcome to imaginary land) just keep nodding to what ever they say take their money and run.

gnicholas 8 years ago | | |

Yep, there have been several studies done, ranging from media to education to optometric.

• CNET found that readers were 35% more likely to finish reading an article if they were reading with BeeLine turned on. See more on the CNET study in this article in The Atlantic [1].

• An optometric study using eye-tracking glasses found that BeeLine reduced the number of regressions and/or skipped lines for the vast majority of participants.

• Various educator-led studies (some more rigorous than others, and none as informal as the "case studies" that are unfortunately common in edtech) have found strong gains in reading fluency and/or comprehension.

I totally appreciate your skepticism, which is especially reasonable for someone who does not personally see a benefit from the tech.

The response from doctors who work in vision has been very positive, and our tools are recommended by doctors at Stanford Med School and UC Berkeley School of Optometry.

We even got featured by the American Optometric Association. The AOA committee that evaluated our technology said that this was the first time they'd ever had a unanimous vote in favor of anything. Apparently to them, it was quite obvious that this has beneficial effects (without having run any formal studies on it themselves).

To your point about colorblindness, I would note that I regularly come across people with red/green colorblindness who love/use our product. You can change the colors to be whatever you want, or use it in grayscale.

We are talking with publishers, but mostly on the digital side. Thanks for the input and questions!

1: http://www.theatlantic.com/technology/archive/2016/05/a-bett...

vivekd 8 years ago | |

Why would anyone think you need ML or AI to make letters on a page fit a color gradient. Are these investors crazy? I can't even think of an application for that technology in bee line reader.

payasr 8 years ago | |

I have been playing around with the reader, and it doesn't seem to play well with mathjax-enabled pages.. Is it a known issue?

gnicholas 8 years ago | | |

Not before now — thanks for flagging! A couple questions: what do you mean by not play well? Does it undercolor (not color the text you want to have colored), overcolor (accidentally color text that should not be colored), or something else? Thanks for sharing this feedback. As another HNer pointed out via email, we need to have a contact/help link in the Settings panel so that we can gather more feedback like this.

coldtea 8 years ago | |

>I indicated that it was actually quite effective without ML, and that it was easier to explain to users this way. They kept prodding around on the ML stuff, and how we might be able to use ML to accomplish roughly the same thing.

Just call what you do ML.

IBM, for one, never shied from calling any old BS "AI" and branding it Watson.

wlesieutre 8 years ago | |

Just rename your company BeeLine Reader Blockchain and you'll be fine.

qbaqbaqba 8 years ago | |

Seizure/Nausea warning!

didibus 8 years ago |

What the article describes is called an "expert system" and is what AI in the enterprise used to look like.

Basically, you try to capture the instinct of a great salesmen by formalizing it into computer logic.

Often that's done with rules like in the article.

It works good, but has its limits. The finer reasoning of human judgement are often not expressable, people don't know why they made that decision. Making it hard to capture. And human also have their limits. Too many variables, too much noise, too much data and they won't make the best prediction/decision.

That's when ML shines. Instead of trying to encode an expert's intuition, instead you let the machine develop its own intuition, itself becoming an expert through training.

The downside is it now similarly becomes challenging to formalize the machine's intuition. Why it made a given choice is no longer easily apparent.

I do think expert systems still have value. Especially when you lack the dataset to train a machine expert.

zitterbewegung 8 years ago |

Companies have a large problem of having their data tucked away or inaccessible to the stakeholders. When people talk about AI / ML what they actually need is their data cleaned to the point where they can communicate to their stakeholders. Also, all of the companies who sell AI / ML as consultants are really good already at cleaning data.

When companies actually hire data scientists what they typically do is clean data for a few months to a year . Then they interpret the data by probably being able to perform linear regression. At that point the data is in a state where it can be easily understood by those stakeholders and then they have created value. Whether or not the linear regression or whatever model has been learned may mean something. But, at the end of the day you need to tell stakeholders how they can create value and guess what SQL and Bash will do 90% of the job.

jrq 8 years ago |

I thoroughly enjoyed this post, so maybe I'm biased.

I think AI is extremely overhyped and under performant. In fact, I think a major strength of AI is founded in the technical ignorance of certain project managers or decision makers. The type of person who doesn't appreciate the simplistic elegance of sql+bash/cron for simple tasks is the person who will bite a pitch for AI customer retention strategy. Customers are people. Business is people. You don't need a rack of gpus to understand why sending someone an email who has a saved cart is a good idea. It's common sense. It doesn't matter if we can force machines through trillions of operations to vaguely capture a customer pattern of a guy at a console can write it by hand in five minutes.

(not always, I know, I work in finance so a lot of my business IS machines and not people, but you catch my drift)

I'm pro-AI research, and anti-AI hype train. They're computers. They're objects. They're not us yet. Consider the magnitude of the AI research market, which is tens of billions, and compare that to what they are actually capable of doing relative to human performance.

/rant

Maybe HN skews my perception on what the public tech enthusiast's perception on AI is...

kevin_nisbet 8 years ago | |

I agree completely, at one of my previous employers, the CEO of the company sent out an email, with a list of links, and said everyone should learn AI/ML, and it would be important for the future products. And he gave a number of examples of potential features that AI could achieve.

When I looked at it, every feature shown, could be more reliably delivered and have a better customer experience through deterministic behaviour.

So I agree, I think AI has made certain technologies way better, but I see it as a tool, and like any tool, it sometimes applies to the situation and sometimes doesn't.

donatj 8 years ago |

I almost took a "big data" data scientist job about a year ago with a local company.

After talking to a number of their engineers, it became quite clear to me that instead of a data scientist, they just badly needed a DBA / someone with ownership and a complete vision of the data structure.

They had no foreign keys, poorly 'designed' indexes, and tons of redundant tables with no rhyme or reason to them.

They'd organically grown their database with hardly any review. They did not have big data, they just had a big mess. And wanted someone else to clean it up.

adaml_623 8 years ago | |

I'll bet they found they got 10 times more applicants for a data scientist role as opposed to a dba/sql/data engineer.

vazamb 8 years ago | | |

But what's to point? You will get a lot of applicants from non-cs backgrounds. Not only will they be unprepared for the job but also most likely very unhappy with their tasks.

flor1s 8 years ago | |

Maybe they should have a data warehouse (which is typically denormalized) instead of a database (which should be normalized)?

donatj 8 years ago | | |

They had a data accessibility / source of truth issue. You need to normalize before you denormalize or else you're just spreading the mess into the warehouse.

halflings 8 years ago |

This could've been a valid criticism of people that use ML where it's not appropriate, but it ended up being a bit of an irrational rant, and a dishonest one too:

> I mean, why send a letter with breast pumps to a man that just bought a pair of sneakers? It doesn't even make sense. Typical open rate for most marketing emails is anywhere between 7 - 10%. But when we do our work well, we saw close to 25 - 30%.

How do you know what items are compatible to each other? Why only recommend sneakers to somebody with sneakers, instead of also recommending sport clothing?

Oh, I guess you could build some type of topology of all your shopping items. But what about recommending soccer balls to people that bought soccer shoes? You could also add that to your database, but now you also need a heuristic to score item similarity: `category_matches * 10 + subcategory_matches * 5 + color_matches * 2 + ...`

This is the whole point of ML. People have been building rule-based systems built on "domain expertise" for ages, only to find that they are limited and cannot compete with simple algorithms fed with enough data.

reacharavindh 8 years ago | |

But, that might be in the realm of SQL too. Find out what items were frequently bought with the item that this customer bought, and send them as recommendations.. Rule-based does not always mean that a user is sitting down writing that tennis balls and tennis shoes are related items. Don't you think?

ju-st 8 years ago | | |

Such a simple system would recommend many items that are frequently bought by everyone (like bread, toilet paper, batteries). You would have to weight the items in some fancy way to get useful recommendations... And I have just described the introducing slides of a applied machine learning university lecture.

visarga 8 years ago | | |

Expert systems are brittle and don't generalise well outside the data on which were created.

You know, counting items and dividing by total number is a kind of machine learned model, too. In technical terms it is "Computing the maximum likelihood estimate (MLE) for the PMF of a random variable taking finitely many values."

But it's a poor man's model. That's why in order to solve complex problems we use stuff like neural nets and gradient boosting, and in unsupervised learning, matrix factorisation.

sixdimensional 8 years ago |

Well, we are in the peak of a wave of hype about AI/ML, maybe even just past that peak. Many fundamental technological advancements in the field of AI/ML have sort of coalesced together at the current time to form a strong feature set that can be more broadly applied by a wider audience, not just those hardcore computer scientists who invented the technology.

I've been in the thick of this previously, facing a complex rules-based engine that did most of its incredible feats in the fraud detection domain using a number of really complicated SQL queries. At the same time, I've used the results of such queries combined together with machine learning and predictive analytics, giving you the best of both worlds. Both have strengths and weaknesses.

These are tools in the toolbox, and I think the adage "try to use the best tool for the job" still applies. Sometimes, you use the tool you have and you know, and all the more power to you if you can get the job done using that tool. If you are a master of that tool (i.e. SQL in this case), you can often push its capabilities very, very far.

That said, I think the best thing to do right now is try to separate the signal from the noise regarding AI/ML and find what really works and what does not. Then find how these new tools can either complement or replace previous approaches. I think they work together quite nicely - and we see that sometimes, for example, with AI/ML tools integrated close to SQL engines.

AI/ML has a place, and so does SQL. I will say, though, that I for one don't want to be caught on the side of the discussion where I don't learn enough about what is possible with AI/ML, and then get left behind. I think many of my colleagues and professionals in the field and here on YC feel similarly.

Actually, I think even non-technical people feel the same way - the fear of being replaced by AI/ML is higher than ever.

So, keep applying SQL and get that low-hanging fruit. But make sure to learn the new stuff too, and add it to your toolbox.

smsm42 8 years ago |

But if you say "I'm going to use a bunch of shell scripts to parse logs" you are boring. If you say "I am going to use groundbreaking ML/AI technologies to transform big data into customer retention solutions", you are a visionary.

brootstrap 8 years ago | |

providing insights to your log data faster then ever before with machine learning big data

euske 8 years ago |

The biggest threat of AI is not its ability of taking jobs or exterminating mankind, but the amount of distraction it creates.

When politicians say they improved the economy like 30%, nobody buys into that. It's an overly exaggerated misleading political talk. But when some tech gurus talk about how AI improved their profit 30% or something, everyone seems to hop on. It's an effective marketing, for sure, but this is a worrisome trend. The root cause of this is I think the lack of proper understanding of fundamentals (and intellectual sloppiness). AI will continue to plague us on this front, and I'm still not sure if the net gain is going to beat all the distractions it created.

kthejoker2 8 years ago |

As someone who sells both of these services, I can only add that it depends, and if you have a good dataset, it's trivial to write either one.

But once you start having to account for noise or seasonality or autoregression or dynamic weights or non linear kernel spaces, pure SQL really starts to fall down on the job.

maltalex 8 years ago |

While I see the author’s point, I fail to understand what any if this has to do with SQL. The problem ML solves isn’t querying databases, it’s making decisions. If a human came up with the idea “let’s lookup people X and send them email Y” and it works, great. But a human made that decision, and SQL is just a tool for making it happen. If you want to take the human out of the loop, SQL won’t save you.

r3bl 8 years ago | |

I don't think you understand author's point.

His point is highlighted in the first tweet, in which the author appears to be specifically annoyed by the potential founders and investors that can't understand that ML isn't a good solution for all of the problems.

He then goes on and gives an example of such problem by explaining a shopping cart that doesn't actually need ML, but just some old-fashioned SQL. He doesn't claim that SQL is a solution to all ML problems, just this one.

dotmanish 8 years ago | | |

(I'm not the parent commenter who you replied to, but I think I understood what his/her point was).

Taking the shopping cart example: "In a former life, I used to write SQL to extract customer of the week. Basically, select from orders table where basket size is the biggest."

The author decided that 'customer of the week' will be selected by 'biggest basket size'. Not by 'biggest $ amount spent', 'fastest time from add-to-cart to checkout' (and numerous other attributes or combination of them). This decision (the "best attribute") was taken by a human, leaving a field open where a combination of attributes could've resulted in overall better business outcome (how much did 99% of these retained customers shop for, in $ value over lifetime?, etc)

This is possibly what the parent commenter is hinting at - this human decision leaves a lot of optimization scope, where ML could have helped.

ianamartin 8 years ago |

A huge part of the "data revolutions" that we've seen in the last few decades really has nothing to do with data and everything to do with process.

Data Warehouses changed the way people and companies do data. They expose all kinds of things that were never available before. It was magic!

No. It wasn't. Not that Data Warehouses are bad or ineffective. But it's a lot like the problem you face when you observing something changes it. The work you have to go through to build a real data warehouse is that you have to get disparate parts of an organization to codify process. Data warehouses don't model data. They model processes.

The mere fact of forcing the company to pin the process is often more beneficial than the warehouse itself.

The same thing goes for ML and AI. The only way to extract features is for them to actually exist. And that means the data needs to exist in a certain form, and there's a human process that leads to that. Absent that, it's pretty useless.

I cut my teeth on SQL, and it's a big part of my professional career. I think it's great. It's one of my favorite languages, and it does a lot that maybe a lot of people don't know about.

But this title and the content are really pretty garbage. Anyone who thinks that good SQL can do what good AI/ML can do is really misunderstanding both.

cyberomin 8 years ago |

Hi, I'm the guy that wrote the tweets. Let me know if you have any other questions. I'm happy to answer any question.

elchief 8 years ago |

eh. I built a lead gen system at a fortune 1000. The heuristic SQL version brought in 10M a year. The random forest version brings in 100M a year. It saw things we didn't

kgdinesh 8 years ago | |

can you elaborate?

elchief 8 years ago | | |

Built a lead generation system. Looks at searches on our site and picks out the best people for our sales team to call.

Set up a bunch of rules created by the sales team. Tweaked it over months. Made money

Then used real sales data tied back to search history and built a machine learning model. It found new patterns that the sales team hadn't thought of, and performs much better

smcl 8 years ago |

"Set this as a CRON that fires at 2AM everyday, period with less activity and traffic. People wake up to emails reminding them about their abandoned carts"

Hah I wondered why I got so many notifications in the middle of the night. Now I know that it's from people who think they're helping - not realising that it actually sours my opinion on their company/product.

epilogue 8 years ago |

The writer seems to describe very basic data mining in some cases, which in itself is a form of AI/ML, but then other examples have no relevance to needing to use AI/ML at all.

If their data is already clean enough for SQL queries to work reliably and they are familiar with the SQL syntax, why not look into things such as DMX in MSSQL to make predictions on what these customers are likely to want to buy. This solves the whole marketing breast pumps to a man who bought sneakers scenario, while it also providing more personalized recommendations.

If your current technique is to send an email about sneakers to recent sneaker purchasers, do you really thing they are in the market for another pair?

Sure, it might not make sense to implement a deep learning neural network just to send something like a semi-personal marketing email but their are so many varying levels of AI/ML that seem to get ignored in favor of the flavor of the month Tensorflow/IBM Watson/Whatever else. Quite frankly, the whole thing just comes across as a very closed minded rant from someone who isn't interested in exploring what new technologies are capable of.

et2o 8 years ago |

Good points but really a false dichotomy. The purpose of AI and machine learning is to find patterns in data that aren't simple heuristics like this.

free652 8 years ago |

The problem with SQL is that eventually you will end up with thousands of SQL scripts. Have you ever tried to debug a 100k SQL? It’s a nightmare. Some of the scripts used to be simple, but got too complicated due to new requirements like this article doesn’t mention how he would deal with multiple time zones, currencies, different type of customers, multiple promotions for repeat customers and etc.

QuantumAphid 8 years ago | |

Are you suggesting that AI/ML makes those new requirements go away? Or that managing those requirements becomes easier because AI/ML software figures it out?

danShumway 8 years ago | |

Not that I disagree with you, but does machine learning solve any of those problems?

SQL is annoying to debug, ML is impossible to debug.

40four 8 years ago |

I don't get this article at all. The author does not really back up their argument with any examples of ML. What in the world does common marketing practice & seemingly basic SQL queries have to do with AI/ML? What am I missing here? To me, this just sounds like a "Get off my lawn" type of rant. "Why do we need the newfangled AI when we still have good ole' SQL & bash!(waving fist in the air)"

On the other hand comments are talking about hiring data scientists for months if not a year or more (yikes!) To clean data & wait for it ... perform linear regression. To me this sounds like a great application of machine learning. Couldn't someone train some models to clean the data, then do one of the things ML does best, linear regression, in a fraction of the time the human data scientists could do it in?

ashelmire 8 years ago | |

Data cleaning is reeeally messy. It requires a lot of training data to get a system that does even a bad job of it automatically. So you still end up needing a lot of clean data, and getting the system to that point probably isn’t worth it if you’re not one of the biggest tech companies (you can spend time creating a system to clean the data or just clean the data). But your other points are spot on. This article is garbage.

40four 8 years ago | | |

I see what you're saying about cleaning the data. This is something I'm very unfamiliar with, so good to know!

Yeah I think the article is garbage too, makes me wonder why it gained so much traction? The argument/ topic are not developed at all.

I guess the point they were going for is there are people who want to use ML because it's 'trendy' or something, and simpler solutions would suffice. I could see that being true, but this article is BAD. I hate seeing low quality articles get rewarded.

threeseed 8 years ago |

Who on earth are these people describing ?

I've never heard of anyone hiring expensive Data Scientists, spinning up Spark/H2O clusters, building a data lake, doing a database offload to S3/HDFS all for a "select from orders table where basket size is the biggest" query.

AI/ML doesn't even work like this. It's simply not designed for giving 100% accurate answers to highly structured queries.

sercant 8 years ago |

Although the author has fair results with his given case, the author is mistaken the use of AI/ML in such scenario. In the example, they make the decision of "We should send emails to people who did 'case a'.". This is a pure 'instinct' by the decision maker. But in AI/ML case, this would be learnt from the feedback of the click rates etc. Naturally, decision maker becomes the AI, which actually can find interesting scenarios and exploit these behaviours to increase the desired outcomes.

kriro 8 years ago |

The article doesn't convince me.

It can be summarized as "don't overengineer" but quite frankly these days ML/DL is so easy to apply from a technical point of view (taking care of the data or fully grasping the things you apply is another issue) that I don't see why one wouldn't at least try to use it. I don't see why a ML-algorithm couldn't grab the first name for example. I mean if your argument is "just use SQL" my counterargument is "I agree but I can just try ML as SQL on steroids". If you already have well curated data that you run the SQL on you might as well play around with it in an ML setting. "Customer with largest basket" might work fine but why not try to prod the data to check for other interesting things. Same for the POD example. Why not at least try to see if a combination of variables might yield more interesting results than the simple stuff that might work. Occams razor should not cut out all curiosity :D

I like the overall idea of "try the simple stuff first" but quite frankly these days you can run very good ML with pretty much all it takes to do SQL queries (assuming you train your models on a separate machine).

soVeryTired 8 years ago | |

The right questions to ask are "what is the incremental benefit I can get from ML over a simple rules-based system? How much does that added complexity cost?".

Costs include technical debt, increased maintenance, general opacity, and the risk that the complex model runs amok and does something stupid (which is more common than you might think).

Sometimes those costs are justified, sometimes they aren't.

dzmien 8 years ago | |

I don't think the author was trying to suggest that people stop using/developing ML/AI. I think the point was that ML is not some kind of black magic panacea, and that a smart programmer is often more valuable than a 'smart' machine.

reilly3000 8 years ago |

This is the opposite of AI use cases in marketing. You are declaring a specific timeframe for your message delivery. That is not how a marketer should deploy AI. I haven’t been in any pitch meetings since AI assclownery took hold so I can’t comment on how the term is being abused. What I can say is that a model that used AI would take every parameter it could about each customer and determine the optimal time to sent an email to get a conversion. The only inputs the marketer should provide is raw historical data with clear parameters like order value, order items, estimated revenue, buyer classification, a stream of subsequent etc and date, and the model should solve for the correct timestamp to send the follow up message. I don’t think the AI is writing the message yet, and I don’t think you need a neural net to do a decent job at solving for the right datetime to send. I do think the approach I described would get superior conversion rates than a rule, cost more to make than that rule, and definitely demand a decently huge dataset to add much value.

nostrademons 8 years ago | |

I think the main point of the article is that 99% of the value in that is in a.) sending the follow-up email at all (which just takes a cronjob) and b.) identifying which customers to send that follow-up e-mail to (which just takes a SQL query). While it's probably nice to try and predict the ideal time to send it, the gains you get from that are marginal compared to steps a & b, which many companies aren't even doing today.

zer00eyz 8 years ago | |

Though your generally right, the article sets a tone in the first paragraph.

It is saying that if your looking for ML/AI solutions for marketing and you ARENT doing the basics already then your throwing good money after bad. You should START with some sql and targeted emails before you dive into a large and potentially expensive project.

taeric 8 years ago | |

I think the big problem with your example is that most companies just don't have something I want to talk with them about. For large stretches of the year. No amount of ai is going to change that I am not in the market for a car, as an example. Even biking gear is off limited sellability to me most years. Hard to know when I'm going to buy new booking shoes, unless you know when my shoes are going to go bad.

flyingcircus3 8 years ago |

I like the notion that AI is impossible to wrangle into a neat box, because it has always described the cutting edge of technology. Image manipulation, audio synthesis, and other techniques we're once considered artificial intelligence. But now that they are far better defined and understood, they essentially have fallen out of the nebulous sphere of sci-fi tech.

master_yoda_1 8 years ago |

In a layman term the difference between sql and ml is, ml predict things and sql just tell you things.

Things has changed and ml now a days can do far better things. If the competitor is using ml and making gain, then one should also catch up as soon as possible.

SQL analytics was past, predictive analytics is the future. ML can do more than predictive analytics for you :)

tejasmanohar 8 years ago | |

  SQL analytics was past, predictive analytics is the future

You're over-simplifying things. SQL is here to stay, regardless of how big ML, which I'm very bullish on, becomes. Start with the simplest approach and try alternatives when/if it doesn't work. Simply jumping to "predictive analytics" is silly.

master_yoda_1 8 years ago | | |

I agree sql is here to stay. For ml you need data and sql is best place to store data. You can use various sql queries to get feature for ml system.

azernik 8 years ago | |

Predictive analytics is the future for people who are trying to predict things.

Use the right tool for the task.

kexx 8 years ago |

Most people forgets IT is the same as any other industry with marketing plots, promotions pretending to be articles, etc. Before AI/ML and big data, we had cloud (which is basically a server), web2.0 (it does not even make any kind of sense technologically), ajax (how was that a new thing in any way?) or really long time ago NETWORK COMPUTER (this one kinda hilarious, oracle tried to sell dumb terminals as future - https://en.wikipedia.org/wiki/Network_Computer, and nowadays Google tried the same thing with chromebook). I feel it's the same thing as in every 5 years, healthy food is different. Do you remember those days when fat was deadly poison?

sbhn 8 years ago |

You can even make do with plain old client side JavaScript object arrays. After looking at your site, I can see your company has very good presentation skills. It very effectively appears to sell a simple algorithm that nearly anybody on earth with a little bit of experience, could do themselves. What the investor wants, is can you sell AI/ML as successfully as some text coloured blue, white and red. If this HN post is anything to go by, it certainly generated a lot of interest and maybe I could hire your company to polish my A href link algorithm with some AI/ML gloss

j45 8 years ago |

People written SQL scripts that check for scenarios, and even potentially action / repair them is a form of intelligence. It's not artificial, either.

Thinking back to successful ERP implementations, little was more useful during go-live or an ongoing basis than a script that ran every hour/day/week/month to look for a condition and report it.

In one case, over a 3 year period where the organization grew from 0 to 60 million per year, every data issue was logged as a ticket, investigated, where needed, a Sql script written to monitor other occurrences, and ultimately, if there was a need to action, it would be forwarded to the right destination with a link to instructions on how to resolve or investigate if a decision could not be programmatically made.

The power of this was users received direct and immediate feedback anytime they wanted if their work was good and compliant with the system and process.

How did the list of scripts to build get made? Every time the system behaved correctly or incorrectly, and needed attention, whether due to data being incomplete, mis-entered, or correct and ready for the next step, the technology was busy working for the users.

Scripts reduced concerns that issues were being missed. Once something had happened and it was important enough, a custom insight could be built. It helped build a data driven culture instead of hoping the computer picked the right thing.

Sql scripts could one day feed into or fit with AI/ML. I don't see that day here in the short term.

voltagex_ 8 years ago |

Are SQL skills disappearing from companies? Could this be a reason people are reaching for more complicated solutions because they don't know what a good SQL database can do?

collyw 8 years ago | |

I blame the NoSQl nonsense from a few years ago. "Relational databases don't scale" apparently.

notyourday 8 years ago | | |

https://www.youtube.com/watch?v=b2F-DItXtZs

dizzystar 8 years ago |

The best is when you ask someone why they want an AI/ML masterwork, they just say it's the future and we don't want to be left behind.

It's interesting because this article shows the overlap of what a non-tech thinks is AI and what is common fodder for any decent programmer. So many things get lost in buzzword to English translation, it's easy to forget that most people correlate the plastic box sitting in front of them with an intelligent Magic 8 Ball.

martin-adams 8 years ago |

Maybe I'm completely missing the point here, but I thought the use case for AI/ML was to find the cause, not the effect. For example:

>> If a person tries to checkout with 3 different cards at the same time and they all bounced, something funny is happening. Block their account temporary for a while.

That assumes you know that 3 different cards were used and they bounced. Sure, the SQL can answer the question, but you have to know the question first.

I'm happy to be corrected here.

johnlbevan2 8 years ago |

Fully agreed that in simple use cases simple solutions make sense; I've been arguing similarly for the NoSQL movement for years (i.e. NoSQL being great for large scale systems; but for most companies day-to-day needs SQL wins out).

However, it would be good to have a bit more in the article to say what AI/ML* is in this context, and a couple of scenarios where it beats SQL; i.e. otherwise it just sounds like the rantings of an old man "in my day we only had turnips; you needed a snack: turnip; you needed a pillow: turnip". By showing a few good use cases allows you to better contrast the product / get an understanding of where the boundaries are between the technologies.

*NB: When I first read this I assumed the author was talking about AIML (artificial intelligence markup language) rather than AI/ML (artificial intelligence / machine language)... as though the slash was included, there was no use of the full terms.

cirgue 8 years ago |

ML is best suited for situations where there is no practical solution using typical statistics techniques and where marginal improvements in accuracy lead to significant boosts in revenue or some other useful metric. It turns out there aren't that many of those problems unless you're operating on truly enormous scales.

crabasa 8 years ago |

Back in 1999 I worked at an early web consultancy that built apps for clients on top of Oracle. We used their DB + a programming language called PL/SQL.

There was a feature of Oracle called SOUNDEX which was magical. Here's an example from their docs page [1]:

    SELECT last_name, first_name
    FROM hr.employees
    WHERE SOUNDEX(last_name)= SOUNDEX('SMYTHE');

This query will return all people with a last name that sounds like 'Smythe', including 'Smith' and 'Smithe'.

[1] https://docs.oracle.com/cd/B19306_01/server.102/b14200/funct...

msumpter 8 years ago | |

I've used similar phonetic algos in Excel to deduplicate CRM data during corporate acquisitions, it always seemed like the source data was hopelessly duplicated, but running a few of theses algos against the data, and then providing the 'best guesses' to the sales team to then do the final massaging of which accounts are truly duplicate or should be left alone.

Soundex is very simple but works well, calculating a strings Jaro–Winkler distance also helped.

PeterisP 8 years ago | |

Soundex, by the way, had it's 100th anniversary a few weeks ago - it was patented in 1918.

nicodds 8 years ago |

I think the writer is overgeneralizing his particular use case. Surely, the situation he represents doesn't need any AI/ML, but it is the result of a simple use case, with little variables and with an easy workflow.

Does the same pattern apply also in more complex scenarios?

ben509 8 years ago | |

Yes and no. AI/ML is real stuff that can do useful things; I worked at a government contracting firm and we made a lot of that stuff work. But as I recall, before anything landed in AI algos, we'd always have pages and pages of code handcrafted by SMEs to prep it. It's not hard to see that, for many cases, all the prep work gets you pretty close to the answer without any training.

I think the issue the article hints at is there are way too many contractors willing to burn your cash on AI/ML.

Contracting has a serious principal agent problem; there was a discussion I recall over how to implement a quick search feature in a system we maintained. I floated the idea of sampling the data to get approximate results, but that was instantly shot down in favor of buying a ton more hardware. There are serious arguments against sampling, it's very tricky to get right, but if we had been spending our own money I think it would have gotten a more careful hearing.

notyourday 8 years ago |

"Machine learning" is an excellent tool to separate extra money from "customers". Founders are separating extra money from VCs. Engineers are separating extra money from the founders. Just reading this thread keeps illustrating this.

Want to get a job done? Use a tool that gets a job done. Want to talk about getting a job done and be "listened" to - use ML to beat around the bush.

This is no different from all these companies talking about Big Data[tm] a few years ago, hiding people to build large processing clusters when their entire dataset would fit into memory of $700 server obtained from Ebay.

Neither it is different from companies mumbling about availability challenges when the entire stack gets sub 100 hits per second.

mythrwy 8 years ago |

For the examples mentioned in the article no, you don't need statistical analysis but these are simple cases (which most cases are).

Late orders, biggest orders etc. etc. sure, those are all SQL queries.

However if you want to make statistical predictions or looks for the non obvious, these simple types of queries aren't going to do it. So it's an apples to oranges comparison.

There are a lot of cases where people don't know what they are after. And also lots of cases were orgs don't have a grasp on the simple things, but somehow think more complex things (especially buzzword things) are magically going to solve a lack of organization and insight.

AngeloAnolin 8 years ago |

AI, ML, Data Science, Algorithms - these are just the fancy buzzwords we have tended to associate with how we analyze data. We have been doing a lot of these stuff (especially if you are in the software engineering world for business and consumer products).

Iterating to the author's given examples, we have probably been doing:

What would be the net effect in terms of sales and profit if we reduce our price by 5 cents, but increased our sales 25x? Those are already models that encompasses predictive modeling, where we provide inputs and determine from a given set of output based on general assumptions backed by data.

AzzieElbab 8 years ago |

You probably do not need SQL. You can make do with well written see awk scripts

ben509 8 years ago | |

For parsing and prepping large amounts of data, awk is shockingly effective. And after a while, you even get used to your eyes bleeding...

stackzero 8 years ago |

Click bait of a title... I think the more important thing this article is trying to say is: use a good heuristic to solve your problem, if it can't do so then ML may be something to look into.

jmpeax 8 years ago |

> select from orders table where basket size is the biggest. We will then email a nice thank you note to this customer and attach a small coupon/voucher.... ...Guess what? 99% of these people became repeat customers

Sounds like you definitely need some ML there, in the form of statistics. Was there a difference in probability of repeat customers between sending and not sending the voucher? Was there a difference between basket sizes and probability of repeat customers? Is there an interaction between the two?

collyw 8 years ago | |

Why couldn't you do that in SQL, sounds simple enough to me?

pugworthy 8 years ago |

Gee. Thanks for the porn in the "Recommended Threads" at the bottom of the page :/

If your work scans your web browsing for certain words etc., don't click the link.

erikb 8 years ago |

Well yes, if you own your shop and are one of 1-10 people working in this shop, then you don't need these high tech things.

They are of course for companies that make so much money that they can afford to spend 6-digit pays for a Marketing Manager who doesn't know sh*t about his job who in turn is spending millions on randomizing-diagram generators so it seems like he is working hard.

i_feel_great 8 years ago |

I find very handy the Postgres aggregate and stats functions: https://www.postgresql.org/docs/current/static/functions-agg....

I have also used Sparklines in Python for quick and dirty trends

philipodonnell 8 years ago |

I think SQL can also benefit from some of the progress around making things that "feel like" ML easier. For instance, dplyr is a refreshing change to the way you write operations that manipulate data in a table/column structure, even though it uses mostly the same verbs and language constructs as SQL.

internetman55 8 years ago |

Why not both?

http://sqldatamine.blogspot.com/2013/07/single-multiple-regr...

qwerty456127 8 years ago |

> we will send a nice "we miss you, come back and here's X Naira voucher" email. The conversation rate for this one was always greater than 50%.

Wow. I could never imagine so many people actually read marketing e-mails

debarshri 8 years ago |

Isn't machine learning a concept, whereas sql or anything else is more about how you implement. I have in past seen a well season sql developer implementing collaborative filtering like algorithms.

walshemj 8 years ago |

You can do some types of ML with SQL all the main sql databases are Turing complete.

Not sure if its going to be efficient for clustering and entity extraction at scale tho

RandyRanderson 8 years ago |

ML is not going from 0 -> 25% it's going from 25% to 28%, say, and that 3% being much more in profit than the cost of the ML work.

Gravityloss 8 years ago |

I guess if you're investing for the long term, avoid anything with machine learning, as it's overpriced...

viach 8 years ago |

You don't need no AI/ML, no Blockchain, SQL works just fine... I see where is it going today on HN...

exabrial 8 years ago |

This is one of those HN threads where I'm going to sit back with a bag of popcorn and hit refresh...

piyush_soni 8 years ago |

Now write an SQL Query to find all photographs that have me and my wife sitting in a boat in them.

Piskvorrr 8 years ago | |

...minus the ones which are obviously a toaster, while the AI insists they're you and your wife sitting in a boat ;) Now what?

piyush_soni 8 years ago | | |

Well, yes, it's not perfect yet, but then I'm pretty sure SQL Queries won't fare any better here ;). Google photos does quite a decent job for me in many cases.

thrownaway954 8 years ago | | |

select * from photos p inner join tags t where t.tag in('you','your wife','boat') and not in('toaster')

thrownaway954 8 years ago | |

no problem:

select * from photos p inner join tags t where t.tag in('you','your wife','boat')

point being is "crap in, crap out". if you properly tag/label your data, you can accomplish anything with sql or machine learning.

piyush_soni 8 years ago | | |

Your SQL solution is that each user (or someone sitting in Google) manually tags each and every photo they take for all the tags in the world they can think of (and can't, but will think of in future). Seriously, the general case of this problem just cannot be solved using SQL in a feasible and automatic manner. What Google/Apple (or Microsoft in parts) have achieved here is not an easy thing to achieve by any SQL Query, because that works on nicely populated tables, and without ML those tables are probably not there.

newsum 8 years ago |

so many haters on this comment thread.

Just read and stop hating. https://www.thestreet.com/investing/nasdaq-all-in-on-blockch...

justonepost 8 years ago |

Doesn’t scale.

cup-of-tea 8 years ago |

Yeah but non technical people who don't know what they are doing but for some reason have money to spend just know they want you to use machine learning for everything.

One time at work I wrote a simple web app with a search box (just doing an sql query, nothing fancy). One of the "higher ups" was impressed and decided to flex their knowledge, pointing to the search box saying "and this uses nlp". It was a damn sql query on a full text field.

blackrock 8 years ago |

Am I misunderstanding something here?

Artificial Intelligence is about statistical analysis.

Such as: Is this picture of a man and his dog, actually a dog? Or is it a cat? Or is it a 4 legged creature? Or is it a turtle?

The AI is supposed to identify that the animal in the picture, is a dog with a 99.8% probability. And since it exceeded the 98% threshold, then it becomes accepted as a dog, until otherwise disproven.

Basically, it is a pattern matching mechanism, on a massive statistical scale.

And from this, then further actions can be taken.

Such as, the owner of the dog, can be mailed advertising and coupons that are related to dogs.

And then, the AI can go even further. What specific kind of dog is it? Is it a German Shepard? Is it a beagle? Is it a poodle?

The AI can determine the specific type of dog, and conclude that it is a German Shepard with a 99.7% probability. This exceeds the threshold, so then the computer system might mail out an advertising to the owner, about deals related to a German Shepard.

For something like this, then this is where social media can really shine. When you upload your pictures to Facebook, or Gmail, or Instagram, then Facebook or Google, can use an AI to analyze your picture. As well as reading your caption on it. And they can determine the context of your picture, such as whether you have a dog in it. Are you holding the dog? Are you walking the dog? Are you smiling in the picture? If the scenarios check out, then the AI can select you as a candidate, and send advertisements related to your dog.

In fact, I think our brain operates the same way, by using statistical analysis.

When we see a dog, in a picture or in real life, our brain is actually using a statistical analysis to determine that it is a dog. Our brain follows a neural network pathway to match that picture of a dog, to a similar variation of a dog that we have in our memory. It is thus statistically true, until otherwise disproven.

This of course, happens in the deep recesses of our brain, so it's currently impossible to know what really is happening there, until we have a better scientific understanding of how our neurons work in our brain.

On the flip side, SQL scripts has no mechanism to view the picture, to determine if the animal in it, is a dog, or a cat, or even if it is a human.

badminton1 8 years ago |

Write me a SQL query that labels images, produces a transcript from sounds, recognizes handwritten text, does facial recognition or recommends items.

See the point? Welcome to 2018.

Piskvorrr 8 years ago | |

Except wan you gat whoa ird trance scriptions, people tagged as gorillas and recommendations "I see you bought $that, do you also want to buy $that?" There's No Silver Bullet, welcome to $any_year.

cup-of-tea 8 years ago | | |

What language is this?

cup-of-tea 8 years ago | |

Most companies don't have anything like that.