Giving Away Our Recommendation Engine

Giving Away Our Recommendation Engine(blog.mortardata.com)

297 points by kky 12 years ago | 43 comments

mck- 12 years ago |

Fwiw, here are two light-weight feature-based recommendation engines I built for Node.js (for situations where you have the cold-start problem and therefore can't rely on user/item based collaborative filtering): Alike [1] and Look-Alike [2]

[1] https://github.com/axiomzen/Alike

[2] https://github.com/axiomzen/Look-Alike

yblu 12 years ago | |

Thanks for sharing. What do you mean by the "cold-start" problem? Just want to know exactly when I can use your engines.

elwell 12 years ago | | |

Just speculating: not having a recommendation when you first begin because you don't have any data.

contingencies 12 years ago |

So hang on, what exactly is a recommendation engine?

They give examples of LinkedIn (people you may know) and Amazon (presumably other people who bought this, so-and-so's list of such-a-subject books).

That makes sense, though the segment of businesses that may actually benefit seems limited. Social stuff, sure. Most of us? What's the minimum recommendable-entity/category-or-user threshold that this makes sense for? Is success with these sorts of engines merely a reflector of poor UI design in your normal UX? (Of the above examples, the first seems very unidimensional - in that it's basically a simple graph distance - and the latter also rather rudimentary and often irrelevant).

So what exactly is this thing providing? Graph analysis? I think not. It reads more like some kind of raw timestamped user behavioural event data processing to infer relationships between users or products they interact with. Reading through the docs it seems this is a layer on top of Apache Pig (https://pig.apache.org/) - a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. I think clarity in explaining this thing could be improved, particularly selling clearly what a recommendation is and when its useful. Using phrases like "award winning" doesn't help.

PS. Why all the downvotes? Sheesh.

icambron 12 years ago | |

I suspect you're being downvoted for having a dismissive tone in the same breath as you admit to not understanding the problem space. My guess is that the marketing copy on the site isn't targeted towards you, so it shouldn't surprise you that you don't understand it, whereas their target customers know all sorts of things about, say, Pig. That's fine, but then your comment should read like, "Can someone explain to me what this is for?" Instead, your comment is dripping with condescending snark, lecturing someone or another about how this thing you don't understand probably isn't useful, and incredulous that you don't understand something on the internet.

Imagine opening an advanced textbook on a subject you don't understand, reading two paragraphs of it, and throwing up your hands in disgust because what does this even mean?

contingencies 12 years ago | | |

Apologies, condescending snark was not the intent (I can't even see where you see that, actually!). In response to your more salient points, it could be argued that a web+EC2 layer on top of existing software is hardly an advanced textbook. Likewise, their announcement's stated intent is to gain customers, so feedback on what's unclear should be well within an acceptable scope of discussion. Finally, I doubt any of us are excluded from their intended market as software people move frequently between problem domains.

alexhawdon 12 years ago | |

https://www.coursera.org/course/recsys will provide you with a good introduction to the topic

contingencies 12 years ago | | |

Thanks. For others who are interested, that course apparently uses a different piece of software called LensKit http://lenskit.grouplens.org/

Could anyone summarize the difference between Pig and LensKit when applied to recommendation systems?

stonemetal 12 years ago | |

> benefit seems limited. Social stuff, sure. Most of us?

Any business where you have a large catalog that users are going to want to filter through. This gives you the ability to offer a shortcut to things they might find interesting. Other examples would be netflix, spotify, app stores, or coursera.

contingencies 12 years ago | | |

Basically those are all media discovery applications. (PS. I can't think of a single app store experience worth replicating...)

olidb2 12 years ago |

FWIW we've been using the mortar platform to run large pig jobs without a fuss at http://datadog.com and we've been very happy with it. Glad to see them contribute their recommender code too.

alecsmart1 12 years ago | |

Can you please suggest why you need a recommendation engine for datadog?

olidb2 12 years ago | | |

We don't use the recommendation engine but the underlying platform, which makes it really simple to write and run pig jobs. Though the majority of our business deals with real-time data processing, the ability to crunch numbers in batch without dev or ops overhead is attractive and well worth the price to us.

pixelmade 12 years ago |

I'm curious what the business case was for open sourcing the code. Maybe to create an ecosystem?

lotsofcows 12 years ago | |

"We’re giving over a year’s worth of work on our recommendation engine away because we want to earn your business on our platform."

showerst 12 years ago | |

From the "What you'll need" section of the first tutorial -

A Mortar account. You can sign up for a free Public account with Mortar here. If you want to keep your customized recommendation engine code private, you will need a Solo-level account ($99/month). Beyond that, you'll only pay for your actual usage of AWS cloud services (we never add an upcharge).

Kudos for the open source, but it looks like to actually use this for business you'll still need to pay. Unless i'm misreading it, "Open source but you'll still have to go through our platform" is pretty disingenuous.

ethanbond 12 years ago | | |

It reads like "open source but not free to make proprietary." First, it's awesome just to see source as something to learn from. Second, it seems reasonable they don't want people forking, modifying then profiting from their work without contributing back to it - either by also releasing source or by paying.

I think it's a nice model actually.

gmisra 12 years ago | | |

The code is all released under the Apache 2.0 license, so calling such an action "disingenuous" is itself disingenuous, (imo).

dsheth 12 years ago |

Anyone know of any comparisons between this and Apache Mahout? I've used Mahout's Item-Item recommender in the past, and it's worked well, just wondering if there were advantages to this recommender.

ASquare 12 years ago |

I'm sure plenty of good karma (even the non-HN kind) is headed your way - kudos.

X4 12 years ago |

WOW, Awesome Documentation and Product!! Kudos and Greetings from Germany 😊

Those who know what Hadoop, Pig and the whole "Data Science Stack" is, will find this surely useful.