Data Science Challenges at Instacart(tech.instacart.com) |
Data Science Challenges at Instacart(tech.instacart.com) |
Listing the problems Instacart has which can be solved through applied statistics is one thing. The other, more important thing, is how exactly data science works to solve problems, with technical detail, as opposed to data science being some mystical unexplained power. (Especially since the intended audience for this post is data scientists Instacart wants to hire and presumably are already knowledgable in data science)
Apparently when she works in SF proper is the only place with a guaranteed wage, all surrounding areas you are at the mercy of what people decide to tip. Sounds outrageous, all it takes is one cheap idiot to completely ruin your shift and make it not worth it to work there.
To be clear, most of the blame falls on Instacart. But your friend should find a new job. A tip-only contractor job is as bad as a 100% commission telemarketer job (maybe worse).
I was just shocked, SHOCKED how little she was actually making vs how much they told her she was making when she took the job. She thought she was making $25 an hour, but after I helped her take a hard look at the math, she was really only making $5-$7 an hour, if that.
Tips should be at the customer's discretion.
I want to believe this is something Instacart employees could easily clear up by disseminating some information on the subject rather than hoping that Instacart will handle it.
The real problem here is Instacart's model. But is it really a problem if your friend still works for them? Why does he/she do so? Maybe time to quit?
imo it seems like these gig companies really benefit from exploiting underpaid workers who don't know any better and aren't connected/smart enough/empowered to fight back. She really believed that she was going to bring in $25 an hour like they told her she would be, didn't realize she had to account for taxes, calculating all the expenses etc.
I really drilled her on this because at first I couldn't believe it and she's not the most savvy person so I wanted to make sure she realized how after gas expenses, wear and tear on her car and the additional 7.65% in FICA that she's going to have to pay (the half normally covered by employers), she's really only making about $5-$7 an hour, if that. Also the job really eats up all the data on her phone because she has be on her phone for Instacart to scan and check off every grocery item she puts in the basket, then use their GPS system for the deliveries, and it's not like they reimburse her for any of that. Instacart sold her on the job promising $20-$25 an hour and she's absolutely not been making anywhere NEAR that.
By way of comparison, she makes a guaranteed $20 an hour under the table walking dogs, so I think she is going to go back to doing that.
This article seems to verify:
> Instacart declined to confirm whether it offers base pay, and some Instacart workers told HuffPost they are not offered an hourly guaranteed wage.
> “It’s a really strange job, and there are many weeks where you’re just sitting in the car waiting for orders and hoping something comes in, not being paid to be there,” one of Instacart’s personal shoppers
http://www.huffingtonpost.com/2015/02/02/instacart-workers_n...
[1] www.gastrograph.com
[2] https://gastrograph.com/blogs/gastronexus/interviewing-data-...
Then again, I'm not looking for an internship.
Did you see the work we linked to? That's intern work here - we treat our interns as full members of the team, and they've delivered.
I have no idea what this could mean. Either you're getting algorithm suggestions from your shoppers and customers, or (more likely) "data science" means "user interface."
As far as I can tell what these people do every day is called "business" or maybe "logistics"
Which isn't a slight against them or the field in the least, it's just a debate about definitions.
They're not scientists, they're engineers perhaps, or business analysts.
In this article, I (our VP Data Science) highlight some challenges the data science team is tackling at Instacart ranging from logistics to personalization. I also go into detail on how we have organized data science to have maximal impact and what we look for when recruiting data scientists.
1 - I've been doing ad optimization / user classification / propensity scoring / product recommendations, warned them that I took a couple stochastic processes classes but haven't used them for a decade, and that I was entirely unsuited for OR type problems. They said that was ok and they where hiring for things I was suited for. Great. My in-person interview was primarily an OR problem best solved with stochastic processes.
2 - they were very responsive at first but after the interview, went radio silent for a week. After promising a response in a day. This was particularly annoying since I told them I had a written offer that I was pushing off for them. My guess is they were waiting to see if another candidate would accept. Which is fine, but the recruiter should have been honest with me. They ignored me for 5 business days after the interview -- 4 after their promised response -- before finally telling me no thanks. I'm not grumpy about being told no -- that's definitely happened before -- but their crappy behavior. Fortunately I'd already accepted the other offer after reading between the lines, but still, the experience left me grumpy.
I debated posting this for a while, but bluntly, I kind of felt like they wasted my time and was really not happy their internal recruiter blew me off after repeated promises otherwise. I'm sure they'll be along to say your experience will be different (and it may well be!) but here's a data point for your consideration. I'm just sharing my experience.
ugh, I hate that. I was jerked around by an a-list firm like that this fall and it was totally frustrating, especially because I turned down another offer while I was waiting to hear back. Really screwed me over and left me very bitter/annoyed. There's no respect and everyone's looking out for #1.
Regarding the focus on OR, that was definitely the case for our first few years, and while it's still important, we have definitely expanded our focus beyond it.
Does this mean that as an employee, or ex-employee, I can take my owned projects with me and use them for my own purposes?
(Whether the contractor classification is reasonable is a whole other ball of wax)
That's an interesting definition.
It doesn't depend on where the person works (though sure that's a relevant sign) it matters what they're doing.
Analyzing business related data and optimizing KPI's isn't science. At best it's applied science, which we have names for, such as engineering or statistics or financial analysis.
In the sense that the money has to come from somewhere, sure. But wages are paid by employers, and it's shitty to underpay your employees under the premise that the customer will make up the difference in tips. If a single customer failing to tip $20 pushes the worker under minimum wage for the day, then the system is broken and the worker is getting screwed, as usual. And make no mistake, this is how it was designed to work - if they actually cared about their workers, they would charge a reasonable delivery fee to the customer and give all of it to the delivery person.
However, no one should be under the delusion that the consumer is not responsible for those costs.
(Ordinarily this would be off topic, but....)
Certainly a business should follow the rules of its market, but I think that we agree that the expectation of a consumer is that the price they pay covers all expenses + profit.
Do you snapshot the computed models as RData and stream them to s3, etc
1. For batch processes that run daily, hourly or minutely, where the models are rebuilt on every run, and outputs (often predictions) are written to a database 2. For computation of coefficients in large sparse regularized models, where the coefficients are written to a database and scoring is done in another language in real-time
For situations where we want real-time predictions, recommendations or optimizations, we tend to setup Python services instead. For batch processes, you can definitely store models in S3 to re-use them, and I've done that at other companies. But in general I've found it better to rebuild models frequently and cache them for short periods of time only if they are cost-prohibitive to rebuild.
Also about scoring in another language - is this really worthwhile for you ? I have often debated just throwing 128GB of RAM on an R machine and calling it a day. As I figure, your "real time" requirements are probably seconds or even minutes (similar to mine).
In general, R is not well-suited for DB-backed websites in real-time, but you can certainly use the outputs in production.
You can do it, but I'm not sure it's worth the effort. You could probably provide a predict() interface in real-time if it was reasonably quick.
I really wish pandas had a "save workspace" feature - R does that very well. No point in saving to dB if you're going to need the data set in memory anyway.... Or use Hadoop.
Changing variables frequently can be versioned in the feature and model coefficients tables, but takes care.
I haven't used Postgres JSONB, but if you have problems with JSON in R check out the tidyjson package (I wrote when dealing with Mongo data previously).
Scoring in another language is best avoided if you can. But supporting R "real time" services will also come with many complications. Hence, we use Python when we really want that.
SparkR was completely unreliable when I first tested it over a year ago, but may have improved. Though the Spark Python API has some limitations compared to Scala, so I would guess the latest SparkR is even further behind, but we haven't tested it. Long term I'd love for that to be the answer to these questions.