Ask HN: Things You Wish You Knew Before Getting into Machine Learning Especially for those who switched careers to become a machine learning practitioner, data scientist, data engineer vs. |
Ask HN: Things You Wish You Knew Before Getting into Machine Learning Especially for those who switched careers to become a machine learning practitioner, data scientist, data engineer vs. |
1. That software engineering skills are way more important than ML skills.
2. That you'd be spending more time on making presentation than doing ML (and it makes sense, it's very important to present statistics properly).
3. That most problems don't need good ML models. Something cheap and easy is often good enough. What you do need to be good, is data pipelines around them (see 1.)
In my case, I learned ML enough to feel "senior" compared to other people in company and online in less than a year. Same path to Senior SWE took me much longer (way larger mandatory knowledge base, probably because ML is a young field). So I'd say ML is definitely easier.
Also there's a great Coursera course on ML for Kaggle: https://www.coursera.org/learn/competitive-data-science
I think once you finish it, you're better than 60% of silicon valley data scientists, no kidding.
See this course to get into Kaggle: https://www.coursera.org/learn/competitive-data-science
When presented with a new problem, you have to build the data infrastructure before you can do any learning.
But if you are on a team maintaining a project over the long term, you amortise the cost of this a bit. You will still see big impacts from improving your data, but you will also see big improvements from modeling improvements, though often that will just be plugging in a different box.
But I actually think this is a good thing of your goal is to build applications. These methods are hot right now because they are very good at things that are hard for us to program, so they let us build better systems.
One other thing I'll mention is that GDPR has made a lot of inane things pretty painful, largely due to overly conservative lawyers. This is probably mostly an issue for large consumer tech companies.
I'm serious about this. Ultimately the job is just software development plus statistics.
If you are a software developer, work on your statistics.
If you're a statistician, learn to program.
Most people will have gaps in both of these sub-fields.
Do not, under any circumstances, take any online courses that include the phrases "data science" or "machine learning" in the title.
In industry, you also need to balance the amount of time and effort it takes to build your model against the incremental benefit.