Probabilistic Machine Learning: An Introduction(probml.github.io) |
Probabilistic Machine Learning: An Introduction(probml.github.io) |
From the preface:
"By Spring 2020, my draft of the second edition had swollen to about 1600 pages, and I was still not done. At this point, 3 major events happened. First, the COVID-19 pandemic struck, so I decided to “pivot” so I could spend most of my time on COVID-19 modeling. Second, MIT Press told me they could not publish a 1600 page book, and that I would need to split it into two volumes. Third, I decided to recruit several colleagues to help me finish the last ∼ 15% of “missing content”. (See acknowledgements below.)
The result is two new books, “Probabilistic Machine Learning: An Introduction”, which you are currently reading, and “Probabilistic Machine Learning: Advanced Topics”, which is the sequel to this book [Mur22].
Together these two books attempt to present a fairly broad coverage of the field of ML c. 2020, using the same unifying lens of probabilistic modeling and Bayesian decision theory that I used in the first book. Most of the content from the first book has been reused, but it is now split fairly evenly between the two new books. In addition, each book has lots of new material, covering some topics from deep learning, but also advances in other parts of the field, such as generative models, variational inference and reinforcement learning. To make the book more self-contained and useful for students, I have also added some more background content, on topics such as optimization and linear algebra, that was omitted from the first book due to lack of space.
Another major change is that nearly all of the software now uses Python instead of Matlab."
Because no open source toolkit can do what Matlab can do.
The same is true of a lot of high end software: Photoshop, pretty much any serious parametric CAD modeling system (say, SolidWorks), DaVinci Resolve, Ableton Live, etc. When a professional costs $100K+ to employ, paying a few grand to make them vastly more productive is a no brainer. If open source truly offered a replacement, then these costly programs would die. But there just isn't anything close for most work.
Matlab is used for massive amounts of precise numerical engineering design, modeling, and running systems. So while Python is good for some tasks, for the places Matlab shines Python is no where near usable. And before Python catches up in this space, I'd expect Julia to get there faster.
This really sets you up to realize that there is (and should be) a lot more to doing a good job in machine learning than simply minimizing an objective function. The answers you get depend on the model you create as do the questions you can hope to answer.
I don't see a clear list of differences between this new edition. Does anyone know what's new?
I topped statistics at the most prestigious university in my country both at the undergrad and postgrad level, and had no problem discussing advanced concepts with Senior PHDs in Quantitative Fields, and I thank this book the most for beginning my journey on this. But, and this is important, make sure to do all the exercises!
https://www.amazon.com/John-Freunds-Mathematical-Statistics-...
Is it your favourite book because of how much your personal history is tied to it, and the time you deboted to it, or are you comparing it against other books based on an analytic review and comparison of several books that you did at some point?
Nothing wrong with the former case, I also have favourites that I recommend, but if it’s the latter the recommendation is more helpful; in that case it would be awesome to detail why this one over others.
This idea of compared review is useful here:
And here:
https://www.lesswrong.com/posts/xg3hXCYQPJkwHyik2/the-best-t...
The book for me was partly great because of its contents and partly because I worked through every problem and realized how much it taught me. I should do a more factual write up on why it’s a great book, I’ll try to when I get some time.
https://camdavidsonpilon.github.io/Probabilistic-Programming...
Would love to get my hands on the draft for "Probabilistic Machine Learning: Advanced Topics".
To say something is "machine learning", I think means that you should show the code not just equations and derivations.
I mean if you only show math and derivations, what's the point? To show off what you know? How is that helpful?
I’ve used this stuff, and more often, the ideas taught, to break down a problem into a tackle-able set of pieces more times than I can count.
Never underestimate the fundamentals. Too many of my colleagues use models without actually understanding any of it. I’ve debugged so many problems by looking at the technical details in original papers and textbooks.
This is really useful information because it can help you identify what information is truly relevant for the estimation of certain parameters (so sufficient statistics) or help you crystallize your understanding of the implications of the model you’ve created. In other words, it helps show you the ways in which your model says different aspects of your data should influence others.
This creates testable implications of the model. If your model says that two variables should be conditionally independent given a third, but they’re not, you have an avenue for refinement. You can also clearly identify your assumptions or the implications of your assumptions.
Another great thing about them is that exact inference for certain (most) structures is known to be computationally infeasible. There are a lot of different inference schemes available that can help you with different approximations with various drawbacks/advantages, heuristics that sort of work, or even ways of drawing samples from the true distribution if you can identify the structures. See belief propagation, loopy belief propagation, sequential Monte Carlo, and Markov chain Monte Carlo methods.
On top of this it helps you see everything in a general framework. Lots of the fundamental pieces of ML models are really just slight tweaks to other things. For instance, SVMs are linear models on kernel spaces with a specific structural prior. Same with splines; it’s just a different basis function. All of this helps you see the pieces of different methods that are actually identical. This helps you make connections and learn more effectively, in my opinion.
Broadcasting in python is a lot more clean than the "bsxfun(@plus, ...)" abomination in matlab. If you think all the "np." is too wordy then just do "from numpy import *". For matrix multiplication you can use "@". Numpy code can be dense but most people choose clarity over brevity.
The book, Machine Learning: A Probabilistic Perspective by Kevin Murphy (the original book everyone in this thread is talking about) is probably the closest thing I can think of. Its goal is to frame everything around graphical models and probability. It's quite a tome. Still, despite its breadth, it can't possibly cover everything.
I'm going to have to rethink everything now as since it worked and was quite quick (I didn't even sample using MCMC, just brute force pulled permutations) so it was clearly not a bayesian approach, and I am very very far from one of the top 20 (or 200, or 2000 or 20000, maybe 200000?) researchers...
To choose just one example, the analysis of the new UK COVID variant relies on Bayesian modelling, both for the government analysis and the Imperial paper. (https://www.imperial.ac.uk/media/imperial-college/medicine/m...)
[1] - https://turing.ml/dev/ [2] - https://arxiv.org/abs/2002.02702
Basically, it was mainly inhertia. Older professors that liked it and rarly used anything else and the fact that generally no one gets rewarded for actually rewriting parts of an existing functioning course.
As an instructor you basically create more work for yourself in the first time you migrate a course's programming language. (And you also annoy some senior staff when forcing them to learn new things)
Proprietary features don't matter here like there. We get MathWorks employees here at least a couple times a year hawking their latest (paid) libraries, but at this point they're always something 5+ years too late, something that already exists in preferred languages--often for free.
Since our clients never deploy Matlab, it doesn't matter if their libraries are fractionally faster in any case besides mockup/experimentation in R&D, and for that I've never met anyone who chooses it for speed there. Plus in this day where even laptops are fast and cloud instances spun up in a few seconds, there's no point. It's also nicer for the dev to complain about not having enough ram to get a better machine than take the time to learn a new language for a specific use case. Likewise the project manager will prefer the quicker solution, buying.
The one item close to a "tie" with Python here is probably migration. Matlab always and Python most of the time get rewritten into something else, Java in my department.
If you're more productive in Matlab, that's fine. But if you're at a loss without it, that's not.
It doesn't belong in the education system or in educational books.
If your job will use tool X, learning it well has value. Those not learning it will be at a disadvantage.
Again, no open source software can do what Matlab can. Why ignore this?
Can you list (or point to a list of) some of MatLab's features that are absent from other software?
It is also not true today that not knowing Matlab harms your industry productivity in ML. This might have been true around a decade ago, but most teams outside academia also have moved to non-Matlab resources. And if anything, this has been further reinforced by Deep Learning libraries, the current crop of MLOps tools and cloud-based frameworks.
Matlab might be good for specific areas, but ML has not been a stronghold for a while. It is also important to remember that in the context of numerical accuracy or computation speed, Python is almost always the user-facing layer. You might (correctly) argue that the Python language is slower/faster than X, but this is not a useful metric for comparing libraries and frameworks, where the compute heavy code is probably in C/C++: numpy, tensorflow, pytorch are good examples of this.
And, here is by far the biggest issue with open source - the numerical accuracy of lots of it is crap. Matlab (and Mathematica, etc.), have employed professional numerical analysists to create numerically stable, robust algorithms, and has had decades (Matlab started in 1970, under academic numerical analyst Cleve Moeller) of refinement to weed out bugs. It's the difference between using BLAS and writing your own linear algebra package - one is likely far more robust.
Sure, some numerical open source packages are decent, and a few are excellent (BLAS and related). But when you need to glue some together, you end up far too often with stuff that's just flakey for production work.
If you've ever coded the quadratic formula as written in high school textbooks and not known all the mess you just made, then you are what most open source developers are. Taking almost any formula from a paper and just typing it in is surely the wrong way to do it numerically, but this is what open source does. A robust engineering platform should have every such formula analyzed for the proper form(s) for implementation to maintain numerical robustness, and it should also avoid allowing users easy ways to do stuff that is not robust. This is the biggest difference between tools like Matlab and Mathematica versus open source projects.
And, like the time spent fiddling with getting open source to work, as soon as you have one engineering task or design fail due to numerical problems, it would have been vastly cheaper to simply use the better tool - Matlab.
Sure, most people don't use it very much, and rarely run into such problems. People using it for serious work in engineering toolchains or production systems cannot rely on instability of opensource.
And those reasons are why things like Matlab still exist, have incredible revenue, and are growing in use.
For example, want to do some work in python? Well, soon you need numpy. Then you might wat pytorch - but crap, it's numpy-ish, but not numpy. So you learn some more nuances on getting the two to play nicely, to get consistent error messages... Then you need some visualization - again, another package (with a host of dependencies), with different conventions, syntax, uses, and god forbid these packages get a little out of sync between releases - then you get to spend a day chasing that down. Now you want some optimization stuff - pull in scikit, but it's not quite consistent with the other libs... so you spend more time making glue functions between the stuff you want to build. Next you need some finite element analysis stuff - oops, pretty much dead compared to the massive amount of toolkits already in Matlab.
Take a moment and look through the list(s) of functions and toolkits standard in matlab [1]. For an incredible amount of engineering work, what you need is there - you spend less time trying to build enough pieces to start to work and you instead get working on the parts you want.
There's a reason python stole a lot of matplotlib ideas from Matlab - it's quite useful.
[1] https://www.mathworks.com/help/referencelist.html?type=funct...
I'm a licensed professional, and in my experience it takes 1-2 hours to set up a conda virtualenv with all the packages I need. Whereas if I want Matlab, it takes about a week to talk through the budgeting and licensing options with my employer, find the right number of seats to purchase (other departments might decide to get in on the purchase, so we need to consult broadly), choose which toolboxes we'll pay for, go back and forth on the quotes and POs, and make sure all the licensing really works.
But your mileage may vary.
Yes, there are problems where Python is an easy solution. And many where it is not. And some where it cannot solve the problem without extreme effort.
Having been in dev a long time, this is the simplest, naive works best case path. If this were how setting up Python worked for everyone, there would not be an incredible amount of forum posts, github issues, setup help and problems, easily found on the internet. If you've not had to change underlying code in some python package or even worse recompile underlying C libraries, then you have not faced the kids of problems many (me included) have.
Ever solve a problem like the one I listed? That is not a simple conda install (and I use conda stuff vastly more than matlab/mathematica, so I'm pretty aware of it's use and features). Many problems I can solve in Mathematica (my preferred tool for certain work) cannot be approached by Python at all (or any open source tools I am aware of, and I have tried pretty much all of the things listed as MMA replacements).
>find the right number of seats to purchase (other departments might decide to get in on the purchase
So you're no longer making an apples to apples comparison - you just solved a bigger problem with the Matlab side.
Octave is also unstable, and I doubt any company needing heavy use of a tool like this in production would trust Octave to not puke. It's just simply cheaper to use the polished and vastly more feature rich tool. Download Octave, go find some decently complex matlab code on github, and try to run it. Do that a bit and see how much works as it should.
Octave lists places they see themselves as different, some of which is core pieces that don't work the same. So if you want to replace some engineering tasks with Octave, it's going to be a mess, in the same way OpenOffice is close to MS Office, until the day you send a proposal with a deadline and it pukes because the other end used MS Word instead of an almost clone.
I've used Octave - it's decent. If you cannot afford Matlab, or your school doesn't have it, or you want to learn "matlab" to get marketable skills, then one can learn on Octave. Most serious engineering will not be done on Octave though.
[1] https://wiki.octave.org/Differences_between_Octave_and_Matla...