Math Basics for Computer Science and Machine Learning [pdf](cis.upenn.edu) |
Math Basics for Computer Science and Machine Learning [pdf](cis.upenn.edu) |
For being more than a practitioner, like an implementer of new ML libraries or a researcher, of course you'd need to know more.
For the people who are interested in ML, the thing to remember here is that he is a Serious mathematician, and he values rigor and in-depth understanding above all. A lot of his three star homework problems were basically impossible. He writes books first and foremost so he can understand things better. In math books, there's the book you first read when you don't understand something, then the book you read when you understand everything. This is book in the link.
for linear algebra, this:https://www.amazon.com/Introduction-Linear-Algebra-Gilbert-S...)
https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra...
What he does with chalk and a blackboard is far more effective than anything done today with Powerpoint, fancy computer animations, or what have you.
"In the following four chapters, the basic algebraic structures (groups, rings, fields, vectorspaces) are reviewed, with a major emphasis on vector spaces. Basic notions of linear algebra such as vector spaces, subspaces, linear combinations, linear independence, [...], dual spaces,hyperplanes, transpose of a linear maps, are reviewed."
If anyone needs to start even earlier than this, I've actually found "3D Math Basics for Graphics and Game Development" to be a good true intro for linear algebra-related stuff. I think this would probably hold even if your primary interest is something other than graphics/game dev. Some of the text in that book's intro is a little cringey with its reliance on kind of juvenile game references, but I didn't find that sort of writing continuing during the actual text. So just push past that stuff.
I got a copy of it to act as a refresher before diving into Real-Time Collision Detection since it's been quite a long time since formal math for me (as in, high school, because I'm self-taught in CS). I've managed to make up a lot of ground by working hard and finding classes to audit online (Strang's linear alg course on OCW is a good one), but I have found that depressingly few math texts which claim to be "introductory" are actually truly introductory.
This isn't a slight against the linked work, I absolutely love when profs make resources such as this freely available.
"How to Prove It" and "Book of Proof" are also great intros to formal math, if less immediately practical.
Did you mean to write "3D Math Primer for Graphics and Game Development" [1]? If you did, I agree 100%. I got a lot out of this book and was able to put it to good use for several projects.
[1] https://www.amazon.com/Math-Primer-Graphics-Game-Development...
The really important concepts for ML are least squares, eigenvalues and vectors, and SVD. Those concepts are not very relevant to game programming.
Well, least squares can be solved with projection, which is relevant for converting between coordinate spaces. But game dev isn't going to give you that intuition.
So while the book in question might not be the best resource, it probably is a better starting point than the linked doc.
Keep in mind it can take an hour, and sometimes way more, to really absorb a single page of a math book like this (do the math). This is more of a reference text.
I think it's a good time to mention a couple of nice books (related)
1. Elementary intro to math of machine learning [0]. Its style is a bit less austere than that of OP's. It also has a chapter on probability. It could possible serve as a great prequel to the book linked in the OP.
2. The book on probability related topics of general data science: high-dimensional geometry, random walks, Markov chains, random graphs, various related algorithms etc [1]
3. Support for people who'd like to read books like the one linked in the OP, but never seen any kind of higher math before [2]. This book has a cover that screams trashy book extremely skimpy on actual info (anyone who reads a lot of tech books knows what I am talking about), but surprisingly,it contains everything it says it does and in great detail. Not even actual math textbooks (say, Springer) are usually written with this much detail. Author likes to add bullet point style elaboration to almost every definition and theorem which is (almost) never the case with gazillions of books usually titled "Abstract Algebra", "Real Analysis", "Complex Analysis" etc. Some such books sometimes attach words like "friendly" to their title (say, "Friendly Measure Theory For Idiots") and still do not rise to the occasion. Worse yet, a ton (if not most) of these books are exact clones of each other with different author names attached. The linked book doesn't suffer from any of these problems.
[0] Mathematics For Machine Learning by Deisentoth, Faisal, Ong
https://mml-book.github.io/book/mml-book.pdf
[1] Foundations Of Data Science By Blum, Hopcroft, Kannan
http://www.cs.cornell.edu/jeh/book%20no%20so;utions%20March%...
2] Pure Mathematics for Beginners: A Rigorous Introduction to Logic, Set Theory, Abstract Algebra, Number Theory, Real Analysis, Topology, Complex Analysis, and Linear Algebra by Steve Warner
https://www.amazon.com/Pure-Mathematics-Beginners-Rigorous-I...
So many teachers seem incapable of stepping outside their sphere of knowledge and seeing what they know and others do not. And so much work went into this.
This IS a lot of math (1,962 pages) and it’s missing a preface/introduction which would have been helpful to understand if I need to go linear or if a la carte is okay. At the moment I’d assume each major section is independent.
Awesome find! Wonder how It’s used. (One of) the author(s) seems pretty prolific too - http://www.cis.upenn.edu/~jean/
Yeah, I wish we had an online resource (other than Wikipedia) anyone could learn any sort of math from in a systematic way... Oh well.
I'd love to know about the existing resource, if it exists. (The only thing that comes to mind is Wolfram Alpha, which didn't seem 'systematic' the last time I skimmed the main page)
Oxfords course stuff provides some structure to the interested user
Just looked at a few pages and it seems really illustrative. I am just a light-weight mathematician as a computer scientist, but I really would have liked such a comprehensive script for studying. I hate it when profs reduce everything to minimal definitions and expect studends to make sense of it. There are countless books but it is always a gamble that they focus on the topic at hand and don't suffer from the same problems.
This even gives you "motivational examples" which are extremely helpful for comprehension in my opinion.
That being said, this is faaaaaar beyond basics. It'd be more appropriate to call this an incomplete (aiming to be comprehensive) guide to almost everything you need to know in computer science (related to math).
I remembered that I took an advanced course about Bayesian Inference, and one course about Multivariate Statistics (PCA, Factor analysis, these kind of things), and my project is about Bernstein Polynomial. That's it...
Based on speaking to my managers in the past, it seems like a year-long lapse is enough for you to lose an incredible amount of retained knowledge/skill. But it's not a permanent loss.
The point is, understanding integrals and derivatives doesn't require one to memorize all the mechanical rules. Using software to compute those functions can be a huge time saver. No one should go with pen an paper double checking if that polynomial integral is correct or not!
With a book almost 2000 pages long, I wonder if this books leans more heavily on the mechanical-rules side of math. In my mind, is the difference between writing a book such that you can write your own wolfram alpha, or writing a book so you can just use it.
I suspect there are better resources for each topic covered (e.g Gilbert Strang books and OCW lectures for Linear Algebra), but it is definitely interesting to peruse and get a sense of relevant topics.
It's 2000 pages long....
[Reads the first paragraph of the 2nd chapter]
Me: I don't know anything about math. At all.
Any hope of that happening?
But I would be shocked if this would be of any use for someone trying to learn a little linear algebra in order to play with neural networks. For that I think you still want Strang.
I think "foundations" might have been a better word than "basics" here. "Basics" in any case is not in the printed title, only in the filename.
[0] https://drive.google.com/file/d/1sJvLQwxMyu89t2z4Zf9tD7O7efn...
I feel like it's some kind of misguided intellectual humility. Kind of feels vaguely related to how so many Haskell packages are version "0.*".
To me basics mean that if you study this entire book you won’t be able to understand ML otherwise it would say comprehensive. Furthermore the math presented in this book are all taught in 1st year courses for most CS programs I’ve encountered.
> Keep in mind it can take an hour, and sometimes way more, to really absorb a single page of a math.
Learning is a personal experience and happens at differing rates for different people. While I do agree this book is rather terse and would serve as a good reference any added explanations around the proofs would force a split to multiple publications so I can see why the authors chose to present it in the way they did.
Overall I have found this text easy to digest and well formulated and thank the authors and poster.
From what I see this book is much more complete than a first year course, or even the whole curriculum of a classical CS education.
I'm quite familiar with math, but I never encountered wavelet theory, Gauss-Seidel method, Rayleigh-Ritz theorem, and many more. My knowledge about other subjects such as Hermitian spaces, quaternions, finite elements is quite superficial.
And I've only listed elements of Part I.
A lot of this is year 2 (and 3!) even in engineering physics.
It definitely looks more like, "math basics" for x field. Kinda like "automotive basics" for Honda or Ford vehicles. Where it's presumed that you know a lot of automotive lingo to begin with and you just need to know what spark plugs go with what engine. And not, "what is a spark plug?"
If you look at any of the later chapters that are trying to teach something new, they are much more gentle and motivate the topic of that chapter: see e.g. “24.1 Affine Spaces” on page 759, or “26.1 Why Projective Spaces?” on page 823, etc.
Other chapters that are meant as a review are similarly terse and quick to the point (like Chapter 2), e.g. Chapter 37 “Topology” on page 1287.
I think it's good when books make conscious choices about what they're teaching versus assuming as a prerequisite (and communicate it to the reader, by using terms like “reviewed” — presumably the yet-to-be-written Introduction chapter will also mention this more explicitly).
https://www.amazon.com/gp/product/1466230525 - Mathematical Notation: A Guide for Engineers and Scientists
Thank you
Then he wants to define, say, addition of real numbers. So, given two real numbers, x and y, that might be equal, he wants to define x + y.
So, here he wants to regard addition, that is, +, as an operation. Then, as is usual for defining operations, he wants an operation to be just a special case of a function. So, he wants to call + a function. So, + will be a function of two variables, say, x and y. With usual function notation we will have
+(x,y) = x + y
The set of all (x,y) is the domain of the function, and the set of all x + y is the range.
So, that defines the function + except commonly in pure math we want to be explicit about the range and domain of the function.
For function +, the range is just the set of all pairs (x,y) with x and y in R. That set is also the set theory Cartesian product of set R with itself and written R x R. So, the domain of + is R x R. The range is just R. Then to be explicit about the range and domain of function +, we can write
+: R x R --> R
which says that + is a function with range R x R and domain R.
We learned how to add in, what, kindergarten? So, why make this so complicated?
Well, he wants to regard the real numbers as just one example of lots of different algebraic systems, e.g., groups, fields, vector spaces, and much more, with lots of operations and, possibly, more that could be defined. E.g., later in his book he will want to add vectors and matrices, take an inner product of two vectors, and multiply two matrices.
So, back to addition on the real numbers, he wants to regard that as just a special case of an operation on an algebraic system.
IMHO there's not much benefit for making adding two real numbers look so complicated.
Whatever he did in that chapter for defining addition on the reals, soon he is discussing matrix multiplication with no definition at all -- assuming the reader already understands that, that is defined and discussed many pages later in his book.
So, in his notation
+: R x R --> R
and matrix multiplication, he is using material before he has defined it, even before he has motivated, explained, exemplified, indicated the value of, and defined it. In good math writing and in good technical writing more generally, that practice is, in non-technical language, a bummer.
But from the table of contents, it appears that the book has quite a long list of possibly interesting narrow topics. And maybe for the routine material, his proofs and presentation are good -- maybe. I thought enough of the book to keep a copy of the PDF. It's there; if someday I want a discussion of some narrow topic, maybe I'll try his book!
In mathematical writing, it used to be common for the word processing to be much more work than the mathematics! Now with TeX and LaTeX, and I'm assuming that the book used one of these two, the flood gates are open!
Once you have spent countless hours doing exercises to the extent that you understand the math, you already remember the rules. If you have not spent countless hours doing exercises, you don't understand anything at this level.
You don't hire a programmer who has read all the books and 'understands' programming but has never programmed. It's the same with math. You don't just read a math book from start to finish. You can use wolfram alpha for visualizing functions, not for learning math.
Programmers have spent countless hours practising programming to the point where they have forgotten how difficult it was in the beginning. A non programmer might think of programming as "memorizing hundreds of rules" to get anything done, but one doesn't learn programming by sitting around explicitly memorizing hundreds of rules and then begin to program.
Actually writing programs with a minimal set of 'rules' memorized and then adding more as needed is how one typically learns programming.
I'm not against memorization though. Memory is very useful when studying math or programming or any other subject. You don't want to have to "reason your way through" every time, shortcuts are very important!. I think of this like brain-memoization. Without it, it would be very inefficient to make progress. A lot has been said about this relationship [1]. Also, I think this is how some breakthroughs happen, "connecting the dots", so to speak.
Maybe when you say: "...spending any time memorizing syntax...", you are thinking flashcards or something like that? Sure, you don't need flashcards, anything you do often enough is gonna be easier to remember.
My comment was more in line with the fact that, with 2000 pages, maybe the author elaborates a lot on things that are very mechanical in nature and maybe require a few pages to describe (and are very inefficient for humans to compute? Just use a computer! :-). Say, Gaussian elimination; couldn't one be told: this is a matrix, this is a determinant, this is the relationship between them, this is what it means to invert the matrix, etc. and skip the full description of Gaussian elimination? (put in an appendix? on a second book? less pages!). I don' think is super helpful to, say, spend a lot of time inverting matrices with pen and paper in order to get proficient in linear algebra.
1: https://www.google.com/search?q=intelligence+memory+relation...
Hm - maybe I'm fortunate that I studied calculus before there were (accessible) software packages that could just do this stuff for you, because back then, the only way to solve these was to do them on paper. I'm sure I would have been tempted to just "skip ahead" to letting the computer do it for me, but I definitely learned a lot more going through all of the steps myself than I would have if I had just gotten a high-level understanding of what was going on and plugged the rest into a computer. Because, honestly, integrating polynomials is really, really easy - if you know how to do it, you can do it on paper faster than you can load up wolfram-alpha, type it in, and wait for an answer.
The actual paper’s title:
> Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning
If that's your level, it might reasonably be called "basic."
For everyone else - no.
Thanks for your comment, it made me feel less alone.
https://www.mheducation.com/highered/product/college-algebra...
There are some good authors of math for, say, calculus, linear algebra, differential equations, advanced calculus, advanced calculus mostly for applications, real analysis, optimization, probability, some topics in stochastic processes, introductory statistics, various more advanced topics in statistics.
Mostly the books are short on motivation and applications, and as a result it is too easy to spend time on material likely not worth the time unless you can be sure both to live forever and remember forever.
For calculus I liked Johnson and Kiokemeister. I taught from Protter and Morrey, and it was easier than J&K. Lots of people liked Thomas.
For linear algebra, I liked E. Nering and, then, P. Halmos, Finite Dimensional Vector Spaces which really is baby Hilbert space theory. Take Nering seriously -- he was a student of E. Artin at Princeton. His treatment of linear algebra is balanced and polished. For one of his editions, he has some group representation theory in the back, good, and some linear programming, really bad.
A lot of people like the MIT Strang book.
For advanced calculus to help when studying physics, especially electricity and magnetism and engineering, I very much liked
Tom M. Apostol, 'Mathematical Analysis: A Modern Approach to Advanced Calculus', Addison-Wesley, Reading, Massachusetts, 1957.
He has more recent versions, but for physics and engineering I like the 1957 version and don't like the later versions at all.
For ordinary differential equations, I liked
Earl A. Coddington, 'An Introduction to Ordinary Differential Equations', Prentice-Hall, Englewood Cliffs, NJ, 1961.
He makes variation of parameters look really nice -- then can understand the remark in the old movie The Day the Earth Stood Still. Ordinary differential equations is a huge, old field, and there is some question about how much of that deserves study now. Do notice that for systems of ordinary differential equations, get to apply some linear algebra in cute ways.
For advanced calculus for applications, there is the old MIT Hildebrand -- he knows what he is talking about, is easy enough to read, and a good place to go if need one of his topics.
In recent decades, the pure math departments wanted to teach advanced calculus as the theorems and proofs for freshman calculus. So there is Rudin, Principles of Mathematical Analysis, third edition (not the first two, maybe a later edition if there is one). So here's what is going on: He wants to develop the Riemann integral which is the one in freshman calculus. For that he wants to integrate over a closed interval on the real line, that is, some [a,b] which for real numbers a <= b is the set of a real numbers x so that a <= x <= b. Rudin will argue that [a,b] is a compact subset of the reals and show that the Riemann integral exists on all compact sets. So, the first chapters are big on compact sets. Then he talks about functions that are continuous and then ones that are uniformly continuous. With uniform continuity, the Riemann integral follows right away. Later he does some infinite sequences and series and then uses these for careful treatments of some important results, the exponential function, the number e, the sine and cosine, etc. Later he does integration of functions of several variables on manifolds for Stokes theorem and the related divergence theorem, a fully careful treatment of these theorems used in E&M. He does the Cartan exterior algebra: What is going on is that he wants to integrate a function g: M --> R where M is a manifold, that is, the range of some function f from some box, triangle, etc. to the space with the M. So, for this need the formula for change of variable for integrating with several variables, and that is a determinant of a square matrix. This integration is a multidimensional version of the line integral where direction of integration is important -- the exterior algebra is the multi-dimensional version of that. Can see that again in some treatments of general relativity in physics.
I like Rudin's third edition: Once know what the heck he is driving at and how he is getting there, say, as above, then his high precision is welcome.
For statistics, I suggest using some popular elementary book as a start. Then learn probability really well and from then on study particular topics in statistics as needed. The current directions in machine learning promise to make lots of particular topics important.
For a first book on statistics, consider
George W. Snedecor and William G. Cochran, 'Statistical Methods, Sixth Edition', ISBN 0-8138-1560-6, The Iowa State University Press, Ames, Iowa, 1971.
My wife did really well with that. So, get a good start on statistics and, then, get to learn some analysis of variance (experimental design), an underrated topic.
For a second book on statistics, consider
Alexander M. Mood, Franklin A. Graybill, and Duane C. Boas, 'Introduction to the Theory of Statistics, Third Edition', McGraw-Hill, New York, 1974.
Here, go quickly and get only the high points and don't expect the math to be very good -- in places it's pretty bad.
For regression analysis and linear multivariate statistics more generally, there are several books, Maurice M.\ Tatsuoka, Donald F.\ Morrison, William W.\ Cooley and Paul R.\ Lohnes, N.\ R.\ Draper and H.\ Smith. So, in particular, get enough to understand that regression is a perpendicular projection and, thus, get the Pythagorean theorem again.
For more on such statistics aimed at machine learning, get the Breiman CART -- Classification and Regression Trees, maybe much of the start of ML.
With that much in statistics, will have seen a lot of applied probability and may be ready for the real stuff. For that, need measure theory, e.g., the first half of Rudin, Real and Complex Analysis or Royden, Real Analysis. Then read Breiman, Probability. After that might read some of Chung, Loeve, Neveu, and maybe some more. Then return to applications including statistics with a really solid foundation in probability, random variables, the classic limit results, and much more. Then can read and/or write lots of advanced topics in statistics.
For optimization, a similar review is possible, but it's getting late.
So when do you get around to teaching programming, then? ; P
/ducks