Math-as-Code (2015)(github.com) |
Math-as-Code (2015)(github.com) |
Many fields have over time developed a Mathmatical notation that's most suitable to it. A notation where irrelevant details are just skipped over (e.g. with neural networks we don't carry explicitly our implicit parameters around, it would be very noisy to have thetas everywhere. Statistics/Probability has other common notation, e.g. the usual symbol for the logical and/logical or meaning min/max). One needs to learn how to read it, bu that's just practice.
Regards your comment about substitute in normal human language, often it's just easier than prose. You can of course explain something in normal language, but whats a quick equation or formal statement might become a whole paragraph in english. If you're comfortable with Mathmatical notation it's just faster to read, easier to understand and more "specific" than prose. It only looks like a crazy mess of characters if you can't connect the dots, it becomes "obvious" very quickly.
I always try to formulate Mathmatical ideas in Mathmatical notation first. It's ways easier to spot weaknesses or details you've not considered compared to prose.
However, notation for its own sake is not really something I'd suggest sitting down and learning. If you have some maths you want to learn, the notation will come along with that, so any book or article that introduces the maths you want will also introduce the notation (usually building on some foundation level of existing knowledge, e.g. high school or whatever).
0. https://en.wikipedia.org/wiki/List_of_mathematical_symbols
- the ones that wish programming would be more elegant and expressive like maths notations
- the ones that wish paper would publish a Python algo instead because maths notations are inconsistant and hard to read
I'm not a maths person, so can people with a lot of experience with it help me decide which one is the more reasonable?
Not so easy. I was trying to find out which of NN and ZZ was what, try searching for those. If you don't know what the signa (summing) sign means, how are you going to find it?
I wonder how long until we get a somewhat mainstream language with pi types. I know Rust considered adding them. And I recently learned that Rust does allow for quantification over lifetimes^[1]. I could certainly see a language that implements dependently typed arrays. Midori for instance looked into eliding bounds checks with compiler proofs^[2].
[1]: https://doc.rust-lang.org/beta/nomicon/hrtb.html [2]: http://joeduffyblog.com/2016/02/07/the-error-model/
https://www.zdnet.com/article/guy-steeles-new-programming-la...
The language has faded into obscurity though.
As a non-mathematician who has been recently been looking at papers in journal back-issues from about 1970 to 2010, I certainly would have benefited from this.
On a related note, another issue is that that maths must be implicitly ordered in the context of the prose of the paper, while programs have actual entry-points and (without nitpicking) explicit ordering. (It's possible that ordering can be relaxed, but correctness preferred over runtime cost.)
I think "maths-as-data" is more important here -- use a parsable common notation with enough meta-data that I can view it any way I want -- as math-with-greeks, math-with-friendly-names('en-us'), as APL, as plain Python, Python with numpy, etc.
a) I'd be interested in your offer
b) It's not clear how this alternate representation would look like. I'm hoping that your possible use cases would shed some light on that
See for example gradient descent in maths (LaTeX) and in PyTorch: https://colab.research.google.com/github/stared/thinking-in-...
For the smoothest math <-> programming
x.matmul(y).pow(2).sum()
This way we can rite a lot of things, and we don't need to make up a new combination of punctuation marks and special characters for an operation.
For example, while one can write in Python:
torch.sum((x @ y) 2)
I consider it less readable. I mean, here maybe it is fine, but once it gets longer, more complicated, or we want to add new operations, it turns into a mess.
Vide "style" section in: https://github.com/stared/thinking-in-tensors-writing-in-pyt...
I did try to use Julia quite a few times (including, I don't know 8 years ago), as I loved the philosophy (brief yet fast, types). Sadly, I never considered it readable - a mix of new concepts, legacy MATLAB syntax, and in general disregard for this part (even function names didn't have consistent naming).
If it moved a lot with that respect, I would be happy some nice examples.
It seems like there is a lot of stuff in math that is kind of like code libraries or functions in programming, except you are supposed to just remember exactly how it works rather than having the source code.
Currently I'm doing Data Science From Scratch except in C++ instead of Python.
It takes you from basic functions like square roots and absolute values, to summations, matrix multiplication, to measures like standard deviation and the Frobenius norm.
I was inspired to create it while taking the Fast.ai course and Jeremy Howard showing what looked like a "complicated" Frobenius norm equation that could be implemented in a single line of Python.
It's open source and should also work on mobile.
So you can write a for loop which solves an equation with different parameters at each iteration, for example. You can write logical proofs too, although that's not the main purpose. (I think doing calculations on your computer and generally being a superb Math assistant is where it's at).
I will have to doubt that for the simple fact that it is impossible to have a real number type on a computer.
So for example, '2' is recognized as an integer and is represented using arbitrary size integers. You can also however write 'x = sqrt(2)' (or just 'sqrt(2)'), which has no finite digital (irrational), 'x' is a real number. You can then ask for finitely many digits of x, with 'x.n(5)' (gives 5 digits), or write something like 'y=x^3+3x-4', which gives another real number, represented as this polynomial data structure itself.
The only problem with this is irreducibility. You can compose arbitrarily many operations on floats and still get a float of the same size. With this approach, it may not be possible to simplify a series of operations so the representation can grow unbounded.
edit: Fun fact, there are (real) numbers that indeed cannot be represented in a finite computer no matter what -- but they cannot be represented in paper or uniquely represented in any finite abstract form either! This follows from the pigeonhole principle: finite expressions may represent numbers, but there are uncountably infinite (2^(N0)) real numbers, and only countably infinitely many expressions. So indeed almost every real cannot be represented. You can think of those as not being identifiable with any property, so there's no finite expression to describe them. They're more or less random.
2017 (a bit): https://news.ycombinator.com/item?id=15948326
(another bit): https://news.ycombinator.com/item?id=15947744
2015: https://news.ycombinator.com/item?id=9805071
2015 (a bit): https://news.ycombinator.com/item?id=9801620
There is barely a single operation in the last few maths papers I've read that doesn't have at least four possible interpretations at first glance, so the only way to understand a single line is to understand the entire theory at once. Needless to say, all variables are described with single letters.
The best defense for it is that it evolved from handwriting, where the difference between `matmul(a,b)` and `a x b`, repeated a few dozen times, is significant. These days, as I type, I define commands to turn the unambiguous longhand into the shorthand.
("But yodel, a and b aren't bolded or uppercase so according to convention they must be scalars!", I hear you say. Ha, ha, ha.)
I am genuinely angry about this. A great deal of practical mathematics is not hard. Mathematical notation is hard, and everyone using it should be a little bit ashamed at how needlessly difficult the field is.
This sort of juvenile over-the top hyperbole is getting pretty boring now. Can you present your criticisms like a normal person instead of calling technical things you don't like "terrible and evil"? Especicially when you say (imo) ridiculous statements like "maths is not hard, it's the notation that's hard".
Notation is, most of the time, trivial compared to the concepts they represent. Funny you should give "matmul(a,b)" as an example, that would be a pretty atrocious way to write mathematics. Imagine writing "equals( matmul(S,matmul(A,inv(S))) , A)" rather than "SAS⁻¹ = A". Which do you think most clearly represents the concept at hand? The "overloading" that you complain so much about, it actually illuminates the fact that the underlying concept of "scalarmul(a,b)" and "matmul(A,B)" is the same. Going even one step further, Einstein notation: a development purely of notation, that makes dealing with tensor algebra a million times more clear.
I swear I don't understand this line of reasoning.
EDIT: I'll go out on a limb here: you're a programmer, not a mathematician, so you tend to look at these things from the prism of a programmer, not a mathematician. When all you have is a hammer...
-- 1. Having the same symbol for several things is the mathematicians way of having polymorphism and is a very nice feature.
Take for example the plus sign. Everytime we encounter it, we will most likely be dealing with some kind of abelian group. So just from seing the sign, we can already tell a lot about the context.
Another example would be the sign for the tensor product. I we see it, we are most likely dealing with some symmetric monoidal category . It might be vector spaces, modules over some commutative ring, coherent sheaves, objects in some derived category, etc. But still the important part is that all the stuff we know about this context can be applied.
-- 2. "so the only way to understand a single line is to understand the entire theory at once."
What do you expect out of a single line? Just think of a codebase with 800k lines. Would you expect to understand what the codebase is about by just looking at a single line?
I guess not: You'd rely on the naming of the functions/classes/etc. and how they are composed to understand what the codebase is doing. You'd also need a lot of knowledge about the area the codebase is concerned with to tell what is going on: Is it about a webserver providing a SPA? Is it a database? Is it a compiler?
You also don't expect to understand the whole codebase and it's dependencies (including low level abstractions)in its entirety: If you write a frontend `<div>` tag as part of a frontend do you really need to know the exact way this will end up as instructions on the chip of the computer that is running running a specific version of a specific browser running a certain version using drivers for a specific monitor, graphics card, etc...
All this is also true for mathematics.
-- 3. "A great deal of practical mathematics is not hard. Mathematical notation is hard ..."
Of course mathematics is hard. In fact: If anything related to human thought is hard, it is mathematics.
And more explicit notation will not make it easier. Do you think the twin prime conjecture is still unsolved because a layperson may misinterpret the meaning of a plus sign?
Seriously, take any 100 ksloc codebase; how many bugs do you expect to be in there? How many of the developers do you expect to really grok what their code is doing on all functional levels of abstraction?
Math papers certainly contain bugs sometimes, even critical ones, but the density of bugs is way lower overall. It's simply much easier to find a bug by staring at a terse one-liner than it is by groveling around 10 pages of "readable" code.
Arthur Whitney's code is definitely not approachable, but once you get over the hump, I find it emminently readable compared to what we hackers tend to tout as "readable code."
I think most of the debate boils down to different concerns. As a communication tool, what is the function of your code? If it's to introduce ideas to a large audience or many non-inductees, then please, use Python or similar. But if your primary communications are only between the system experts, do you really need to optimize for "this code looks grokkable at first sight"?
You have same shit on JS where + can mean do a numeric addition or it can mean concat 2 strings and other powerful languages will allow you to have + enhanced so you can use it to add matrices,vectors,fractions time intervals etc.
If you want for example to understand the fundamental theorem of statistics you can read only it and his proof you need to read and understand everything in the book before it.
How much math/physics have you actually done? In deriving something, or solving a problem, I can have pages of the concise notation. Dealing with your expanded notation would be Hell that no mathematician/physicist would put up with. We struggle even with the concise notation. We'd like an equation to fit on a line, for example, and it's a pain when it doesn't. With your notation, very little of substance would fit on a line.
It's a bit like programming in that sense. If you can fit a function in one screen, it greatly improves the chances of you understanding it. With your notation, you're intentionally going verbose. It may help you understand that one line, but will likely make it harder to understand the overall picture.
And really, it's not `a x b`, it's 'ab' :-)
(And with matrices, it's more often 'AB').
> Mathematical notation is hard, and everyone using it should be a little bit ashamed at how needlessly difficult the field is.
Everyone has his preference of comfort. I don't like much of the notation of logic, and I'm thankful most other math disciplines don't replace "and" and "or" and "negation" with the symbols used in logic. However, the set of practicing mathematicians who share your comfort preference is virtually nil. I don't like the concise, implicit notation in general relativity, either (where lots of stuff isn't even written down but kept in your head) - but I'll trust what everyone who has studied it says: You'd go crazy if it were as explicit as I want it - let alone your level of verbosity.
Also, as an aside, I've had to read code that represented mathematical operations (numerical integration, matrix operations, etc) written by others, and without comments or access to the author it was virtually impossible to follow. Most of the standard programming language's syntax is poor for mathematics.
Edit: I'll add that if your primary exposure is through math journal papers, then I do understand your frustration. They're usually much, much harder to understand than the level in a typical textbook. Unfortunately, this is intentional.
show us
- It doesn't change very much, so you can just pick it up and understand it using what you learned at school. Cf e.g. looking at modern Javascript and wondering wtf all those JQuery selectors are.
- Relatedly, there's not a lot you need to know. I'm not a mathematician, but I can understand most mathematical theorems I've needed to look up with just basic high school algebra, derivatives, integrals, sums, products, trig, and matrix / suffix notation. Find me a piece of code that does anything interesting which has less than six calls to library functions.
- It's extremely concise, so you don't have to scroll through pages of code to get to the point - each line communicates a lot. This is really important, because it stops you getting lost or distracted in the middle of figuring things out.
- It's declarative, so it tells you what you're actually doing, rather than just how to do it. This is like the difference between saying "get me an orange", and giving full instructions for finding a shop that has oranges, navigating to it, etc. I know declarative programming exists (and it has these advantages) but I've never seen it used outside CS courses.
I honestly prefer mathematical notation; it has more upfront work to get good at it, but is much easier to use once you clear that hurdle. I haven't found ambiguity to be a big issue (especially compared to code; how often have you used a function which does something subtly different to what you expected?) although this may be due to the fact that I'm not looking at super advanced stuff (a couple of papers on deep learning, skew normal distributions, the rocket equation, that sort of thing). Especially with the deep learning papers, I found it infuriating how long they spent using confusing diagrams and wordy explanations where a couple of lines of algebra would have made their point instantly.
The true answer is almost certainly YMMV.
I find this to be a problem quite often. Maybe my brain is wired differently, but I can't keep declarative definitions in my head and apply them unless I know of an imperative strategy to make them work.
I've first noticed this when, in high school, I was struggling to understand the epsilon–delta definition of a limit. The one that goes like this: https://wikimedia.org/api/rest_v1/media/math/render/svg/619d....
Only after many hours of classes, followed by many hours of staring at the formula, all over the course of weeks, I finally understood what it means. The one missing puzzle piece that no one explained to me was that, when you see "\forall _{sth > 0}", you should read it as "in particular, for a sth arbitrarily close to 0".
Declarative notation is fine if you comprehend all consequences of that declaration. In cases where you don't, it would be nice to have it spelled out which consequences are relevant.
Math notation is unlike declarative programming language in that it's possible to describe an something while telling the reader absolutely nothing about how to compute the object being communicated.
Humans are really good at pattern matching. Giving your concepts very terse names maximizes the ability for this tool to find actually interesting patterns.
Of course, the terse notation sacrifices approachability. If your intent is to communicate to non-mathematicians then it's a bad language. That said, mathematical notation is full of historical quirks and irregularities akin to natural languages. Ideally, we want a one-to-one map between concepts and symbols, but in practice there are cases where the map ends up being more like many-to-many. There are competing notations, overloaded symbols, and often one needs significant context to actually resolve the symbol to its concept.
In a way, programming languages are an inversion of the above good and bad points. Progamming languages are very precise with their syntax, meaning that you don't need to consider author's intent to figure out what a piece of syntax does.
Also, programming languages, by virtue of being executable, are vastly more amenable to empower tinkering and exploration. There is something deeply satisfying about typing in some code and getting an expected result, and when the result differs from expectations, then you found a bug in your mental model. This is much harder to do with pen and paper.
That said, most programming languages and their standard conventions fall short by being relatively wordy and verbose. What is a good one-liner in math symbols often takes up a page of code or more, not including definitions. This severely hamstrings the ability to grok the big picture at a glance, and opting for "words" over "symbols" intruces noise that kills our natural visual pattern matching abilities.
So if you take the above thesis to its conclusion, one would want executable math notation. I'm not sure such a thing exists, but the APL family certainly is a solid experiment in this respect.
Also a lot of math notation is very sloppy, the opposite of unambiguous. Math people will say it's necessary for brevity, context is important, you must take the relevant courses beforehand etc, be properly initiated etc. then you will understand. But for me it's still annoying even after a CS master's with many math courses. In code, I can jump to the definition of anything and see precisely what is meant. But a math person would say ambiguity is a feature because, say, the integral sign may stand for different definitions, but in this context we don't care which one it is. They'd often say, well, here you could substitute anything, you could use anything here, it doesn't matter what we use for this and that. But really anything? Can I use a pound of dog fur? No. In working code there is at least one working example to suggest what you mean. I will have a much easier time to understand your abstraction and though process.
I think people often think in concrete terms but intentionally keep this secret because abstract ideals are more prestigious, so it becomes the reader's job to figure out what the actual motivation and reasoning of the author was.
Thankfully AI papers now mostly release their code so it's possible to see what they actually do behind the smokes and mirrors on the ground level. I'm all for abstraction and general patterns and proofs but there is such a thing as too much.
There is not wonder that a machine learning or a computer vision idea might be more accessible as an algorithm program, but this is not universally true for any kind of math.
A well written paper would describe things in clear terms, choosing the most comprehensible means to convey their idea. So, they might provide all three: 1) concise math formulas, 2) corresponding algorithms, and 3) an explanation in English. Not always all three presentation forms are achievable, but all of them can be useful and enlightening for the reader.
Unfortunately, in reality, there might be problems with presenting things in clear way:
1) A scientist might think that it is too verbose to repeat themselves, so they present the idea in only one form, at a single place in their paper, and think it's enough ("for the intelligent reader"). You want to find certain balance between conciseness and clarity, imo.
2) A scientist might be so immersed into the specific area of study they are writing about, so they don't see the reason for explanations for the more general audience. Here, it depends what audience the scientist is trying to reach. Also, not all papers present a good exposition of the field of study anyway.
So try finding good papers that explain things in clear terms at your level of comprehension. Read those, then move on to find new sources.
Also there's nothing wrong with doing both?
[0]: https://annals.math.princeton.edu/wp-content/uploads/annals-...
They seem like a crazy mix of noise at first, but that's just because you're not used to it. It's like a javascript developer reading their first lines of rust, or the C-programmer reading their first lines of Haskell.
Often, physics is presented as a set of interconnected of equations (some are general, some are based on assumptions, representing special cases). Unfortunately, most of the time these equations are not built "bottom-up", so to speak. They do describe various aspects of physical reality, but often don't allow a nice computational/operational interpretation of it right away. So converting this knowledge into a computational model can be non-trivial.
For example, if I want to regularise a layer:
PyTorch:
l2 = layer.pow(2).sum()
loss += lambda * l2
Julia: l2 = sum(layer .^ 2)
loss += λ * l2
I can just use regular Julia functions whereas in PyTorch I need to use special PyTorch functions as to not detach the tape. This only gets worse and less readable the more complex things I want to do.I love when the "data-flow" is consistent, from left to right.
The Julia code (and traditional mathematical notation) reads: middle, right, left.
Plus, instead of inventing infix notation, I find it cleaner to write custom methods/functions rather than rely on (a fixed, limited, and non-apparent) set of built-in inflix operators.
Made-up .^
integration.integral.apply(omega, eff)
on another as omega.getNaturalIntegrator().applyTo(eff)
and yet on another as eff.integrateOver("planarDomain", omega)
and on each of these constructions the visually evident formal properties of the integral are lost and difficult to reason about.I have a lot of frustration with this notion of the "visually evident formal properties of the integral" -- I think the existing visual-symbolic paradigm of mathematical notation is intuitive to some subset of people, but that subset does not encompass everyone who could do mathematics at a high level, excluding those for whom some alternate, in this case more text-based, representation is more tractable.
https://en.wikipedia.org/wiki/List_of_mathematical_symbols
for "Sigma" gets you that capital sigma is the summation symbol. ℤ etc are there too though they are admittedly harder to Ctrl-f for.
Good luck googling for operators in a programming language as well though.
Like with a programming language, for which you might want to skim a tutorial, if you are doing real analysis or whatever is talking about the sets of integers or natural numbers, you might want to skim the opening pages of a textbook or other course. The most common things are usually defined early on, and the less common things are often defined in-text when they are used, e.g "consider the bijective function f: ℤ→ℤ", now you know what f is and just need to google "bijective function". And if you do you'll see the notation about the domain and codomain (which you would also see in the opening chapters of an analysis textbook or lecture notes).
I mean, learning how to use a programming language involves having to do some reading too to get to grips with hard-to-google things. Maths is no different.
> about the domain and codomain
I was taught these as domain and range. Another notational inconsistency.
I basically agree with you but maths is a lot harder for some reason than programming. I don't think it's just me either. Perhaps if we could animate maths... I don't know. It's simply not an easy thing.
But, there are ocassional renamings of things that usually improve the overall lingo. Some things I learned one way but they have now been renamed. E.g. I learned about 'one-to-one and onto' functions instead of 'bijective' functions. I think you'll agree 'bijective' is better than 'one-to-one and onto'.
The same happens in programming languages and libraries when they rename functions from time-to-time. For a while there are inconsistencies whilst both are in use, sometimes permenently if the new name comes about by a creation of a new programming language or dialects.
It's true that searching for a strange symbol you don't know the name of (or even have the unicode character - if it exists - on hand) is hard. If you knew they were called "blackboard-bold Z" you'd be able to find out via google what it meant pretty easily.
(I am totally assuming you mean blackboard bold by writing ZZ and NN, otherwise I also don't know what those mean)
I am not sure what you mean with that. You could have an infinite tape that holds the number 1 and an "infinitely sized" irrational number at the same time.
"I learned about 'one-to-one and onto' functions instead of 'bijective' functions" - For a beginner? The former. Past beginner? The latter. It's context.
The "strange symbols" I only came across when searching for are described as double-struck. I only found that recently. I only found out yesterday they were also called Blackboard, while replying to others here. Difficulty in finding symbol meanings are no way the hard part of maths, though.
I have a strong preference for notations that can be read consistently from left to right (vide pipe operators, chaining, etc). A litmus test is if when I read something I use the word 'of'.
If you do anything more complicated that only vector-matrix operations (i.e. a lot of quantum information, all deep learning), with all multiplications you need to specify dimensions somehow. Having only two operators for multiplication is not enough.
sum(A * B) ^ 2
If you define sum as Σ then you can do
Σ(A * B) ^ 2
torch.sum((x @ y)**2)I don't think you can use * if you want to backprop, for example in layer regularisation. You'd need to use only PyTorch operations such as torch.mul(A, B).pow(2).sum().
The point here isn't that PyTorch can't look similar to Julia in this small example, rather that I can just use regular, concise Julia syntax - unlike in Python + PyTorch where I need to use PyTorch constructs that are outside of Python.
In maths notation, Σ(x * y)^2 would mean Σ((x * y)^2), but in most programming notation, treating Σ as a function, it would be as you say. I'm going with the original formula in https://news.ycombinator.com/item?id=23508661.
I don't know PyTorch, but regular Python, Numpy, Sympy, etc. seem very similar to Julia in this instance.
> I don't know PyTorch, but regular Python, Numpy, Sympy, etc. seem very similar to Julia in this instance.
If you need PyTorch to record the operations for the backward pass later on I believe you need to use PyTorch versions of *, +, etc.: torch.mul, torch.sub, torch.add, etc. In Julia you can just use built in functions and let Flux handle the backward pass.
If I encounter math notation that I'm not already familiar with, my only real options are to start asking people, "hey do you know what this is saying?" or start shotgunning references/papers/books and hoping for the best. Even if I know the names of some symbols, searching something like "integral omega f" doesn't generally yield useful results.
I don't see it all that difference in mathematics. A given piece of math will make assumptions on the notations and expects the audience to share those - the example of integrals is a good one. Usually within a particular subdiscipline (or from the context) you'll know if this is a Lebesgue or a Riemann integral - so they don't bother having separate symbols for them. If you're new to the discipline, you may not know the convention, so you have to ask or search.
The thing with software and programming is that it is usually "complete", and that's why you can use your tools to access the docs/definition. It is complete because the universe of options for a given program is relatively small. In mathematics, though, it isn't that small, so the challenge of making all the definitions, conventions available to you for a given piece of math you're reading is much greater. Textbooks typically are good about this, but the more advanced you go, the more you are expected to know as "these are the conventions in this subdiscipline".
A lot of this is probably historical, and no one today wants to bother with making a consistent set of tooling that will get you what you want.
It does not pose major problems, either. You can easily distinguish f^p(x) and f(x)^p. The first one means "apply f p times to x, iteratively" and the second one means "compute f(x) and raise it to p". It works just as well when p=-1.
E.g., = as assignment instead of equality.
Also, in more declarative languages, particularly ones that don't allow reassignment, = can be regarded as an assertion of equality rather than an assignment.
The whole point is not having to understand everything at once. And I would expect to understand what that particular line does to whatever level of detail I want by reading that line and recursively working backwards through the definitions of each operation invoked. Like clicking on each identifier and then doing ‘jump to definition’ in an IDE, instead of reading a book from cover to cover.
With code, I do this on a regular basis. With mathematics, it is plain impossible.
(And yes, this is why literate programming is a silly idea best forgotten.)
> Do you think the twin prime conjecture is still unsolved because a layperson may misinterpret the meaning of a plus sign?
No. But it very much may be because too few people understand the relevant portions of number theory in sufficient depth — from the highest levels of abstraction down to the bits and bolts of set theory — to notice the key insight necessary to solve it.
It's easily possible with systems like Coq [1] plus a nice IDE [2]. There are mathematical libraries for the foundations [3][4].
[1] https://github.com/coq-community/awesome-coq
[2] https://github.com/coq-community/vscoq
I don't expect to solve the twin prime conjecture. I do expect to understand, say, a kalman filter, or a multilayer perceptron. Those are not difficult concepts. And yet... I could not name a single paper that I could say to someone who doesn't already understand them "go read <x>".
How is that anything but damning? What is the point of a means of communication if I cannot impart a novel concept with it, because the reader must already understand all possible and imagined meanings of the plus sign to understand what I'm doing with it?
(side note: who's going to be solving the twin prime conjecture and not providing the proof as code? Why? A computer as an automated theorem checker is revolutionary.)
This! This is exactly my biggest pet peeve in Python. That the plus operator is used, in the base language, for such a notoriously non-abelian operation as string concatenation. As Carlin would say, this is not a pet peeve, this is a major psychiatric hatred of mine. In the (fortunately few) occasions that I have to do string processing in Python, I go to great lengths to avoid any usage of the concatenation operator. I cannot bear myself to use it.
That wasn't what was said. What was said was "A great deal of practical mathematics is not hard.".
The notation "e^(i x pi)"is a crime in 2 languages against countless students. Why E? No good reason. Why pi? Also no good reason. i? When they named it they didn't even know if it was a real thing. They have many faces, and the notation manages to avoid representing a single one of them. Even "self_diff(rotate*half_turn)" would be an improvement; at least some of the meanings are captured.
Naming things is hard, but this is literally 1-letter-naming applied to the 3 most common and important functions humanity has ever defined. The only excuse for it was up until ~2,000AD the mathematicians were working by hand.
No, the real issue is that the circle constant is 6.28319, not 3.14159. That's why a "full turn" is 2π instead of 1.
That's why every time I read an equation involving circles, or spheres, or any similar notion I have to take the constant that is in front of π and mentally divide by two.
And nobody should dare pipe up and claim that these are isomorphic and it "doesn't matter", because it does.
There's a tendency in modern theoretical physics to just ignore random small constant factors in equations, even when using natural units. In that case there ought not to be any unexplained constants! Any numbers in those equations ought to have an essential physical meaning, a real interpretation. Numbers ought to count things. But they don't, because of stupid extra constants like the ones needed next to π. This "2" gets blended into everything else, so you get stupid shit like the "8" in the Einstein field equations. Eight of what is being counted here exactly? Oops.. no wait.. 4 circle constants are actually 8 half-circle-constants.
Skip learning that i := imaginary unit. Write "imaginary_unit" every time you want to refer to it, for maximum clarity.
Choose the latter and now you have to write "pow(euler_constant, mul(imaginary_unit, circle_half_turn))". Does this sound "maintainable"?
I don't even know what "e^i x pi" means. How does that translate into C notation? Is it pow(e, i) * pi, or is it pow(e, i) * x * pi, or is it something else entirely?
I'd be surprised, on a tech forum, if people didn't see at least the smallest hint of a problem at single-letter naming. The tech industry moved away from that approach with a certain implicit urgency.
Except that no one will let you get away with (sin^-1)^-1 x to represent sin x. But you can do that with inverse function notation in general. Some of this is just cultural.
One basic thing is that PI (and even more so E) are pure abstractions. They correspond to certain physical or intuitive concepts in certain contexts, but their identity across contexts is more important. Even in programming, we sometimes use one-letter variable for very abstract things (the loop variables i, j, k; T, U, V as type arguments in C++ and co, or a, b, c in Haskell; and probably others).
Another major difference is the way this is actually done. Programs are written once and then mostly read, tweaked, debugged. Mathematics is all about writing and writing and writing, a lot of times from scratch. You're not as Lilly to be editing a file from 2 years ago, you're more likely to be writing a new attempt from scratch. And, people will likely only read your final version, not your hundreds of attempts before. So efficiency in writing your ideas is all the more important than in coding.
Another aspect is that, a lot of the time, the mathematician will spell out a lot of intermediate steps between a problem and a solution that, in a computer program, actually fall in the execution, since we're not in the habit of proving our code reaches the final state we envision (even informally); and certainly not in-line with the rest of the program. So again, in mathematics you end up writing a lot more the same few symbols, making long words all the more harmful.
Imagine
X + 6 = 5X
X - 5X = -6
-4X = -6
X = 6 / 4
Spelled out as Unkown plus 6 is equal to 5 times Unknown
Unknown minus 5 times Unknown is equal to negative 6
Negative 4 times Unknown is equal to negative 6
Unknown is equal to 4 divided by 6
Which of these is clearer and easier to follow?Edit: I'll also add that function notation works great for unary and maybe binary functions (though operator notation is generally nicer for binary functions), but it becomes horrible for anything past that. For example, the sigma notation for sums is much easier to remember than if I wrote `sum(1, 789, some_value squared)`.
> Mathematics is all about writing and writing and writing, a lot of times from scratch.
Mathematics is all about clever people communicating precise ideas.
> Even in programming, we sometimes use one-letter variable for very abstract things (the loop variables i, j, k; T, U, V as type arguments in C++ and co, or a, b, c in Haskell; and probably others).
They're defined locally though. A loop variable in one program is totally logically unconnected from a loop variable in another program to the point where programmers might give them mutually contradictory meanings. If the meaning was shared across all programs then yes the 1-letter-names would be troubling.
> or a, b, c in Haskell
And part of the reason Haskell doesn't catch on is because very few people can read Haskell programs; I know my interest in it ended when I saw =<< then decided to learn a lisp instead. I still live in cheerful ignorance of whatever that symbol means. I can't even search for it I get results pages in Chinese (!!). It is a bad community to look to on the subject of how to communicate ideas.
In C notation, this would be pow(e, i * π). Or maybe pow(i*π, e), I don't remember and the notation is pretty bad.
And the point about exponentiation is very simple, and I didn't need to do hundreds of exercises to remember: the natural logarithm of e raised to some X is X, whatever that X may be.
The other commenter's note was that raising some constant to an imaginary number is a very obscure way of expressing the geometric operation that this represents (multiplying by e^iπ is equivalent to rotating a complex number by a half circle). I was pointing out that, while that is true, it loses the algebraic interpretation, which also has value.
To those who haven't memorized Euler's Identity (I'm aware of it due to your comment, but will probably forget it in a day or two), this "self-evident" truth is invisible no matter which notation is used.
C notation is bad, but has the advantage of being precisely representable in ASCII. To post math notation you'd need to upload a PNG somewhere.