Gonum – Numerical Computing for Go(gonum.org) |
Gonum – Numerical Computing for Go(gonum.org) |
My question to the developer: is that issue something you've encountered with this library? If not, did you design the library to periodically yield in tight loops, or am I just completely wrong about the Go scheduler?
But one thing to remember is that go inserts gc and pre-emption points at function call sites. So basically as long as a function is occasionally called you're good.
Cgo threading does complicate the matter. My understanding is that cgo calls are done in a threadpool with a larger stack size. I don't know the details about how that threadpool is managed. Not sure if this would help or hurt your concern.
Also, don't forget GOMAXPROCS. There's nothing stopping you from letting the go runtime spin up arbitrarily large number of OS threads.
So it's not an ideal situation, but if you're careful I don't think tight loops are likely to torpedo an otherwise sound go project.
I suppose my hypothetical would be an issue if you used a non-Go BLAS implementation, as calling out to C will hog the OS thread. But this is a known issue (e.x. https://www.cockroachlabs.com/blog/the-cost-and-complexity-o...).
- https://github.com/golang/go/issues/10958
- https://go-review.googlesource.com/c/go/+/33910
- https://go-review.googlesource.com/c/go/+/36206
Of course, it just provides convenience, but it's what makes writing stuff in numpy, Tensorflow, Eigen, etc elegant.
Lately I do a lot of numpy/tensorflow, and have begun to really dislike the slowness of python. It would be great to do that work in Go specifically.
When the Julia AOT compilation story is complete, and it's well along now, Julia should dominate a whole lot of Go use cases...
Performance comparison? Algorithmic equivalence? How close are the results numerically (e.g. how do they compare on badly conditioned matrices)?
The performance story is complex. Typically we're the same speed on small matrices (and using Go is faster if you include the cgo overhead). We currently have significant speed penalties on large matrices (300x300 or so), but Kunde21 is working on assembly kernels for the BLAS functions to close that gap
Last summer, I tried an experiment: have a student migrate a little python-based analysis to a Go-based one. The analysis was fitting some cosmological constants out of the so called Hubble diagram.
I was pleased to see that, in the span of 2-3 months, the student who had limited knowledge in programming (a bit of python), managed to pull off the minimization of a 740 supernovae dataset with a 2220x2220 nuisance parameters matrix.
and the run time was 2x faster than the python one (with scipy/minuit for the minimization, so everything in C/C++, really).
success. :)
(and this motivated us to completely switch to Go as a teaching language for our master in particle physics / cosmology.)
C++, for all its flaws, seems to just generally be a more well-conceived language and more generalist than Go. The only place I really feel like Go works is specifically in the context of moving bytes from one socket to another.
It's not too hard to write Go to minimize allocations (and most short lived allocations end up on the stack anyways unlike other languages [1]). If you really need a lot of allocations you can always use https://golang.org/pkg/sync/#Pool to avoid GC overhead.
[1] https://groups.google.com/d/msg/golang-nuts/KJiyv2mV2pU/wdBU...
Our library was just open sourced (and still in my personal account, until we add more documentation): https://github.com/jbochi/facts
(also it's undergoing major reconstruction/refactoring right now)
One interesting thing that NumPy demonstrates is that such things are capable of becoming popular enough that they essentially become their own sub-language. One option in that case, if GoNum collected enough of a community, is to fork Go and add generics. There are some complicated generics options that would be difficult to use, but there's some simpler options that would work, and arguably "generics via templated code generation" is pretty much what you'd want for this use case anyhow since it gives the optimizers the most to work with. Said fork might also add some custom optimizations for this use case. I wouldn't want to deviate too far from core Go because I'd like to be able to keep pulling from that code base if at all possible, but some judicious work here might be a net positive.
There's a couple of cases with float64 vs. complex128 matrices, but I have been annoyed with those silent changes in Matlab where the answer is wrong but the code continues anyway.
Would love your thoughts on it
In Python (despite there being an excellent `multipledispatch` module) this is mostly just handled by aggressive duck typing ("if it has a .foo method, it's good enough"). In R it's handled with S4 classes, which are cool and kind of CLOS-like but are even slower than single dispatch.
So I guess my question is: why do you need generics when you have interfaces? These other (admittedly dynamically typed) languages make do without.
Going backwards, as you allude to, dynamic languages fulfill the use cases for generics, as long as you don't care about type safety, which is a thing that is true for the whole language anyhow so it's not much to give up.
For Go, the main problem is that when you're trying to be mathematical, with interfaces you get the worst of both the static and the dynamic worlds. You might like to define an interface that lets you add two vectors, right?
type Vector interface {
Components() []float64
}
type Add interface {
Add(Vector) Vector
}
which might let you implement an Add method on something that is a Vector as well, but you don't get a satisfactory result from either perspective. From the static perspective you can not, using interfaces, guarantee that someone doesn't add a Vector3 to a Vector2, meaning you must either panic at run time or have Add potentially return an error (that will generally not be necessary to check if used correctly, which is not a pleasant error to work with). From the dynamic perspective, you have to remember that what comes out the other end of that operation is always an Add interface value, not a concrete type, so if you have a Vector2 and .Add(Vector2) to it, you don't get a concrete Vector2, you get a value of type "interface Add", which you have to manually cast back to a Vector2 if you want to do anything more than just keep adding to it.You can make Vector2 have a distinct .Add(Vector2) method which does return a Vector2, but then if you also have a "func (v Vector3) Add(Vector3) Vector3" function, there is no way to declare an interface that both of those methods can meet, so you can not write any dimensionally-oblivious code that uses generic vector adding.
In "normal software engineering", Go's interface limitations are often not so bad, certainly not as bad as is often portrayed on HN. However, when you try to create a strongly-type numeric system (and you want it to be strongly-typed because that's also how you get good performance), Go's interface mechanism is basically worthless.
What you get performance from is the absence of dynamic checks, not the presence of static ones. Of course, in the absence of dynamic checks, you want static ones for your sanity's sake - but not for performance's sake!
that's true. that is... until a (real) Go interpreter shows up. something that's bound to happen when Go will be used for (data) exploratory work.
But as per my other thread in this thread, if the scientific community becomes big enough I wouldn't be surprised they fork Go entirely, at which point that opens up a lot more options.