Come to think of it, this is also a problem with the Unix philosophy in general, in that it requires trading off performance (user productivity) for flexibility (developer productivity), and that trade-off isn't always worth it. I would love to see a low overhead version of this that can keep data as packed arrays on a GPU during intermediate steps, but I'm not sure it's possible with Unix interfaces available today.
Maybe there's a use case with very small networks and CPU evaluation, but so much of the power of modern neural networks comes from scale and performance that I'm skeptical it is very large.
Notice that the bulk of data does not necessarily go through the pipeline (and thus by the cpu). You may only send a "token", than the program downstream uses to connect to and deal with the actual data that never left the gpu.
“Sure, Unix is a user-friendly operating system. It's just picky with whom it chooses to be friends.” ~ Ken Thompson on Unix
But seriously, I would argue that Unix is "superuser" friendly - very friendly to advanced users who like their power tools, and is only unfriendly to those who want to have a more casual relationship with their computer (which admittedly is probably 98% of users).
I am not really a developer anymore, but any system that expects me to use a mouse over a keyboard makes me feel less productive.
Flexibility can often be "user productivity" as well.
Is this not something that can be helped with more advanced compilers?
Another poster commented that performance might not be that great, but I don't care about performance, I care about the essence of the idea, and the essence of this idea is brilliant, absolutely brilliant!
Now, that being said, there is one minor question I have, and that is, how would backpropagation apply to this apparently one-way model?
But, that also being said... I'm sure there's a way to do it... maybe there should be a higher-level command which can run each layer in turn, and then backpropagate to the previous layer, if/when there is a need to do so...
But, all in all, a brilliant, brilliant idea!!!
The author mentioned that this is only for inference of neural networks (not training), so this does not support backpropagation.
John Carmack working on Scheme as a VR scripting language
https://bair.berkeley.edu/blog/2017/06/20/learning-to-reason...
I wrote some Racket Scheme code that reads Keras trained models and does inferencing but this is much better: I used Racket’s native array/linear algebra support but this implementation uses BLAS which should be a lot faster.
https://addons.mozilla.org/en-US/firefox/addon/smart-referer...
It seems like today - and maybe I am wrong - but that data science and deep learning in general has pretty much "blessed" Python and C++ as the languages for such tasks. Had this been implemented in either, it might receive a wider audience.
But maybe the concept itself is more important than the implementation? I can see that as possibly being the case...
Great job in creating it; the end-tool by itself looks fun and promising!
The one true gripe they outlined is that the user interface requires some explanation before one can bootstrap their own knowledge.
I'll defend the unix philosophy till I die, probably. Why _wouldn't_ you apply it here?
I wouldn't and won't
When I came up with the idea of chaining and piping neural network layers on the command line, I also came across CHICKEN Scheme which promised to be portable and well-suited for translating the Clojure-based implementation I had previously done. As you can probably imagine, the porting process was a lot more involved than I expected, but nevertheless I had a BLASt (pun intended) hacking on it.
Cool, you got to discover Scheme today! It's one of the classical languages that defines the programming world we live in.
> data science and deep learning in general has pretty much "blessed" Python and C++ as the languages for such tasks.
It's reasonable to expect that the languages a community uses for its programs bear some proportionality to the broader programming community unless you have a very severe historical isolation of that community (i.e., MUMPS in medical informatics). Python and C++ are extremely common languages. You should expect the usual long tail of other languages as well. And under the hood, it's really all about CUDA anyway.
There's nothing wrong with implementing the tool in Scheme, but the problem is that typical ML frameworks implemented in Python use Python as their "glue" language (which already can be somewhat problematic performance wise). This approach is using a text serialization and sh as the glue language.
Sure, it's conceptually neat, but for exploration, it's not even competitive with regular Python, let alone e.g. Python in a Jupyter notebook.
I could see an approach using scheme itself as the exploratory glue language being quite competitive. Dropping down into shell pipelines is decidedly worse.
It is a functional-first, small and clean variety of lisp. A much more beefed-up cousin is Racket, but some implementations of scheme (in particular Guile and CHICKEN) have excellent ecosystems.
Scheme is a language that deserves to be used more. It simplicity is deceptive. It is extremely powerful because although the basic components are simple, there is virtually no limitation on how they may be composed.
There is literally only a single drawback to using Scheme (it might appear that being dynamically typed is a weakness, but both CHICKEN and Racket offer typed variants), is that it is sadly extremely unportable. The specification for Scheme is very small, and a large number of implementations exist which go beyond this standard - so basically none are compatible with eachother.
Also I'm suspicious that Tensorflow will be the Deep Learning library we'll use 5-10 years down the line so it's always nice to see smaller projects that try to do something different.
Scheme is a classic Lisp dialect.
I was careful to say "functional-first". Racket has a fully-fledged object system, too.
If it were true that humans thought "imperatively", then we'd all still be using languages with GOTO.
I wonder if, at that point, we'll wax nostalgic about the way software used to grow organically. Ahhh, to lose myself once more in the meandering spaghetti of yesteryear...
cat tweets.txt | layer language-embed | layer sentiment > out.txt
In your example, the sentiment layer will work without re-training or finetuning only if preceeded by the exact same language-embed layer as the one it was trained on. You can't swap in another layer there - even if you get a different layer that has the exact same dimensions, the exact same structure, the exact same training algorithm and hyperparameters, the exact same training data but a different random seed value for initialization, then it can't be a plug-in replacement. It will generate different language embeddings than the previous one - i.e. the meaning of output neuron #42 being 1.0 will be completely unrelated to what your sentiment layer expects in that position, and your sentiment layer will output total nonsense. There often (but not always!) could exist a linear transformation to align them, but you'd have to explicitly calculate it somehow e.g. through training a transformation layer. In the absence of that, if you want to invoke that particular version of sentiment layer, then you have no choice about the preceeding layers, you have to invoke the exact same version as was done during the training.
Solving that dependency problem requires strong API contracts about the structure and meaning of the data being passed between the layers. It might be done, but that's not how we commonly do it nowadays, and that would be a much larger task than this project. Alternatively, what could be useful is that if you want to pipe the tweets to sentiment_model_v123 then a system could automatically look up in the metadata of that model that it needs to transform the text by transformation_A followed by fasttext_embeddings_french_v32 - as there's no reasonable choice anyway.