Parallel Programming with Python(chryswoods.com) |
Parallel Programming with Python(chryswoods.com) |
This is just not true, because C extension modules (i.e. libraries written to be used from Python but whose implementations are written in C) can release the global interpreter lock while inside a function call. Examples of these include numpy, scipy, pandas and tensorflow, and there are many others. Most Python processes that are doing CPU-intensive computation spend relatively little time actually executing Python, and are really just coordinating the C libraries (e.g. "mutiply these two matrices together").
The GIL is also released during IO operations like writing to a file or waiting for a subprocess to finish or send data down its pipe. So in most practical situations where you have a performance-critical application written in Python (or more precisely, the top layer is written in Python), multithreading works fine.
If you are doing CPU intensive work in pure Python and you find things are unacceptably slow, then the simplest way to boost performance (and probably simplify your code) is to rewrite chunks of your code in terms of these C extension modules. If you can't do this for some reason then you will have to throw in the Python towel and re-write some or all of your code in a natively compiled language (if it's just a small fraction of your code then Cython is a good option). But this is the best course of action regardless of the threads situation, because pure Python code runs orders of magnitude slower than native code.
I think some people's opinions is that if you're writing in C then you're not really writing a Python program, so they think it is impossible in Python. Which seems a reasonable point to make to me.
Your argument is that Python is fine for multithreading... as long as you actually write C instead of Python.
def add_and_mult(a, b, c):
return a + b @ c
If a, b and c are numpy arrays then this function releases the GIL and so will run in multiple threads with no further work and with little overhead (if a, b and c are large). I would describe this as a function "written in Python", even though numpy uses C under the hood. It seems you describe this snippet as being "written in C instead of Python"; I find that odd, but OK.But, if I understand you right, you are also suggesting that the other commenters here that talk about the GIL would also describe this as "written in C". They realise that this releases the GIL and will run on multiple threads, but the point of their comments is that proper pure Python function wouldn't. I disagree. I think that most others would describe this function as "written in Python", and when they say that functions written in Python can't be parallelised they do so because they don't realise that functions like this can be.
The GIL means that a single Python interpreter process can execute at most one Python thread at a time, regardless of the number of CPUs or CPU cores available on the host machine. The GIL also introduces overhead which affects the performance of code using Python threads; how much you're affected by it will vary depending on what your code is doing. I/O-bound code tends to be much less affected, while CPU-bound code is much more affected.
All of this dates back to design decisions made in the 1990s which presumably seemed reasonable for the time: most people using Python were running it on machines with one CPU which had one core, so being able to take advantage of multiple CPUs/cores to schedule multiple threads to execute simultaneously was not necessarily a high priority. And most people who wanted threading wanted it to use in things like network daemons, which are primarily I/O-bound. Hence, the GIL and the set of tradeoffs it makes. Now, of course, we carry multi-core computers in our pockets and people routinely use Python for CPU-bound data science tasks. Hindsight is great at spotting that, but hindsight doesn't give us a time machine to go back and change the decisions.
Anyway. This is not the same thing as "multithreading is impossible". This is the same thing as "multithreading has some limitations, and for some cases the easiest way to work around them will be to use Python's C extension API". Which is what the parent comment seemed to be saying.
I've mainly been looking at these resources:
https://github.com/rochacbruno/rust-python-example
Though I have not done rust <-> python in real practice
If you care about speed, Rust is supposedly as fast as C. The Rust ecosystem also has a lot of supposedly safe(!) tools for parallelism.
Type system and expressive macros seems like a big win over c to me.
That was interesting, thanks!
I really wish he had shown his numpy code. He said at 13:46 "Numpy actually doesn't help you at all because the calculation is still getting done at the Python level". But his function could be vectorised with numpy using functions like numpy.maximum or numpy.where, in which case the main loop will be in C not Python. I can't figure out from what he said whether his numpy code did that or not.
But either way, it's interesting that in this case the numpy version is arguably harder to write than the Cython version: rather than just adding a few bits of metadata (the types), you have to permute the whole control flow. If there's only a small amount of code you want to convert, I would still say it's better to use numpy though (if it actually is fast enough), because getting the build tools onto your computer for Cython can be a pain. And for some matrix computation there are speed inprovements above the fact that it's implemented in C e.g. matrix multiplication is faster than the naive O(n^3) version.
Why using legacy Python for this?
I get EOL/deprecation is here but lets not jump the gun to legacy just yet. I just see more 2 than 3 @ Day Job.
asyncio is a good library for asyncronous I/O but concurrent.futures gives us some pretty nifty tooling which makes concurrent programming (with ThreadPoolExecutor) and parallel programming (with ProcessPoolExecutor) pretty easy to get right. The Future class is a pretty elegant solution for continuing execution while a background task is being executed.
[0] https://docs.python.org/3/library/concurrent.futures.html
Step 1: stop using Python.
"You can have a second core when you know how to use one"
Now don't get me wrong, Python is a perfectly fine language for lots of things, but not for taking optimal advantage of the CPU.
https://benchmarksgame-team.pages.debian.net/benchmarksgame/...
Relative performance compared to C is somewhere between an order of magnitude or two slower. Considering how much harder and more error-prone multi-core is, maybe first try a fast sequential solution.
The ratio between the most-performant parallel framework and the least on Python will be a factor of (guessing) 1.5.
The ratio between a CPU-bound algorithm written in C and one in Python will be of the order of 10000 (again guessing as it's application-dependent).
Where is your time most profitably spent?
Just curious...
Most of the Python programs referenced on that benchmarks game webpage are in-fact using multi-core ?
Also includes best currently available hyperparameter tuning framework!
However, where I think having this stuff available inside of python is useful is that it's cross platform and consumable from "higher levels" of python. A library can do some mucky stuff internally to speed computation but still present a simple sync interface, all without external dependencies.
Nowadays you can also use serverless to parallelize coarse-grained workloads in the cloud.
I could not agree more
It's definitely cheating to use C code with the exception of most Python libraries that already are to a large extent nothing more than thin wrappers over existing C libraries or the tiny fact that the most popular by far implementation of Python , CPython, is almost 50% implemented in the C language, including the standard library.The author even dared include "C" in the name of the implementation.
Those cheaters, becoming bolder and bolder every day.
Damn them !!!
Hold on, the GIL doesn't make Python automatically thread-safe!
You can still have classic data races as the VM can pause and resume two threads writing to the same variable.
It also simplifies a lot of CPython code, making it a lot easier to maintain.
What about no?
Don't get me wrong, i don't like Python as a language, but it's a fine tool and many useful programs have been written with it
But parallel programming? No, thanks.
For parallel execution, there's the GIL, but in practice it rarely matters, because once you want to do parallel execution, you have most likely a computationally intensive task to do, at which point you call down to C or something, and then GIL doesn't matter.
Eh, let me stop you there. Everything isn't about performance.
Hardware and UI based things really benefit from parallelism.
trio:
https://github.com/python-trio/trio
trio compared to asyncio, goroutines, etc.:
https://stackoverflow.com/a/49485603/1612318
"Notes on structured concurrency, or: Go statement considered harmful":
https://vorpus.org/blog/notes-on-structured-concurrency-or-g...
Was a damn good read, Thanks!
There are hundreds of libraries to deal with concurrency and/or parallelism in Python, asyncio, Celery and PySpark being the common ones.
All of them provide different approaches to concurrency because the language itself is not tight to one in particular.
And all of that is really just I/O parallelization; there's also CPU parallelization, and I don't believe Python has anything that's quite as easy as "Do these two things in parallel". Pretty much everything requires a lot of marshalling and process management which can easily slow a program down instead of improving it.
Python is great for a lot of things, and the community has found many creative workarounds for its shortcomings, but Go beats Python in I/O and CPU parallelism handily.
My library lets you do parallelism in a unique way, where you do message passing parallelism without being explicit about it.
Also, by having the introductory chapter be about "functional programming" (which incidentally Python does not do well), he completely bypasses the serious issue of shared state.
Which goes to show that parallelism in Python is more like a gimmick than a real-world solution since it doesn't let you do in-process shared-memory processing via threads in parallel which is so important for many applications. In my case, the vast majority of the time I do not want to farm workers out to different operating system processes and deal with serialization and communication, but this is the only way for Python code to take advantage of multiple cores [1].
[1] Another way is to write a module in C and have Python code call into it on a new thread and release the GIL while doing so, but of course this is even worse pain-wise than doing it with multiprocessing and you end up writing/compiling C.
I thought a lot about this problem, for over 2 years, and came up with zproc
https://github.com/pycampers/zproc
Basically,
> It lets you do message passing parallelism without the effort of tedious wiring.
You'll be doing message passing without ever dealing with sockets!
Also, Shared memory parallelism is hard to get right irregardless of which language you use. I would recommend strongly against it, unless you're writing some really really really niche thing where message passing is a bottleneck (it isn't most of the time)
It means threas-based parallelism of pure-python code is unavailable; concurrency is just fine on Python.
While it's a very big hammer, consider experimenting with Celery for your parallelism needs on Windows. I've had good results using per-script Celery "clusters" with either a filesystem (on a ramdisk for extra speed) or an embedded Redis backend to accomplish pretty nice bidirectional RPC-ish parallelism. The initial setup is much more complicated than something like goroutines, but once you get it working you can boilerplate it onto other tasks without much trouble.
It still won't save you from memory constraints imposed by the lack of good fork() emulation, though. Hopefully the WSL stuff will either bring better fork() emulation, or allow support for shared memory objects (e.g. multiprocessing.Value) in order to ease some of that pain.
Sadly I don't think this is _quite_ true. I believe GILs are used in a number of interpreters and fall prey to the common problem of where either coarsening locks or making them finer ruins somebody's day. I believe Guido Van Rossum hung the GILectomy on two main issues: The interpreter must remain relatively simple, and C extensions cannot be slowed down.
I'm not disagreeing with the decision (necessarily) but it isn't simply a layover from a bygone era. It was a decision that has been reaffirmed and upheld numerous times.
[0]: https://lwn.net/Articles/754577/ [1]: just google Gilectomy, it's been covered in a few places that I don't have handy.
The thing is, in the 90s the choices the produced the GIL as it exists were not bad ones; that's why I went to the trouble of explaining how it affects threaded code and why those effects can be considered reasonable tradeoffs for what was known at the time, in implementing threading (without completely breaking the ecosystem of Python + Python extensions, which was already significant even back then).
Of course, knowing what's known today about the directions computing and the use of Python went in, different decisions might end up being made, but at this point it's very difficult (more difficult than people typically expect) to undo them or make different choices.
The build system is https://github.com/PyO3/setuptools-rust (which is linked at the bottom of the above readme).
But I use the right tool for the job. Python is a great tool, but not for performance (Applies to all dynamic, interpreted languages TBH).
Got it!
If instead of operating on a numerical matrix, you were instead operating on something like a graph of Python objects, something like a graph traversal would be hard to parallelise as you could not stay out of the GIL long enough to get anything done.
I did also concede that if you do have to write your algorithm completely from scratch, with no scope for using existing C extensions (be they general purpose like numpy or more specialist that implements the whole algorithm) then yes you'll be caught be the GIL, so I agree with you on that. But I also made the point that you'll be caught even more (orders of magintude more!) by the slowness of Python, so any discussion about parallelism or the GIL is a red herring. It's like worrying that you car's windscreen will start to melt if you travel at 500mph; even if that's technically true, it's not the problem you should be focusing on.
It's interesting you mention graphs because the most popular liberally licensed graph library is NetworkX, which is indeed pure Python and so presumably isn't particularly fast. There are graph libraries written as C extension modules but I believe they are less popular and less librally licensed (GPL-style rather than BSD-style). So I definitely agree that this is a big weakness of the Python ecosystem.
Has been in Python since version 3.5
There are idiot-proof thread-safe datastructures and producer/consumer APIs that map extremely well to most problems that come up in practice in the domain, that one should confidently use. Refusing to do shared memory parallelism because of the _abstract potential for havoc_ rather than any practical justifications based on the problem-at-hand is throwing out the baby with the bathwater and is not the mark of competent engineering.
The problem is that its _hard_ to get right.
For example - It's not trivial to use locks when you're working at an abstraction level higher than operating systems. Most people don't even realise there is a race in their application, because locks are inherently non-enforcing. Code written in locks is also really hard to read and reason by.
Message passing just makes it a little more trivial to avoid the pitfalls associated with parallel programming.
I also found that it lets you avoid busy waiting in certain places, which is always a performance advantage :)
Can you shed some light on those "idiot-proof thread-safe datastructures"?
Futures in particular make it easy to write concurrent code close to the way you would write single-threaded code, because all of the threading is handled behind the scenes.
Please research your topic.
you've rediscovered message-passing... please take an elementary CS course on parallel systems.
That claim is naive in the extreme.
http://zguide.zeromq.org/page:all#Multithreading-with-ZeroMQ
Maybe I should've just linked it there,sorry!
Okay, I will take that course and get back, thanks for the suggestion.
P.S. You just implied Pieter Hintjens is naive. You have to live with that now :(
"By "perfect MT programs", I mean code that's easy to write and understand, that works with the same design approach in any programming language, and on any operating system, and that scales across any number of CPUs with zero wait states and no point of diminishing returns."
That doesn't mean to say its "perfect" or "solves" multithreading, just that its easy to write and understand and portable across architectures. That says nothing of how optimal it is for concurrency or parallelism ease-of-use wise or performance-wise, just that its 'easy'.
(Speaking purely from experience. Don't have a fancy CS degree)
Given those restrictions and use cases you get a very efficient low latency locking mechanism.
Try saying that out loud?
easy to write and understand is something completely different to correctness, robustness, scalability, etc. All those must be considered if you think you have 'solved' parallelism, but they are orthogonal to 'easy to understand'.
You could easily interpret that as -
Perfect _implies_ that it's easy to write and understand, but it's not the whole picture. It's just a feature that _he_ thinks is _crucial_ to it being perfect.
You get my point right?
Like sure, you could implemented a _perfect_, I don't know like gnome desktop in assembly language, but it wouldn't be easy to write and understand.
He thinks it's essential that it should be easy to read and write for it to be perfect.
Unfortunately, He's not with us now so can't even confirm :(
TBH, you're claims sound like you've just "discovered" message-passing, of which many, many languages, runtimes and operating systems have been using for many years/decades. (https://en.wikipedia.org/wiki/Message_passing)
In other words... its not a revolution.
ZProc seems to simply be a simple library to pickle data structures thru a central (pubsub?) server.
This is not the way to get remotely close to "high performance". What you've created here is pretty much what multiprocessing gives you already in a more performant solution (i.e. no zeromq involved).
Minor point of pedantry which I'll state because it's an often-overlooked timesaver for folks developing on multiprocessing: not only is MP potentially faster for transferring data between processes compared to this solution, but it can also be way, way faster in situations where you have all your data before creating your processes/pool and just want to farm it out to your MP processes without waiting for it all to be chunked/pickled/unpickled.
Because of copy-on-write fork magic, many multiprocessing configurations (including the default) can "send" that data to child processes in constant* time, if the data's already present in e.g. a global when children are created.
This pattern can be used to totally bypass all considerations of performance/CPU/etc. for pickling/unpickling data and lends a massive speed boost in certain situations--e.g. a massive dataset is read into memory at startup, and then ranges of that dataset are processed in parallel by a pool of MP processes, each of which will return a relatively small result-set back to the parent, or each of which will write its processed (think: data scrubbing) range to a separate file which could be `cat`ed together, or written in parallel with careful `seek` bookkeeping.
Unix-ish OSes only, though (unless the fork() emulation in WSL works for this--I have not tested that).
* Technically it's O(N) for the size of data you have in memory at process pool start, because fork() can take time, but the multiplier is small enough in practice compared to sending data to/from MP processes via queues or whatever that it might as well be constant.
Note that this works for big objects, but not for small objects. E.g. if you fork-share a large list of integers or dicts or something like that, then you don't get any memory usage benefits, because every access will cause a refcount-write and that will copy the whole page containing the object.
> * Technically it's O(N) for the size of data you have in memory at process pool start
It's not quite that simple; sharing n pages can take very little time or a bit more time; it depends on how the pages are mapped; sharing a large mapping doesn't take longer than a small mapping.
Have you tried this or got it working ? The fly in the ointment is the reference count. Add a reference and BOOM you suddenly have a huge copy. It can be made to work efficiently in certain cases but takes a lot of care.
In fact, I think Performance centric development is a lesser known evil.
> have all your data before creating your processes/pool
Zproc exposes the required API for this (Nothing new, just the python API) :)
https://zproc.readthedocs.io/en/latest/api.html#zproc.Proces... (args and kwargs)
> a massive dataset
Wouldn't you be better off using a Database for that kind of work?
> Because of copy-on-write fork magic, many multiprocessing configurations (including the default) can "send" that data to child processes in constant time
Any resources on how to implement that?
I never claimed it to be performant!
"Above all, ZProc is written for safety and the ease of use."
(Read here - https://github.com/pycampers/zproc?files=1#faq)
> It's not a revolution
I totally agree. It's just a better way of doing things zmq already perfected. Like, tell me if you've ever seen a python object that has a `dict` API, but does message passing in the background.
> central (pubsub?) server.
Central server, yes. It uses PUB-SUB for state watching and REQ-REP for everything else.
> you've just "discovered" message-passing
Guess you're right? 2 years is a peanut on the time scale...
P.S. Thanks for all the feedback, I've been dying to hear something for a while now.
Don't get me wrong, message-passing has some advantages, but they certainly aren't that it 'solves' parallelism. If you wish to know more, investigate:
- Smalltalk and Erlang (for message passing languages).
- QNX (for a message-passing OS)
- mpiPY (for a message-passing Python library, mpi is the grandfather of message passing libraries that runs everywhere).
- Occam & the transputer for an example of a hardware-mp implementation (actually its Communicating Sequential Processes, but for your purposes it would be enlightening).
- golang for a modern-day implementation of CSP.
- Python implementation of CSP (https://github.com/futurecore/python-csp)
- Discussion about MP (http://wiki.c2.com/?MessagePassingConcurrency, for more just google it)
Basically, its great that you want to learn about concurrency & parallelism, but you've come to a gun fight with a blunt butter knife.
That's a big claim which you don't really back up as much as you need to. Unique is an extremely high bar in this very busy field.
There are several other similar red flags on the linked GitHub; I think your enthusiasm is running away from you a little. You might want to dial the ten-dollar language back a bit – it made me immediately suspicious ("utterly perfect", for example is another danger phrase).
It's the combination of grandiose language + solution-in-search-of-a-problem which leads to that.
If you're going to sell hard, what I would want to see is a large, complex, high-traffic system which makes extensive use of this; if you compare and contrast with Ray, which I've also only just encountered in this thread, there's a real problem (distributed hyperparameter optimization) which they've built a solution for with the library, and that immediately lends it credibility; I know the system can be used for something because it has been.
http://zguide.zeromq.org/page:all#Multithreading-with-ZeroMQ
Thought linking it there would make it better, but I'll just remove it...
And you do make a good point. It doesn't really solve anything technically. But would you agree that it exposes a better API for doing much of the same stuff?
So you've just invented a new name for a coordinator process and called it a new fashion in computation?
Just without the 'niceties'.
He is fully aware that he has not solved parallelism.
Very true; I went into some more detail about my typical use case above. Using MP for lots of small objects that you've already extracted from raw data/IO/whatever is a game of diminishing returns. It's in situations like that where traditional shared-memory starts looking more and more attractive. When I get to that point, while multiprocessing and some other packages provide a few nice abstractions over shmem, I start looking for other platforms than Python.
> It's not quite that simple; sharing n pages can take very little time or a bit more time
Definitely; I was simplifying in order to compare the overhead of fork with the overhead of pickling/shipping/unpickling data. Sharing large pieces of data with even very slow fork()ing is, in my experience, so much faster than the [de]serialize approach that it is effectively constant in comparison, but I didn't mean to discount the complexities of what make certain forking situations faster/slower than others.
big_data = read_huge_binary_or_string()
def process_range(rng):
start, end = rng
do_something(big_data[start:end])
pool = multiprocesing.Pool(2)
pool.map(process_range, [
(0, 10000),
(10001, len(big_data),
])The `multiprocessing.Pool` uses a `multiprocessing.Queue` in the background to retrieve the results after completion.
The `multiprocessing.Queue` in turn uses `multiprocessing.connection.Pipe` and sends the pickled objects over to the wire.
So I don't see how this is any better than ZMQ.
Just because stuff has an API that doesn't look like message passing doesn't mean it can't be doing that in the background. Which is funny, because that's the whole point of ZProc.
I realize the subtle difference that Cpython uses pipes, not sockets, unlike ZMQ. But that doesn't really make a difference now, does it?
Proof:
Process Pool worker, returning the result by using `outqueue.put()`
https://github.com/python/cpython/blob/86b89916d1b0a26c1e77f...
multiprocessing Queue, initializing a Pipe
https://github.com/python/cpython/blob/86b89916d1b0a26c1e77f...
multiprocessing Queue serializing data to send it using that Pipe
https://github.com/python/cpython/blob/86b89916d1b0a26c1e77f...
The point of the original post is that MP lets you do more than just serialize/ship data around after pool start time; there are substantial optimizations you can do if you know lots of the data you need to process early on.
I thought you were talking about sending data to child processes in constant* time, while it was running.
Most of the situations where I care enough about memory and/or pickling overhead fall into the "take a giant block of binary/string data and process ranges of it in parallel" family, in which case there aren't too many references until the subprocesses get to work. If I had more complex structures of data I'd probably get a little less performance bang for my buck, but even then I suspect it would be much faster than multiprocessing's strategy: pickling and sending data between processes via pipes is many times slower than moving the equivalent amount of data by dirty-writing pages into a forked child.
That's not meant to discount anything y'all are saying, though: refcounts are definitely a very important thing to be mindful of in this situation. A child comment suggests gc.freeze, which can help, but can't entirely save you from thinking about this stuff.
It's also very important to be mindful of what happens with your program at shutdown: if you have a big set of references shared via fork(), and all your children shut down around the same time, your memory usage can shoot up as each child tries to de-refcount all objects in scope. This applies even if each child was only operating on a subset of the references shared to it. If you're processing, say, 1GB of data from the parent in 8 children on a 4 core system (doing M>N(cpu) because e.g. children spend some time writing results out to the FS/network), a near-simultaneous shutdown could allocate 9GB of memory in the very worst case, which can cause OOM or unexpected swapping behavior. Throttled shutdowns using a semaphore or equivalent are the way to go in that case.
In my workload that's exactly when it hits.
We ran into this when sharing different parts of a huge matrix with different workers. We had to be extra careful that we did not create new references in the subprocesses. We were operating at scale where if we got it wrong OOM will kill us.
Working with memory mapped arrays are more forgiving.
If you could point out some stuff from ZProc's page, that would be nice!
> mpi is the grandfather of message passing libraries
Never heard of it before, but just a simple google search reveals that it _might_ be more performant than zmq, but not as fault-tolerant and flexible. It really looks like a niche thing, from this comment by peter hintjens
> Why smart cloud builders are betting everything on 0MQ. In detail, compare to the alternatives. Hand-rolling your own TCP stack is insane. Using any broker-based product won't scale. Buying licenses from IBM or TIBCO would eat up your capital. Supercomputing products like MPI aren't designed for this scale. There is literally no alternative.
(http://zeromq.org/docs:the-ten-minute-talk)
> Don't get me wrong, message-passing has some advantages, but they certainly aren't that it 'solves' parallelism.
Doesn't it? (For most people)
---
I can't believe I'm hearing words against zmq on HN, its wierd.
Even the guys over at Dask settled on ZMQ over anything - https://github.com/dask/distributed/issues/776
P.S. Seems like you know quite a lot about this topic. Do you have any projects of your own that I can see?
Bottom line, I think most people would be happy doing message passing parallelism in the real world. Sure, it doesn't look that good in theory but works damn good in practicality.
...also, nanomsg is the 'improved' successor.
Also, MPI isn't a 'niche' thing, its the way that a large proportion of high-performance applications have been implemented for a few decades (think Crays & weather prediction). Zeromq has a few simple web-apps using it (I exagerate slightly).
Anyone who has done any scientific or technical computing is highly likely to be familiar with it – it's been around in some form for over 25 years.
I use it everyday for some of my home baked apps, and would love if this really made a difference.
The problem here is the claims you're making. You've written some utility classes around 0MQ for some applications, which is a real thing, so I'd rewrite your GitHub readme to just demonstrate what problems you've solved with it (and at what kind of scale). Making big, sweeping claims gets you into these kinds of threads, because extraordinary claims require extraordinary evidence :-)
But since you seem like arguments from authority, I've got around 25 years experience in software ranging from hard-real-time embedded defence software to safety-critical train braking systems. I've been software architect on systems selling 10's of millions of products, currently working in the IoT space. I've architected and implemented software on servers, desktops, embedded and mobile platforms.
But no, you aren't likely to find my stuff on GitHub.
It's the internet, so have to make sure.
Most of my stuff is closed source as well.
Iot you say?
I made a couple of stuff myself, mostly using the micropython stack or the raspberry pi.
What do you generally work on?
If you're interested in IoT (or embedded s/w in general), get away from MicroPython.
The primary characteristic of most embedded products is to be low-cost. When you're selling millions of products, cost counts. You can't waste cycles or resources on Python.
MicroPython is a toy for the 'makers'. Similarly the JS equivalents. No real high volume product would use those technologies.
https://github.com/pycampers/zproc
I use it in a couple of my own stuff which I have open sourced yet.
I do plan to release them, and I hope I can prove the usefulness of the library using those...
professional (corporate) stuff would be a far fetch for me, obviously.
Won't argue about pyboard, because that's more of a gimmick. (Way too expensive for what it does)
I think the cost and time of developing on micropython vs a lower level language like C, would superseded the cost associated with wasted cpu cycles.
However, I agree with the fact that no real product would use mpy right now in production because of the infancy of the project.
It certainly looks promising.
It's definitely NOT a toy.
OTOH JS is just a bad choice for this kind of work IMO. ( I'm a firm believer that JS is just a bad choice for anything in general, but IOT is just madness)
In embedded s/w, the cost of the h/w outweighs everything else, ok so your 'easily developed' Python app will cost (say) 10K less to develop... but the resource requirements mean you need to go up $0.5 on the processor....
Oops, on your 500'000 devices you've suddenly wasted $250,000. All because you couldn't be bothered to save a few KB of RAM.
Python is a toy for embedded s/w.... dont get me going on catching bugs at runtime in an embedded system because you're using a dynamically typed language! Now you have to upgrade 500,000 device over-the air (cellular data costs) which will take a week, meanwhile your customers a fuming because their data is being lost!
seriously!
How is catching bugs at runtime and dynamically typed language related?
In fact I've had (a lot) better experience debugging mpy apps than Arduino ones (that's the only low level embedded experience I have)
I mostly just let stuff fail, and have them restart gracefully (like the erlang guys)
Do yourself a favour and learn your craft instead of trotting out obviously wrong statements to people who know better. Thats not a way to make a good impression.
Oh yes, regarding Erlang, trust me, I'm aware of 'let-it-fail' and have implemented it production on large distributed systems and trust me... that does not justify writing embedded s/w in a dynamically-typed language.
The "let it fail" strategy is quite useful