Hot code reloading with Erlang

Hot code reloading with Erlang(medium.com)

73 points by kansi 10 years ago | 29 comments

skrebbel 10 years ago |

I wonder what HN's devops people think about this wrt the current trend of containers and immutable infrastructure. Hot code reloading seems to be directly at odds with the idea of immutable architecture, because essentially the application code becomes state. So your container becomes stateful, instead of swapping out your old appserver container by a new one.

What's your opinion? Ditch Docker and put the Erlang VM on the host OS? Ditch hot code loading and swap containers the usual way? Some middle ground?

greenleafjacob 10 years ago | |

Hot code reloading is always more work than just blue-green and should be avoided if necessary. For example, author of Learn You Some Erlang writes [1]:

> if you can avoid the whole procedure (which will be called relup from now on) and do simple rolling upgrades by restarting VMs and booting new applications, I would recommend you do so.

Erlang grew out of the challenges faced by telecoms industries such as what do you do when blue-green isn't an option? Think an in-use packet switch that is the only point of contact between two networks. No way to take the switch down for maintenance without some interruption in service, which gets messy when dealing with timeouts. In the Armstrong thesis paper he gives another example [2]:

> Usually in a sequential system, if we wish to change the code, we stop the system, change the code and re-start the program. In certain real-time control systems, we might never be able to turn off the system in order to change the code and so these systems have to be designed so that the code can be changed without stopping the system. An example of such a system is the X2000 satellite control system developed by NASA.

This power comes at a cost, though. LYSE again:

> It is said that divisions of Ericsson that do use relups spend as much time testing them as they do testing their applications themselves. They are a tool to be used when working with products that can imperatively never be shut down.

The point being, hot code reloading is an additional feature that can come in handy but for most of HN's audience probably won't be relevant; the cost outweighs the benefits of just blue-green deploying it.

[1] http://learnyousomeerlang.com/relups#the-hiccups-of-appups-a... [2] http://www.erlang.org/download/armstrong_thesis_2003.pdf

stingraycharles 10 years ago | |

On the contrary, the implementation of Erlang's hot code reloading forces state to be separated from the code. If you look at Erlang's gen_server, every call requires you to return a new State object, which is passed to the next function call.

In other words, you can compare the Erlang's virtual machine with a container itself, and everything old is new again!

jlouis 10 years ago | |

Hot code reloading is best used for a scenario where you cannot afford to restart your system, usually because it drags a lot of internal state around and reconstructing that state is expensive.

Typical use cases include several gigabytes of in memory state which takes a long time to read in and get hot when redeploying or a large amount of long-running TCP connections.

For most other uses, we just do rolling upgrades in Erlang as everyone else is doing. It is somewhat simpler to get to work, and immutable architecture is to a certain extent easier to manipulate.

dm3 10 years ago | |

You have to start with the problem. If your problem is solved by highly available and stateful services, then the Erlang VM on the host seems like a good idea. If the availability doesn't matter that much or the services can be made stateless without much pain - go for the containers.

Ixiaus 10 years ago | |

As always: depends on the use case.

We're using Erlang as the primary language environment for our IoT product for a lot of reasons but one big one is: Hot code loading and a very robust release upgrade environment with a lot of control over the process (including restarting everything inside the VM if that's what we wish to do).

For our product, a digital light switch / dimmer, high uptime guarantees is a very important requirement and Erlang has it all plus many other wonderful features.

Fuddh 10 years ago |

Attended a talk by one of the creators of Erlang a couple of weeks ago. Very passionate about achieving maximum uptime for applications written in his language. This is one of the features that makes that possible... Fascinating stuff.

stevegh 10 years ago |

Interesting. The hot code reloading functionality in Erlang led me to investigate ruby (my preferred Dev language) a bit more.

You can do a hot code load in Ruby using the Kernel#load() call. It won't alter functionality currently on the call stack, but it will change the functionality of everything not on the call stack. With some sympathetic design, you can achieve hot code loading fo high availability in ruby.

pmontra 10 years ago | |

You can use that to replace code by monkey patching

    $ cat hi.rb 
    def method
      puts "hi"
    end
    method
    load("hello.rb")
    method

    $ cat hello.rb 
    def method
      puts "hello"
    end

    $ ruby hi.rb 
    hi
    hello

You must engineer your application to execute the load method and that's it. However I wonder if this is really equivalent to what Erlang does. I remember http://rvirding.blogspot.it/2008/01/virdings-first-rule-of-p...

pmontra 10 years ago | | |

Interesting post at http://blog.rkh.im/code-reloading

amelius 10 years ago |

> Hot code loading is the art of replacing an engine from a running car without having to stop it.

Except you can clone the car into a controlled environment, and test the whole procedure, before doing the actual replacing.

guiomie 10 years ago |

This is cool. What type of scenarios could you not afford a few seconds of downtime on a server? For example, why not simply remove a machine from the cluster/nlb and upgrade it, then add it back ...?

yetihehe 10 years ago | |

When your server has several gigs of state. It's VERY useful on a dev server. Instead of waiting several minutes for reload, I just load in new code manually (typically I change only 1-2 files per reload). If something breaks - hey, it's only dev server. Erlangs other feature - almost everything works alone - helps with this. If something breaks, it breaks only in one place, so most of the time I only need to make small changes and reload once more. Rest of the system does what it needs without any downgrades.

simoncion 10 years ago | | |

> Instead of waiting several minutes for reload, I just load in new code manually (typically I change only 1-2 files per reload). If something breaks - hey, it's only dev server.

Someone wrote a module for elixir that uses inotify (and similar) to -I think- watch .beam files for modification and perform the required hot-reloads automatically.

I would be reluctant to run this in production, and I can see situations (even in development) where this could trigger unwanted code purging and would be disastrous, but it's a pretty neat thing to have and -it seems- a must for Web Dev people.

jgalt212 10 years ago |

You know what would be really amazing is if you could restart the Erlgang VM, or load new a VM without interrupting any of the running code modules.

dozzie 10 years ago | |

That's what distributed Erlang is for.

simoncion 10 years ago | |

What -exactly- do you want to do when you say you want to restart the Erlang VM?

I'm asking because I don't have enough context to know why you want to do what you're asking to do.

toast0 10 years ago | | |

Not the original asker, but it would be nice to be able to upgrade between OTP releases, without having to restart my application (but I have no expectation of that being possible ever when there are VM changes, and unlikely for the effort to be spent to do it for the user-space erlang bits either). I have to use DNS for load balancing [1] and big mnesia tables, so I have to wait a long time for traffic to drain, and then another long time for the application to start back up.

Working for 4 years in an Erlang environment where hotloading is the norm, makes me wish for it everywhere! Why do I have to reboot to fix kernel bugs in tcp? :(

[1] the load balancers I have access to where we host had more downtime than our hosts, so not actually helpful

jgalt212 10 years ago | | |

I would assume the longer any VM is running the higher the chances of a service degrading. I guess this is mostly due to memory leaks or bit rot. I have no Erlang VM experience, so my comment was geared towards VMs in general.

Grue3 10 years ago |

Seems like a lot of work. In Common Lisp I can just press C-c C-c in SLIME over the changed function and it goes live.

Eshell V7.0 (abort with ^G) 1> l(inet). {error,sticky_directory} =ERROR REPORT==== 6-Dec-2015::03:37:00 === Can't load module 'inet' that resides in sticky dir 2> code:which(inet). "/usr/lib/erlang/lib/kernel-4.0/ebin/inet.beam" 3> code:unstick_dir("/usr/lib/erlang/lib/kernel-4.0/ebin/"). ok 4> l(inet). {module,inet} 5>