Use Haskell for shell scripting

Use Haskell for shell scripting(haskellforall.com)

217 points by stefans 11 years ago | 103 comments

cies 11 years ago |

Thanks Gabriel Gonzalez! There is a comment on the blog post (by Chris Done) asking how it deals with piping. I really wonder about that too.

Some related projects:

- Joey Hess recently released a nice Haskell-to-sh compiler. I like this approach as the resulting sh scripts are runnable on pretty much every *nix. https://joeyh.name/blog/entry/shell_monad/

- Chris Done also released a lib to do shell stuff from Haskell, which build on the conduit library http://chrisdone.com/posts/shell-conduit

- Chris also wrote a shell in Haskell https://github.com/chrisdone/hell

- Then there is Shelly by Greg Weber https://github.com/yesodweb/Shelly.hs

There are probably more...

Gabriel439 11 years ago | |

You use `inproc` and `inshell` for piping. For example, here's the type of `inshell`:

    inshell
        :: Text        -- Shell command
        -> Shell Text  -- Standard input to feed command
        -> Shell Text  -- Standard output produced by command

I made one intentional simplification in the API, which was to not provide a way to capture standard error. It's definitely possible to provide such a utility, but I wanted to simplify things as much as possible in the first release before the slow onslaught of feature cruft begins. If there were such a utility, it would have this type:

    both
        :: Text        -- Shell command
        -> Shell Text  -- Standard input to feed command
        -> Shell (Either Text Text)

... and you could selectively listen to just stderr or stdout by taking advantage of the fact that pattern match failures short-circuit downstream commands:

    Left txt <- both -- only read stderr

There is one more shell library that I know of: `process-streaming`. I actually didn't know about `shell_monad` (that's the one most similar in spirit to what I wrote).

The main reason I rolled my own library is that this was written with the specific audience of people who didn't know any Haskell, but were comfortable with Python or Bash. My actual goal is to convince people internally at Twitter to use Haskell instead of Python for large scripts. I reviewed all those libraries (with the exception of shell_monad) to see if I felt comfortable marketing them to non-Haskell programmers and none of them felt like the right level of abstraction to me. I almost ended up going with Shelly, but in the process of polishing shelly for internal usage I found myself continually wrapping things with better names, different types, and providing missing features to get a single import umbrella, so I just stopped and asked: "why not just do this as a cohesive single library instead?". Also, `shelly` does not provide any `IO`-only commands: everything has to be wrapped in the `Sh` monad.

As for the other libraries, `shell-conduit` was too complex for new users in my opinion and `hell` is not embedded within Haskell (it's a separate language), and I wanted to keep the features of Haskell. I still need some more time to review `shell_monad` to see if I made a mistake by ignoring it.

danidiaz 11 years ago | | |

"process-streaming" is more like a set of helper functions for "process"; it doesn't provide formatting, regexps, or OS-independent implementations of typical shell commands. It does support piping of processes, though.

kenko 11 years ago | | |

Why `Either Text Text`? What if you're interested in both stdout and stderr?

cies 11 years ago | | |

Thanks for this elaborate response.

Doji 11 years ago |

The tutorial does a great job of explaining why this is interesting: http://hackage.haskell.org/package/turtle-1.0.0/docs/Turtle-...

For example, the pwd function returns a FilePath type rather than a String:

  Prelude Turtle> :type pwd
  pwd :: IO Turtle.FilePath

The datefile function is also typed:

  Prelude Turtle> :type datefile
  datefile :: Turtle.FilePath -> IO UTCTime

So this really does seem to structure the data passed between commands, instead of the "stringly typing" unix shells have historically been known for.

TazeTSchnitzel 11 years ago | |

Are those types just aliases of String?

throwaway283719 11 years ago | | |

No. For example, a FilePath is (after resolving a few other type aliases)

  data Root
	  = RootPosix
	  | RootWindowsVolume Char
	  | RootWindowsCurrentVolume

  data FilePath = FilePath
	  { pathRoot        :: Maybe Root
	  , pathDirectories :: [String]
	  , pathBasename    :: Maybe String
	  , pathExtensions  :: [String]
	  }

marcosdumay 11 years ago | | |

What do you mean by "alias"?

Path carries String-like information, it can even be easily converted to and from strings. Yet, it's a strong type that won't let you write something like 'path </> file_contents' (although, with overloaded strings, you can do 'path </> "file_name"').

UTCTime is not String-like.

barrkel 11 years ago |

OK. How do you easily fork to run a command in the background? How does setting up pipes work? What's the idiom for chdir'ing to a subdirectory such that you pop back out again when you're done (I'd use a subshell with (ch xxx; ...) in bash)?

Getting into more tricky stuff, what's the equivalent of <() in bash?

This doesn't really demonstrate anything that shell scripts are actually written for: orchestrating and composing other processes, and job control.

If you wanted to leverage type checking for safety, it would be more interesting to typecheck the streams input and output by pipes.

S4M 11 years ago |

I don't know much about Haskell, but I thought it had some properties to isolate side effects, but the code he gives:

    main = do
        cd "/tmp"
        mkdir "test"
        output "test/foo" "Hello, world!"  -- Write "Hello, world!" to "test/foo"
        stdout (input "test/foo")          -- Stream "test/foo" to stdout
        rm "test/foo"
        rmdir "test"
        sleep 1
        die "Urk!"

Clearly doesn't (it creates a directory, writes in a file, removes that file and that directory all in one go without anything indicated by the function main. Is it because it's the main function of the program, or am I missing something?

dkarapetyan 11 years ago |

Who's the target audience of this exactly? I already see a language pragma, do notation, liftIO, parser combinators.

Hamming has this great set of lectures on how he became a world renowned scientist and in one of the lectures he explains why Ada failed and other languages succeeded. The difference was that Ada was designed logically and most successful languages were designed psychologically. Even when government contracts mandated Ada people still wrote in Fortan and hand translated to Ada. You can watch the videos and take from it what you will.

A minimal bash file is`#!/bin/bash`. A minimal turtle file is already way too long and logical.

The set of videos: https://www.youtube.com/playlist?list=PL2FF649D0C4407B30.

boothead 11 years ago |

For all the considerable awesomeness that Gabriel produces, I always think the best part is the *.Tutorial module he includes. I always learn a lot and it's always a great over view that puts the work in context.

Everyone should do this!

chrisBob 11 years ago |

After learning Perl I started using it where some more educated people might recommend a proper shell script. My thinking is that using what you know is a whole lot more efficient than learning a new tool for a small job, even if some people think it is the right tool. I am sure it is no different for people familiar with Haskell.

loudmax 11 years ago | |

I do a lot of shell scripting, and I'm not sure there is such a thing as a "proper" shell script. The shell just isn't a great programming language. Just about any modern scripting language is better, starting with Perl. But the shell has been the lingua franca of the Unix world for decades now. It's the one language that you can pretty much guarantee is on any Unix or Linux server, even pretty ancient ones.

I don't doubt that Haskell isn't a better scripting language language than the shell, but you can't assume /usr/bin/env runhaskell is going to return anything on random Linux servers. Perl and Python, maybe, but Haskell isn't there yet.

klibertp 11 years ago | | |

> you can pretty much guarantee is on any Unix or Linux server, even pretty ancient ones

Well, yes and no. You can get reasonable compatibility with different Unix flavours if you stick to sh. Your script is not going to work on BSDs once you start using bash specific features, though.

Fun fact: on FreeBSD bash does not live in /bin/bash, it's in /usr/local/bin/bash. Every time you write a shebang with /bin/bash hardcoded you're making your script harder to use there.

Perl is everywhere almost by default and it's more compatible as it has just one implementation, without sh/bash/csh/ksh/tcsh/zsh madness. I'd say it's a good idea to use Perl instead of shell script for anything more complicated than a few lines of code if it's meant to be portable. (And I'm not Perl programmer at all).

Gabriel439 11 years ago | | |

Note that you only need `/usr/bin/env runhaskell` if you want to interpret the script. You can also compile the script as a native binary, which is the recommended approach on Windows.

falcolas 11 years ago | |

Agreed. I look upon people who use PHP for shell scripting with a sigh and a shaken head, but I can't fault them for it. PHP works, PHP is quick to write, and for many tasks, PHP is sufficient.

I hope Haskel can gain traction in this area, if only because options are always nice to have, and competition forces everyone to bring their best game.

mercurial 11 years ago |

I like the Pattern thing. However, it seems to me that you're going to quickly run into trouble if you need to even vaguely emulate shell scripting. Shell utilities live and die by their options. It's unfortunate Haskell supports neither named arguments nor default values. Which means that in order to emulate options, you would need to pass records to your "shell" utility, which, on top of being cumbersome, forces you to prefix every option in a way unique to your utility, since you cannot have two records with the same fields in the same namespace...

falcolas 11 years ago |

Please forgive my lack of familiarity with the concurrent workings of Haskell, but since the Shell streams are based off []/IO, and not Concurrent.Chan, does this mean one turtle function has to complete (and write its results to memory) before the next turtle function can run?

To me, magic bits of shell scripts which turtle would need to improve upon were it to replace said scripts are not the loop constructs, conditionals, or even the type system (even though it's completely lacking in bash), it is the ability to use pipes to link processes concurrently.

joeyh 11 years ago | |

The streaming section shows some examples of combining turtle functions, this will be the same as shell pipes.

There's also nothing stopping you from using forkIO to spark off a separate thread, and doing IO in multiple threads concurrently.

Haskell's IO manager allows multiple threads doing concurrent IO in what looks like an imperative, one instruction after the other manner. Instead of async callbacks like you might expect from other languages.

Klasiaster 11 years ago |

For me combing the best parts of bash and ipython is the way to go. Up to now this seems more comfortable to me than using subprocess in python or this haskell aproach which needs to be aware of every programme output to give what it promises. You can easily copy big parts of existing bash scripts and e.g. add error handling in the python way :) Even I think for loops/list comprehensions are betten than the strange bash syntax.

And here a short example::

  #!/usr/bin/env ipython3
  #
  # 1. echo "#!/usr/bin/env ipython3" > scriptname.ipy    # creates new ipy-file
  #
  # 2. chmod +x scriptname.ipy                            # make it executable
  #
  # 3. starting with line 2, write normal python or do some of
  #    the ! magic of ipython, so that you can use shell commands
  #    within python and even assign their output to a variable via
  #    var = !cmd1 | cmd2 | cmd3                          # enjoy ;)
  #
  # 4. run via ./scriptname.ipy - if it fails with recognizing % and !
  #    but parses raw python fine, please check again for the .ipy suffix which must be there!
  #
  # ugly example, please go and find more in the wild
  files = !ls *.* | grep "y"
  for file in files:
    !echo $file | grep "p"
  # sorry for this nonsense example ;)
  # it's even possible to access the output of a command by outputvariable.s, .p or .n
  # see file:///usr/share/doc/ipython-doc/html/interactive/reference.html#system-shell-access

Better take a look here, it's more complete: https://blog.safaribooksonline.com/2014/02/12/using-shell-co...

codygman 11 years ago | |

Oh, I'm going to have to see if I can use Turtle with IHaskell tomorrow!

0: http://gibiansky.github.io/IHaskell/ 1: https://registry.hub.docker.com/u/gregweber/ihaskell/

tel 11 years ago |

But why "turtle"?

strager 11 years ago | |

Turtle shell.

tel 11 years ago | | |

And in retrospect that's quite obvious, hah!

I spent the whole time trying to think how this was connected to LOGO.

npsimons 11 years ago |

Nice! Now I can add Haskell to my list of languages I can script with.

I'm always on the lookout for new languages I can script with (or at least get closer to rapid prototyping) for easier learning, testing, problem solving, etc. I've got templates that I run against linters, style checkers, etc for many languages and it will be helpful to have even more options.

agumonkey 11 years ago |

It's not closely related but still, it reminded me of the wonderful https://pypi.python.org/pypi/sh to write 'shell' script in python with very low boilerplate.

akurilin 11 years ago |

This is awesome, I was actually looking for something like that out of sheer curiosity, but perhaps it'll make it into production at some point.

fallat 11 years ago |

I've been pushing for alternative shell scripts for awhile now. I mostly stick to Python and Haskell now. It is great. Highly recommended.

amelius 11 years ago |

I wonder why it uses the convention:

stdout (input "test/foo")

instead of:

output stdout (input "test/foo")

which would be expected considering the previous line.

qznc 11 years ago |

Haskell is low on boilerplate? Yes, in general I would agree. Those scripts however, all have to be prefixed with "{-# LANGUAGE OverloadedStrings #-} import Turtle main = do". This is tedious boilerplate.

sukilot 11 years ago | |

You could trivially have a wrapper program that added that to every script file before calling runhaskell.

joelthelion 11 years ago |

I want to see how you implement the pipe :)

meekins 11 years ago |

q3k 11 years ago |

I don't really see the point of this, apart from academic research values.

POSIX shell is everywhere - your current Linux and OS X machines, old UNIX workstations, home routers, servers... Just drop in a file and it will probably run just fine, unless the author screwed something up completely. POSIX shell scripts are the perfect bootstrap mechanisms that will run almost anywhere regardless of architecture.

Haskell, on the other hand, is rarely present in an operating system - if you absolutely, positively need a higher-level language for „shell scripting”, then you have a much higher chance of finding a Perl interpreter, or even Python. Heck, even getting ghc and its' basic ecosystem running has always proved to be a huge burden to me. Try sticking a `cabal install` in your CI flow, you'll see your job times increase by hours.

Third, there's just the KISS aspect of it - if you're writing something that has logic so simple it can be stuck in a shell file, why not just write it in a shell file? You don't need category theory to get a few files installed...

import Control.Exception import System.Directory withDirectory :: FilePath -> IO a -> IO a withDirectory path action = bracket (getCurrentDirectory <* setCurrentDirectory path) setCurrentDirectory (const action)

import Prelude hiding ((-)) data Grep = Grep {isRecursive :: Bool, maxCount :: Maybe Int} --etc deriving (Show) grep = Grep False Nothing --short pseudonim r :: Grep -> Grep r command = command{isRecursive = True} m :: Int -> Grep -> Grep m num command = command{maxCount = Just num} (-) :: a -> (a -> a) -> a (-) command flag = flag command ourGrep = grep -m 50 -r main = print ourGrep -- > Grep {isRecursive = True, maxCount = Just 50} --then we should write monad which execute that data