Technically it has not, even for the core tools they've been getting extended, usually incompatibly, in both GNU and BSD lineages. Though it's pretty funny how much the rust community has been taken up by providing alternatives and replacements for "classic" (POSIX) utilities.
1. saves -f because it doesn’t support cut’s -b and -c modes (edit: actually -c is supported, I just didn’t see it);
2. Uses -f instead of -d, making it rather confusing for cut users;
3. Uses : instead of - for range specifications;
4. Offers an exclusive indexing mode;
5. Misses a bunch of other cut features (assuming coreutils cut).
Not sure I see much appeal...
Edit: Another thing I missed: regex separator instead of just character list.
The appeal is the same as replacing grep with a fancier searcher:
1. it has good and sensible defaults (field mode, also I'd have to check but hopefully and unlike cut it doesn't print the entire line when it's unhappy with the selection you asked for, that's worse error handling than ed) (edit: confirmed, if you give `choose` nonsensical selection it doesn't print anything e.g. if you ask `cut` for columns 10-15 of data with 3 columns it's going to print the source as-is, choose is properly going to print a bunch of empty lines, that alone makes it better than cut)
2. It works better on actual data, which is generally whitespace-separated rather than tab-separated, meaning cut requires preprocessing before it'll do anything of use
Can you massage cut or the data to fit? Yes, in the same way you can massage grep or your data to fit. That you don't have to and the utility behaves sensibly by default is appealing. This exact thing is one I've been thinking about for some time now, I'm glad somebody else agreed and did the legwork.
echo -e "foo bar baz" | choose -1 -2Also I don't think that it's so much easier to use than cut. On the other hand every *nix system has cut so if you make scripts with it they are portable.
Because 99% of awk IRL use is just as a fancier cut.
It's very rare someone even sets a variable using awk. If you do it, you are a statistical rarity.
> Also I don't think that it's so much easier to use than cut. On the other hand every *nix system has cut so if you make scripts with it they are portable.
I, for one, never remember the syntax for cut. If "choose" gets a deb, I'll use it: Python slicing is something familiar to me.
I don't care if cut is on every unix system: if I have the possibility to install things on the machine, then I'll just install what I need. I have a script for that. If I don't, I'll google/man/--help GNU commands as usual.
And as for writing shell scripts, I use Python anyway.
You say "fancier", I say "working": since cut can't work on general whitespace without a pre-processing phase (e.g. tr), it simply doesn't work for the vast majority of the things I try to shove into it, and I pretty much always end up using awk instead.
Choose means my awk use will fall down by 99% or so.
This actually makes choose a cut-killer for me. It can be frustrating having to figure out which delimiters to use - tabs or spaces? If spaces, you'll have to chain it with tr, or resort to awk.
>However, the awk command is not ideal for rapid shell use
And
>cut is far from ideal for rapid shell use, because of its confusing syntax
anything new is confusing until you learn enough to be comfortable
>ranges are just plain difficult to get right on the first try
and how does choose become easy to use with ':' character instead of '-'
Is this a typo or does inclusive/exclusive depend on whether first number is specified?
>choose 2:5 # print everything from the 2nd to 5th item on the line, _inclusive_ of the 5th
>choose :3 # print the beginning of the line to the 3rd item _exclusive_
$ echo "1,2,3,4,5" | cut -d , -f 1,3
1,3
$ echo "1,2,3,4,5" | cut -d , -f 3,1
1,3 $ echo "1,2,3,4,5" | choose -f , 2 0
3 1
$ echo "1,2,3,4,5" | choose -f , 2:0
3 2 1
Note that the indexing starts with 0, "-d" is "-f", and a range is denoted by ":" instead of "-" which is used for indexing from the end. awk -F, '{print $3","$1}'awk uses $0 as the whole line, and $1 as the first field. cut uses -f1 as the first field $1 is the first argument to a posix shell script /1 is the first matched reference in a sed $1 is the first regex match in perl
A command-line tool being 0-indexed breaks from expectation of what everybody is used to using on the command line.
$ echo " a b c" | choose 1 2
b c
$ echo " a b c" | python3 -c 'import sys; [print(f[1], f[2]) for line in sys.stdin if (f := line.split()) or True]'
b cThis pain is real: https://xkcd.com/1168/
If it is more comfortable to use for some people then it’s a great invention.
echo -e "foo bar baz" | choose -1 -2
With cut. echo -e "foo bar baz" | xargs | cut -d\ -f1,2But now, I install their successors, ripgrep and fdfind, on all my machines. Including the windows ones.
I would even suggest the command to add ability to invert the ranges, byte selection (if -c is character and not byte selection), add examples for character splitting in README, etc.
Just off the top of my head, fex and miller are alternative cut-likes with field extraction.
Unless a new unix-tool-alike is significantly better and backwards-compatible, HN and most greybeard nixers tend toward conservatism. The old tools are usually good for 95% of use-cases anyway. There's just way too much to keep track of if you're eager to switch to any old shiny new thing.
I hope it's a typo given:
> choose -3:-1 # print the last three items from a line
is clearly inclusive (otherwise it'd print but the last one), and there's a very explicit flag for inclusive ranges. Might be a good idea to open an issue just in case.
I not sure how to feel wrt using Python's range syntax with different inclusivity (by default) though.
edit: after installing and testing, it does seem like an error in the readme, the end is inclusive whether a start is provided or not. That is, `:3`, `0:3` and `1:3` all yield the 4th field.
I'd more "formally" define 0/1 indexing as:
Zero indexing: arr[0] is a valid way to address the first element of an array, and len(arr) - 1 is the index to the final element.
One indexing: arr[0] results in an error or an out of bounds access, and len(arr) is the index to the final element.
These statements are true in Matlab, but not most command line tools.
- the default separator is "\s", like python's split(). Just for that I will adopt it: not having to care about tabs/spaces/mixes is a much better experience.
- it has negative indexes, again like python. Getting the last field, or the last nth field, is something common enought. I don't want to rewrite the thing with a twisted double "rev" with proper index. And I don't want to have to google it.
- plus the syntax is just must easier to remember to me. When I use cut, I always try: "echo 'foo bar baz' | cut 2", just to realize that I need to pass '-f', then I do "cut -f 2", and get stump, and google it, to then remember I need to pass the delimiter explicitly even if it's a space.
- it works the same on windows. I dual boot.
Compare:
echo -e "foo bar baz" | choose -1
To: echo -e "foo bar baz" | rev | cut -d ' ' -f 1 | rev
cut is, to me, the opposite of a friendly API.Something so basic in the Unix world should have sane default.
Default are not sane if I have to google it once out of two.
If you give cut columns which don't exist, it's going to output the entire source as-is.
In fact, anybody promoting cut, please give me the cut version of:
echo -e "foo bar baz" | choose -1 -2
It should work on an arbitrary number of spaces, and fields.The oneliner is going to be... interesting.
Now you can do it with awk using:
echo -e "foo bar baz" | awk '{ print $NF " " $(NF-1)}'
But it's neither easy to type, nor to remember.Choose is what cut should have been.
`echo -e "foo bar baz" | tr -s ' ' | rev | cut -d ' ' -f 1-2 | rev | awk '{print $2 " " $1}`
Everything except the awk part is something that I use all the time and is easy to type & remember.
To be honest I'd use `choose` if it was available everywhere, but for string manipulation I can't justify using nonstandard tools since they aren't always available.
Every now and then there are some new ones I actually start to use. For example `ripgrep` mostly replaced `grep -R` for me some time ago, a lot of it has to do with the fact that if `rg` is not found I can fallback to normal grep and get the same result, just a bit slower.
I guess my point is that while I do appreciate innovation & making better tooling, the hard part always is getting the tool where it's most needed.
echo ... |while read a b x ;do ... ;done