and mark it as finished in the README.
This is another example showing how broken the GNU coreutils are:
$ echo -e "äää\tüüü\tööö" | gnu-expand -t "5,10,20"
äää üüü ööö
$ echo -e "äää\tüüü\tööö" | sbase-expand -t "5,10,20"
äää üüü ööö
This is due to the fact that they are still not UTF8-aware and
actually see "ä" as two single characters, expanding the "äää" with
4 spaces to a tab of length 10.
The correct way however is to expand the "äää" with 2 spaces to a
tab of length 5.
One can only imagine how this silently breaks a lot of code around
the world.
WHAT WERE THEY THINKING?
Be stricter while resolving escapes in the delimiter-string and
error out when it has length 0 or contains an invalid escape.
Thanks to Hiltjo Posthuma's sharp eagle eyes this bug was spotted.
which we are not planning to include into sbase.
What's left to discuss is how we're going to handle them in the
tools (dump usage() or silently ignore them).
Having multibyte delimiters is not enough. For full flexibility,
the possiblity of cutting input lines with arbitrary length delimiters
is the real deal.
Given this functionality, it only sounds reasonable to also add support
to resolve escapes.
Thanks to Truls Becken for making the suggestion and designing such a
flexible cut(1)-implementation!
Now you can specify a multibyte-delimiter to cut, which should
definitely be possible for the end-user (Fuck POSIX).
Looking at GNU/coreutils' cut(1)[0], which basically ignores the difference
between characters and bytes, the -n-option and which is bloated as hell,
one has to wonder why they are still default. This is insane!
Things like this personally keep me motivated to make sbase better
every day.
[0]: http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/cut.c;hb=HEAD
NSFW! You have been warned.
One major milestone is to have the sbase-tools supporting UTF-8.
Tools like cut(1) with the -n flag don't make sense otherwise.
And while the gnu coreutils cut(1) blatantly ignores such an
important aspect, we will not tolerate this madness and mark it
as a TODO in the main README.
Since most tools inherently support UTF-8 anyway, this just concerns
tools which mangle with text or search in it in special ways.
and mark it as finished in README.
One small rationale on the way the manpage is set up: Looking at
the coreutils manpage, it does not invite to be a quick reference
guide, whereas I wrote this manpage to be short and concise in regard
to the information the advanced user needs.
No one needs to explain what an octal number is. That's not part of
the scope of this manpage.
Also, nobody wants to read a block of text just to find out how
to build an octal mode string.