In the description of 3111908b03, it says
that the functions must be able to handle st being NULL, but recurse
always passes a valid pointer. The only function that was ever passed
NULL was rm(), but this was changed to go through recurse in
2f4ab52739, so now the checks are
pointless.
In commit 30fd43d7f3, unescape was
simplified significantly, but the new version failed to NULL terminate
the resulting string.
This causes bad behavior in various utilities, for example tr:
Broken:
$ echo b2 | tr '\142' '\143'
c3
Fixed:
$ echo b2 | tr '\142' '\143'
c2
This bug breaks libtool's usage of tr, causing gcc to fail to build with
sbase.
Previously, if a file failed to read in a checksum list, it would be
reported as not matched rather than a read failure.
Also, if reading from stdin failed, previously a bogus checksum would be
printed anyway.
Also, since parsemode exits on failure, don't bother checking return
value in xinstall (this would never trigger anyway because mode_t can be
unsigned).
For sort(1) we need memmem(), which I imported from OpenBSD.
Inside sort(1), the changes involved working with the explicit lengths
given by getlines() earlier and rewriting some of the functions.
Now we can handle NUL-characters in the input just fine.
strmem() was not very well thought out. The thing is the following:
If the string contains a zero character, we want to match it, and not
stop right there in place.
The "real" solution is to use memmem() where needed and replace all
functions that assume zero-terminated-strings from standard input, which
could lead to early string-breakoffs.
This requires a strict tracking of string lengths.
We want our delimiters to also contain 0 characters and have them
handled gracefully.
To accomplish this, I wrote a function strmem(), which looks for a
certain, arbitrarily long memory subset in a given string.
memmem() is a GNU extension and forces you to call strlen every time.
Yeah well, the old topic. POSIX allows \0123 and \123 octals in
different tools, in printf, depending on %b or other things.
We'll just keep it simple and just allow 4 digits. the 0 does not make
a difference anyway.
When we move the exit() out of venprintf(), we can reuse it for
weprintf(), which basically had duplicate code.
I also renamed venprintf() to xvprintf (extended vprintf) so it's
more obvious what it actually does.
This reverts commit a564a67c4ea70e90a4dc543814458e4903869d3e.
Not as trivial as I thought. This breaks cp when used as:
cp -r /foo/bar /baz
The old code expands this to:
cp -r /foo/bar /baz/bar
This is a utility function to allow easy parsing of file or other
offsets, automatically taking in regard suffixes, proper bases and
so on, for instance used in split(1) -b or od -j, -N(1).
Of course, POSIX is very arbitrary when it comes to defining the
parsing rules for different tools.
The main focus here lies on being as flexible and consistent as
possible. One central utility-function handling the parsing makes
this stuff a lot more trivial.
Otherwise, we run into problems in a typical autoconf-based build
system:
- config.status is created at some point between two seconds.
- config.status is run, generating Makefile by first writing to a file
in /tmp, and then mv-ing it to Makefile.
- If this mv happens before the beginning of the next second, Makefile
will be created with the same tv_sec as config.status, but with
tv_nsec = 0.
- When make runs, it sees that Makefile is older than config.status,
and re-runs config.status to generate Makefile.
In general, POSIX does not define /dev/std{in, out, err} because it
does not want to depend on the dev-filesystem.
For utilities, it thus introduced the '-'-keyword to denote standard
input (and output in some cases) and the programs have to deal with
it accordingly.
Sadly, the design of many tools doesn't allow strict shell-redirections
and many scripts don't even use this feature when possible.
Thus, we made the decision to implement it consistently across all
tools where it makes sense (namely those which read files).
Along the way, I spotted some behavioural bugs in libutil/crypt.c and
others where it was forgotten to fshut the files after use.
General convention is to use size_t to store sizes of all kinds.
Internally, the function uses double anyway, but at least this
doesn't clobber up the API any more and there's a chance in the
future to make this function a bit cleaner and not use this dirty
static buffer hack any more.