POSIX only specifies the -H, -L, and -P options for use with -R, and
the default is left to the implementation. Without -R, symlinks must
be followed.
Most implementations use -P as the default with -R, which makes sense
to me as default behavior (the source and destination trees are the same).
Since we use the same code for -R and without it, and we allow -H, -L,
and -P without -R, set the default based on the presence of -R. Without
it, use -L as before, as required by POSIX, and with it, use -P to match
the behavior of other implementations.
Previous behaviour was to call opendir regardless of if we are actually going
to be recursing into the directory. Additionally, some utilities that use
DIRFIRST benefit from running the function pointed to by fn before the call
to opendir. One such example is `chmod [-R] 777 dir` on a directory with mode
000, where it will be expected for chmod to first give itself rwx before
optionally listing the directory to traverse it.
Previously, when the destination file was created with fopen, we needed
to use fchmod to set its permissions.
Now that we pass in the mode to creat, we already get the desired
behavior of creating the file with the same mode as the source file
modified by the user's file creation mask.
This fixes the issue where a directory or special file created with
mkdir/mknod does not end up with the appropriate mode with -p or -a
(since it may have been narrowed by the umask).
This also allows us to clear the SUID and SGID bits from the mode if the
chown fails, as specified by POSIX.
If we are just copying data from one file to another, we don't need to
fill a complete buffer, just read a chunk at a time, and write it to the
output.
fread reads the entire requested size (BUFSIZ), which causes tools to
block if only small amounts of data are available at a time. At best,
this causes unnecessary copies and inefficiency, at worst, tools like
tee and cat are almost unusable in some cases since they only display
large chunks of data at a time.
writeall makes successive write calls to write an entire buffer to the
output file descriptor. It returns the number of bytes written, or -1 on
the first error.
Previously, with -p, the specified directory and all of its parents
would be 0777&~filemask (regardless of the -m flag). POSIX says parent
directories must created as (0300|~filemask)&0777, and of course if -m
is set, the specified directory should be created with those
permissions.
Additionally, POSIX says that for symbolic_mode strings, + and - should
be interpretted relative to a default mode of 0777 (not 0).
Without -p, previously the directory would be created first with
0777&~filemask (before a chmod), but POSIX says that the directory shall
at no point in time have permissions less restrictive than the -m mode
argument.
Rather than dealing with mkdir removing the filemask bits by calling
chmod afterward, just clear the umask and remove the bits manually.
In the description of 3111908b03, it says
that the functions must be able to handle st being NULL, but recurse
always passes a valid pointer. The only function that was ever passed
NULL was rm(), but this was changed to go through recurse in
2f4ab52739, so now the checks are
pointless.
In commit 30fd43d7f3, unescape was
simplified significantly, but the new version failed to NULL terminate
the resulting string.
This causes bad behavior in various utilities, for example tr:
Broken:
$ echo b2 | tr '\142' '\143'
c3
Fixed:
$ echo b2 | tr '\142' '\143'
c2
This bug breaks libtool's usage of tr, causing gcc to fail to build with
sbase.
Previously, if a file failed to read in a checksum list, it would be
reported as not matched rather than a read failure.
Also, if reading from stdin failed, previously a bogus checksum would be
printed anyway.
Also, since parsemode exits on failure, don't bother checking return
value in xinstall (this would never trigger anyway because mode_t can be
unsigned).
For sort(1) we need memmem(), which I imported from OpenBSD.
Inside sort(1), the changes involved working with the explicit lengths
given by getlines() earlier and rewriting some of the functions.
Now we can handle NUL-characters in the input just fine.
strmem() was not very well thought out. The thing is the following:
If the string contains a zero character, we want to match it, and not
stop right there in place.
The "real" solution is to use memmem() where needed and replace all
functions that assume zero-terminated-strings from standard input, which
could lead to early string-breakoffs.
This requires a strict tracking of string lengths.
We want our delimiters to also contain 0 characters and have them
handled gracefully.
To accomplish this, I wrote a function strmem(), which looks for a
certain, arbitrarily long memory subset in a given string.
memmem() is a GNU extension and forces you to call strlen every time.
Yeah well, the old topic. POSIX allows \0123 and \123 octals in
different tools, in printf, depending on %b or other things.
We'll just keep it simple and just allow 4 digits. the 0 does not make
a difference anyway.
When we move the exit() out of venprintf(), we can reuse it for
weprintf(), which basically had duplicate code.
I also renamed venprintf() to xvprintf (extended vprintf) so it's
more obvious what it actually does.
This reverts commit a564a67c4ea70e90a4dc543814458e4903869d3e.
Not as trivial as I thought. This breaks cp when used as:
cp -r /foo/bar /baz
The old code expands this to:
cp -r /foo/bar /baz/bar
This is a utility function to allow easy parsing of file or other
offsets, automatically taking in regard suffixes, proper bases and
so on, for instance used in split(1) -b or od -j, -N(1).
Of course, POSIX is very arbitrary when it comes to defining the
parsing rules for different tools.
The main focus here lies on being as flexible and consistent as
possible. One central utility-function handling the parsing makes
this stuff a lot more trivial.
Otherwise, we run into problems in a typical autoconf-based build
system:
- config.status is created at some point between two seconds.
- config.status is run, generating Makefile by first writing to a file
in /tmp, and then mv-ing it to Makefile.
- If this mv happens before the beginning of the next second, Makefile
will be created with the same tv_sec as config.status, but with
tv_nsec = 0.
- When make runs, it sees that Makefile is older than config.status,
and re-runs config.status to generate Makefile.
In general, POSIX does not define /dev/std{in, out, err} because it
does not want to depend on the dev-filesystem.
For utilities, it thus introduced the '-'-keyword to denote standard
input (and output in some cases) and the programs have to deal with
it accordingly.
Sadly, the design of many tools doesn't allow strict shell-redirections
and many scripts don't even use this feature when possible.
Thus, we made the decision to implement it consistently across all
tools where it makes sense (namely those which read files).
Along the way, I spotted some behavioural bugs in libutil/crypt.c and
others where it was forgotten to fshut the files after use.