The assumption of NUL-terminated strings is actually quite a good one in
most cases. You don't have to worry about paths, because they may not
contain NUL.
Same applies to arguments passed to you. Unless you have to unescape,
there is no way for you to receive a NUL.
There are two important exceptions though, and it's important that we
address them, or else we get unexpected behaviour:
1) All tools using unescape() have to be strict about delimlen.
Else they end up for instance unescaping
'\\0abc'
to
'\0abc',
which in C's string-vision is an empty string.
2) All tools doing line wrenching and putting them out
again as lines again.
puts() will cut each line containing NULs off at the first
occurence.
strmem() was not very well thought out. The thing is the following:
If the string contains a zero character, we want to match it, and not
stop right there in place.
The "real" solution is to use memmem() where needed and replace all
functions that assume zero-terminated-strings from standard input, which
could lead to early string-breakoffs.
This requires a strict tracking of string lengths.
We want our delimiters to also contain 0 characters and have them
handled gracefully.
To accomplish this, I wrote a function strmem(), which looks for a
certain, arbitrarily long memory subset in a given string.
memmem() is a GNU extension and forces you to call strlen every time.
Previously, we used the System V output format:
"%7d%7d%7d %s\n"
The problem here is, that if any number has more than six digits, the
result looks like one big number, as we don't mandate spaces.
POSIX says the output format should rather be
"%d %d %d %s\n"
but in this case we wouldn't get consistent results.
To serve both camps, I changed it to the following:
"%6d %6d %6d %s\n"
This won't change the output for normal values, but also
prevent the output of large files to be ambiguous.
Yeah well, the old topic. POSIX allows \0123 and \123 octals in
different tools, in printf, depending on %b or other things.
We'll just keep it simple and just allow 4 digits. the 0 does not make
a difference anyway.
Here's a better version of the patch.
When the R flag is used with a single directory, the given directory name is
omitted. With multiple directories each directory name is listed.
Directories that start with './' and '../' are now also printed.
Given the following commands:
touch 1.txt; install -D 1.txt d/2.txt
find d
The result without this fix:
d
d/2.txt
d/2.txt/1.txt
The result with this patch applied:
d
d/2.txt
-s strip binary
-d create directory
-D create missing directories
-t DIR target directory
-m MODE permission bits
-o USER set owner
-g GROUP set group
Installed files are copied, and default mode is 755.
Signed-off-by: Mattias Andrée <maandree@kth.se>
It is impossible to rematch a pattern which has one (or both)
of these operators, so the simplest solucion is detect them
while we are compiling the regular expression and break the
match loop after the first iteration.