Commit Graph

67 Commits

Author SHA1 Message Date
FRIGN
a88906b423 Rever the strmem() addition and add a TODO element
strmem() was not very well thought out. The thing is the following:
If the string contains a zero character, we want to match it, and not
stop right there in place.

The "real" solution is to use memmem() where needed and replace all
functions that assume zero-terminated-strings from standard input, which
could lead to early string-breakoffs.
This requires a strict tracking of string lengths.
2016-02-26 09:54:46 +00:00
FRIGN
3396088666 Implement strmem() and use it in join(1)
We want our delimiters to also contain 0 characters and have them
handled gracefully.
To accomplish this, I wrote a function strmem(), which looks for a
certain, arbitrarily long memory subset in a given string.
memmem() is a GNU extension and forces you to call strlen every time.
2016-02-26 09:54:46 +00:00
FRIGN
007df69fc5 Add parseoffset()
This is a utility function to allow easy parsing of file or other
offsets, automatically taking in regard suffixes, proper bases and
so on, for instance used in split(1) -b or od -j, -N(1).
Of course, POSIX is very arbitrary when it comes to defining the
parsing rules for different tools.
The main focus here lies on being as flexible and consistent as
possible. One central utility-function handling the parsing makes
this stuff a lot more trivial.
2015-09-30 19:44:10 +01:00
sin
2deb40290e Use off_t in humansize() as it is more descriptive and applicable 2015-04-29 16:42:49 +01:00
sin
42326f7684 Include stdint.h in util.h for uintmax_t 2015-04-28 11:36:58 +01:00
Dionysis Grigoropoulos
2d6cde1862 humansize: Use uintmax_t for size
du(1) breaks on 32-bit size_t for files greater than 4G.
2015-04-28 11:36:58 +01:00
FRIGN
5595af5742 Convert humansize() to accept a size_t instead of a double
General convention is to use size_t to store sizes of all kinds.
Internally, the function uses double anyway, but at least this
doesn't clobber up the API any more and there's a chance in the
future to make this function a bit cleaner and not use this dirty
static buffer hack any more.
2015-04-25 11:43:14 +01:00
sin
c914a2feca Update putword() to accept a FILE * 2015-04-21 18:00:47 +01:00
sin
b9d60bee87 Move mkdirp() to libutil 2015-04-20 18:04:08 +01:00
FRIGN
11e2d472bf Add *fshut() functions to properly flush file streams
This has been a known issue for a long time. Example:

printf "word" > /dev/full

wouldn't report there's not enough space on the device.
This is due to the fact that every libc has internal buffers
for stdout which store fragments of written data until they reach
a certain size or on some callback to flush them all at once to the
kernel.
You can force the libc to flush them with fflush(). In case flushing
fails, you can check the return value of fflush() and report an error.

However, previously, sbase didn't have such checks and without fflush(),
the libc silently flushes the buffers on exit without checking the errors.
No offense, but there's no way for the libc to report errors in the exit-
condition.

GNU coreutils solve this by having onexit-callbacks to handle the flushing
and report issues, but they have obvious deficiencies.
After long discussions on IRC, we came to the conclusion that checking the
return value of every io-function would be a bit too much, and having a
general-purpose fclose-wrapper would be the best way to go.

It turned out that fclose() alone is not enough to detect errors. The right
way to do it is to fflush() + check ferror on the fp and then to a fclose().
This is what fshut does and that's how it's done before each return.
The return value is obviously affected, reporting an error in case a flush
or close failed, but also when reading failed for some reason, the error-
state is caught.

the !!( ... + ...) construction is used to call all functions inside the
brackets and not "terminating" on the first.
We want errors to be reported, but there's no reason to stop flushing buffers
when one other file buffer has issues.
Obviously, functionales come before the flush and ret-logic comes after to
prevent early exits as well without reporting warnings if there are any.

One more advantage of fshut() is that it is even able to report errors
on obscure NFS-setups which the other coreutils are unable to detect,
because they only check the return-value of fflush() and fclose(),
not ferror() as well.
2015-04-05 09:13:56 +01:00
FRIGN
a68c2a9e6e Remove apathmax() and implicitly agetcwd()
pathconf() is just an insane interface to use. All sane operating-
systems set sane values for PATH_MAX. Due to the by-runtime-nature of
pathconf(), it actually weakens the programs depending on its values.

Given over 3 years it has still not been possible to implement a sane
and easy to use apathmax()-utility-function, and after discussing this
on IRC, we'll dump this garbage.

We are careful enough not to overflow PATH_MAX and even if, any user
is able to set another limit in config.mk if he so desires.
2015-03-18 15:20:35 +01:00
FRIGN
93fd817536 Add estrlcat() and estrlcpy()
It has become a common idiom in sbase to check strlcat() and strlcpy()
using

if (strl{cat, cpy}(dst, src, siz) >= siz)
        eprintf("path too long\n");

However, this was not carried out consistently and to this very day,
some tools employed unchecked calls to these functions, effectively
allowing silent truncations to happen, which in turn may lead to
security issues.
To finally put an end to this, the e*-functions detect truncation
automatically and the caller can lean back and enjoy coding without
trouble. :)
2015-03-17 11:24:49 +01:00
FRIGN
9fd4a745f8 Add history and config-struct to recurse
For loop detection, a history is mandatory. In the process of also
adding a flexible struct to recurse, the recurse-definition was moved
to fs.h.
The motivation behind the struct is to allow easy extensions to the
recurse-function without having to change the prototypes of all
functions in the process.
Adding flags is really simple as well now.

Using the recursor-struct, it's also easier to see which defaults
apply to a program (for instance, which type of follow, ...).

Another change was to add proper stat-lstat-usage in recurse. It
was wrong before.
2015-03-13 00:29:48 +01:00
FRIGN
01de5df8e6 Audit du(1) and refactor recurse()
While auditing du(1) I realized that there's no way the over 100 lines
of procedures in du() would pass the audit.
Instead, I decided to rewrite this section using recurse() from libutil.
However, the issue was that you'd need some kind of payload to count
the number of bytes in the subdirectories and use them in the higher
hierarchies.
The solution is to add a "void *data" data pointer to each recurse-
function-prototype, which we might also be able to use in other
recurse-applications.
recurse() itself had to be augmented with a recurse_samedev-flag, which
basically prevents recurse from leaving the current device.

Now, let's take a closer look at the audit:
1) Removing the now unnecessary util-functions push, pop, xrealpath,
   rename print() to printpath(), localize some global variables.
2) Only pass the block count to nblks instead of the entire stat-
   pointer.
3) Fix estrtonum to use the minimum of LLONG_MAX and SIZE_MAX.
4) Use idiomatic argv+argc-loop
5) Report proper exit-status.
2015-03-11 23:21:52 +01:00
FRIGN
011c81b21b Undef reallocarray in util.h before declaration
In case we link against the OpenBSD-libc, we want to avoid collisions.
2015-03-11 17:06:52 +01:00
FRIGN
833c2aebb4 Remove mallocarray(...) and use reallocarray(NULL, ...)
After a short correspondence with Otto Moerbeek it turned out
mallocarray() is only in the OpenBSD-Kernel, because the kernel-
malloc doesn't have realloc.
Userspace applications should rather use reallocarray with an
explicit NULL-pointer.

Assuming reallocarray() will become available in c-stdlibs in the
next few years, we nip mallocarray() in the bud to allow an easy
transition to a system-provided version when the day comes.
2015-03-11 10:50:18 +01:00
FRIGN
3c33abc520 Implement mallocarray()
A function used only in the OpenBSD-Kernel as of now, but it surely
provides a helpful interface when you just don't want to make sure
the incoming pointer to erealloc() is really NULL so it behaves
like malloc, making it a bit more safer.

Talking about *allocarray(): It's definitely a major step in code-
hardening. Especially as a system administrator, you should be
able to trust your core tools without having to worry about segfaults
like this, which can easily lead to privilege escalation.

How do the GNU coreutils handle this?
$ strings -n 4611686018427387903
strings: invalid minimum string length -1
$ strings -n 4611686018427387904
strings: invalid minimum string length 0

They silently overflow...

In comparison, sbase:

$ strings -n 4611686018427387903
mallocarray: out of memory
$ strings -n 4611686018427387904
mallocarray: out of memory

The first out of memory is actually a true OOM returned by malloc,
whereas the second one is a detected overflow, which is not marked
in a special way.
Now tell me which diagnostic error-messages are easier to understand.
2015-03-10 22:19:19 +01:00
FRIGN
3b825735d8 Implement reallocarray()
Stateless and I stumbled upon this issue while discussing the
semantics of read, accepting a size_t but only being able to return
ssize_t, effectively lacking the ability to report successful
reads > SSIZE_MAX.
The discussion went along and we came to the topic of input-based
memory allocations. Basically, it was possible for the argument
to a memory-allocation-function to overflow, leading to a segfault
later.
The OpenBSD-guys came up with the ingenious reallocarray-function,
and I implemented it as ereallocarray, which automatically returns
on error.
Read more about it here[0].

A simple testcase is this (courtesy to stateless):
$ sbase-strings -n (2^(32|64) / 4)

This will segfault before this patch and properly return an OOM-
situation afterwards (thanks to the overflow-check in reallocarray).

[0]: http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/calloc.3
2015-03-10 21:23:36 +01:00
sin
7d36a35649 Fix off-by-one in apathmax() as the path is relative to "/"
1) Use size_t * instead of long *
2) Fallback to PATH_MAX instead of BUFSIZ
3) Header cleanup
2015-03-06 23:50:39 +00:00
FRIGN
8dc92fbd6c Refactor enmasse() and recurse() to reflect depth
The HLP-changes to sbase have been a great addition of functionality,
but they kind of "polluted" the enmasse() and recurse() prototypes.
As this will come in handy in the future, knowing at which "depth"
you are inside a recursing function is an important functionality.

Instead of having a special HLP-flag passed to enmasse, each sub-
function needs to provide it on its own and can calculate results
based on the current depth (for instance, 'H' implies 'P' at
depth > 0).
A special case is recurse(), because it actually depends on the
follow-type. A new flag "recurse_follow" brings consistency into
what used to be spread across different naming conventions (fflag,
HLP_flag, ...).

This also fixes numerous bugs with the behaviour of HLP in the
tools using it.
2015-03-02 22:50:38 +01:00
FRIGN
c01641c897 Audit nice(1)
1) val is sufficient as "int" (read the standard)
2) BUGFIX: If getpriority fails, it returns -1 and sets errno.
   Previously, it would correctly catch the errno but not take
   care of the fact that by then val has been decremented by 1.
   Only change val if the getpriority-call has been successful.
3) Add LIMIT()-macro from st to increase readability.
4) setpriority returns < 0 on failure
5) Remove bikeshedding-comment. Read the standard if you wonder.
6) return-value trick from env(1)
2015-03-02 16:53:13 +01:00
sin
8f068589fb Fix recurse() prototype and convert char to int flags 2015-02-16 16:23:12 +00:00
Tai Chi Minh Ralph Eastwood
0cf6a18f6f recurse: change char follow to int follow 2015-02-16 15:53:58 +00:00
Tai Chi Minh Ralph Eastwood
82bc92da51 recurse: add symlink derefencing flags -H and -L 2015-02-16 15:53:55 +00:00
Jakob Kramer
c0a3c66a84 add estrndup 2015-02-11 01:17:21 +00:00
Jakob Kramer
08e93dd4f5 add en*alloc functions 2015-02-11 01:17:21 +00:00
Tai Chi Minh Ralph Eastwood
af8be7f92c cp: add symlink deref flags -H and -L for cp and mv 2015-02-09 22:54:52 +00:00
FRIGN
360a63769c Use strtonum and libutf in test(1), refactor code and manpage
and mark it as finished in README.
2015-02-09 22:21:23 +01:00
FRIGN
fd562481f3 Convert estrto{l, ul} to estrtonum
Enough with this insanity!
2015-01-30 16:52:44 +01:00
sin
e5c1f0f372 Add estrtonum() as well 2015-01-30 13:56:45 +00:00
sin
28d9b18e4c Remember to undef strtonum in case it is provided also as a macro 2015-01-30 13:52:24 +00:00
sin
add25a464f Add strtonum() in preparation to nuking estrtol() and friends 2015-01-30 13:48:33 +00:00
sin
b90ca482a0 Add estrtoul() 2015-01-30 13:24:41 +00:00
FRIGN
b8b9d983c8 Add unescape() to libutil
formerly known as resolveescapes(), it is of central use to numerous
programs.
This drops a lot of LOC.
2015-01-29 21:52:44 +01:00
sin
bc9c752df5 Import strsep() from musl libc 2015-01-25 17:48:11 +00:00
sin
ce86a05f36 Import strcasestr() from musl and remove -D_GNU_SOURCE 2014-11-20 23:46:06 +00:00
sin
cb7cbde722 Add compat.h 2014-11-17 15:46:28 +00:00
sin
bd3cf55b54 Define HOST_NAME_MAX if necessary
Some systems do not provide this, namely FreeBSD and NetBSD.
2014-11-17 14:50:40 +00:00
Hiltjo Posthuma
ce90cc57d4 util: add eregcomp: show descriptive error message on regcomp error 2014-11-16 14:36:41 +00:00
sin
045fc62028 Group related decls together in util.h 2014-11-14 18:13:26 +00:00
sin
2982d88533 Import ealloc.c from ubase 2014-11-14 18:10:05 +00:00
sin
49c91462b3 Undef MIN/MAX in case they are defined somewhere else 2014-11-13 16:01:34 +00:00
Hiltjo Posthuma
b6b8fe9591 separate humansize into a util function
also show 1 decimal of human size string like: 4M -> 4.4M
2014-10-18 23:56:51 +01:00
Hiltjo Posthuma
696cbdbb68 util.h, mode_t: sys/types.h defines mode_t
see: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_stat.h.html

this removes the warning with gcc (but musl didn't have this warning).
2014-06-16 23:04:43 +01:00
Hiltjo Posthuma
d12e953f18 add agetline, separate estrtod to util
Signed-off-by: Hiltjo Posthuma <hiltjo@codemadness.org>
2014-06-01 18:01:45 +01:00
Hiltjo Posthuma
bd99b92e91 parsemode: rework
- for octal input: reset mode to 0.
- take umask into account.
- make '=rwx' etc work.
- we wont support crazy but valid modes like "a+rw,g=x,o=g"
- uudecode: use parsemode, mask is 0.

Signed-off-by: Hiltjo Posthuma <hiltjo@codemadness.org>
2014-04-24 11:51:33 +01:00
Hiltjo Posthuma
560340341f make parsemode() generic
use for uudecode and chmod

Signed-off-by: Hiltjo Posthuma <hiltjo@codemadness.org>
2014-04-09 15:40:32 +01:00
sin
4ba6c37839 Ensure we #undef strlcat and strlcpy
These may be implemented as macros so #undef them and use our own
implementation.
2014-01-30 21:04:01 +00:00
sin
fb12183c52 Add strlcpy()/strlcat()
Refactor recurse() routine in preparation to moving tar(1) over
to use it instead of the ftw() interface.
2014-01-30 14:55:05 +00:00
sin
b8edf3b4ee Add weprintf() and replace fprintf(stderr, ...) calls
There is still some programs left to be updated for this.

Many of these programs would stop on the first file that they
could not open.
2013-11-13 11:41:43 +00:00