The logic is simple, it's just a pain in the ass to fill the
data-structures.
Some lines had to be commented out, as glibc/musl apparently
have not fully implemented the mandatory variables for the
2013 corrigendum of POSIX 2008.
Also added a manpage and the necessary entries in README.
I also removed it from the TODO.
Yeah, if the skipping is longer than the file itself, we need
to take the skip value, not the address.
Also, only print the last newline when we've actually printed
at least 1 address.
If this flag is not given, od(1) automatically replaces duplicate
adjacent lines with an '*' for each reoccurence.
If this flag is set, thus, no such filtering occurs.
In this case this would mean having to somehow keep the last printed
line in some backbuffer, building the next line and then doing the
necessary comparisons. This basically means that we duplicate the
functionality provided with uniq(1).
So instead of
$ od -t a > dump
you'd rather do
$ od -t a | uniq -f 1 -c > dump
Skipping the first field is necessary, as the addresses obviously differ.
Now, I was thinking hard why this flag even exists. If POSIX mandated
to add the address before the asterisk, so we know the offset of duplicate
occurrences, this would make sense. However, this is not the case.
Using uniq(1) also gives nicer output:
~ $ echo "111111111111111111111111111111111111111111111111" | od -t a -v | uniq -f 1 -c
3 0000000 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0000060 nl
1 0000061
in comparison to
$ echo "111111111111111111111111111111111111111111111111" | od -t a
0000000 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
*
0000060 nl
0000061
Before working on od(1), I didn't even know it would filter out
duplicate adjacent lines like that. This is also a matter of
predictability.
Concluding, the v-flag is implicitly set and users urged to just
use the existing tools provided by the system.
I don't think we would break scripts either. Firstly, it's rather
unlikely to have duplicate lines exactly matching the line-length of
od(1). Secondly, even if a script did that specifically, in the worst
case there would be a counting error or something.
Given od(1) is mostly used interactively, we can safely assume this
feature is for the benefit of the users.
Ditch this legacy POSIX crap!
Please enter the commit message for your changes. Lines starting
This is a utility function to allow easy parsing of file or other
offsets, automatically taking in regard suffixes, proper bases and
so on, for instance used in split(1) -b or od -j, -N(1).
Of course, POSIX is very arbitrary when it comes to defining the
parsing rules for different tools.
The main focus here lies on being as flexible and consistent as
possible. One central utility-function handling the parsing makes
this stuff a lot more trivial.
It was possible to make some sections of the code shorter.
Also fix a bug where the last printed address was always in hex
rather than depending on the radix chosen.
getline(3) expects newline-terminated input. While glibc's
implementation seems to catch unterminated input and zero the
buffer, other versions (notably musl's) do not.
This is a workaround. Garbage will still be read, but
not printed.
This client does not support the netascii mode. The default mode
is octet/binary and should be sufficient.
One thing left to do is to check the source port of the server
to make sure it doesn't change. If it does, we should ignore the
packet and send an error back without disturbing an existing
transfer.
Previously, a line read from file 1 before a strcmp was
performed would be overwritten and lost. Something like
this:
comm one_line_file empty_file
produced no output.
This patch is a bit inelegant, but quite simple.
1) Remove the function prototypes. No need for them, as the
functions are ordered.
2) Add fieldseplen, so the length of the field-separator is not
calculated nearly each time skipcolumn() is called.
3) rename next_col to skip_to_next_col so the purpose is clear,
also reorder the conditional accordingly.
4) Put parentheses around certain ternary expressions.
5) BUGFIX: Don't just exit() in check(), but make it return something,
so we can cleanly fshut() everything.
6) OFF-POSIX: Posix for no apparent reason does not allow more than
one file when the -c or -C flags are given.
This can be problematic when you want to check multiple files.
With the change 5), rewriting check() to return a value, I went
off-posix after discussing this with Dimitris to just allow
arbitrary numbers of files. Obviously, this does not break scripts
and is convenient for everybody who wants to quickly check a big
amount of files.
As soon as 1 file is "unsorted", the return value is 1, as expected.
For convenience reasons, check()'s warning now includes the filename.
7) BUGFIX: Set ret to 2 instead of 1 when the fshut(fp, *argv) fails.
8) BUGFIX: Don't forget to fshut stderr at the end. This would improperly
return 1 in the following case:
$ sort -c unsorted_file 2> /dev/full
9) Other style changes, line length, empty line before return.
Make it clear that <blank> characters just are spaces or tabs and
not a special group which needs special treatment for wide characters.
Also, and that was the only problem here, correctly calculate the
offset given by the key definitions for the start- and end-characters
using libutf-utility-functions.
Mark the progress in the README and put parentheses around the missing
flags which are insane to implement for no real gain.
I kind of missed that the sorting was still not properly done.
parse_flags() and addkeydef() are independent of everything else,
so they can be put at the bottom.
Sorting the other functions reveals the true hierarchy much better.
This is much easier to read than having yet another handrolled
list implementation.
Tested and more or less clearly equivalent.
Now that I have uni-vac, I'll have enough time to refactor more.
Using $(LD) directly for linking can cause issues with cross-compilers
and various other toolchains, as various libraries such as libc may not
be implicitly linked in, causing symbol resolution errors.
Linking through the C compiler frontend solves this issue.