Commit Graph

1167 Commits

Author SHA1 Message Date
FRIGN
ce11e1f195 Add section for laces in lowerrune and upperrune and more ranges
This is a special third kind of structure found in Unicode, besides
singletons and ranges.
This dramatically reduces the number of explicit singletons in the
lookup tables.
Also, I changed the awk-script so that it can sort trivial
translations as well, breaking down the LOC even more.

The binary size of tr dropped from 67K to 51K.
2015-02-12 16:18:02 +01:00
sin
113caaf677 Make getlines() less verbose
Thanks Roberto for the suggestion.
2015-02-12 14:34:07 +00:00
FRIGN
9565eef895 Refactor uppercase-inclusion in libutf
Previously, the to*rune function would have to jiggle with two
arrays, and it somehow evaded me that it is actually way simpler
to just add another entry to the arrays if needed.
Binary size goes slightly down, e.g. tr statically linked against
musl: 68072 -> 67688

Behind the scenes though the conversion should be a bit faster and,
more importantly, the scary case-conversion function is simplified
and easier to understand.

It also drops nearly half the LOC in upperrune.c and lowerrune.c.
2015-02-12 12:28:45 +01:00
FRIGN
73577f10a0 Scrap chartorunearr(), introducing utftorunestr()
Interface and function as proposed by cls.

The reasoning behind this function is that cls expressed his
interest to keep memory allocation out of libutf, which is a
very good motive.
This simplifies the function a lot and should also increase the
speed a bit, but the most important factor here is that there's
no malloc anywhere in libutf, making it a lot smaller and more
robust with a smaller attack-surface.

Look at the paste(1) and tr(1) changes for an idiomatic way to
allocate the right amount of space for the Rune-array.
2015-02-11 21:32:09 +01:00
FRIGN
1c462012e4 Rename variable Rune * p -> r in fgetrune() 2015-02-11 21:14:28 +01:00
FRIGN
7c578bf5b0 Scrap writerune(), introducing fputrune()
Interface and function as proposed by cls.
Code is also shorter, everything else analogous to fgetrune().
2015-02-11 20:58:00 +01:00
FRIGN
a5ae899a48 Scrap readrune(), introducing fgetrune()
Interface as proposed by cls, but internally rewritten after a few
considerations.
The code is much shorter and to the point, aligning itself with other
standard functions. It should also be much faster, which is not bad.
2015-02-11 20:16:49 +01:00
sin
4888bae455 uniq: Add standards section to manpage and update README 2015-02-11 15:55:58 +00:00
sin
2e5a02dd26 uniq is now complete, update README 2015-02-11 15:27:19 +00:00
Tai Chi Minh Ralph Eastwood
5c811577a2 uniq.1: add [input [output]] information 2015-02-11 15:26:59 +00:00
Tai Chi Minh Ralph Eastwood
70694a318c uniq: add support for writing output files 2015-02-11 15:26:57 +00:00
FRIGN
f9846a9a6b Split up is*rune() and to*rune() functions into individual source files
This optimizes the binary size for each tool that uses these functions.
Previously, if a program just used one single function, maybe even a
one-liner, it would statically compile in all lookup-tables, bloating
the binary by up to 20K.
All these changes are derived from a local libutf where I do the
primary changes. So I hope that I can merge these things into libutf
sooner or later, as discussed on the ml.
2015-02-11 15:48:18 +01:00
FRIGN
471cf8f5bc Use runetypebody.h-functions in wc(1) 2015-02-11 15:48:18 +01:00
sin
17dad35015 uniq: Fix typo in usage 2015-02-11 12:50:39 +00:00
sin
b2370171e6 uniq: Match usage with manpage 2015-02-11 12:21:31 +00:00
sin
5f06185b1b uniq: Fixup program usage and manpage
Remove -i as it is not required by POSIX.  We'll add it if we
hit scripts that require it.
2015-02-11 12:19:38 +00:00
FRIGN
5836ef72e3 Use runetypebody.h-functions in tr(1)
That's one small step for a man, one giant leap for mankind.
2015-02-11 13:12:27 +01:00
FRIGN
02ec321419 Add missing is*rune() functions and tolowerrune() and toupperrune()
This basically means that we now have an autogenerating typecheck
and case-conversion tool.
Don't freak out when you see the added LOC. Given we now have
an additional mapping to the uppercase-characters, some ranges got
"lost" and have to be written literally by the generating awk-script.

The runetypebody.h was generated by myself using my modified version
of mkrunetype.awk and I'll push the changed version as soon as this
has been discussed on the ml.
If you worry about speed, consider, that bsearch is just the right
tool for this job and can even handle a long array like this.
2015-02-11 13:12:27 +01:00
sin
26bc079ecc uniq: Style fix 2015-02-11 12:02:33 +00:00
sin
6d4a7989cd readlink: Use eprintf() to report errors 2015-02-11 11:58:13 +00:00
sin
a29d31e94b Update readlink in README 2015-02-11 11:54:58 +00:00
sin
3f3e15b314 readlink: Use strlcat() instead of strncat() 2015-02-11 11:51:57 +00:00
sin
aed987a9af Update README 2015-02-11 10:57:00 +00:00
sin
b63fe99941 Add Ralph to LICENSE 2015-02-11 10:56:59 +00:00
Tai Chi Minh Ralph Eastwood
bc2310376f uniq: add ascii implementation of -f and -s flags 2015-02-11 10:56:58 +00:00
Tai Chi Minh Ralph Eastwood
28e26bc688 readlink: add -m and -f flags 2015-02-11 10:56:58 +00:00
Jakob Kramer
0fcad66c75 make use of en*alloc functions 2015-02-11 01:17:21 +00:00
Jakob Kramer
c0a3c66a84 add estrndup 2015-02-11 01:17:21 +00:00
Jakob Kramer
08e93dd4f5 add en*alloc functions 2015-02-11 01:17:21 +00:00
sin
51680535ce getlines: Style fix 2015-02-11 00:27:30 +00:00
Jakob Kramer
66a5ea722d getlines: last line of file should always have a newline
This is a useful behavior if you want to reorder the lines,
because otherwise you might end up with originally two lines
on one, e.g.

	$ echo -ne "foo\nbar" | sort
	barfoo
2015-02-11 00:25:48 +00:00
Hiltjo Posthuma
c1e6ecb41b fix some mandoc warnings 2015-02-10 17:37:57 +01:00
sin
0779d69df7 paste: No need to make an exception for stdin, just close it at the end 2015-02-10 12:08:06 +00:00
FRIGN
1c6298103e Fix alphabetical order in README 2015-02-10 12:11:21 +01:00
Evan Gates
bc07f1b9b5 Add initial implementation of sed(1)
No manpage yet.
2015-02-10 10:35:22 +00:00
FRIGN
7143737b50 Update README reflecting recent changes to the codebase 2015-02-10 00:53:48 +01:00
FRIGN
237e8cdfa7 Add periods in expr.1 2015-02-10 00:34:47 +01:00
FRIGN
4d32205c2d Rework test.1
The previous version was not well-searchable and a bit too harsh
on emphasized text segments.
This version should improve that.
2015-02-10 00:34:47 +01:00
Tai Chi Minh Ralph Eastwood
22f868cf0b du.1: add symlink dereferencing flags to manpage 2015-02-09 22:54:53 +00:00
Tai Chi Minh Ralph Eastwood
1d2d28a8e4 du.c: add symlink dereferencing flags -H and -L 2015-02-09 22:54:53 +00:00
Tai Chi Minh Ralph Eastwood
bd89474b8a tar.1: add symbolic link dereferencing to manpage 2015-02-09 22:54:53 +00:00
Tai Chi Minh Ralph Eastwood
6e99f9e1c7 chgrp.1: note exception of -h flag unsupported 2015-02-09 22:54:53 +00:00
Tai Chi Minh Ralph Eastwood
c05bbe2eee chgrp.1: add symlink derefencing flags to manpage 2015-02-09 22:54:53 +00:00
Tai Chi Minh Ralph Eastwood
f581761c53 chown.1: add symlink dereferencing flags to manpage 2015-02-09 22:54:52 +00:00
Tai Chi Minh Ralph Eastwood
c21a664f7c cp.1: symlink dereferencing flags 2015-02-09 22:54:52 +00:00
Tai Chi Minh Ralph Eastwood
af8be7f92c cp: add symlink deref flags -H and -L for cp and mv 2015-02-09 22:54:52 +00:00
FRIGN
360a63769c Use strtonum and libutf in test(1), refactor code and manpage
and mark it as finished in README.
2015-02-09 22:21:23 +01:00
FRIGN
856c79e242 Make wording more consistent in head.1 and tail.1 2015-02-09 20:04:54 +01:00
FRIGN
e31980915b Remove trailing newline 2015-02-09 19:03:25 +01:00
FRIGN
77fe242ded Amend STANDARDS section in tail.1 2015-02-09 19:02:39 +01:00