sbase

fbt/sbase

Author	SHA1	Message	Date
FRIGN	b8b9d983c8	Add unescape() to libutil formerly known as resolveescapes(), it is of central use to numerous programs. This drops a lot of LOC.	2015-01-29 21:52:44 +01:00
FRIGN	ee6f7d3fc0	Add trivial equivalence class support in tr(1) and update manpage Equivalence classes are a hard matter and there's still no "standard" way to solve the issue. Previously, tr would just skip those classes, but it's much better when it resolves a [=c=] to a normal c instead of treating it as a literal. Also, reflect recent changes in the manpage (octal escapes) and fix the markup in some areas.	2015-01-28 19:44:05 +01:00
FRIGN	ee843a2e09	Fix segmentation fault in tr(1) and make the parser stricter.	2015-01-24 23:00:34 +01:00
FRIGN	eb57becb38	Add octal sequence support to tr(1)	2015-01-24 22:43:46 +01:00
sin	98d759a274	Add license remark to tr.c	2015-01-20 15:26:08 +00:00
FRIGN	7d3e9c6e88	Resolve escape characters in tr(1) This is one aspect which I think has blown up the complexity of many tr-implementations around today. Instead of complicating the set-theory-based parser itself (he should still be relying on one rune per char, not multirunes), I added a preprocessor, which basically scans the code for upcoming '\'s, reads what he finds, substitutes the real character onto '\'s index and shifts the entire following array so there are no "holes". What is left to reflect on is what to do with octal sequences. I have a local implementation here, which works fine, but imho, given tr is already so focused on UTF-8, we might as well ignore POSIX at this point and rather implement the unicode UTF-8 code points, which are way more contemporary and future-proof. Reading in \uC3A4 as a an array of 0xC3 and 0xA4 is not the issue, but I'm still struggling to find a way to turn it into a well-formed byte sequence. Hit me with a mail if you have a simple solution for that.	2015-01-15 11:01:52 +00:00
FRIGN	7a644aea7d	Fix mapping a class to a simple set and improve error-reporting It's standard behaviour to map a whole class of matched objects to the last element of a given simple set2 instead of just passing it through. Also, error out more strictly when the user gives us bogus sets.	2015-01-12 11:19:43 +00:00
FRIGN	0f90528df7	Add proper casts and fix a small error	2015-01-11 22:35:15 +00:00
FRIGN	09704afc24	Add Unicode character class support Thinking about it long enough, the solution seems almost trivial.	2015-01-11 22:35:15 +00:00
FRIGN	369bb01eb1	Prevail order	2015-01-10 19:56:34 +00:00
Hiltjo Posthuma	14c5ab48d5	tr: set2 must be set in some cases echo abc \| tr 'a' '' would crash because of: m--; r = set2[m].start + (off1 - off2) / set2[m].quant; if set2ranges > 0 it's fine.	2015-01-10 18:16:43 +00:00
Hiltjo Posthuma	cf714e6edb	tr: fix signed/unsigned warnings	2015-01-10 17:00:01 +00:00
sin	1f3345b9e6	Staticise some symbols in tr(1)	2015-01-10 14:26:32 +00:00
FRIGN	a582cb8a2f	Rewrite tr(1) in a sane way tr(1) always used to be a saddening part of sbase, which was inherently broken and crufted. But to be fair, the POSIX-standard doesn't make it very simple. Given the current version was unfixable and broken by design, I sat down and rewrote tr(1) very close to the concept of set theory and the POSIX-standard with a few exceptions: - UTF-8: not allowed in POSIX, but in my opinion a must. This finally allows you to work with UTF-8 streams without problems or unexpected behaviour. - Equivalence classes: Left out, even GNU coreutils ignore them and depending on LC_COLLATE, which sucks. - Character classes: No experiments or environment-variable-trickery. Just plain definitions derived from the POSIX- standard, working as expected. I tested this thoroughly, but expect problems to show up in some way given the wide range of input this program has to handle. The only thing left on the TODO is to add support for literal expressions ('\n', '\t', '\001', ...) and probably rethinking the way [_*n] is unnecessarily restricted to string2.	2015-01-10 14:26:30 +00:00
Evan Gates	84b08427a1	remove agetline	2014-11-18 21:05:28 +00:00
FRIGN	ec8246bbc6	Un-boolify sbase It actually makes the binaries smaller, the code easier to read (gems like "val == true", "val == false" are gone) and actually predictable in the sense of that we actually know what we're working with (one bitwise operator was quite adventurous and should now be fixed). This is also more consistent with the other suckless projects around which don't use boolean types.	2014-11-14 10:54:20 +00:00
FRIGN	7d2683ddf2	Sort includes and more cleanup and fixes in util/	2014-11-14 10:54:10 +00:00
FRIGN	eee98ed3a4	Fix coding style It was about damn time. Consistency is very important in such a big codebase.	2014-11-13 18:08:43 +00:00
sin	0c5b7b9155	Stop using EXIT_{SUCCESS,FAILURE}	2014-10-02 23:46:59 +01:00
sin	ac402965d5	Fix comment style and nuke stray whitespace	2014-07-16 20:43:29 +01:00
Adria Garriga	b3a63a60e4	Improved tr - Added support for character ranges ( a-z ) - Added support for complementary charset ( -c ), only in delete mode - Added support for octal escape sequences - Unicode now only works when there are no octal escape sequences, otherwise behavior is not predictable at first sight. - tr now supports null characters in the input - Does not yet have support for character classes ( [:upper:] )	2014-07-16 20:40:54 +01:00
Hiltjo Posthuma	fab4b384e7	use agetline instead of agets also use agetline where fgets with a static buffer was used previously. Signed-off-by: Hiltjo Posthuma <hiltjo@codemadness.org>	2014-06-01 18:03:10 +01:00
Silvan Jegen	4e13ff39c3	Wrap mbtowc to check for errors	2014-04-12 21:29:16 +01:00
sin	bc13aa5960	No need to cast return value of mmap() in tr	2014-04-12 20:33:59 +01:00
Hiltjo Posthuma	a8f45b4568	tr: change delete behaviour when one argument is specified use delete behaviour again Signed-off-by: Hiltjo Posthuma <hiltjo@codemadness.org>	2014-04-12 20:33:10 +01:00
Hiltjo Posthuma	ff474a8cbc	tr: add dflag, error with usage() on invalid flag combination Signed-off-by: Hiltjo Posthuma <hiltjo@codemadness.org>	2014-04-09 15:40:21 +01:00
Hiltjo Posthuma	3e49e946b7	tr: fix escape code handling in set2 Signed-off-by: Hiltjo Posthuma <hiltjo@codemadness.org>	2014-04-09 15:40:04 +01:00
sin	e9a4af87bd	Staticise functions in tr(1)	2014-01-25 22:07:40 +00:00
sin	fe6144793f	Check mmap() return value and unmap at the end	2014-01-20 11:28:21 +00:00
Silvan Jegen	38f429a3d2	Add the tr program including man page	2014-01-20 11:22:28 +00:00

30 Commits