Commit Graph

4 Commits

Author SHA1 Message Date
FRIGN
ab9b240dc6 Fix warnings and update isalpharune() 2015-02-12 17:08:02 +01:00
FRIGN
ce11e1f195 Add section for laces in lowerrune and upperrune and more ranges
This is a special third kind of structure found in Unicode, besides
singletons and ranges.
This dramatically reduces the number of explicit singletons in the
lookup tables.
Also, I changed the awk-script so that it can sort trivial
translations as well, breaking down the LOC even more.

The binary size of tr dropped from 67K to 51K.
2015-02-12 16:18:02 +01:00
FRIGN
9565eef895 Refactor uppercase-inclusion in libutf
Previously, the to*rune function would have to jiggle with two
arrays, and it somehow evaded me that it is actually way simpler
to just add another entry to the arrays if needed.
Binary size goes slightly down, e.g. tr statically linked against
musl: 68072 -> 67688

Behind the scenes though the conversion should be a bit faster and,
more importantly, the scary case-conversion function is simplified
and easier to understand.

It also drops nearly half the LOC in upperrune.c and lowerrune.c.
2015-02-12 12:28:45 +01:00
FRIGN
f9846a9a6b Split up is*rune() and to*rune() functions into individual source files
This optimizes the binary size for each tool that uses these functions.
Previously, if a program just used one single function, maybe even a
one-liner, it would statically compile in all lookup-tables, bloating
the binary by up to 20K.
All these changes are derived from a local libutf where I do the
primary changes. So I hope that I can merge these things into libutf
sooner or later, as discussed on the ml.
2015-02-11 15:48:18 +01:00