diff options
author | Haoran S. Diao (刁浩然) <0@hairydiode.xyz> | 2023-11-23 05:15:53 -0800 |
---|---|---|
committer | Haoran S. Diao (刁浩然) <0@hairydiode.xyz> | 2023-11-23 05:15:53 -0800 |
commit | 75094d65209fd457f5222809ea5570473b1d411f (patch) | |
tree | a30432be2ed5ec1a8819c4855312644363ba778d | |
parent | f53115e196c7fec7f4ab5cc04dc8f90d125aa480 (diff) |
more presentable README
-rw-r--r-- | README | 61 |
1 files changed, 42 insertions, 19 deletions
@@ -1,30 +1,53 @@ +-----------------------------------=[Sources]=---------------------------------- Unihan Database from - https://www.unicode.org/Public/UNIDATA/ -Unihan_DictionaryLikeData.txt - has all the four corners info + https://www.unicode.org/Public/UNIDATA/Unihan.zip + Unihan_DictionaryLikeData.txt + has four corners info for about 16k characters -Logo genrrated here: +Logo generated here: https://www.zhuanshuti.cn/3 +---------------------------------=[How to Use]=--------------------------------- +Installation]=--- +$ make +$ sudo make install +$ ibus restart + +Use]=--- +$ ibus engine table:fc + +Keybinds]=--- +F1 to F9 are the character selectors +` is the wildcard character + +Note that you can essentially type english and chinese at the same time without +changing input methods. If you need to type a number or use english punctuation, +simply press left shift and it will change into english mode. + + +-----------------------------=[Method of Creation]=----------------------------- +These are just notes on how the file was created, it's largely manual and very +janky but it works. If the UniHan database updates we can do all this again + + +$ cat Unihan_DictionaryLikeData.txt | grep kFourCornerCode -grep kFourCornerCode delete comment line -:%s/\(.*\s.*\s\)\(.*\)\s\(.*\)/\1\2\r\1\3/ - removes duplicate four corners -clean it up so its -12345 U+212121 -then turn to echo $'123445 \u12341' -using two seperate subsitutes for 4 and 5 char -change to 8 character length +In vim: + :%s/\(.*\s.*\s\)\(.*\)\s\(.*\)/\1\2\r\1\3/ + changes duplicate four corner codes to be seperate entries -then use the following to convert to actual unicode -:%s/^\(.*\)\t\(.*\)$/echo -e "\1\\t$(echo \2 |xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8)"/ +In vim edit to format : + 12345 000ABC12 + where the hex code is exactly 8 digits long -xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8 - coverts fro U+code to normal, need to pad to 32bits +then use the following to convert to actual unicode + :%s/^\(.*\)\t\(.*\)$/echo -e "\1\\t$(echo \2 |xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8)"/ -create tempalte according to usr/share/ibus-input/tables/template.txt +The following command converts hex code to actual unicode char +$ xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8 +Create template according to usr/share/ibus-input/tables/template.txt -cat four | awk '{print $1}' | sort | uniq -c | sort -n - counts 71 conflicts at most +Counting conflicts +$ cat four | awk '{print $1}' | sort | uniq -c | sort -n |