From 75094d65209fd457f5222809ea5570473b1d411f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Haoran=20S=2E=20Diao=20=28=E5=88=81=E6=B5=A9=E7=84=B6=29?= <0@hairydiode.xyz> Date: Thu, 23 Nov 2023 05:15:53 -0800 Subject: more presentable README --- README | 61 ++++++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 42 insertions(+), 19 deletions(-) (limited to 'README') diff --git a/README b/README index 103738f..7be31d4 100644 --- a/README +++ b/README @@ -1,30 +1,53 @@ +-----------------------------------=[Sources]=---------------------------------- Unihan Database from - https://www.unicode.org/Public/UNIDATA/ -Unihan_DictionaryLikeData.txt - has all the four corners info + https://www.unicode.org/Public/UNIDATA/Unihan.zip + Unihan_DictionaryLikeData.txt + has four corners info for about 16k characters -Logo genrrated here: +Logo generated here: https://www.zhuanshuti.cn/3 +---------------------------------=[How to Use]=--------------------------------- +Installation]=--- +$ make +$ sudo make install +$ ibus restart + +Use]=--- +$ ibus engine table:fc + +Keybinds]=--- +F1 to F9 are the character selectors +` is the wildcard character + +Note that you can essentially type english and chinese at the same time without +changing input methods. If you need to type a number or use english punctuation, +simply press left shift and it will change into english mode. + + +-----------------------------=[Method of Creation]=----------------------------- +These are just notes on how the file was created, it's largely manual and very +janky but it works. If the UniHan database updates we can do all this again + + +$ cat Unihan_DictionaryLikeData.txt | grep kFourCornerCode -grep kFourCornerCode delete comment line -:%s/\(.*\s.*\s\)\(.*\)\s\(.*\)/\1\2\r\1\3/ - removes duplicate four corners -clean it up so its -12345 U+212121 -then turn to echo $'123445 \u12341' -using two seperate subsitutes for 4 and 5 char -change to 8 character length +In vim: + :%s/\(.*\s.*\s\)\(.*\)\s\(.*\)/\1\2\r\1\3/ + changes duplicate four corner codes to be seperate entries -then use the following to convert to actual unicode -:%s/^\(.*\)\t\(.*\)$/echo -e "\1\\t$(echo \2 |xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8)"/ +In vim edit to format : + 12345 000ABC12 + where the hex code is exactly 8 digits long -xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8 - coverts fro U+code to normal, need to pad to 32bits +then use the following to convert to actual unicode + :%s/^\(.*\)\t\(.*\)$/echo -e "\1\\t$(echo \2 |xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8)"/ -create tempalte according to usr/share/ibus-input/tables/template.txt +The following command converts hex code to actual unicode char +$ xxd -r -ps -u | iconv -f UTF-32BE -t UTF-8 +Create template according to usr/share/ibus-input/tables/template.txt -cat four | awk '{print $1}' | sort | uniq -c | sort -n - counts 71 conflicts at most +Counting conflicts +$ cat four | awk '{print $1}' | sort | uniq -c | sort -n -- cgit v1.1