summaryrefslogtreecommitdiff
path: root/cont/unihome.html
diff options
context:
space:
mode:
Diffstat (limited to 'cont/unihome.html')
-rw-r--r--cont/unihome.html187
1 files changed, 187 insertions, 0 deletions
diff --git a/cont/unihome.html b/cont/unihome.html
new file mode 100644
index 0000000..30b8dd7
--- /dev/null
+++ b/cont/unihome.html
@@ -0,0 +1,187 @@
+<!--
+123456789-223456789-323456789-423456789-523456789-623456789-723456789-8234567890
+一二三四-->[TITLE] [DATE]
+--------------------------------------------------------------------------------
+[SETTITLE]We Have Unicode at Home
+[SETDATE]6-30-2023
+So as we all know, the Linux console is limited to 512 characters, and lives in
+kernel space. So I wrote a workaround that displays unicode characters using
+braille (assuming your linux console font has braille characters) characters
+using only userland busybox.
+
+
+--------------------------=[Part I. Braille Graphics]=--------------------------
+Braille graphics are actually really easy, the braille block goes from U+2800
+to U+28FF, with the lower 8 bits corresponding to the dots in each braille
+character in the following order:
+
+#0 3
+#1 4
+#2 5
+#6 7
+
+with 0 being the lowest bit and 7 being the highest bit.
+
+utf-8 encodes this codepoint with three bytes
+
+1110xxxx 10xxxxxx 10xxxxxx
+
+where x represents the bits of the codepoint, therefore U+2800 converted to
+UTF-8 is 0xE2A080 (big endian) or 14852224 in decimal (I'll explain why decimal
+is relevant later).
+
+If you take the pixel buffer, shift it according to the above chart (and
+adjusted for the utf-8 encoding position change), and OR the base codepoint, you
+get your desired braille character.
+
+The problem is that bash can not do bitwise operations, and that it calls a
+seperate process for conversion from hex to decimal. So our code ends up looking
+like this:
+
+ if [ "${rawbuff[((1+4*$2))]:((1+2*$1)):1}" == "1" ];then
+ num=$(($num + 16))
+ fi
+
+ where $num starts off as 14852224, we have a raw pixel buffer where each
+ row is stored as a string where '1' represents a filled in pixel, and
+ the current braille block we are rendering's x and y position are at $2
+ and $1.
+
+The above code takes the value of the raw pixel buffer at position (1,1)
+relative to the current code block, shifts it by 4, then ORs it with the
+rendered braille character.
+
+
+I also wrote some code to take commands that draw in the raw pixel buffer as
+well.
+
+code <a href="https://hairydiode.xyz/cgit/bbrll.git/tree/bbrll">here</a>
+
+----------------=[Part 2, Rendering BDF fonts with only busybox]=---------------
+BDF is a human legible bitmap font format where each character entry looks like:
+
+STARTCHAR uni6D69
+ENCODING 28009
+SWIDTH 1000 0
+DWIDTH 8 0
+BBX 7 7 0 -1
+BITMAP
+98
+1C
+A8
+3E
+80
+9C
+9C
+ENDCHAR
+
+ Source: Misaki Mincho, also sidenote, the entire font is only 746K
+ despite the insanely inefficient format and large amount of characters
+ supported , meanwhile TeX Live is installing multiple 50 Megabyte fonts
+ that only support latin.
+
+The first line is the unicode codepoint, followed by some info I don't care
+about, and the bitmap data of the character where each row is a stored as a line
+converted to hex. You can tell if we convert the hex to binary, it will be the
+"raw pixel format" from before. so all we really need to do is write a small awk
+script to find the relevant bitmap lines, then convert to binary and display it
+with previous braille display script.
+
+Complete Character Display code <a href="https://hairydiode.xyz/cgit/bbrll.git/tree/fontd">here</a>
+
+-------------------------=[Part 3. UTF-8 Shenanigans.]=-------------------------
+One annoying thing about utf-8, is that if you want to get the codepoint of a
+particular character in a utf-8 string, you have to do some iconv trickery where
+you first convert it to UTF-32, then convert it to hex.
+
+Another problem is that BDF stores the codepoint as DECIMAL!!!!!. You see that
+line "STARTCHAR uni6D69"? That's just the name of the character, it could
+theoretically be anything. The actual line storing the codepoint is
+"ENCODING 28009", So we have to convert from hex to decimal, which is a
+surprisingly convoluted procedure in bash.
+
+All this is done in a wrapper script that displays all the input from stdin and
+displays it using all the fonts in a directory given as its argument
+
+wrapper script code <a href="https://hairydiode.xyz/cgit/bbrll.git/tree/fontd">here</a>
+
+----------------------------=[Part 4. Practical Use]=---------------------------
+So remember the janky bash based IM from last time? I modified it to use the
+braille display from before. I also wrote a little script that displays all the
+non-ASCII characters in the previously focused tmux pane, so together we can
+both display and input utf-8 characters in the linux console using tmux.
+
+see the <a href="https://hairydiode.xyz/cgit/bim.git">code</a> and <a href="https://hairydiode.xyz/jankime">writeup</a>
+
+
+"Screenshots" below:
+
+Bash running in tmux
+[usernm@cm│[usernm@cmphostname ~]$ mkdir 帖 │乔
+phostname │[usernm@cmphostname ~]$ cd 帖 │pdr
+~]$ ud │[usernm@cmphostname 帖]$ vim 天干 │⢠⠋⣏⡁⡆⡇⠀⠀⠁
+⡤⡧⡄⠀⡧⠄⠀⠀ │ │⢹⠔⢅⠇⡇⡇⠀⠀⠀
+⡇⡇⡇⡖⠓⡆⠀⠀ │ │⠸⠠⠊⠀⠥⠇⠀⠀⠂
+⠁⠏⠁⠧⠤⠇⠀⠀ │ │⣲⡪⢰⣓⣲⠀⠀⠀
+⡤⡧⡄⠀⡧⠄⠀⠀ │ │⠒⣱⠘⡖⡞⠀⠀⠀
+⡇⡇⡇⡖⠓⡆⠀⠀ │ │⠩⠜⠠⠃⠧⠇⠀⠀
+⠁⠏⠁⠧⠤⠇⠀⠀ │ │⢠⠴⠥⠤⡄⠀⠀⠀
+⡤⡧⡄⠀⡧⠄⠀⠀ │ │⠸⢭⠭⡭⠇⠀⠀⠀
+⡇⡇⡇⡖⠓⡆⠀⠀ │ │⠤⠊⠀⠣⠤⠇⠀⠀
+⠁⠏⠁⠧⠤⠇⠀⠀ │ │
+⠉⠉⢹⠉⠉⠁⠀⠀ │ │
+⠉⠉⡝⡍⠉⠁⠀⠀ │ │
+⠤⠊⠀⠈⠢⠄⠀⠀ │ │
+⠈⠉⢹⠉⠉⠀⠀⠀ │ │
+⠒⠒⢺⠒⠒⠂⠀⠀ │ │
+⠀⠀⠸⠀⠀⠀⠀⠀ │ │
+[usernm@cm│ │
+phostname │ │
+~]$ │ │
+ │ │
+Leftpane is displaying all the unicode characters in the primary terminal
+(remember, on the linux console they would all just be squares), and right pane
+is the input method, which displays candidate characters in bash.
+
+Vim running in tmux
+⡇⡇⡇⡖⠓⡆⠀⠀ │甲乙丙丁 │之 鐻
+⠁⠏⠁⠧⠤⠇⠀⠀ │ 最常用 │azn
+⡤⡧⡄⠀⡧⠄⠀⠀ │~ │⠤⠤⠼⠤⢤⠀⠀⠀
+⡇⡇⡇⡖⠓⡆⠀⠀ │~ │⠀⠀⣀⠔⠁⠀⠀⠀
+⠁⠏⠁⠧⠤⠇⠀⠀ │~ │⠔⠉⠒⠤⠤⠄⠀⠀
+⡤⡧⡄⠀⡧⠄⠀⠀ │~ │⣊⡂⣀⣗⣒⠀⠀⠀
+⡇⡇⡇⡖⠓⡆⠀⠀ │~ │⢺⡂⣗⢗⡖⡃⠀⠀
+⠁⠏⠁⠧⠤⠇⠀⠀ │~ │⠽⠴⠑⠝⠘⠄⠀⠀
+⠉⠉⢹⠉⠉⠁⠀⠀ │~ │
+⠉⠉⡝⡍⠉⠁⠀⠀ │~ │
+⠤⠊⠀⠈⠢⠄⠀⠀ │~ │
+⠈⠉⢹⠉⠉⠀⠀⠀ │~ │
+⠒⠒⢺⠒⠒⠂⠀⠀ │~ │
+⠀⠀⠸⠀⠀⠀⠀⠀ │~ │
+[usernm@cm│~ │
+phostname │~ │
+~]$ ud │~ │
+⣏⣉⣹⣉⣉⡇⠀⠀ │~ │
+⠧⠤⢼⠤⠤⠇⠀⠀ │~ │
+⠀⠀⠸⠀⠀⠀⠀⠀ │~ │
+⠉⠉⢉⠝⠋⠀⠀⠀ │~ │
+⢀⠔⠁⠀⠀⡀⠀⠀ │~ │
+⠣⠤⠤⠤⠤⠃⠀⠀ │~ │
+⣉⣉⣹⣉⣉⡁⠀⠀ │~ │
+⡇⢀⠜⢄⠀⡇⠀⠀ │~ │
+⠇⠁⠀⠀⠥⠇⠀⠀ │~ │
+⠉⠉⢹⠉⠉⠁⠀⠀ │~ │
+⠀⠀⢸⠀⠀⠀⠀⠀ │~ │
+⠀⠠⠼⠀⠀⠀⠀⠀ │~ │
+⢸⠭⠭⠭⢽⠀⠀⠀ │~ │
+⢹⠭⡏⡭⠭⡅⠀⠀ │~ │
+⠚⠉⠇⠬⠪⠄⠀⠀ │~ │
+⡖⣓⣚⣒⡓⡆⠀⠀ │~ │
+⢀⣓⣲⣒⣃⠀⠀⠀ │~ │
+⠘⠀⠸⠀⠚⠀⠀⠀ │~ │
+⢸⣉⣹⣉⣹⠀⠀⠀ │~ │
+⢸⠤⢼⠤⢼⠀⠀⠀ │~ │
+⠎⠀⠸⠀⠼⠀⠀⠀ │~ │
+[usernm@cm│~ │
+phostname │~ │
+~]$ │-- INSERT -- 2,11-15 All │