summaryrefslogtreecommitdiff
path: root/cont/unihome.html
blob: 30b8dd7d0dd1c82742923114cad2249d3d66d2ff (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
<!--
123456789-223456789-323456789-423456789-523456789-623456789-723456789-8234567890
一二三四-->[TITLE]                                                      [DATE]
--------------------------------------------------------------------------------
[SETTITLE]We Have Unicode at Home
[SETDATE]6-30-2023
So as we all know, the Linux console is limited to 512 characters, and lives in
kernel space. So I wrote a workaround that displays unicode characters using
braille (assuming your linux console font has braille characters) characters
using only userland busybox.


--------------------------=[Part I. Braille Graphics]=--------------------------
Braille graphics are actually really easy, the braille block goes from  U+2800
to U+28FF, with the lower 8 bits corresponding to the dots in each braille
character in the following order:

#0 3
#1 4
#2 5
#6 7

with 0 being the lowest bit and 7 being the highest bit. 

utf-8 encodes this codepoint with three bytes 

1110xxxx 	10xxxxxx 	10xxxxxx

where x represents the bits of the codepoint, therefore U+2800 converted to
UTF-8 is 0xE2A080 (big endian) or 14852224 in decimal (I'll explain why decimal
is relevant later).

If you take the pixel buffer, shift it according to the above chart (and
adjusted for the utf-8 encoding position change), and OR the base codepoint, you
get your desired braille character.

The problem is that bash can not do bitwise operations, and that it calls a
seperate process for conversion from hex to decimal. So our code ends up looking
like this: 

	if [ "${rawbuff[((1+4*$2))]:((1+2*$1)):1}" == "1" ];then
                num=$(($num + 16))
        fi

	where $num starts off as 14852224, we have a raw pixel buffer where each
	row is stored as a string where '1' represents a filled in pixel, and
	the current braille block we are rendering's x and y position are at $2
	and $1. 

The above code takes the value of the raw pixel buffer at position (1,1)
relative to the current code block, shifts it by 4, then ORs it with the
rendered braille character.


I also wrote some code to take commands that draw in the raw pixel buffer as
well. 

code <a href="https://hairydiode.xyz/cgit/bbrll.git/tree/bbrll">here</a>

----------------=[Part 2, Rendering BDF fonts with only busybox]=---------------
BDF is a human legible bitmap font format where each character entry looks like:

STARTCHAR uni6D69
ENCODING 28009
SWIDTH 1000 0
DWIDTH 8 0
BBX 7 7 0 -1
BITMAP
98
1C
A8
3E
80
9C
9C
ENDCHAR

	Source: Misaki Mincho, also sidenote, the entire font is only 746K
	despite the insanely inefficient format and large amount of characters
	supported , meanwhile TeX Live is installing multiple 50 Megabyte fonts
	that only support latin.

The first line is the unicode codepoint, followed by some info I don't care
about, and the bitmap data of the character where each row is a stored as a line
converted to hex. You can tell if we convert the hex to binary, it will be the
"raw pixel format" from before. so all we really need to do is write a small awk
script to find the relevant bitmap lines, then convert to binary and display it
with previous braille display script.

Complete Character Display code <a href="https://hairydiode.xyz/cgit/bbrll.git/tree/fontd">here</a>

-------------------------=[Part 3. UTF-8 Shenanigans.]=-------------------------
One annoying thing about utf-8, is that if you want to get the codepoint of a
particular character in a utf-8 string, you have to do some iconv trickery where
you first convert it to UTF-32, then convert it to hex.

Another problem is that BDF stores the codepoint as DECIMAL!!!!!. You see that
line "STARTCHAR uni6D69"? That's just the name of the character, it could
theoretically be anything. The actual line storing the codepoint is 
"ENCODING 28009", So we have to convert from hex to decimal, which is a
surprisingly convoluted procedure in bash.

All this is done in a wrapper script that displays all the input from stdin and
displays it using all the fonts in a directory given as its argument 

wrapper script code <a href="https://hairydiode.xyz/cgit/bbrll.git/tree/fontd">here</a>

----------------------------=[Part 4. Practical Use]=---------------------------
So remember the janky bash based IM from last time? I modified it to use the
braille display from before. I also wrote a little script that displays all the
non-ASCII characters in the previously focused tmux pane, so together we can
both display and input utf-8 characters in the linux console using tmux.

see the <a href="https://hairydiode.xyz/cgit/bim.git">code</a> and <a href="https://hairydiode.xyz/jankime">writeup</a>


"Screenshots" below:

Bash running in tmux
[usernm@cm│[usernm@cmphostname ~]$ mkdir 帖                          │乔
phostname │[usernm@cmphostname ~]$ cd 帖                             │pdr
~]$ ud    │[usernm@cmphostname 帖]$ vim 天干                         │⢠⠋⣏⡁⡆⡇⠀⠀⠁
⡤⡧⡄⠀⡧⠄⠀⠀  │                                                          │⢹⠔⢅⠇⡇⡇⠀⠀⠀
⡇⡇⡇⡖⠓⡆⠀⠀  │                                                          │⠸⠠⠊⠀⠥⠇⠀⠀⠂
⠁⠏⠁⠧⠤⠇⠀⠀  │                                                          │⣲⡪⢰⣓⣲⠀⠀⠀
⡤⡧⡄⠀⡧⠄⠀⠀  │                                                          │⠒⣱⠘⡖⡞⠀⠀⠀
⡇⡇⡇⡖⠓⡆⠀⠀  │                                                          │⠩⠜⠠⠃⠧⠇⠀⠀
⠁⠏⠁⠧⠤⠇⠀⠀  │                                                          │⢠⠴⠥⠤⡄⠀⠀⠀
⡤⡧⡄⠀⡧⠄⠀⠀  │                                                          │⠸⢭⠭⡭⠇⠀⠀⠀
⡇⡇⡇⡖⠓⡆⠀⠀  │                                                          │⠤⠊⠀⠣⠤⠇⠀⠀
⠁⠏⠁⠧⠤⠇⠀⠀  │                                                          │
⠉⠉⢹⠉⠉⠁⠀⠀  │                                                          │
⠉⠉⡝⡍⠉⠁⠀⠀  │                                                          │
⠤⠊⠀⠈⠢⠄⠀⠀  │                                                          │
⠈⠉⢹⠉⠉⠀⠀⠀  │                                                          │
⠒⠒⢺⠒⠒⠂⠀⠀  │                                                          │
⠀⠀⠸⠀⠀⠀⠀⠀  │                                                          │
[usernm@cm│                                                          │
phostname │                                                          │
~]$       │                                                          │
          │                                                          │
Leftpane is displaying all the unicode characters in the primary terminal
(remember, on the linux console they would all just be squares), and right pane
is the input method, which displays candidate characters in bash.

Vim running in tmux
⡇⡇⡇⡖⠓⡆⠀⠀  │甲乙丙丁                                                  │之 鐻
⠁⠏⠁⠧⠤⠇⠀⠀  │        最常用                                            │azn
⡤⡧⡄⠀⡧⠄⠀⠀  │~                                                         │⠤⠤⠼⠤⢤⠀⠀⠀
⡇⡇⡇⡖⠓⡆⠀⠀  │~                                                         │⠀⠀⣀⠔⠁⠀⠀⠀
⠁⠏⠁⠧⠤⠇⠀⠀  │~                                                         │⠔⠉⠒⠤⠤⠄⠀⠀
⡤⡧⡄⠀⡧⠄⠀⠀  │~                                                         │⣊⡂⣀⣗⣒⠀⠀⠀
⡇⡇⡇⡖⠓⡆⠀⠀  │~                                                         │⢺⡂⣗⢗⡖⡃⠀⠀
⠁⠏⠁⠧⠤⠇⠀⠀  │~                                                         │⠽⠴⠑⠝⠘⠄⠀⠀
⠉⠉⢹⠉⠉⠁⠀⠀  │~                                                         │
⠉⠉⡝⡍⠉⠁⠀⠀  │~                                                         │
⠤⠊⠀⠈⠢⠄⠀⠀  │~                                                         │
⠈⠉⢹⠉⠉⠀⠀⠀  │~                                                         │
⠒⠒⢺⠒⠒⠂⠀⠀  │~                                                         │
⠀⠀⠸⠀⠀⠀⠀⠀  │~                                                         │
[usernm@cm│~                                                         │
phostname │~                                                         │
~]$ ud    │~                                                         │
⣏⣉⣹⣉⣉⡇⠀⠀  │~                                                         │
⠧⠤⢼⠤⠤⠇⠀⠀  │~                                                         │
⠀⠀⠸⠀⠀⠀⠀⠀  │~                                                         │
⠉⠉⢉⠝⠋⠀⠀⠀  │~                                                         │
⢀⠔⠁⠀⠀⡀⠀⠀  │~                                                         │
⠣⠤⠤⠤⠤⠃⠀⠀  │~                                                         │
⣉⣉⣹⣉⣉⡁⠀⠀  │~                                                         │
⡇⢀⠜⢄⠀⡇⠀⠀  │~                                                         │
⠇⠁⠀⠀⠥⠇⠀⠀  │~                                                         │
⠉⠉⢹⠉⠉⠁⠀⠀  │~                                                         │
⠀⠀⢸⠀⠀⠀⠀⠀  │~                                                         │
⠀⠠⠼⠀⠀⠀⠀⠀  │~                                                         │
⢸⠭⠭⠭⢽⠀⠀⠀  │~                                                         │
⢹⠭⡏⡭⠭⡅⠀⠀  │~                                                         │
⠚⠉⠇⠬⠪⠄⠀⠀  │~                                                         │
⡖⣓⣚⣒⡓⡆⠀⠀  │~                                                         │
⢀⣓⣲⣒⣃⠀⠀⠀  │~                                                         │
⠘⠀⠸⠀⠚⠀⠀⠀  │~                                                         │
⢸⣉⣹⣉⣹⠀⠀⠀  │~                                                         │
⢸⠤⢼⠤⢼⠀⠀⠀  │~                                                         │
⠎⠀⠸⠀⠼⠀⠀⠀  │~                                                         │
[usernm@cm│~                                                         │
phostname │~                                                         │
~]$       │-- INSERT --                            2,11-15       All │