-------------------------------------------------------------------------------- >HairyDiode -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- So as we all know, the Linux console is limited to 512 characters, and lives in kernel space. So I wrote a workaround that displays unicode characters using braille (assuming your linux console font has braille characters) characters using only userland busybox. --------------------------=[Part I. Braille Graphics]=-------------------------- Braille graphics are actually really easy, the braille block goes from U+2800 to U+28FF, with the lower 8 bits corresponding to the dots in each braille character in the following order: #0 3 #1 4 #2 5 #6 7 with 0 being the lowest bit and 7 being the highest bit. utf-8 encodes this codepoint with three bytes 1110xxxx 10xxxxxx 10xxxxxx where x represents the bits of the codepoint, therefore U+2800 converted to UTF-8 is 0xE2A080 (big endian) or 14852224 in decimal (I'll explain why decimal is relevant later). If you take the pixel buffer, shift it according to the above chart (and adjusted for the utf-8 encoding position change), and OR the base codepoint, you get your desired braille character. The problem is that bash can not do bitwise operations, and that it calls a seperate process for conversion from hex to decimal. So our code ends up looking like this: if [ "${rawbuff[((1+4*$2))]:((1+2*$1)):1}" == "1" ];then num=$(($num + 16)) fi where $num starts off as 14852224, we have a raw pixel buffer where each row is stored as a string where '1' represents a filled in pixel, and the current braille block we are rendering's x and y position are at $2 and $1. The above code takes the value of the raw pixel buffer at position (1,1) relative to the current code block, shifts it by 4, then ORs it with the rendered braille character. I also wrote some code to take commands that draw in the raw pixel buffer as well. code here ----------------=[Part 2, Rendering BDF fonts with only busybox]=--------------- BDF is a human legible bitmap font format where each character entry looks like: STARTCHAR uni6D69 ENCODING 28009 SWIDTH 1000 0 DWIDTH 8 0 BBX 7 7 0 -1 BITMAP 98 1C A8 3E 80 9C 9C ENDCHAR The first line is the unicode codepoint, followed by some info I don't care about, and the bitmap data of the character where each row is a stored as a line converted to hex. You can tell if we convert the hex to binary, it will be the "raw pixel format" from before. so all we really need to do is write a small awk script to find the relevant bitmap lines, then convert to binary and display it with previous braille display script. Complete Character Display code here -------------------------=[Part 3. UTF-8 Shenanigans.]=------------------------- One annoying thing about utf-8, is that if you want to get the codepoint of a particular character in a utf-8 string, you have to do some iconv trickery where you first convert it to UTF-32, then convert it to hex. Another problem is that BDF stores the codepoint as DECIMAL!!!!!. You see that line "STARTCHAR uni6D69"? That's just the name of the character, it could theoretically be anything. The actual line storing the codepoint is "ENCODING 28009", So we have to convert from hex to decimal, which is a surprisingly convoluted procedure in bash. All this is done in a wrapper script that displays all the input from stdin and displays it using all the fonts in a directory given as its argument wrapper script code here ----------------------------=[Part 4. Practical Use]=--------------------------- So remember the janky bash based IM from last time? I modified it to use the braille display from before. I also wrote a little script that displays all the non-ASCII characters in the previously focused tmux pane, so together we can both display and input utf-8 characters in the linux console using tmux. see the code and writeup "Screenshots" below: Bash running in tmux [usernm@cm│[usernm@cmphostname ~]$ mkdir 帖 │乔 phostname │[usernm@cmphostname ~]$ cd 帖 │pdr ~]$ ud │[usernm@cmphostname 帖]$ vim 天干 │⢠⠋⣏⡁⡆⡇⠀⠀⠁ ⡤⡧⡄⠀⡧⠄⠀⠀ │ │⢹⠔⢅⠇⡇⡇⠀⠀⠀ ⡇⡇⡇⡖⠓⡆⠀⠀ │ │⠸⠠⠊⠀⠥⠇⠀⠀⠂ ⠁⠏⠁⠧⠤⠇⠀⠀ │ │⣲⡪⢰⣓⣲⠀⠀⠀ ⡤⡧⡄⠀⡧⠄⠀⠀ │ │⠒⣱⠘⡖⡞⠀⠀⠀ ⡇⡇⡇⡖⠓⡆⠀⠀ │ │⠩⠜⠠⠃⠧⠇⠀⠀ ⠁⠏⠁⠧⠤⠇⠀⠀ │ │⢠⠴⠥⠤⡄⠀⠀⠀ ⡤⡧⡄⠀⡧⠄⠀⠀ │ │⠸⢭⠭⡭⠇⠀⠀⠀ ⡇⡇⡇⡖⠓⡆⠀⠀ │ │⠤⠊⠀⠣⠤⠇⠀⠀ ⠁⠏⠁⠧⠤⠇⠀⠀ │ │ ⠉⠉⢹⠉⠉⠁⠀⠀ │ │ ⠉⠉⡝⡍⠉⠁⠀⠀ │ │ ⠤⠊⠀⠈⠢⠄⠀⠀ │ │ ⠈⠉⢹⠉⠉⠀⠀⠀ │ │ ⠒⠒⢺⠒⠒⠂⠀⠀ │ │ ⠀⠀⠸⠀⠀⠀⠀⠀ │ │ [usernm@cm│ │ phostname │ │ ~]$ │ │ │ │ Leftpane is displaying all the unicode characters in the primary terminal (remember, on the linux console they would all just be squares), and right pane is the input method, which displays candidate characters in bash. Vim running in tmux ⡇⡇⡇⡖⠓⡆⠀⠀ │甲乙丙丁 │之 鐻 ⠁⠏⠁⠧⠤⠇⠀⠀ │ 最常用 │azn ⡤⡧⡄⠀⡧⠄⠀⠀ │~ │⠤⠤⠼⠤⢤⠀⠀⠀ ⡇⡇⡇⡖⠓⡆⠀⠀ │~ │⠀⠀⣀⠔⠁⠀⠀⠀ ⠁⠏⠁⠧⠤⠇⠀⠀ │~ │⠔⠉⠒⠤⠤⠄⠀⠀ ⡤⡧⡄⠀⡧⠄⠀⠀ │~ │⣊⡂⣀⣗⣒⠀⠀⠀ ⡇⡇⡇⡖⠓⡆⠀⠀ │~ │⢺⡂⣗⢗⡖⡃⠀⠀ ⠁⠏⠁⠧⠤⠇⠀⠀ │~ │⠽⠴⠑⠝⠘⠄⠀⠀ ⠉⠉⢹⠉⠉⠁⠀⠀ │~ │ ⠉⠉⡝⡍⠉⠁⠀⠀ │~ │ ⠤⠊⠀⠈⠢⠄⠀⠀ │~ │ ⠈⠉⢹⠉⠉⠀⠀⠀ │~ │ ⠒⠒⢺⠒⠒⠂⠀⠀ │~ │ ⠀⠀⠸⠀⠀⠀⠀⠀ │~ │ [usernm@cm│~ │ phostname │~ │ ~]$ ud │~ │ ⣏⣉⣹⣉⣉⡇⠀⠀ │~ │ ⠧⠤⢼⠤⠤⠇⠀⠀ │~ │ ⠀⠀⠸⠀⠀⠀⠀⠀ │~ │ ⠉⠉⢉⠝⠋⠀⠀⠀ │~ │ ⢀⠔⠁⠀⠀⡀⠀⠀ │~ │ ⠣⠤⠤⠤⠤⠃⠀⠀ │~ │ ⣉⣉⣹⣉⣉⡁⠀⠀ │~ │ ⡇⢀⠜⢄⠀⡇⠀⠀ │~ │ ⠇⠁⠀⠀⠥⠇⠀⠀ │~ │ ⠉⠉⢹⠉⠉⠁⠀⠀ │~ │ ⠀⠀⢸⠀⠀⠀⠀⠀ │~ │ ⠀⠠⠼⠀⠀⠀⠀⠀ │~ │ ⢸⠭⠭⠭⢽⠀⠀⠀ │~ │ ⢹⠭⡏⡭⠭⡅⠀⠀ │~ │ ⠚⠉⠇⠬⠪⠄⠀⠀ │~ │ ⡖⣓⣚⣒⡓⡆⠀⠀ │~ │ ⢀⣓⣲⣒⣃⠀⠀⠀ │~ │ ⠘⠀⠸⠀⠚⠀⠀⠀ │~ │ ⢸⣉⣹⣉⣹⠀⠀⠀ │~ │ ⢸⠤⢼⠤⢼⠀⠀⠀ │~ │ ⠎⠀⠸⠀⠼⠀⠀⠀ │~ │ [usernm@cm│~ │ phostname │~ │ ~]$ │-- INSERT -- 2,11-15 All │