From 7a701578b0dbd700dc0330eafbd3a8faf3990beb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Haoran=20S=2E=20Diao=20=28=E5=88=81=E6=B5=A9=E7=84=B6=29?= <0@hairydiode.xyz> Date: Tue, 19 Aug 2025 14:08:54 -0700 Subject: New write-up on xkb abuse --- cont/xkbabuse.html | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 107 insertions(+) create mode 100644 cont/xkbabuse.html diff --git a/cont/xkbabuse.html b/cont/xkbabuse.html new file mode 100644 index 0000000..4e5fc07 --- /dev/null +++ b/cont/xkbabuse.html @@ -0,0 +1,107 @@ +[TITLE] [DATE] +-------------------------------------------------------------------------------- +[SETTITLE]Abusing X11's xkb for fun and profit +[SETDATE]08-19-2025 + +Yesterday I was playing around with xkb keyboard layouts before I discovered +that compose functionality for European (and other languages that use accents) +keyboard layouts is implemented as a 5227 line long table that maps key symbols +to unicode strings. + +ex: /usr/share/X11/locale/en_US.UTF-8/Compose on my computer +# UTF-8 (Unicode) Compose sequences +# +# Spacing versions of accents (mostly) + : "~" asciitilde # TILDE + : "~" asciitilde # TILDE + : "~" asciitilde # TILDE + : "~" asciitilde # TILDE + : "'" apostrophe # APOSTROPHE +... + : "ồ ocircumflex # LATIN SMALL LETTER O WITH CIRCUMFLEX +... + : "⍠́ U2360 # : ⎕ APL FUNCTIONAL SYMBOL QUAD COLON + +For those of you who are unaware, if you use for example a German keyboard +layout, pressing "^" followed by "o" will produce "ô". The circumflex "^" key is +called a "dead key" in xkb terminology because it does not produce any +characters by itself. In addition if you were to bind any key on your keyboard +to "compose" it would allow you to type basically a wide range of unicode +characters via various sequences of key presses. + +This got me thinking, this functionality is identical to how ibus-table IMs work +and would allow me to implement Chinese IMs in a way that requires no extra +software and which would presumbaly be compatible with a far greater range of +software since the functionality is built into X11. The fact that the default +file is 5000 lines long tells me that X11 is more than capable of handling long +tables. + +My first step was to take the ibus boshiamy implementation I already have on my +computer and mutilate it into the above format using convoluted regex commands +and a lot of whack a mole to turn 46000 lines of: + +aaa 100 鑫 +aaa 99 龘 +aaa 98 鑆 + + into + + : "鑫" + <1> : "龘" + <2> : "鑆" + + +To my surprise, after moving this file to ~/.XCompose it worked exactly as I +expected with no lag. The only issue then, is that there's no way to switch +between compose sets in xkb. This explains why the en_US.UTF-8 compose set was +so long, it had to essentially handle every possible dead-key or compose +sequence for every keyboard layout. + +There's an easy solution to this though, which is to create a custom keyboard +layout where the keys are mapped to custom key symbols (xkb's layer of +abstraction above a physical keycode and below a text string) and have my +compose table use those as the inputs instead of qwerty keys. + +Since I started this whole thing by messing with xkb layouts, it didn't take +long for me to edit the us layout into something like this: + + key {[ U9AD8, Q ]}; # 高 + key {[ U4E94, W ]}; # 五 + key {[ U4E00, E ]}; # 一 + key {[ U4E8C, R ]}; # 二 + key {[ U901A, T ]}; # 通 + key {[ U76CA, Y ]}; # 益 + key {[ U4EE5, U ]}; # 以 + key {[ U5F8C, I ]}; # 後 + key {[ U3007, O ]}; # 〇 + key {[ U5099, P ]}; # 備 + +And my compose table to look something like this: + + : "鑆" + : "鑫" + : "龘" + +Now if I set my keyboard layout to "boshiamy", it will be sending these custom +key symbols which will be interpreted by my custom compose rules, and if I +switch it back to the us layout the compose rules don't apply. + +The only issues there are with this method is that the functionality for user +specific keyboard layouts is incredibly broken in xkb, so I had to add my +custom layout to the system xkb data directory. Otherwise this whole +implementation would consist entirely of two config files in the home directory. + +Also if you have ibus installed make sure to check "use system keyboard layout" +in settings or else it'll keep switching your keyboard layout around. + +Also for those unfamilair with component based input methods, unlike phonetic +input methods like Pinyin (derogatory) or Zhuyin (derogatory), the mapping for +key presses to characters has very few if any conflicts, and therefore the +system works in an open loop way. You can easily use somethign like CangJie or +Boshiamy without the preview window or without any sort of predictive text. + + +The files I've created and further reading are in a git repo here + -- cgit v1.1