<!-- 123456789-223456789-323456789-423456789-523456789-623456789-723456789-8234567890 一二三四-->[TITLE] [DATE] -------------------------------------------------------------------------------- [SETTITLE]Abusing X11's xkb for fun and profit [SETDATE]08-19-2025 Yesterday I was playing around with xkb keyboard layouts before I discovered that compose functionality for European (and other languages that use accents) keyboard layouts is implemented as a 5227 line long table that maps key symbols to unicode strings. ex: /usr/share/X11/locale/en_US.UTF-8/Compose on my computer # UTF-8 (Unicode) Compose sequences # # Spacing versions of accents (mostly) <dead_tilde> <space> : "~" asciitilde # TILDE <dead_tilde> <dead_tilde> : "~" asciitilde # TILDE <Multi_key> <minus> <space> : "~" asciitilde # TILDE <Multi_key> <space> <minus> : "~" asciitilde # TILDE <dead_acute> <space> : "'" apostrophe # APOSTROPHE ... <dead_circumflex> <o> : "ồ ocircumflex # LATIN SMALL LETTER O WITH CIRCUMFLEX ... <Multi_key> <colon> <U2395> : "⍠́ U2360 # : ⎕ APL FUNCTIONAL SYMBOL QUAD COLON For those of you who are unaware, if you use for example a German keyboard layout, pressing "^" followed by "o" will produce "ô". The circumflex "^" key is called a "dead key" in xkb terminology because it does not produce any characters by itself. In addition if you were to bind any key on your keyboard to "compose" it would allow you to type basically a wide range of unicode characters via various sequences of key presses. This got me thinking, this functionality is identical to how ibus-table IMs work and would allow me to implement Chinese IMs in a way that requires no extra software and which would presumbaly be compatible with a far greater range of software since the functionality is built into X11. The fact that the default file is 5000 lines long tells me that X11 is more than capable of handling long tables. My first step was to take the ibus boshiamy implementation I already have on my computer and mutilate it into the above format using convoluted regex commands and a lot of whack a mole to turn 46000 lines of: aaa 100 鑫 aaa 99 龘 aaa 98 鑆 into <a> <a> <a> <space> : "鑫" <a> <a> <a> <1> : "龘" <a> <a> <a> <2> : "鑆" To my surprise, after moving this file to ~/.XCompose it worked exactly as I expected with no lag. The only issue then, is that there's no way to switch between compose sets in xkb. This explains why the en_US.UTF-8 compose set was so long, it had to essentially handle every possible dead-key or compose sequence for every keyboard layout. There's an easy solution to this though, which is to create a custom keyboard layout where the keys are mapped to custom key symbols (xkb's layer of abstraction above a physical keycode and below a text string) and have my compose table use those as the inputs instead of qwerty keys. Since I started this whole thing by messing with xkb layouts, it didn't take long for me to edit the us layout into something like this: key <AD01> {[ U9AD8, Q ]}; # 高 key <AD02> {[ U4E94, W ]}; # 五 key <AD03> {[ U4E00, E ]}; # 一 key <AD04> {[ U4E8C, R ]}; # 二 key <AD05> {[ U901A, T ]}; # 通 key <AD06> {[ U76CA, Y ]}; # 益 key <AD07> {[ U4EE5, U ]}; # 以 key <AD08> {[ U5F8C, I ]}; # 後 key <AD09> {[ U3007, O ]}; # 〇 key <AD10> {[ U5099, P ]}; # 備 And my compose table to look something like this: <U5C0D> <U5C0D> <U5C0D> <U4E8C> <space> : "鑆" <U5C0D> <U5C0D> <U5C0D> <space> : "鑫" <U5C0D> <U5C0D> <U5C0D> <U8981> <space> : "龘" Now if I set my keyboard layout to "boshiamy", it will be sending these custom key symbols which will be interpreted by my custom compose rules, and if I switch it back to the us layout the compose rules don't apply. The only issues there are with this method is that the functionality for user specific keyboard layouts is incredibly broken in xkb, so I had to add my custom layout to the system xkb data directory. Otherwise this whole implementation would consist entirely of two config files in the home directory. Also if you have ibus installed make sure to check "use system keyboard layout" in settings or else it'll keep switching your keyboard layout around. Also for those unfamilair with component based input methods, unlike phonetic input methods like Pinyin (derogatory) or Zhuyin (derogatory), the mapping for key presses to characters has very few if any conflicts, and therefore the system works in an open loop way. You can easily use somethign like CangJie or Boshiamy without the preview window or without any sort of predictive text. The files I've created and further reading are in a git repo here