summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorHaoran S. Diao (刁浩然) <0@hairydiode.xyz>2025-08-19 14:10:12 -0700
committerHaoran S. Diao (刁浩然) <0@hairydiode.xyz>2025-08-19 14:10:12 -0700
commitdf5df87632439a47d28214d3b155535259eec2ec (patch)
tree8117d97f00cfd7f84ba391e0499f7b5c201e621e
parent7a701578b0dbd700dc0330eafbd3a8faf3990beb (diff)
xbabuse.html
-rw-r--r--index.html1
-rw-r--r--xkbabuse.html132
2 files changed, 133 insertions, 0 deletions
diff --git a/index.html b/index.html
index 5cb5df9..6a07b63 100644
--- a/index.html
+++ b/index.html
@@ -43,6 +43,7 @@ Where's all the other stuff you host from this domain?
<a href="https://social.hairydiode.xyz">My Mastodon Instance</a>
Where's all the content?
Scroll Down
+<a href="https://hairydiode.xyz/xkbabuse">[Abusing X11's xkb for fun and profit] 08-19-2025</a>
<a href="https://hairydiode.xyz/fourcorners">[Four Corners Input Method for ibus-table] 11-25-2023</a>
<a href="https://hairydiode.xyz/unihome">[We Have Unicode at Home] 6-30-2023</a>
<a href="https://hairydiode.xyz/jankime">[Janky IME] 6-29-2023</a>
diff --git a/xkbabuse.html b/xkbabuse.html
new file mode 100644
index 0000000..ae8ff5c
--- /dev/null
+++ b/xkbabuse.html
@@ -0,0 +1,132 @@
+<!DOCTYPE html>
+<head>
+<title>Abusing X11's xkb for fun and profit</title>
+<meta charset="utf-8"/>
+<link rel="stylesheet" href="https://hairydiode.xyz/style.css"/>
+<link rel="icon" type="image/png" href="https://hairydiode.xyz/img/fav/logo.png"/>
+</head>
+<body>
+<div class="content">
+<pre>
+<!--
+123456789-223456789-323456789-423456789-523456789-623456789-723456789-8234567890
+一二三四
+-->--------------------------------------------------------------------------------
+
+<a href="https://hairydiode.xyz">>HairyDiode</a>
+
+--------------------------------------------------------------------------------
+<!--
+123456789-223456789-323456789-423456789-523456789-623456789-723456789-8234567890
+一二三四-->Abusing X11's xkb for fun and profit 08-19-2025
+--------------------------------------------------------------------------------
+
+Yesterday I was playing around with xkb keyboard layouts before I discovered
+that compose functionality for European (and other languages that use accents)
+keyboard layouts is implemented as a 5227 line long table that maps key symbols
+to unicode strings.
+
+ex: /usr/share/X11/locale/en_US.UTF-8/Compose on my computer
+# UTF-8 (Unicode) Compose sequences
+#
+# Spacing versions of accents (mostly)
+<dead_tilde> <space> : "~" asciitilde # TILDE
+<dead_tilde> <dead_tilde> : "~" asciitilde # TILDE
+<Multi_key> <minus> <space> : "~" asciitilde # TILDE
+<Multi_key> <space> <minus> : "~" asciitilde # TILDE
+<dead_acute> <space> : "'" apostrophe # APOSTROPHE
+...
+<dead_circumflex> <o> : "ồ ocircumflex # LATIN SMALL LETTER O WITH CIRCUMFLEX
+...
+<Multi_key> <colon> <U2395> : "⍠́ U2360 # : ⎕ APL FUNCTIONAL SYMBOL QUAD COLON
+
+For those of you who are unaware, if you use for example a German keyboard
+layout, pressing "^" followed by "o" will produce "ô". The circumflex "^" key is
+called a "dead key" in xkb terminology because it does not produce any
+characters by itself. In addition if you were to bind any key on your keyboard
+to "compose" it would allow you to type basically a wide range of unicode
+characters via various sequences of key presses.
+
+This got me thinking, this functionality is identical to how ibus-table IMs work
+and would allow me to implement Chinese IMs in a way that requires no extra
+software and which would presumbaly be compatible with a far greater range of
+software since the functionality is built into X11. The fact that the default
+file is 5000 lines long tells me that X11 is more than capable of handling long
+tables.
+
+My first step was to take the <a href="https://github.com/jdh8/ibus-boshiamy">ibus boshiamy implementation</a> I already have on my
+computer and mutilate it into the above format using convoluted regex commands
+and a lot of whack a mole to turn 46000 lines of:
+
+aaa 100 鑫
+aaa 99 龘
+aaa 98 鑆
+
+ into
+
+<a> <a> <a> <space> : "鑫"
+<a> <a> <a> <1> : "龘"
+<a> <a> <a> <2> : "鑆"
+
+
+To my surprise, after moving this file to ~/.XCompose it worked exactly as I
+expected with no lag. The only issue then, is that there's no way to switch
+between compose sets in xkb. This explains why the en_US.UTF-8 compose set was
+so long, it had to essentially handle every possible dead-key or compose
+sequence for every keyboard layout.
+
+There's an easy solution to this though, which is to create a custom keyboard
+layout where the keys are mapped to custom key symbols (xkb's layer of
+abstraction above a physical keycode and below a text string) and have my
+compose table use those as the inputs instead of qwerty keys.
+
+Since I started this whole thing by messing with xkb layouts, it didn't take
+long for me to edit the us layout into something like this:
+
+ key <AD01> {[ U9AD8, Q ]}; # 高
+ key <AD02> {[ U4E94, W ]}; # 五
+ key <AD03> {[ U4E00, E ]}; # 一
+ key <AD04> {[ U4E8C, R ]}; # 二
+ key <AD05> {[ U901A, T ]}; # 通
+ key <AD06> {[ U76CA, Y ]}; # 益
+ key <AD07> {[ U4EE5, U ]}; # 以
+ key <AD08> {[ U5F8C, I ]}; # 後
+ key <AD09> {[ U3007, O ]}; # 〇
+ key <AD10> {[ U5099, P ]}; # 備
+
+And my compose table to look something like this:
+
+<U5C0D> <U5C0D> <U5C0D> <U4E8C> <space> : "鑆"
+<U5C0D> <U5C0D> <U5C0D> <space> : "鑫"
+<U5C0D> <U5C0D> <U5C0D> <U8981> <space> : "龘"
+
+Now if I set my keyboard layout to "boshiamy", it will be sending these custom
+key symbols which will be interpreted by my custom compose rules, and if I
+switch it back to the us layout the compose rules don't apply.
+
+The only issues there are with this method is that the functionality for user
+specific keyboard layouts is incredibly broken in xkb, so I had to add my
+custom layout to the system xkb data directory. Otherwise this whole
+implementation would consist entirely of two config files in the home directory.
+
+Also if you have ibus installed make sure to check "use system keyboard layout"
+in settings or else it'll keep switching your keyboard layout around.
+
+Also for those unfamilair with component based input methods, unlike phonetic
+input methods like Pinyin (derogatory) or Zhuyin (derogatory), the mapping for
+key presses to characters has very few if any conflicts, and therefore the
+system works in an open loop way. You can easily use somethign like CangJie or
+Boshiamy without the preview window or without any sort of predictive text.
+
+
+The files I've created and further reading are in a git repo <a href="https://hairydiode.xyz/cgit/xkb-boshiamy">here</a>
+
+</pre>
+</div>
+<br>
+<br>
+</body>
+<!--
+if you're digging in the src you might be interested in how this site works
+here: https://hairydiode.xyz/meta2
+-->