Arno’s Engram keyboard layout

Engram v2.0 (“Engram”) is a key layout optimized for comfortable and efficient touch typing in English created by Arno Klein, with open source code to create other optimized key layouts. You can install the Engram v2.0 layout on Windows, macOS, and Linux or try it out online. An article is under review (see the preprint for an earlier (and superceded) version (v1.3) with description).

[See here for the Spanish version, engram-es]

Letters are optimally arranged according to ergonomics factors that promote reduction of lateral finger movements and more efficient typing of high-frequency letter pairs. The most common punctuation marks are logically grouped together in the middle columns and numbers are paired with mathematical and logic symbols (shown as pairs of default and Shift-key-accessed characters).

See below for a full description and comparisons with other key layouts.

Standard diagonal keyboard (default and Shift-key layers)

Standard keyboard

“Ergonomic” orthonormal keyboard (default and Shift-key layers)

Orthonormal keyboard

     [{ 1| 2= 3~ 4+  5<  6>  7^ 8& 9% 0* ]} /\
        bB yY oO uU  '(  ")  lL dD wW vV zZ #$ @`
        cC iI eE aA  ,;  .:  hH tT sS nN qQ 
        gG xX jJ kK  -_  ?!  rR mM fF pP

Letter frequencies (Norvig, 2012), showing that the Engram layout emphasizes keys in the home row:

      B   Y   O   U           L   D   W   V    Z
      C   I   E   A           H   T   S   N    Q
      G   X   J   K           R   M   F   P

     53  59 272  97          145 136  60  38   3
    119 270 445 287          180 331 232 258   4
     67   8   6  19          224  90  86  76

Why a new key layout?
How does Engram compare with other key layouts?
Guiding criteria
Summary of steps and results

Why a new key layout?

Personal history
In the future, I hope to include an engaging rationale for why I took on this challenge. Suffice to say I love solving problems, and I have battled repetitive strain injury ever since I worked on an old DEC workstation at the MIT Media Lab while composing my thesis back in the 1990s. I have experimented with a wide variety of human interface technologies over the years – voice dictation, one-handed keyboard, keyless keyboard, foot mouse, and ergonomic keyboards like the Kinesis Advantage and Ergodox keyboards with different key switches. While these technologies can significantly improve comfort and reduce strain, an optimized key layout can only help when typing on ergonomic or standard keyboards.

I have used different key layouts (Qwerty, Dvorak, Colemak, etc.) for communications and for writing and programming projects, and have primarily relied on Colemak for the last 10 years. I find that most to all of these key layouts:

Demand too much strain on tendons
- strenuous lateral extension of the index and little fingers
Ignore the ergonomics of the human hand
- different finger strengths
- different finger lengths
- natural roundedness of the hand
- home row easier than upper row for shorter fingers
- home row easier than lower row for longer fingers
- ease of little-to-index finger rolls vs. reverse
Over-emphasize alternation between hands and under-emphasize same-hand, different-finger transitions
- same-row, adjacent finger transitions are easy and comfortable
- little-to-index finger rolls are easy and comfortable

While I used ergonomic principles outlined below and the accompanying code to help generate the Engram layout, I also relied on massive bigram frequency data for the English language. if one were to follow the procedure below and use a different set of bigram frequencies for another language or text corpus, they could create a variant of the Engram layout, say “Engram-French”, better suited to the French language.

Why “Engram”?
The name is a pun, referring both to “n-gram”, letter permutations and their frequencies that are used to compute the Engram layout, and “engram”, or memory trace, the postulated change in neural tissue to account for the persistence of memory, as a nod to my attempt to make this layout easy to remember.

How does Engram compare with other key layouts?

Below we compare the Engram layout with different prominent key layouts (Colemak, Dvorak, QWERTY, etc.) for some large, representative, publicly available data (all text sources are listed below and available on GitHub).

Engram Scoring Model scores (x100) for layouts, based on publicly available text data

Engram scores higher for all text and software sources than all other layouts according to its own scoring model (higher scores are better):

Layout	Google bigrams	Alice	Memento	Tweets_100K	Tweets_20K	Tweets_MASC	Spoken_MASC	COCA_blogs	iweb	Monkey	Coder	Rosetta
Engram	62.48	61.67	62.30	63.03	60.28	62.49	61.56	62.19	62.38	62.23	62.51	62.48
Halmak	62.40	61.60	62.23	62.93	60.26	62.43	61.51	62.13	62.31	62.16	62.46	62.40
Hieamtsrn	62.39	61.64	62.27	62.99	60.27	62.47	61.53	62.16	62.35	62.20	62.49	62.39
Norman	62.35	61.57	62.20	62.86	60.21	62.39	61.47	62.08	62.27	62.12	62.40	62.35
Workman	62.37	61.59	62.22	62.91	60.23	62.41	61.49	62.10	62.29	62.14	62.43	62.37
MTGap 2.0	62.32	61.59	62.21	62.88	60.22	62.39	61.49	62.09	62.28	62.13	62.42	62.32
QGMLWB	62.31	61.58	62.21	62.90	60.25	62.40	61.49	62.10	62.29	62.14	62.43	62.31
Colemak Mod-DH	62.36	61.60	62.22	62.90	60.26	62.41	61.49	62.12	62.30	62.16	62.44	62.36
Colemak	62.36	61.58	62.20	62.89	60.25	62.40	61.48	62.10	62.29	62.14	62.43	62.36
Asset	62.34	61.56	62.18	62.86	60.25	62.37	61.46	62.07	62.25	62.10	62.39	62.34
Capewell-Dvorak	62.29	61.56	62.17	62.86	60.20	62.36	61.47	62.06	62.24	62.10	62.37	62.29
Klausler	62.34	61.58	62.20	62.89	60.25	62.39	61.48	62.09	62.27	62.12	62.41	62.34
Dvorak	62.31	61.56	62.17	62.85	60.23	62.35	61.46	62.06	62.24	62.09	62.35	62.31
QWERTY	62.19	61.49	62.08	62.72	60.17	62.25	61.39	61.96	62.13	61.99	62.25	62.19

Keyboard Layout Analyzer (KLA) scores for the same text sources

The optimal layout score is based on a weighted calculation that factors in the distance your fingers moved (33%), how often you use particular fingers (33%), and how often you switch fingers and hands while typing (34%).

Engram scores highest for 7 of the 9 and second highest for 2 of the 9 text sources; Engram scores third and fourth highest for the two software sources, “Coder” and “Rosetta” (higher scores are better):

Layout	Alice in Wonderland	Memento screenplay	100K tweets	20K tweets	MASC tweets	MASC spoken	COCA blogs	iweb	Monkey	Coder	Rosetta
Engram	70.13	57.16	64.64	58.58	60.24	64.39	69.66	68.25	67.66	46.81	47.69
Halmak	66.25	55.03	60.86	55.53	57.13	62.32	67.29	65.50	64.75	45.68	47.60
Hieamtsrn	69.43	56.75	64.40	58.95	60.47	64.33	69.93	69.15	68.30	46.01	46.48
Colemak Mod-DH	65.74	54.91	60.75	54.94	57.15	61.29	67.12	65.98	64.85	47.35	48.50
Norman	62.76	52.33	57.43	53.24	53.90	59.97	62.80	60.90	59.82	43.76	46.01
Workman	64.78	54.29	59.98	55.81	56.25	61.34	65.27	63.76	62.90	45.33	47.76
MTGAP 2.0	66.13	53.78	59.87	55.30	55.81	60.32	65.68	63.81	62.74	45.38	44.34
QGMLWB	65.45	54.07	60.51	56.05	56.90	62.23	66.26	64.76	63.91	46.38	45.72
Colemak	65.83	54.94	60.67	54.97	57.04	61.36	67.14	66.01	64.91	47.30	48.65
Asset	64.60	53.84	58.66	54.72	55.35	60.81	64.71	63.17	62.44	45.54	47.52
Capewell-Dvorak	66.94	55.66	62.14	56.85	57.99	62.83	66.95	65.23	64.70	45.30	45.62
Klausler	68.24	59.91	62.57	56.45	58.34	64.04	68.34	66.89	66.31	46.83	45.66
Dvorak	65.86	58.18	60.93	55.56	56.59	62.75	66.64	64.87	64.26	45.46	45.55
QWERTY	53.06	43.74	48.28	44.99	44.59	51.79	52.31	50.19	49.18	38.46	39.89

Keyboard Layout Analyzer consecutive same-finger key presses

KLA (and other) distance measures may not accurately reflect natural typing, so below is a more reliable measure of one source of effort and strain – the tally of consecutive key presses with the same finger for different keys. Engram scores lowest for 6 of the 11 texts, second lowest for two texts, and third or fifth lowest for three texts, two of which are software text sources (lower scores are better):

KLA (and other) distance measures may not accurately reflect natural typing, so below is a more reliable measure of one source of effort and strain – the tally of consecutive key presses with the same finger for different keys. Engram scores lowest for 6 of the 9 and second or third lowest for 3 of the 9 text sources, and third or fifth lowest for the two software text sources (lower scores are better):

Layout	Alice	Memento	Tweets_100K	Tweets_20K	Tweets_MASC	Spoken_MASC	COCA_blogs	iweb	Monkey	Coder	Rosetta
Engram	216	11476	320406	120286	7728	3514	137290	1064640	37534	125798	5822
Halmak	498	13640	484702	170064	11456	5742	268246	2029634	68858	144790	5392
Hieamtsrn	244	12096	311000	119490	8316	3192	155674	1100116	40882	158698	7324
Norman	938	20012	721602	213890	16014	9022	595168	3885282	135844	179752	7402
Workman	550	13086	451280	136692	10698	6156	287622	1975564	71150	132526	5550
MTGap 2.0	226	14550	397690	139130	10386	6252	176724	1532844	58144	138484	7272
QGMLWB	812	17820	637788	189700	14364	7838	456442	3027530	100750	149366	8062
Colemak Mod-DH	362	10960	352578	151736	9298	4644	153984	1233770	47438	117842	5328
Colemak	362	10960	352578	151736	9298	4644	153984	1233770	47438	117842	5328
Asset	520	12519	519018	155246	11802	5664	332860	2269342	77406	140886	6020
Capewell-Dvorak	556	14226	501178	163878	12214	6816	335056	2391416	78152	151194	9008
Klausler	408	14734	455658	174998	11410	5212	257878	1794604	59566	135782	7444
Dvorak	516	13970	492604	171488	12208	5912	263018	1993346	64994	142084	6484

Inward roll frequencies

Here we tally the number of bigrams (in billions of instances from Norvig’s analysis of Google data) that engage inward rolls (little-to-index sequences), within the four columns of one hand, or any column across two hands. Engram scores second highest for 32 keys and highest for 24 keys, where the latter ensures that we are comparing Engram’s letters with letters in other layouts (higher scores are better):

Total inward roll frequency in billions

Layout             32 / 24 keys
Engram:          4.64 / 4.51
Halmak:          4.59 / 4.25
Hieamtsrn:       4.69 / 4.16
Norman:          3.99 / 3.61
Workman:         4.16 / 3.63
MTGap 2.0:       3.96 / 3.58
QGMLWB:          4.36 / 2.81
Colemak Mod-DH:  4.15 / 3.51
Colemak:         4.17 / 3.16
Asset:           4.03 / 3.05
Capewell-Dvorak: 4.39 / 3.66
Klausler:        4.42 / 3.52
Dvorak:          4.40 / 3.20
QWERTY:          3.62 / 2.13

Layout	Year	Website
Engram	2021	https://engram.dev
Halmak 2.2	2016	https://github.com/MadRabbit/halmak
Hieamtsrn	2014	https://mathematicalmulticore.wordpress.com/the-keyboard-layout-project/#comment-4976
Colemak Mod-DH	2014	https://colemakmods.github.io/mod-dh/
Norman	2013	https://normanlayout.info/
Workman	2010	https://workmanlayout.org/
MTGAP 2.0	2010	https://mathematicalmulticore.wordpress.com/2010/06/21/mtgaps-keyboard-layout-2-0/
QGMLWB	2009	http://mkweb.bcgsc.ca/carpalx/?full_optimization
Colemak	2006	https://colemak.com/
Asset	2006	http://millikeys.sourceforge.net/asset/
Capewell-Dvorak	2004	http://michaelcapewell.com/projects/keyboard/layout_capewell-dvorak.htm
Klausler	2002	https://web.archive.org/web/20031001163722/http://klausler.com/evolved.html
Dvorak	1936	https://en.wikipedia.org/wiki/Dvorak_keyboard_layout
QWERTY	1873	https://en.wikipedia.org/wiki/QWERTY

Text source	Information
“Alice in Wonderland”	Alice in Wonderland (Ch.1)
“Memento screenplay”	Memento screenplay
“100K tweets”	100,000 tweets from: Sentiment140 dataset training data
“20K tweets”	20,000 tweets from Gender Classifier Data
“MASC tweets”	MASC tweets (cleaned of html markup)
“MASC spoken”	MASC spoken transcripts (phone and face-to-face: 25,783 words)
“COCA blogs”	Corpus of Contemporary American English blog samples
“Rosetta”	“Tower of Hanoi” (programming languages A-Z from Rosetta Code)
“Monkey text”	Ian Douglas’s English-generated monkey0-7.txt corpus
“Coder text”	Ian Douglas’s software-generated coder0-7.txt corpus
“iweb cleaned corpus”	First 150,000 lines of Shai Coleman’s iweb-corpus-samples-cleaned.txt

Reference for Monkey and Coder texts: Douglas, Ian. (2021, March 28). Keyboard Layout Analysis: Creating the Corpus, Bigram Chains, and Shakespeare’s Monkeys (Version 1.0.0). Zenodo. http://doi.org/10.5281/zenodo.4642460

Guiding criteria

 Assign letters to keys that don't require lateral finger movements.
 Promote alternating between hands over uncomfortable same-hand transitions.
 Assign the most common letters to the most comfortable keys.
 Arrange letters so that more frequent bigrams are easier to type.
 Promote little-to-index-finger roll-ins over index-to-little-finger roll-outs.
 Balance finger loads according to their relative strength.
 Avoid stretching shorter fingers up and longer fingers down.
 Avoid using the same finger.
 Avoid skipping over the home row.
Assign the most common punctuation to keys in the middle of the keyboard.
Assign easy-to-remember symbols to the Shift-number keys.

Factors used to compute the Engram layout

N-gram letter frequencies

Peter Norvig’s analysis of data from Google’s book scanning project
Flow factors (transitions between ordered key pairs)
These factors are influenced by Dvorak’s 11 criteria (1936).

Summary of steps and results

We will assign letters to keys by choosing the arrangement with the highest score according to our scoring model. However, there are over four hundred septillion, or four hundred trillion trillion (26! = 403,291,461,126,605,635,584,000,000, or 4.032914611 E+26) possible arrangements of 26 letters (24! = 6.204484017 E+23), so we will arrange the letters in four steps, based on ergonomics principles. These consist of (Step 1) assigning the eight most frequent letters to different keys, optimizing assignment of the remaining (Step 2) eight most frequent letters, and (Step 3) eight least frequent letters (besides Z and Q), and (Step 4) exchanging letters.

Step 1: Define the shape of the key layout to minimize lateral finger movements

We will assign 24 letters to 8 columns of keys separated by two middle columns reserved for punctuation. These 8 columns require no lateral finger movements when touch typing, since there is one column per finger. The most comfortable keys include the left and right home rows (keys 5-8 and 17-20), the top-center keys (2,3 and 14,15) that allow the longer middle and ring fingers to uncurl upwards, as well as the bottom corner keys (9,12 and 21,24) that allow the shorter fingers to curl downwards. We will assign the two least frequent letters, Z and Q (or J), to the two hardest-to-reach keys lying outside the 24-key columns in the upper right (25 and 26):

    Left:            Right:
2  3  4       13 14 15 16 25
6  7  8       17 18 19 20 26
10 11 12       21 22 23 24

We will consider the most comfortable keys to be those typed by either hand on the home row, by the ring and middle finger above the home row, and by the index and little finger below the home row, with a preference for the strongest (index and middle) fingers:

 -  2  3  -        - 14 15  -  
 5  6  7  8       17 18 19 20  
 9  -  - 12       21  -  - 24

Step 2: Arrange the most frequent letters based on comfort and bigram frequencies

In prior experiments using the methods below, all vowels consistently automatically clustered together. Below, we will arrange vowels on one side and the most frequent consonants to the other side to encourage balance and alternation across hands. Since aside from the letters Z and Q there is symmetry across left and right sides, we will decide later which side the vowels and which side the most frequent consonants should go.

Vowels

E, T, A, O, I, N, S, R, H, L, D, C, U, M, F, P, G, W, Y, B, V, K, X, J, Q, Z

The highest frequency bigrams that contain two vowels are listed below in bold, with more than 10 billion instances in Peter Norvig’s analysis of Google data:

OU, IO, EA, IE, AI, IA, EI, UE, UA, AU, UI, OI, EO, OA, OE

 OU  24,531,132,241
 IO  23,542,263,265
 EA  19,403,941,063
 IE  10,845,731,320
 AI   8,922,759,715
 IA   8,072,199,471   
 EI   5,169,898,489
 UE   4,158,448,570       
 UA   3,844,138,094   
 AU   3,356,322,923
 UI   2,852,182,384
 OI   2,474,275,212
 EO   2,044,268,477
 OA   1,620,913,259
 OE   1,089,254,517 

We will assign the most frequent vowels with over 100 billion instances in Norvig’s analysis (E=445,A=331,O=272,I=270) to four of the six most comfortable keys on the left side of the keyboard (keys 2,3,5,6,7,8). We will assign the letter E, the most frequent in the English language, to either of the strongest (index and middle) fingers on the home row, and assign the other three vowels such that (1) the home row keys typed by the index and middle fingers are not left vacant, and any top-frequency bigram (more than 10 billion instances in Norvig’s analysis) (2) does not use the same finger and (3) reads from left to right (ex: EA, not AE) for ease of typing (inward roll from little to index finger vs. outward roll from index to little finger). These constraints lead to three arrangements of the four vowels:

- - O -    - - O -    - - - -    
- I E A    I - E A    I O E A
- - - -    - - - -    - - - -

Consonants

On the right side of the keyboard, we will assign four of the five most frequent consonants (with over 5% or 150 billion instances in Norvig’s analysis: T=331, N=258, S=232, R=224, and H=180) to the four home row keys. We will assign the letter T, the most frequent consonant in the English language, to either of the strongest (index and middle) fingers on the home row. As with the left side, letters are placed so that top-frequency bigrams read from right to left (ex: HT, not TH) for ease of typing. The top-frequency bigrams (more than 10 billion instances in Norvig’s analysis) include: TH, ND, ST, NT, CH, NS, CT, TR, RS, NC, and RT (below 10 billion instances these bigrams start to occur in reverse, such as RT and TS):

 TH 100,272,945,963  3.56% 
 ND  38,129,777,631  1.35%
 ST  29,704,461,829  1.05%
 NT  29,359,771,944  1.04%
 CH  16,854,985,236  0.60%
 NS  14,350,320,288   
 CT  12,997,849,406
 TR  12,006,693,396       
 RS  11,180,732,354   
 NC  11,722,631,112
 RT  10,198,055,461   

The above constraints lead to five arrangements of the consonants:

- - - -    - - - -    - - - -    - - - -    - - - -
R T S N    H T S N    H T S R    H T N R    T S N R
- - - -    - - - -    - - - -    - - - -    - - - -

We will assign the fifth consonant to a vacant key on the left home row if there is a vacancy, otherwise to the key below the right index finger (any other assignment requires the same finger to type a high-frequency bigram). The resulting 20 initial layouts, each with 15 unassigned keys, are represented below with the three rows on the left and right side of the keyboard as a linear string of letters, with unassigned keys denoted by “-”.

--O- HIEA ----    ---- RTSN ----
--O- RIEA ----    ---- HTSN ----
--O- NIEA ----    ---- HTSR ----
--O- SIEA ----    ---- HTNR ----
--O- IHEA ----    ---- RTSN ----
--O- IREA ----    ---- HTSN ----
--O- INEA ----    ---- HTSR ----
--O- ISEA ----    ---- HTNR ----
--O- -IEA ----    ---- RTSN H---
--O- -IEA ----    ---- HTSN R---
--O- -IEA ----    ---- HTSR N---
--O- I-EA ----    ---- RTSN H---
--O- I-EA ----    ---- HTSN R---
--O- I-EA ----    ---- HTSR N---
---- IOEA ----    ---- RTSN H---
---- IOEA ----    ---- HTSN R---
---- IOEA ----    ---- HTSR N---
--O- HIEA ----    ---- TSNR ----
--O- IHEA ----    ---- TSNR ----

Step 3: Optimize assignment of the remaining letters

We want to assign letters to the 17 unassigned keys in each of the above 20 layouts based on our scoring model. That would mean scoring all possible arrangements for each layout and choosing the arrangement with the highest score, but since there are over 355 trillion (17!) possible ways of arranging 17 letters, we will break up the assignment into two stages for the most frequent and least frequent remaining letters.

Most frequent letters

We will compute scores for every possible arrangement of the seven most frequent of the remaining letters (in bold below) assigned to vacancies among the most comfortable sixteen keys.

E, T, A, O, I, N, S, R, H, L, D, C, U, M, F, P, G, W, Y, B, V, K, X, J, Q, Z

    Left:            Right:
 -  2  3  -        - 14 15  -
 5  6  7  8       17 18 19 20
 9  -  - 12       21  -  - 24

Since there are 5,040 (7!) possible combinations of eight letters for each of the 21 layouts, we need to score and evaluate 105,840 layouts. To score each arrangement of letters, we construct a frequency matrix where we multiply a matrix containing the frequency of each ordered pair of letters (bigram) by our flow and strength matrices to compute a score.

Least frequent letters

Next we will compute scores for every possible (40,320 = 8!) arrangement of the least frequent eight letters (in bold below, besides Z and Q) in the remaining keys, after substituting in the 21 results of the above for an additional 846,720 layouts:

E, T, A, O, I, N, S, R, H, L, D, C, U, M, F, P, G, W, Y, B, V, K, X, J, Q, Z

    Left:            Right:
 1  -  -  4       13  -  - 16
 -  -  -  -        -  -  -  -
 - 10 11  -        - 22 23  -

Further optimize layouts by exchanging more letters

If we relax the above fixed initializations and permit further exchange of letters, then we can search for even higher-scoring layouts. As a final optimization step we exchange letters, eight keys at a time (8! = 40,320) selected twice in 14 different ways, in each of the above 21 layouts, to score a total of 23,708,160 more combinations. We allow the following keys to exchange letters:

Top rows
Bottom rows
Top and bottom rows on the right side
Top and bottom rows on the left side
Top right and bottom left rows
Top left and bottom right rows
Center of the top and bottom rows on both sides
The eight corners
Left half of the top and bottom rows on both sides
Right half of the top and bottom rows on both sides
Left half of non-home rows on the left and right half of the same rows on the right
Right half of non-home rows on the left and left half of the same rows on the right
Top center and lower sides
Top sides and lower center
Repeat 1-14

Engram Scoring Model

Our optimization algorithm finds every permutation of a given set of letters, maps these letter permutations to a set of keys, and ranks these letter-key mappings according to a score reflecting ease of typing key pairs and frequency of letter pairs (bigrams). The score is the average of the scores for all possible bigrams in this arrangement. The score for each bigram is a product of the frequency of occurrence of that bigram, the frequency of each of the bigram’s characters, and flow, strength (and optional speed) factors for the key pair.

Factors to penalize strenuous key transitions

Direction:

- outward = 0.9: outward roll of fingers from the index to little finger (same hand)

Dexterity:

- side_above_3away = 0.9
    - index and little finger type two keys, one or more rows apart (same hand)
- side_above_2away = 0.9^2 = 0.81
    - index finger types key a row or two above ring finger key, or
    - little finger types key a row or two above middle finger key (same hand)
- side_above_1away = 0.9^3 = 0.729
    - index finger types key a row or two above middle finger key, or
    - little finger types key a row or two above ring finger key (same hand)
- middle_above_ring = 0.9
    - middle finger types key a row or two above ring finger key (same hand)
- ring_above_middle = 0.9^3 = 0.729
    - ring finger types key a row or two above middle finger key (same hand)
- lateral = 0.9
    - lateral movement of (index or little) finger outside of 8 vertical columns

Distance:

- skip_row_3away = 0.9       
    - index and little fingers type two keys that skip over home row (same hand)
    - (e.g., one on bottom row, the other on top row)
- skip_row_2away = 0.9^3 = 0.729
    - little and middle or index and ring fingers type two keys that skip over home row (same hand)
- skip_row_1away = 0.9^5 = 0.59049
    - little and ring or middle and index fingers type two keys that skip over home row (same hand)

Repetition:

- skip_row_0away = 0.9^4 = 0.6561
    - same finger types two keys that skip over home row
- same_finger = 0.9^5 = 0.59049
    - use same finger again for a different key
    - cannot accompany outward, side_above, or adjacent_shorter_above 

Strength: Accounted for by the strength matrix (minimum value for the little finger = 0.9)

Step 4: Evaluate winner against other optimized layouts

After assigning letters Z and Q to upper right keys outside of the home blocks and testing left/right side swap of all letters, the winning layout is:

B Y O U  L D W V Z
C I E A  H T S N Q
G X J K  R M F P    

We ran tests on the winning layout:

Evaluate optimized layouts using interkey speed estimates   
Evaluate variants of the candidate winner using interkey speed estimates
Test sensitivity of the candidate winner to the scoring parameters

For test 1, we rescored all of the 20 top-scoring layouts optimized from the 20 initialized layouts, and replaced the factor matrix with the inter-key speed matrix. The same two layouts that tied for first place do so again.

For test 2, we rescored all of the 5,040 variants of the candidate winner that were tied for first place, replacing the factor matrix with the interkey speed matrix. The candidate winner scored highest.

For test 3, we ran a test on the variants of the candidate winner layout to see how robust they are to removal of scoring parameters. We removed each of the 11 scoring parameters one by one and ranked the new scores for the variants. The candidate winner scored highest for 8 of the 11 cases, and second highest for two other cases, demonstrating that this layout is not sensitive to individual parameters.

Step 5. Arrange non-letter characters in easy-to-remember places

Now that we have all 26 letters accounted for, we turn our attention to non-letter characters, taking into account frequency of punctuation and ease of recall.

Frequency of punctuation marks

Statistical values of punctuation frequency in 20 English-speaking countries (Table 1):
Sun, Kun & Wang, Rong. (2018). Frequency Distributions of Punctuation Marks in English: Evidence from Large-scale Corpora. English Today. 10.1017/S0266078418000512.
https://www.researchgate.net/publication/328512136_Frequency_Distributions_of_Punctuation_Marks_in_English_Evidence_from_Large-scale_Corpora
“frequency of punctuation marks attested for twenty English-speaking countries and regions… The data were acquired through GloWbE.” “The corpus of GloWbE (2013) is a large English corpus collecting international English from the internet, containing about 1.9 billion words of text from twenty different countries. For further information on the corpora used, see https://corpus.byu.edu/.”
Google N-grams and Twitter analysis:
“Punctuation Input on Touchscreen Keyboards: Analyzing Frequency of Use and Costs”
S Malik, L Findlater - College Park: The Human-Computer Interaction Lab. 2013
https://www.cs.umd.edu/sites/default/files/scholarly_papers/Malik.pdf
“the Twitter corpora included substantially higher punctuation use than the Google corpus,
comprising 7.5% of characters in the mobile tweets and 7.6% in desktop versus only 4.4%…
With the Google corpus,only 6 punctuation symbols (. -’ ( ) “) appeared more frequently than [q]”
“Frequencies for English Punctuation Marks” by Vivian Cook
http://www.viviancook.uk/Punctuation/PunctFigs.htm
“Based on a writing system corpus some 459 thousand words long.
This includes three novels of different types (276 thousand words),
selections of articles from two newspapers (55 thousand),
one bureaucratic report (94 thousand), and assorted academic papers
on language topics (34 thousand). More information is in
Cook, V.J. (2013) ‘Standard punctuation and the punctuation of the street’
in M. Pawlak and L. Aronin (eds.), Essential Topics in Applied Linguistics and Multilingualism,
Springer International Publishing Switzerland (2013), 267-290”
“A Statistical Study of Current Usage in Punctuation”:
Ruhlen, H., & Pressey, S. (1924). A Statistical Study of Current Usage in Punctuation. The English Journal, 13(5), 325-331. doi:10.2307/802253
“Computer Languages Character Frequency” by Xah Lee.
Date: 2013-05-23. Last updated: 2020-06-29.
http://xahlee.info/comp/computer_language_char_distribution.html
NOTE: biased toward C (19.8%) and Py (18.5%), which have high use of “_”.

Frequency:

         Sun:     Malik:   Ruhlen:    Cook:            Xah:
          /1M   N-gram %   /10,000   /1,000       All%  JS%   Py%

.    42840.02      1.151       535     65.3       6.6   9.4  10.3
,    44189.96                  556     61.6       5.8   8.9   7.5
"                  2.284        44     26.7       3.9   1.6   6.2
'     2980.35      0.200        40     24.3       4.4   4.0   8.6
-     9529.78      0.217        21     15.3       4.1   1.9   3.0
()    4500.81      0.140         7                7.4   9.8   8.1
;     1355.22      0.096        22      3.2       3.8   8.6
z                  0.09                   -         -
:     3221.82      0.087        11      3.4       3.5   2.8   4.7
?     4154.78      0.032        14      5.6       0.3
/                  0.019                          4.0   4.9   1.1
!     2057.22      0.013         3      3.3       0.4
_                  0.001                         11.0   2.9  10.5

Add punctuation keys and number keys

We will assign the most frequent punctuation according to Sun, et al (2018) to the six keys in the middle two columns: . , “ ‘ - ? ; : () ! _

        B Y O U   '  "   L D W V Z
        C I E A   ,  .   H T S N Q
        G X J K   -  ?   R M F P

We will use the Shift key to group similar punctuation marks (separating and joining marks in the left middle column and closing marks in the right middle column):

        B Y O U  '(  ")  L D W V Z #$ @`
        C I E A  ,;  .:  H T S N Q
        G X J K  -_  ?!  R M F P

Separating marks (left): The comma separates text in lists; the semicolon can be used in place of the comma to separate items in a list (especially if these items contain commas); open parenthesis sets off an explanatory word, phrase, or sentence.

Joining marks (left): The apostrophe joins words as contractions; the hyphen joins words as compounds; the underscore joins words in cases where whitespace characters are not permitted (such as in variables or file names).

Closing marks (right): A sentence usually ends with a period, question mark, or exclamation mark. The colon ends one statement but precedes the following: an explanation, quotation, list, etc. Double quotes and close parenthesis closes a word, clause, or sentence separated by an open parenthesis.

Number keys: The numbers are flanked to the left and right by [square brackets], and {curly brackets} accessed by the Shift key. Each of the numbers is paired with a mathematical or logic symbol accessed by the Shift key:

{ | = ~ +   <  >   ^ & % * } \
[ 1 2 3 4   5  6   7 8 9 0 ] /

| (vertical bar or "pipe" represents the logical OR operator: 1 stroke, looks like the number one)
= (equal: 2 strokes, like the Chinese character for "2")
~ (tilde: "almost equal", often written with 3 strokes, like the Chinese character for "3")
+ (plus: has four quadrants; resembles "4")
& 6: < > ("less/greater than"; these angle brackets are directly above the other bracket keys)
^ (caret for logical XOR operator as well as exponentiation; resembles "7")
& (ampersand: logical AND operator; resembles "8")
% (percent: related to division; resembles "9")
* (asterisk: for multiplication; resembles "0") 

The three remaining keys in many common keyboards (flanking the upper right hand corner Backspace key) are displaced in special keyboards, such as the Kinesis Advantage and Ergodox. For the top right key, we will assign the forward slash and backslash: / \. For the remaining two keys, we will assign two symbols that in modern usage have significance in social media: the hash/pound sign and the “at sign”. The hash or hashtag identifies digital content on a specific topic (the Shift key accesses the dollar sign). The “at sign” identifies a location or affiliation (such as in email addresses) and acts as a “handle” to identify users in popular social media platforms and online forums.

The resulting Engram layout:

     [{ 1| 2= 3~ 4+  5<  6>  7^ 8& 9% 0* ]} /\
        bB yY oO uU  '(  ")  lL dD wW vV zZ #$ @`
        cC iI eE aA  ,;  .:  hH tT sS nN qQ
        gG xX jJ kK  -_  ?!  rR mM fF pP

Arno’s Engram keyboard layout

Arno's Engram v2.0 ("Engram") layout is an optimized key layout for touch typing in English based on ergonomic considerations, with a protocol and software for creating new, optimized key layouts in other languages.

Arno’s Engram keyboard layout

Standard diagonal keyboard (default and Shift-key layers)

“Ergonomic” orthonormal keyboard (default and Shift-key layers)

Contents

Why a new key layout?

How does Engram compare with other key layouts?

Engram Scoring Model scores (x100) for layouts, based on publicly available text data

Keyboard Layout Analyzer consecutive same-finger key presses

Inward roll frequencies

Guiding criteria

Factors used to compute the Engram layout

Summary of steps and results

Step 1: Define the shape of the key layout to minimize lateral finger movements

Step 2: Arrange the most frequent letters based on comfort and bigram frequencies

Vowels

Consonants

Step 3: Optimize assignment of the remaining letters

Most frequent letters

Least frequent letters

Further optimize layouts by exchanging more letters

Engram Scoring Model

Factors to penalize strenuous key transitions

Step 4: Evaluate winner against other optimized layouts

Step 5. Arrange non-letter characters in easy-to-remember places

Frequency of punctuation marks

Add punctuation keys and number keys