\section{Development and Implementation of Necessary Tools} For the purpose of this thesis, we programmed our own typing test platform to have better control over the performance related measurements and the text that has to be transcribed. The participants had to fill out up to two questionnaires after each typing test which had to be linked to this specific typing test or keyboard. With a total number of 24 subjects, five keyboards and therefore 10 individual typing tests per subject or 240 typing tests in total, we decided to incorporate a questionnaire feature into our platform to mitigate the possibility of false mappings between typing tests, surveys and participants. Additionally, because we wanted to control the understandability of text without introducing observer bias for the text selection process and also to save time, we implemented a crowdsourcing feature where individuals could provide text snippets that were automatically checked for adequate \gls{FRE}. Finally, we wanted to open source this platform so other researchers in the field of text entry performance could use it without additional cost. Another challenge was to measure the maximum force each individual finger is able to apply to any of the keyswitches on a keyboard. We therefore decided to prototype a device that is able to simulate the position of different keyswitches and measure the applied force by the finger usually responsible to actuate a specific key. Both implementations are explained in more detail in the following two sections as shown in Figure \ref{fig:s3_flow} \begin{figure}[H] \centering \includegraphics[width=1.0\textwidth]{images/section_3_flow} \caption{Overview of the topics covered in the following sections} \label{fig:s3_flow} \end{figure} \subsection{Typing Test Platform} \label{sec:gott} The platform we created is called \gls{GoTT} because the backend, which is the server side code, is programmend in Go, a programming language developed by a team at Google \cite{golang}. The decision for Go was made because Go's standard library offers convenient packages to quickly setup a web server with simple routing and templating functionalities \cite{golang_std}. The backend and frontend communicate through a \gls{REST} \gls{API} and exchange data in \gls{JSON} format. \gls{GoTT} utilizes a document based database to persistently store login credentials, results of typing tests and all finished questionnaires. We decided to use \gls{MongoDB} because of the capability to directly store \gls{JSON}-like, nested, data without prior transformation \cite{mongodb}. The general functionality of \gls{GoTT} can be seen in Figure \ref{fig:gott_arch}. \begin{figure}[H] \centering \includegraphics[width=0.9\textwidth]{images/gott_arch.png} \caption{Overview of the general functionality of \gls{GoTT}} \label{fig:gott_arch} \end{figure} The platform offers three major functionalities that are important for this thesis: \begin{enumerate} \item \textbf{The typing test} itself was designed after evaluating various free typing test tools online. One major issue almost all had in common was the lack of functionality to provide own texts for transcription. Further, only a few provided insights on how performance metrics were calculated or the ability to export results automatically. Since time in between typing tests was limited by the design of the experiment as described in Section \ref{sec:methodology}, recording the results by hand for multiple metrics would have been error prone and therefore not a valid option. The typing test provided by \gls{GoTT} features a non-intrusive interface. The font size can be adjusted via the zoom functionality of the browser. Colors used to indicate correctly or incorrectly entered characters have been adjusted to enhance accessibility for people with vision related disabilities. The perception of the colors used in \gls{GoTT} for people with different color vision impairments can be observed in Figure \ref{fig:gott_colorblind} and was simulated with the help of a tool called \textit{Color Oracle} \footnote{\url{https://colororacle.org/index.html}} \cite{colororacle}. \begin{figure}[H] \centering \includegraphics[width=1.0\textwidth]{images/gott_colorblind.png} \caption{\gls{GoTT}'s text area perceived with different kinds of colorblindness. The examples are ordered from top, impairments most commonly found in the population, to bottom (least common) and are simulated with the tool \textit{Color Oracle} \cite{colororacle}} \label{fig:gott_colorblind} \end{figure} The typing test features an area to display the text that has to be transcribed. As soon as the typist has transcribed half of the displayed text, the content of this area starts to scroll up one line after each finished line of text. Further, two drop down menus are used to select the text and keyboard currently required for the next typing test. Lastly, two buttons determine when the text is revealed (Start) and if the participant or researcher wants to interrupt the active typing test in case of malfunctioning hardware e.g., keyboard, \gls{EMG} device, computer, etc., or if the subject experiences discomfort and wants to stop. The timer for the typing test starts when the participant inputs the first character after the start button was pressed. The \gls{UI} for the typing test is shown in Figure \ref{fig:gott_text_area}. \begin{figure}[H] \centering \includegraphics[width=0.80\textwidth]{images/gott_text_area.jpg} \caption{\gls{GoTT}'s typing test. The \textit{START} button reveals the text selected with the dropdown menu labeled \textit{Text to transcribe}. The \textit{RESET} button interrupts the currently active typing test. The content will scroll up one line after half of the text was transcribed (Marked by \textit{Scrolling begins here}) so the relevant line always stays centered.} \label{fig:gott_text_area} \end{figure} \gls{GoTT} captures the metrics presented in Listing \ref{lst:meas_perf} according to the formulas given in Section \ref{sec:meas_perf}. \begin{listing}[H] \caption{Implementation of performance related metrics in \gls{GoTT}. The function \textit{roundToPrecision} takes the number of decimal places to round to as the second argument.} \label{lst:meas_perf} \begin{minted}[linenos,fontsize=\small]{js} // TEST_TIME is retrieved from backend and // set in the config file in seconds mins = TEST_TIME / 60; // T is the transcribed text TL = T.length; // Input Stream Length = TL + Fixes (Backspace) // + Incorrect Fixed (Fixed Errors) ISL = TL + F + IF; // Correct input = TL - Incorrect Not Fixed (Left errors) C = TL - INF; // Error metrics CER = roundToPrecision(IF / (TL + IF), 5); UER = roundToPrecision(INF / (TL + IF), 5); TER = roundToPrecision((INF + IF)/(TL + IF), 5); // Speed metrics // TL - 1 because the first char is entered at 0 seconds WPM = roundToPrecision((TL - 1) / (5 * mins), 2); AdjWPM = roundToPrecision(WPM * Math.pow((1 - UER), a), 2); KSPS = roundToPrecision((ISL - 1) / TEST_TIME, 5); \end{minted} \end{listing} % // Correct / Any input char % accuracy = roundToPrecision(C / (TL + IF) * 100, 2); % KSPC = roundToPrecision(ISL / TL, 5); For further implementation details on how input was captured or sent to the backend refer to the code in the online repository\footnote{\url{https://github.com/qhga/GoTT}}. To test the usability of the typing test we asked five individuals to complete multiple typing tests with their own computer. Based on the feedback we received, we were able to switch to another font to further improve readability and also fix a bug related to the scrolling. All five volunteers reported that the typing test was very intuitive and fun to use. \item \textbf{The questionnaires} had to be linked to a specific participant, typing test and keyboard. In total, three different types of questionnaires had to be filled out by each participant at different times (more information in Section \ref{sec:methodology}). The demographics questionnaire was completed once at the start of the experiment, which could have been done via already existing survey tools and then linked to the participant by hand. The \gls{PTTQ} and the \gls{PKQ} on the other hand, were required after each individual typing test or after every keyboard respectively. Whereas manually matching all finished questionnaires to the corresponding typing tests and keyboards could have led to unwanted errors, we decided to implement a survey tool into \gls{GoTT} which achieved this task automatically. The \gls{PTTQ} resembled the \gls{KCQ} \cite[56]{iso9241-411} and the questions for the \gls{PKQ} were gathered from the \gls{UEQ-S} \cite{schrepp_ueq_handbook}. All questionnaires can be observed in Appendix \ref{app:gott}. \item \textbf{The text crowdsourcing platform} was required because of the potential introduction of observer bias as described in Section \ref{sec:bias}. Further, this part of \gls{GoTT} helped us gather 44639 instead of the estimated 40000 required characters to provide enough text for ten non-overlapping texts. The goal was reached after only 2 days, which proved crowdsourcing to be a good method to efficiently gather greater amounts of text for our experiment. The estimation of 40000 characters was made according to Eq. \ref{eq:chars}. \begin{equation} \label{eq:chars} n_{kb} * m_{ttkb} * \frac{s}{60} * |w| * wpm_{max} = 5 * 2 * \frac{300}{60} * 5 * 160 = 40000 \end{equation} with $n_{kb}$ the number of tested keyboards, $m_{ttkb}$ the number of typing test conducted with each keyboard, $\frac{s}{60}$ the time for each typing test (5min), $|w|$ number of characters defining a word (Section \ref{sec:meas_perf}) and $wpm_{max}$ which represents the average wpm of the top 100 typists retrieved from a database released by the website Typeracer\footnote{\url{https://docs.google.com/spreadsheets/d/18ZokmvjdzDypIr-Ayl1VWsRPOBa91qvgX3FgcsZtSAU/edit#gid=636312661}} which included the top 25000 competitors in terms of average \gls{WPM} \cite{typeracer}. The text snippets provided by volunteers trough our platform had to fulfill three requirements: \begin{enumerate} \item German language \item Fairly easy to understand (\gls{FRE} $>$ 70 \cite{flesch_fre}) \item Number of characters must be between 200 and 300 \end{enumerate} In order to communicate what kind of text is appropriate, the platform provided an example where the difference between fairly easy and difficult text was shown. Further, the backend implemented a set of functions that calculated the \gls{FRE} of submitted text, counted the number of characters and either accepted or rejected the text depending on if the requirements were met or not. The implementation of the algorithm that calculates the \gls{FRE} can be seen in Listing \ref{lst:gott_fre}. The function \textit{countSyllables} utilizes regex \footnote{\url{https://github.com/google/re2/wiki/Syntax}} matching to identify the number of syllables in a given string in German language. The rules for hyphenation defined by \textit{Duden Online} \footnote{\url{https://www.duden.de/sprachwissen/rechtschreibregeln/worttrennung}} were used to derive the regex patterns to identify syllables \cite{duden_hyphen}. The \gls{FRE} scores yielded by our function were verified with the help of multiple unit tests and also compared to scores obtained by another website \footnote{\url{https://fleschindex.de/berechnen/}} offering the calculation for German texts. The \gls{UI} for the crowdsourcing page is shown in Appendix \ref{app:gott}. The gathered text snippets were, first checked for typos and grammar using \textit{Duden Mentor}\footnote{\url{https://mentor.duden.de/}}, then randomized and finally aggregated into equally long texts with nearly identical \gls{FRE} scores (mean = 80.10, SD = 0.48). \begin{listing}[H] \caption{Algorithm that calculates the \gls{FRE} score for a given string in German language, utilizing regex pattern matching to count syllables, words and sentences.} \label{lst:gott_fre} \begin{minted}[linenos,fontsize=\small]{go} func countSyllables(txt string) int { rx := regexp.MustCompile(`(?i)[^aeiouäöüßy\W][aeiouäöüßy]| \b[aeiouäöüßy][^aeiouäöüßy\W]|\b[aeiouäöüy]{2,}| u[aeuo]|(on|er)\b|\B(a|o|u|e)\B`) extraConsonants := []string{"ck", "x", "ch", "x", "sch", "x", "st", "x", "gn", "x"} extraVowels := []string{"äu", "i", "ie", "i"} r := strings.NewReplacer(extraConsonants...) txt = r.Replace(txt) r = strings.NewReplacer(extraVowels...) txt = r.Replace(txt) syllableCount := len(rx.FindAllStringIndex(txt, -1)) return syllableCount } func countWords(txt string) int { rx := regexp.MustCompile(`[\wäöüß]{2,}`) return len(rx.FindAllStringIndex(txt, -1)) } func countSentences(txt string) int { rx := regexp.MustCompile(`[\wäöüß]{2,}[\?\.!;]`) return len(rx.FindAllStringIndex(txt, -1)) } func calculateFRE(txt string) float64 { syc := countSyllables(txt) wc := countWords(txt) sec := countSentences(txt) // Average Sentence Length = Words / Sentence asl := float64(wc) / float64(sec) // Average Number of Syllables per Word = Syllables / Words asw := float64(syc) / float64(wc) fre := math.Round((180.-asl-(58.5*asw))*100) / 100 // <0 and >100 is allowed, but not relevant in our case if fre > 100. { fre = 100. } if fre < 0. { fre = 0. } return fre } \end{minted} \end{listing} \end{enumerate} \pagebreak \subsection{Finger Strength Measurement Device} \label{sec:force_meas_dev} \begin{figure}[ht] \centering \includegraphics[width=0.8\textwidth]{images/force_master_1} \caption{Prototype of a measuring device that simulates the distance and finger position required to press different keys on a keyboard. The display shows the currently applied force in gram and the peak force applied throughout the current measurement in gram and \gls{N}} \label{fig:force_master} \end{figure} Considering the fact that we required very specific data about the force each digit is able to apply to keyswitches in different locations, we decided to prototype our own device to measure the required data. Because of previous research in the field of finger strength and force applied to keyboards, we wanted to use the same type of sensor―a load cell―that was commonly utilized in those studies \cite{gerard_keyswitch, rempel_ergo, bufton_typingforces}. A load cell, capable of measuring up to 5 kg $\approx$ 49.0 \gls{N}, in combination with the HX711 load cell amplifier shown in Figure \ref{fig:hx711} and the library HX711\_ADC\footnote{\url{https://github.com/olkal/HX711_ADC}} was used to build the prototype which can be seen in Figure \ref{fig:force_master}. Initial testing revealed that the response for measurements with the standard 10 Hz sample rate of the HX711 was not sufficient to pick up the peak force in some measurements. Therefore, we resoldered the 0 $\Omega$ surface mount resistor to raise sample rate to 80 Hz, which yielded better results for fast keystrokes but did not deteriorate overall precision compared to the measurements conducted with 10 Hz. The apparatus used an \gls{OLED} display to present currently applied force in gram and peak force in gram and \gls{N}. The device was mainly controlled via two terminal commands. While one command initiated re-calibration that was used after each participant or in between measurements, the other command reset all peak values displayed via the display. The base of the device featured a scale, which was traversed with the help of a wrist rest that got aligned with the markings corresponding to the currently measured key. Each mark represents the distance and position of a finger to the associated key indicated by the label underneath the marking. The measurement process is explained in more detail in Section \ref{sec:meth_force} \begin{figure}[ht] \centering \includegraphics[width=0.5\textwidth]{images/hx711} \caption{HX711 amplifier module. The 0 $\Omega$ resistor had to be resoldered to accomplish 80 Hz polling rate. This module is used in combination with the HX711\_ADC library to read the changes in resistance by the load cell and convert those into gram.} \label{fig:hx711} \end{figure} \subsection{Summary} By implementing our own typing test platform (\gls{GoTT}) we maximized the control over one of the main measurement tools required by our experiment. We were able to exactly define all functions responsible to collect the metrics according to our research done in Section \ref{sec:meas_perf}. The crowdsourcing tool allowed us to gather a great amount of unbiased text in very little time and the addition of questionnaires into \gls{GoTT} eliminated the possibility of unnecessary errors. Both potentially improved the reliability of the results acquired by our experiment. Further, the device we built to measure the peak force each finger can produce while pressing certain keys on a keyboard allowed us to base the design of our keyboard with non-uniform actuation forces on more than anecdotal evidence. The exact procedure of our preliminary experiment on peak force will be addressed in the following section.