|
|
|
@ -1,7 +1,7 @@
|
|
|
|
|
% A rapid method that creates many corrected errors, has efficient error correction, and leaves
|
|
|
|
|
% few uncorrected errors can still be considered a successful method, since it produces
|
|
|
|
|
% accurate text in relatively little time. pp. 56 MacKenzie
|
|
|
|
|
\section{Results}
|
|
|
|
|
\section{Results of the Main User Study}
|
|
|
|
|
\label{sec:results}
|
|
|
|
|
This section addresses the statistical analysis of the data obtained throughout
|
|
|
|
|
the main, within-subject, user study (n = 24) that consisted of five repeated
|
|
|
|
@ -141,7 +141,17 @@ significant differences between \textit{Aphrodite} (M = 51.5, sd = 14.0) and
|
|
|
|
|
3.53, p = 0.01). Further, the \gls{rmANOVA} for \gls{AdjWPM} yielded (F(3, 69) =
|
|
|
|
|
6.197, p = 0.0009) and for \gls{KSPS} (F(3, 69) = 3.566, p = 0.018). All
|
|
|
|
|
relevant results of the post-hoc tests and the summary of the performance data
|
|
|
|
|
can be observed in Tables \ref{tbl:sum_tkbs_speed} and \ref{tbl:res_tkbs_speed}.
|
|
|
|
|
can be observed in Tables \ref{tbl:sum_tkbs_speed} and
|
|
|
|
|
\ref{tbl:res_tkbs_speed}. We further examined, which of the four test keyboard
|
|
|
|
|
was the fastest for each participant and found, that \textit{Hera} was the
|
|
|
|
|
fastest keyboard in terms of \gls{WPM} for 46\% (11) of the twenty-four
|
|
|
|
|
subjects. Additionally, we analyzed the \gls{WPM} percentage of \textit{Own}
|
|
|
|
|
(\gls{OPC}) for all test keyboards to figure out, which keyboard exceeded the
|
|
|
|
|
performance of the participant's own keyboard. We found, that three subjects
|
|
|
|
|
reached \gls{OPC}\_\gls{WPM} values greater than 100\% with all four test
|
|
|
|
|
keyboards. Also, \textit{Athena, Aphrodite} and \textit{Hera} exceeded 100\% of
|
|
|
|
|
\gls{OPC}\_\gls{WPM} eight, seven and six times respectively. Detailed results
|
|
|
|
|
are presented in Figure \ref{fig:max_opc_wpm}.
|
|
|
|
|
|
|
|
|
|
\begin{table}[H]
|
|
|
|
|
\centering
|
|
|
|
@ -215,6 +225,15 @@ can be observed in Tables \ref{tbl:sum_tkbs_speed} and \ref{tbl:res_tkbs_speed}.
|
|
|
|
|
\label{tbl:res_tkbs_speed}
|
|
|
|
|
\end{table}
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=1.0\textwidth]{images/max_opc_wpm}
|
|
|
|
|
\caption{The left graph shows the fastest keyboard in terms of \gls{WPM} for
|
|
|
|
|
each participant. The right graph shows, which keyboards were even faster
|
|
|
|
|
than the participant's own keyboard (\gls{OPC}\_\gls{WPM} > 100\%)}
|
|
|
|
|
\label{fig:max_opc_wpm}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\subsubsection{Error Rate}
|
|
|
|
|
\label{sec:res_error_rate}
|
|
|
|
|
\gls{GoTT} also automatically tracked various error related metrics from which
|
|
|
|
@ -230,7 +249,15 @@ Test for \gls{UER} ($\chi^2$(3) = 2.59, p = 0.46) yielded no statistical
|
|
|
|
|
significant difference. It should be noted, that the 90th percentile of
|
|
|
|
|
\gls{UER} for all keyboards was still below 1\%. Summaries for the individual
|
|
|
|
|
metrics and results for all post-hoc tests can be seen in Table
|
|
|
|
|
\ref{tbl:sum_tkbs_err} and \ref{tbl:res_tkbs_err}.
|
|
|
|
|
\ref{tbl:sum_tkbs_err} and \ref{tbl:res_tkbs_err}. Furthermore, we compared the
|
|
|
|
|
\gls{TER} of all test keyboards for each participant and found, that
|
|
|
|
|
\textit{Athena} was the keyboard which participants typed most accurately
|
|
|
|
|
with. Two participants scored identical \gls{TER} with two test keyboards,
|
|
|
|
|
therefore the total number of ``1st-placed'' keyboards increased to twenty-six.
|
|
|
|
|
Lastly, we compared the test keyboards to subject's own keyboards and examined
|
|
|
|
|
that eleven participants scored lower \gls{TER}s with \textit{Athena} compared
|
|
|
|
|
to \textit{Own} (\gls{OPC}). All data can be observed in Figure
|
|
|
|
|
\ref{fig:max_opc_ter}.
|
|
|
|
|
|
|
|
|
|
\begin{table}[H]
|
|
|
|
|
\centering
|
|
|
|
@ -309,7 +336,16 @@ metrics and results for all post-hoc tests can be seen in Table
|
|
|
|
|
\label{tbl:res_tkbs_err}
|
|
|
|
|
\end{table}
|
|
|
|
|
|
|
|
|
|
\subsection{Muscle Activity}
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=1.0\textwidth]{images/max_opc_ter}
|
|
|
|
|
\caption{The left graph shows the keyboard with the lowest \gls{TER} for each
|
|
|
|
|
participant. The right graph shows, which keyboards were more accurate than
|
|
|
|
|
the participant's own keyboard (\gls{OPC}\_\gls{TER} < 100\%)}
|
|
|
|
|
\label{fig:max_opc_ter}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\subsection{Muscle Activity Measurements}
|
|
|
|
|
\label{sec:res_muscle_activity}
|
|
|
|
|
We utilized the \gls{EMG} device described in Section \ref{sec:main_design} to
|
|
|
|
|
gather data about the muscle activities (\% of \glsfirst{MVC}) during typing
|
|
|
|
@ -327,7 +363,23 @@ using either dependent T-tests or Wilcoxon Signed Rank Tests. There were no
|
|
|
|
|
statistically significant differences in \%\gls{MVC} between the first and the
|
|
|
|
|
second typing test for any keyboard/muscle combination. The summaries for all
|
|
|
|
|
test keyboards of the mean values for both typing tests combined can be observed
|
|
|
|
|
in Table \ref{tbl:sum_tkbs_emg}.
|
|
|
|
|
in Table \ref{tbl:sum_tkbs_emg}. Lastly, we created histograms (Figure
|
|
|
|
|
\ref{fig:max_mvc_tkbs}) for each of the observed muscle groups, that show the
|
|
|
|
|
number of times a keyboard yielded the highest \%\gls{MVC} out of all keyboards
|
|
|
|
|
for each participant. We found, that \textit{Athena} most frequently (~45\%)
|
|
|
|
|
produced the highest extensor muscle activity for both arms. The highest muscle
|
|
|
|
|
activity for both flexor muscle groups was evenly distributed among all test
|
|
|
|
|
keyboards with a slight exception of \textit{Nyx}, which produced the highest
|
|
|
|
|
\%\gls{MVC} only in ~14\% of participants.
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=1.0\textwidth]{images/max_emg_tkbs}
|
|
|
|
|
\caption{Histograms for all \gls{EMG} measurements that show the keyboard with
|
|
|
|
|
the highest mean \% of \glsfirst{MVC} out of all four keyboards for each
|
|
|
|
|
participant}
|
|
|
|
|
\label{fig:max_emg_tkbs}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\begin{table}[H]
|
|
|
|
|
\centering
|
|
|
|
@ -461,3 +513,173 @@ Table \ref{tbl:res_kcq}.
|
|
|
|
|
\end{table}
|
|
|
|
|
\subsubsection{User Experience Questionnaire (Short)}
|
|
|
|
|
\label{sec:res_ueqs}
|
|
|
|
|
Additionally to the \gls{KCQ} we utilized the \glsfirst{UEQ-S}. It featured
|
|
|
|
|
eight questions on a 7-point Likert scale, which formed two scales (pragmatic,
|
|
|
|
|
hedonic). Additionally we added one extra question that could be answered on a
|
|
|
|
|
\glsfirst{VAS} from 0 to 100. The survey was filled out after both tests with a
|
|
|
|
|
keyboard have been completed. The questions of our modified \gls{UEQ-S} were
|
|
|
|
|
labelled as follows:
|
|
|
|
|
|
|
|
|
|
\begin{table}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\ra{0.8}
|
|
|
|
|
\small
|
|
|
|
|
\begin{tabular}{llll}
|
|
|
|
|
\multicolumn{2}{c}{Pragmatic Scale} & \multicolumn{2}{c}{Hedonic Scale} \\
|
|
|
|
|
\\
|
|
|
|
|
\textbf{PRA1:} & \textit{``Obstructive or Supportive?''} & \textbf{HED1:} & \textit{``Boring or Exciting?''} \\
|
|
|
|
|
\textbf{PRA2:} & \textit{``Complicated or Easy?''} & \textbf{HED2:} & \textit{``Not interesting or Interesting?''} \\
|
|
|
|
|
\textbf{PRA3:} & \textit{``Inefficient or Efficient?''} & \textbf{HED3:} & \textit{``Conventional or Inventive?''} \\
|
|
|
|
|
\textbf{PRA4:} & \textit{``Confusing or Clear?''} & \textbf{HED4:} & \textit{``Usual or Leading Edge?''} \\
|
|
|
|
|
\\
|
|
|
|
|
\multicolumn{4}{c}{Additional Question (\gls{VAS})} \\
|
|
|
|
|
\\
|
|
|
|
|
\textbf{SATI:} & \multicolumn{3}{l}{\textit{``How satisfied have you been with this keyboard?''}}
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{table}
|
|
|
|
|
|
|
|
|
|
The 7-point Likert scale items (PRA1-4, HED1-4) were then transformed to
|
|
|
|
|
represent a scale from -3 to +3, where -3 represented the left term and +3 the
|
|
|
|
|
right term of the ``or'' questions. All sub-scales, pragmatic ($\alpha$ =
|
|
|
|
|
0.90)\footnote{PRA: Athena ($\alpha$ = 0.83), Aphrodite ($\alpha$ = 0.95), Nyx
|
|
|
|
|
($\alpha$ = 0.90), Hera ($\alpha$ = 0.85)} and hedonic ($\alpha$ =
|
|
|
|
|
0.88)\footnote{HED: Athena ($\alpha$ = 0.89), Aphrodite ($\alpha$ = 0.89), Nyx
|
|
|
|
|
($\alpha$ = 0.91), Hera ($\alpha$ = 0.90)}, exceeded the recommended threshold
|
|
|
|
|
for Cronbach's alpha of $\alpha$ > 0.7 \cite{schrepp_ueq_handbook}. The mean
|
|
|
|
|
values for all responses of the \gls{UEQ-S} can be seen in Figure
|
|
|
|
|
\ref{fig:kcq_tkbs_res} and the individual responses to the additional question
|
|
|
|
|
(SATI) are presented in Figure \ref{fig:sati_tkbs_res}. We conducted
|
|
|
|
|
\gls{rmANOVA}s for both sub-scales but found no statistically significant
|
|
|
|
|
variations for the pragmatic scale (F(3, 69) = 3.254, p = 0.06, post-hoc did not
|
|
|
|
|
reveal any tendencies) nor the hedonic scale (F(3, 69) = 0.425, p =
|
|
|
|
|
0.74). Contrary, the \gls{rmANOVA} for the additional question \textit{SATI}
|
|
|
|
|
indicated statistically significant differences (F(3, 69) = 3.254, p =
|
|
|
|
|
0.027). In this case, we decided to use Wilcoxon Signed Rank Tests for our
|
|
|
|
|
post-hoc analysis because of our interest in the difference of medians and the
|
|
|
|
|
relatively high power of this test in analyzing \gls{VAS} data
|
|
|
|
|
\cite{heller_vas}. The results and summaries for the test keyboards can be
|
|
|
|
|
observed in Tables \ref{tbl:res_tkbs_sati} and \ref{tbl:sum_tkbs_sati}.
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=1.0\textwidth]{images/ueq_tkbs_res}
|
|
|
|
|
\caption{Means of the responses for all questions of the \glsfirst{UEQ-S}}
|
|
|
|
|
\label{fig:ueq_tkbs_res}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\begin{table}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\small
|
|
|
|
|
\ra{1.3}
|
|
|
|
|
\begin{tabular}{?l^l^l^l^l^l^l^l}
|
|
|
|
|
\toprule
|
|
|
|
|
\rowstyle{\itshape}
|
|
|
|
|
Y & Comparison & Statistic & p & Estimate & CI & Hypothesis \\
|
|
|
|
|
\midrule
|
|
|
|
|
\multicolumn{6}{l}{\textbf{Non Parametric (Wilcoxon Signed Rank Test)}} \\
|
|
|
|
|
SATI & Aphrodite - Nyx & 217.0 & 0.046^* & 14.0 & [5, Inf] & greater \\
|
|
|
|
|
SATI & Aphrodite - Athena & 201.5 & 0.046^* & 12.5 & [4.5, Inf] & greater \\
|
|
|
|
|
SATI & Nyx - Athena & 125.5 & 1.0 & -3.0 & [-11.5, Inf] & greater \\
|
|
|
|
|
SATI & Hera - Athena & 205.5 & 0.174 & 8.5 & [0, Inf] & greater \\
|
|
|
|
|
SATI & Hera - Aphrodite & 118.5 & 1.0 & -2.5 & [-12.5, Inf] & greater \\
|
|
|
|
|
SATI & Hera - Nyx & 223.5 & 0.074^\dagger & 12.5 & [2.5, Inf] & greater \\
|
|
|
|
|
\bottomrule
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\caption{Post-hoc tests for the additional question \textit{``How satisfied
|
|
|
|
|
have you been with this keyboard?''}. Statistically significant
|
|
|
|
|
differences (p < 0.05) are marked with an * and p values indicating a trend
|
|
|
|
|
towards significance are denoted with $\dagger$. Confidence intervals are
|
|
|
|
|
given for the difference of the location parameter. We only tested keyboards
|
|
|
|
|
with lower actuation force against keyboards with higher actuation
|
|
|
|
|
force. The first comparison of Aphrodite (50 g) and Nyx (35 g) was added,
|
|
|
|
|
because of the noticeable differences in the visual assessment of Figure
|
|
|
|
|
\ref{fig:sati_tkbs_res}}
|
|
|
|
|
\label{tbl:res_tkbs_sati}
|
|
|
|
|
\end{table}
|
|
|
|
|
|
|
|
|
|
\begin{table}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\small
|
|
|
|
|
\ra{1.3}
|
|
|
|
|
\begin{tabular}{?r^l^l^l^l^l^l^l}
|
|
|
|
|
\toprule
|
|
|
|
|
\rowstyle{\itshape}
|
|
|
|
|
Pseud. & Mean & Median & Min & Max & SD & SE \\
|
|
|
|
|
\midrule
|
|
|
|
|
Athena & 54.12 & 50.00 & 1.00 & 95.00 & 25.43 & 5.19 \\
|
|
|
|
|
Aphrodite & 65.08 & 71.50 & 10.00 & 94.00 & 22.56 & 4.61 \\
|
|
|
|
|
Nyx & 51.42 & 55.00 & 0.00 & 90.00 & 23.40 & 4.78 \\
|
|
|
|
|
Hera & 63.29 & 70.00 & 12.00 & 92.00 & 19.95 & 4.07 \\
|
|
|
|
|
\bottomrule
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\caption{Summaries for the additional question \textit{``How satisfied have
|
|
|
|
|
you been with this keyboard?''} for all four test keyboards}
|
|
|
|
|
\label{tbl:sum_tkbs_sati}
|
|
|
|
|
\end{table}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=1.0\textwidth]{images/sati_tkbs_res}
|
|
|
|
|
\caption{Responses for the additional question \textit{``How satisfied have
|
|
|
|
|
you been with this keyboard?''} with the means for all participant
|
|
|
|
|
represented as horizontal lines}
|
|
|
|
|
\label{fig:sati_tkbs_res}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{UX Curves and Semi-Structured Interviews}
|
|
|
|
|
\label{sec:res_uxc}
|
|
|
|
|
In order to give all participants the chance to recapitulate the whole
|
|
|
|
|
experiment and give retrospective feedback about each individual keyboard, we
|
|
|
|
|
conducted a semi-structured interview which included drawing UX-curves for
|
|
|
|
|
perceived fatigue and perceived typing speed. We evaluated the curves by
|
|
|
|
|
measuring the y position of the \gls{SP} for a curve and the y position of the
|
|
|
|
|
respective \gls{EP} an determine the slope of that curve. Slopes are defined as
|
|
|
|
|
improving if \gls{SP} < \gls{EP}, deteriorating if \gls{SP} > \gls{EP} and
|
|
|
|
|
stable if \gls{SP} = \gls{EP} (margin of $\pm$ 1 mm). One curve can either
|
|
|
|
|
represent one typing test (C1 or C2) or the whole experience with one keyboard
|
|
|
|
|
over the course of both typing tests (C12). All curves can be observed in
|
|
|
|
|
Appendix \ref{app:uxc} and the resulting slopes for all curve types are shown in
|
|
|
|
|
Figure \ref{fig:res_uxc}. During the semi-structured interview, we asked the
|
|
|
|
|
participants to rank the keyboards from 1 (favorite) to 5 (least favorite). If
|
|
|
|
|
in doubt, participants were allowed to place two keyboards on the same
|
|
|
|
|
rank. Further, we asked some participants (n = 19) to also rank the keyboards
|
|
|
|
|
from lowest actuation force (one) to highest actuation force (five). The
|
|
|
|
|
participants own keyboard was four times more often placed first than any other
|
|
|
|
|
keyboard. \textit{Hera} was the only keyboard, that never got placed fifth and
|
|
|
|
|
except for \textit{Own}, was the most represented keyboard in the top three. The
|
|
|
|
|
ranking of the perceived actuation force revealed, that participants were able
|
|
|
|
|
to identify \textit{Nyx} (35 g) and \textit{Athena} (80 g) as the keyboards with
|
|
|
|
|
the lowest and highest actuation force respectively. All results for both
|
|
|
|
|
rankings are visualized in Figure \ref{fig:res_interview}. Lastly, we analyzed
|
|
|
|
|
the recordings of all interviews and found several similar statements about
|
|
|
|
|
specific keyboards. Twelve participants noted, that because of the new form
|
|
|
|
|
factor of the test keyboards, additional familiarization was required to feel
|
|
|
|
|
comfortable. Nine of those specifically mentioned the height of the keyboard as
|
|
|
|
|
the main difference. Fourteen subjects reported―\textit{``Because Nyx had such a
|
|
|
|
|
low resistance, I kept making mistakes!''}. Four participants explicitly
|
|
|
|
|
noted, that \textit{Hera} felt very pleasant and two subjects mentioned
|
|
|
|
|
\textit{``I had really good flow.''} and \textit{``It somehow just felt
|
|
|
|
|
right''}. Ten participants reported, that typing on \textit{Athena} was
|
|
|
|
|
exhausting. \textit{Aphrodite} was not mentioned as often as the other keyboards
|
|
|
|
|
which could be related to a comment of two subjects―\textit{``It felt very
|
|
|
|
|
similar to my own Keyboard''}.
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=1.0\textwidth]{images/res_uxc}
|
|
|
|
|
\caption{\centering Evaluation of UX-curve slopes for perceived fatigue and perceived
|
|
|
|
|
speed. \\
|
|
|
|
|
\textit{DE:} deteriorating, \textit{IM:} improving, \textit{ST:} stable}
|
|
|
|
|
\label{fig:res_uxc}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=1.0\textwidth]{images/res_interview}
|
|
|
|
|
\caption{Rankings for favorite keyboard and perceived required actuation force
|
|
|
|
|
for all keyboards including \textit{Own}. The graphs show the number of
|
|
|
|
|
times a keyboard was placed at a certain rank}
|
|
|
|
|
\label{fig:res_interview}
|
|
|
|
|
\end{figure}
|