Time of Initiating Correct
For Mouse-Tracking Data
See a full paper on this work at: March, D., S. & Gaertner, L. (in press). A Method for Estimating the Time of Initiating Correct Categorization in Mouse-Tracking. Behavior Research Methods, 53, 2439-2449.
Please see this link for SAS and R code, a tutorial document, and a sample dataset.
Mouse-tracking is a process-tracing method that indexes cognitive processes online and in real-time. The temporal precision of mouse-tracking facilitates exploration of mental processes underlying decision making (i.e., categorization). Imagine a computer displaying a stimulus (e.g., an image of an emotionally-neutral man) at the bottom-center of the screen and two response labels at the screen’s top left and right corners – one label, the target, describes the stimulus (e.g., Calm) and the other label, the distractor, does not (e.g., Dangerous). Upon stimulus presentation, the cognitive system begins to gradually accrue evidence in favor of one category versus another (i.e., response competition). Mouse-tracking records the x-, y-coordinates of the mouse cursor in time as the participant moves the mouse from the bottom-center of the screen to the target. As the cognitive system works to settle on a decision, response competition manifests in the motor movements of the hand, bringing the mouse relatively closer to one alternative versus the other. For example, effects of stimulus typicality on judgement yield mouse-movement toward “Fish” before arriving at “Mammal” when categorizing a whale (Dale et al., 2007), and toward “Male” before arriving at “Female” when categorizing a masculine woman (Freeman et al., 2008).
As a process-tracing method, mouse-tracking (and other-process tracing methods, e.g., eye-tracking, electroencephalography) allows the observation of cognitive dynamics in action and yields temporal information about the development of a response. In mouse-tracking studies, the process being traced is often stimulus categorization (i.e., decision-making, judgment), and information about when categorization begins is highly relevant. Many existing mouse-tracking metrics provide insight into decision-making processes by indexing the shape or complexity of the mouse trajectory, or how long it took to complete a trial. Lacking, however, is a metric that estimates the point in time when a participant begins to correctly categorize a stimulus in regard to the target label. That is, the time when the cognitive system begins to settle on the correct decision to a degree sufficient to result in motor movement relatively closer to the target. This page describes a method that rectifies this absence by introducing a metric referred to as the time of initiating correct categorization (TICC).
Estimating the TICC
Estimating TICC begins by averaging all trajectories for a given stimulus for a given participant – it can be done on the trial level, but an average trajectory is certainly more stable. We recommend using raw-time trajectories to maintain the fidelity of x-, y- coordinates in time (but normed-time trajectories could be used). For a given trajectory (average or single trial), the Euclidean distance (or proximity) to the target and to the distractor is calculated at each time point.
Euclidean distance (or proximity) is superior to only vertical or horizontal distance. Vertical movement brings the trajectory closer to the target and distractor and horizontal movement lower on the screen is further from the target (and distractor) than is the same horizontal movement higher on the screen. Euclidean distance simultaneously accounts for vertical and horizontal distance:
In the formula, di is distance at a given time, xi and yi is the horizontal and vertical location at a given time, and xf and yf is the horizontal and vertical location at the final time (i.e., location when the participant clicked the target – or, for distance from distractor, the corresponding location in the distractor).
Next, a difference is calculated at each time point by subtracting the Euclidean distance to the target (taken here as the last x-, y- coordinate) from the Euclidean distance to the distractor (these differences are what Spivey et al. assess with t-tests). Plotting the difference over time reveals a sigmoid curve, as is displayed in the figure just below. The flat part of the curve early in time is vertical movement bringing the mouse equally close to the target and distractor. The exponential slope is movement relatively closer to the target and further from the distractor, which flattens later in time as the mouse reaches the target.
A sigmoid over time occurs for many phenomena, such as bacterial growth. The phases on top of the above figure are what bacteriologists refer to as the lag phase in which growth is dormant, the exponential phase in which growth multiplies, and the stationary phase in which growth has maximized. Lambda (λ) on the time-axis is, for bacteriologists, the time when bacteria transition from dormancy to exponential growth. For the plotted difference in Euclidean distances, λ is the time when the mouse trajectory begins moving increasingly closer to the target than distractor; that is, λ is the TICC.
A number of nonlinear models estimate λ and the other the parameters of the sigmoid. Two of those models, Gompertz (Borglin et al., 2012) and Baranyi (Baty & Delignette-Muller, 2004), are highly reliable and accurate (Baty & Delignette-Muller, 2004):
In terms of mouse trajectory, yt is the difference in Euclidean-distances at a given time (t), ymin is the lower asymptote of the difference, ymax is the upper asymptote of the difference, μm is the maximum growth rate, e is a mathematical constant » 2.718 (i.e., Euler’s number), and λ is theparameter of interest – that is, TICC, time when the trajectory begins moving increasingly closer to the target than distractor.
Software capable of nonlinear regression can estimate the parameters of the Gompertz and Baranyi models from each participant’s Euclidean-distance difference at each time point. I typically use Proc NLIN of SAS (see code at link at beginning of entry), which requests starting values to facilitate the iterative estimation of model parameters. Starting values can be roughly guessed by eye-balling a plot of the average curve over time (i.e., averaging across all curves). Proc NLIN allows boundaries for parameter estimates. For example, λ (i.e., TICC) can be restricted to be no lower than 0 (i.e., categorization cannot begin earlier than the trial) and no greater than the maximum trial duration, and ymax can be restricted within the lowest and highest possible difference in Euclidean distances (e.g., 0 and 2 if using MouseTracker; Freeman & Ambady, 2010). Although boundaries are not necessary, they yield, in our experience (detailed in the next section), higher rates of model convergence (exceeding 97%). Maximizing convergence is desirable because nonconvergence yields a missing TICC estimate. Gompertz and Baranyi models provide exceptional fit to the data, again in our experience, with average pseudo-R2 exceeding .94 (we discuss how well TICC estimation works for different trajectory shapes in the “Applicability to Differently Shaped Trajectories” section). With no reason to prefer the TICC from the Gompertz versus Baranyi model, we average them. Hence, this procedure provides a person (or trial) specific estimate of the time correct categorization initiated for a given stimulus. That estimate can be used as a predictor or an outcome. Viola!
Below you can see the process from raw XY-coordinates in the most left figure, to the Euclidean distance difference sigmoid in the middle figure, and finally to the predicted sigmoid with an area of focus on the TICC estimates in the right most figure. Note the different line style are for different conditions of stimuli in this example:
If you have any questions about this procedure, feel free to email me at: firstname.lastname@example.org.
Applicability of the TICC to Differently Shaped Trajectories
You may be wondering how well the method works on different shaped trajectories. This section addresses that issue.
The Gompertz and Baranyi models estimate bacterial growth in a sigmoidal curve where growth (y) increases with time (x). We adopted those models to estimate TICC in a sigmoid that occurs in the distribution of the difference in Euclidean distance to the target versus distractor (y) over time (x). It might be assumed that application of those formulas requires an idealized mouse trajectory in which initial movement is vertical and equidistant from target and distractor and then monotonically shifts towards the target. However, such an idealized trajectory is not necessary, and the models fit many trajectories.
To gauge how the models fit less-than-ideal trajectories (in terms of generating a sigmoid), we explored three types of trajectories that commonly occur in the mouse tracking literature (Kieslich et al., 2020; Schoeman 2020 ; Wulff et al. 2019). One possibility (below figure, top panel) is for the participant to move relatively closer to the target than the distractor from movement onset onwards (i.e., resulting in a relatively straight line, or what Wulff et al. label as trajectory type 2). In this instance, the difference in Euclidean distance over time is a flatter sigmoid with a very early TICC estimate (275 ms) and a pseudo-R2 of .99. (Keep in mind, the time axis in the figure is stimulus-locked, not response-locked.) Another possibility (below figure, middle panel) is that the participant might first move closer to the distractor, change their mind, and move to the target (i.e., resulting in a 7-shape, or what Wulff et al. label as trajectory type-5). In this instance, there is only one initial turn toward the correct target and TICC is estimated to occur later in time (494 ms) than in the previous pattern with a pseudo-R2 of .90 and a noticeably lower estimate of ymin. A third possibility (below figure, bottom panel) is that the participant might move toward the target, turn to the distractor, and turn back to the target (i.e., a multi-turn trial, resulting in a messy 7-shape, not labeled by Wulff et al.). In this instance, TICC is estimated to occur even later in time (740 ms) than in the two previous patterns with a pseudo-R2 of 0.75.
It is also evident in the bottom panel that the estimated TICC is reflecting the initial turn toward the target and not the subsequent turn back to the target (after movement to the distractor). That subsequent turn back to the target is reflected in the actual difference in Euclidean distance of approximately zero at 1420 ms. The decreasing pseudo-R2 across the three patterns certainly reflects the increased indecision underlying the trajectory. Importantly, in these examples, the increasing estimate of TICC similarly reflects the increasing indecision with greater indecision yielding later estimates of when in time correct categorization initiates. However, an early TICC is not necessarily diagnostic of a lack of indecision that occurs later in the trial because the TICC captures the persistent initial turn toward the target.
Of course, the non-linear models will not converge to yield a TICC estimate for every trajectory nor will they always provide exceptional fit to trajectories. The below figure provides an example of a trajectory from to which the models provided a poor fit (pseudo-R2 = .04). Visual inspection of trajectories is sometimes recommended as a means of data exclusion (Freeman et al., 2013). Some readers might consider exclusion on the basis of poor model fit to the trajectory. We do not necessary advocate this approach, because data exclusion is a tricky issue. If data exclusion procedures are used, we recommend consideration based on multiple criteria that are established a priori (to avoid hypothesis confirmation pressures). For example, a trajectory with a low pseudo-R2 might be retained if visual inspection of the trajectory suggests the TICC estimate captures the initial turn towards the target.
As a reminder, we have examined the non-linear modeling method for estimating TICC in a two-choice mouse-tracking paradigm. Whether it is appropriate for other mouse-tracking paradigms, such as three or four choice paradigms, remains to be examined.
Again, if you have any questions about this procedure, feel free to email me at: email@example.com.