Stimulus Coding Example¶
In this tutorial we illustrate how to use the regression approach to model the effect of stimulus coding on the drift rate parameter of the DDM.
Import Modules¶
from copy import deepcopy
import arviz as az
import pandas as pd
import hssm
What is Stimulus Coding?¶
There are two core approaches to coding the stimuli when fitting paramters of 2-choice SSMs (the discussion here is simplified, to bring across the core ideas, de facto ideas from both approaches can be mixed):
- Accuracy coding: Responses are treated as correct or incorrect
- Stimulus coding: Responses are treated as stimulus_1 or stimulus_2
Take as a running example a simple random dot motion task with two conditions, left
and right
. Both conditions are equally difficult, but for half of the experiments the correct motion direction is left, and for the other half it is right.
So it will be reasonable to assume that, ceteris paribus, nothing should really change in terms of participant behavior, apart from symmetrically preferring right to left when it is correct and vice versa.
Now, when applying Accuracy coding, we would expect the drift rate to be the same for both conditions, any condition effect to vanish by the time we code responses as correct or incorrect.
When we apply Stimulus coding on the other hand, we actively need to account for the direction change (since we now attach our response values, e.g. -1
, 1
, permanently to specific choice-options, regardless correctness).
To formulate a model that is equivalent to the one described above in terms of accuracy coding, we again want to estimate only a single v
parameter, but we have to respect the direction change in response when respectively completing experiment conditions left
and right
.
Note that an important aspect of what we describe above is that we want to estimate a single v
parameter in each of the two coding approaches.
For Accuracy coding we simply estimate a single v
parameter, and no extra work is necessary.
For Stimulus coding we need to account for symmetric shift in direction from the two experiment conditions. One way to do this, is the following:
We can simply assign a covariate, direction
, which codes -1
for left
and 1
for right
.
Then we use the following regression formula for the v
parameter: v ~ 0 + direction
.
Note that we are not using an intercept here.
Let's how this works in practice.
Simulate Data¶
# Condition 1
stim_1 = hssm.simulate_data(
model="ddm", theta=dict(v=-0.5, a=1.5, z=0.5, t=0.1), size=500
)
stim_1["stim"] = "C-left"
stim_1["direction"] = -1
stim_1["response_acc"] = (-1) * stim_1["response"]
# Condition 2
stim_2 = hssm.simulate_data(
model="ddm", theta=dict(v=0.5, a=1.5, z=0.5, t=0.1), size=500
)
stim_2["stim"] = "C-right"
stim_2["direction"] = 1
stim_2["response_acc"] = stim_2["response"]
data_stim = pd.concat([stim_1, stim_2]).reset_index(drop=True)
data_acc = deepcopy(data_stim)
data_acc["response"] = data_acc["response_acc"]
print(data_acc.head())
print(data_stim.head())
rt response stim direction response_acc 0 2.493836 -1.0 C-left -1 -1.0 1 0.829591 1.0 C-left -1 1.0 2 0.386000 -1.0 C-left -1 -1.0 3 1.521230 1.0 C-left -1 1.0 4 1.928993 1.0 C-left -1 1.0 rt response stim direction response_acc 0 2.493836 1.0 C-left -1 -1.0 1 0.829591 -1.0 C-left -1 1.0 2 0.386000 1.0 C-left -1 -1.0 3 1.521230 -1.0 C-left -1 1.0 4 1.928993 -1.0 C-left -1 1.0
Set up Models¶
Accuracy Coding¶
m_acc_stim_dummy = hssm.HSSM(
data=data_acc,
model="ddm",
include=[{"name": "v", "formula": "v ~ 1 + stim"}],
z=0.5,
)
m_acc_stim_dummy.sample(sampler="mcmc", tune=500, draws=500)
m_acc_stim_dummy.summary()
Model initialized successfully. Using default initvals.
Initializing NUTS using adapt_diag... Multiprocess sampling (4 chains in 4 jobs) NUTS: [t, a, v_Intercept, v_stim]
Output()
Sampling 4 chains for 500 tune and 500 draw iterations (2_000 + 2_000 draws total) took 6 seconds. 100%|██████████| 2000/2000 [00:00<00:00, 2607.97it/s]
mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
---|---|---|---|---|---|---|---|---|---|
t | 0.119 | 0.019 | 0.082 | 0.151 | 0.000 | 0.000 | 1557.0 | 1293.0 | 1.0 |
v_stim[C-right] | 0.017 | 0.048 | -0.066 | 0.113 | 0.001 | 0.001 | 1973.0 | 1523.0 | 1.0 |
v_Intercept | 0.516 | 0.037 | 0.453 | 0.588 | 0.001 | 0.001 | 1799.0 | 1511.0 | 1.0 |
a | 1.465 | 0.026 | 1.416 | 1.515 | 0.001 | 0.000 | 1549.0 | 1453.0 | 1.0 |
m_acc_simple = hssm.HSSM(
data=data_acc,
model="ddm",
include=[
{
"name": "v",
"formula": "v ~ 1",
"prior": {"Intercept": {"name": "Normal", "mu": 0.0, "sigma": 3.0}},
}
],
z=0.5,
)
m_acc_simple.sample(sampler="mcmc", tune=500, draws=500)
m_acc_simple.summary()
Model initialized successfully. Using default initvals.
Initializing NUTS using adapt_diag... Multiprocess sampling (4 chains in 4 jobs) NUTS: [t, a, v_Intercept]
Output()
Sampling 4 chains for 500 tune and 500 draw iterations (2_000 + 2_000 draws total) took 9 seconds. 100%|██████████| 2000/2000 [00:00<00:00, 2231.23it/s]
mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
---|---|---|---|---|---|---|---|---|---|
t | 0.118 | 0.019 | 0.082 | 0.151 | 0.001 | 0.000 | 969.0 | 850.0 | 1.0 |
v_Intercept | 0.524 | 0.025 | 0.478 | 0.571 | 0.001 | 0.001 | 1252.0 | 1277.0 | 1.0 |
a | 1.467 | 0.027 | 1.416 | 1.518 | 0.001 | 0.001 | 917.0 | 1342.0 | 1.0 |
az.compare({"m_acc_simple": m_acc_simple.traces, "m_acc_stim_dummy": m_acc_stim_dummy})
rank | elpd_loo | p_loo | elpd_diff | weight | se | dse | warning | scale | |
---|---|---|---|---|---|---|---|---|---|
m_acc_simple | 0 | -1988.669447 | 3.053195 | 0.000000 | 1.000000e+00 | 33.191774 | 0.000000 | False | log |
m_acc_stim_dummy | 1 | -1989.490479 | 3.906351 | 0.821032 | 1.110223e-16 | 33.245992 | 0.334027 | False | log |
Stim Coding¶
m_stim = hssm.HSSM(
data=data_stim,
model="ddm",
include=[
{
"name": "v",
"formula": "v ~ 0 + direction",
"prior": {"direction": {"name": "Normal", "mu": 0.0, "sigma": 3.0}},
}
],
z=0.5,
)
m_stim.sample(sampler="mcmc", tune=500, draws=500)
m_stim.summary()
az.compare(
{
"m_acc_simple": m_acc_simple.traces,
"m_acc_stim_dummy": m_acc_stim_dummy.traces,
"m_stim": m_stim.traces,
}
)
rank | elpd_loo | p_loo | elpd_diff | weight | se | dse | warning | scale | |
---|---|---|---|---|---|---|---|---|---|
m_stim | 0 | -1988.603906 | 2.991864 | 0.000000 | 1.000000e+00 | 33.219300 | 0.000000 | False | log |
m_acc_simple | 1 | -1988.669447 | 3.053195 | 0.065541 | 0.000000e+00 | 33.191774 | 0.052019 | False | log |
m_acc_stim_dummy | 2 | -1989.490479 | 3.906351 | 0.886573 | 3.663736e-15 | 33.245992 | 0.329172 | False | log |
Stim coding advanced¶
So far we focused on the v
parameter. The are two relevant concepts concerning bias
that we need to account for in the stimulus coding approach:
1. Bias in v
:¶
What is drift bias? Imagine our experimental design is such that the correct motion direction is left for half of the experiments and right for the other half. However, the sensory stimuli are such that the participant will nevertheless be accumulating excess evidence toward the left direction, even when the correct motion direction is right for a given trial.
To account for drift bias, we simply include an Intercept
term, which will capture the drift bias, so that the direction
term will capture the direction effect, a symmetric shift around the Intercept
(previously this Intercept
was set to 0, or appeared in the model that operated on a dummy stim
variable, which remember, creates a models that is too complex / has unnecessary extra parameters).
2. Bias in z
:¶
Bias in the z
parameter gets a bit more tricky. What's the idea here? The z
parameter represents the starting point bias. This notion is to some extend more intuitive when using stimulus coding than accuracy coding. A starting point bias under the stimulus coding approach is a bias toward a specific choice option (direction). A starting point bias under the accuracy coding approach is a ... bias toward a correct or incorrect response ... (?)
By itself not a problem, but to create the often desired symmetry in the z
parameter across stim
conditions, keeping in mind that bias takes values in the interval [0, 1]
, we need to account for the direction effect in the z
parameter. So if in the left
condition $z_i = z$, then in the right
condition $z_i = 1 - z$.
How might we incoporate this into our regression framework?
Consider the following varible $\mathbb{1}_{C_i = c}, \text{for} \ c \in \{left, right\}$ which is 1 if the condition is left
and 0 otherwise for a given trial. Now we can write the following function for $z_i$,
$$ z_i = \mathbb{1}_{(C_i = left)} \cdot z + (1 - \mathbb{1}_{(C_i = left)}) \cdot (1 - z) $$
which after a bit of algebra can be rewritten as,
$$ z_i = \left((2 \cdot \mathbb{1}_{(C_i = left)}) - 1\right) \cdot z + (1 - \mathbb{1}_{(C_i = left)}) $$
or,
$$ z_i = \left((2 \cdot \mathbb{1}_{(C_i = left)}) - 1\right) \cdot z + \mathbb{1}_{(C_i = right)} $$
This is a linear function of the z
parameter, so we will be able to include it in our model, with a little bit of care.
You will see the use of the offset
function, to account for the right
condition, and we will a priori massage our data a little to define the left.stimcoding
and right.stimcoding
covariates (dummy variables that identify the left
and right
conditions).
Defining the new covariates¶
# Folling the math above, we can define the new covariates as follows:
data_stim["left.stimcoding"] = (2 * (data_stim["stim"] == "C-left").astype(int)) - 1
data_stim["right.stimcoding"] = (data_stim["stim"] == "C-right").astype(int)
Defining the model¶
Below an example of a model that take into account both the bias in v
and in z
.
m_stim_inc_z = hssm.HSSM(
data=data_stim,
model="ddm",
include=[
{
"name": "v",
"formula": "v ~ 0 + direction",
"prior": {"direction": {"name": "Normal", "mu": 0.0, "sigma": 3.0}},
},
{
"name": "z",
"formula": "z ~ 0 + left.stimcoding + offset(right.stimcoding)",
"prior": {
"left.stimcoding": {"name": "Uniform", "lower": 0.0, "upper": 1.0},
},
},
],
)
m_stim_inc_z.sample(sampler="mcmc", tune=500, draws=500)
m_stim_inc_z.summary()
Model initialized successfully. Using default initvals.
Initializing NUTS using adapt_diag... Multiprocess sampling (4 chains in 4 jobs) NUTS: [t, a, v_direction, z_left.stimcoding]
Output()
Sampling 4 chains for 500 tune and 500 draw iterations (2_000 + 2_000 draws total) took 16 seconds. 100%|██████████| 2000/2000 [00:02<00:00, 908.02it/s]
mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
---|---|---|---|---|---|---|---|---|---|
z_left.stimcoding | 0.519 | 0.013 | 0.495 | 0.543 | 0.000 | 0.000 | 1245.0 | 1299.0 | 1.0 |
a | 1.469 | 0.027 | 1.420 | 1.518 | 0.001 | 0.001 | 1196.0 | 1195.0 | 1.0 |
v_direction | 0.555 | 0.034 | 0.490 | 0.615 | 0.001 | 0.001 | 1238.0 | 1169.0 | 1.0 |
t | 0.106 | 0.021 | 0.066 | 0.141 | 0.001 | 0.000 | 1141.0 | 1018.0 | 1.0 |
az.compare(
{
"m_acc_simple": m_acc_simple.traces,
"m_acc_stim_dummy": m_acc_stim_dummy.traces,
"m_stim": m_stim.traces,
"m_stim_inc_z": m_stim_inc_z.traces,
}
)
rank | elpd_loo | p_loo | elpd_diff | weight | se | dse | warning | scale | |
---|---|---|---|---|---|---|---|---|---|
m_stim_inc_z | 0 | -1988.466490 | 3.808830 | 0.000000 | 5.620994e-01 | 33.341068 | 0.000000 | False | log |
m_stim | 1 | -1988.603906 | 2.991864 | 0.137416 | 4.379006e-01 | 33.219300 | 1.404438 | False | log |
m_acc_simple | 2 | -1988.669447 | 3.053195 | 0.202956 | 1.776111e-16 | 33.191774 | 1.404106 | False | log |
m_acc_stim_dummy | 3 | -1989.490479 | 3.906351 | 1.023989 | 0.000000e+00 | 33.245992 | 1.442721 | False | log |