Colab Instructions¶
If you would like to run this tutorial on Google colab, please click this link.
Once you are in the colab, follow the installation instructions below and then restart your runtime.
Just uncomment the code in the next code cell and run it!
NOTE:
You may want to switch your runtime to have a GPU or TPU. To do so, go to Runtime > Change runtime type and select the desired hardware accelerator.
Note that if you switch your runtime you have to follow the installation instructions again.
# If running this on Colab, please uncomment the next line
# !pip install hssm
Using lapse probabilities and distributions in HSSM¶
Since v0.1.2, HSSM has added the ability to model outliers in the distribution with lapse probabilities and distributions. Let's see how it works.
import bambi as bmb
import matplotlib.pyplot as plt
import hssm
hssm.set_floatX("float32")
import warnings
warnings.filterwarnings("ignore")
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Setting PyTensor floatX type to float32.
Setting "jax_enable_x64" to False. If this is not intended, please set `jax` to False.
Lapse probabilities and distributions are enabled by default.¶
From v0.1.2 on, lapse probabilities and distributions are enabled by default. If left unspecified, the probability for outliers will be a fixed value of 0.05 and the distribution will be specified as Uniform(0, 10)
.
# Simulate some data
ddm_data = hssm.simulate_data(
model="ddm", theta=dict(v=0.5, a=1.5, z=0.5, t=0.1), size=1000
)
ddm_data.head()
rt | response | |
---|---|---|
0 | 2.935709 | 1.0 |
1 | 0.954312 | 1.0 |
2 | 2.164892 | 1.0 |
3 | 1.686578 | 1.0 |
4 | 3.178887 | 1.0 |
# Build the simplest model specifying only data
# Note the model output
ddm_model_default = hssm.HSSM(data=ddm_data)
ddm_model_default
Model initialized successfully.
Hierarchical Sequential Sampling Model Model: ddm Response variable: rt,response Likelihood: analytical Observations: 1000 Parameters: v: Prior: Normal(mu: 0.0, sigma: 2.0) Explicit bounds: (-inf, inf) a: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf) z: Prior: Uniform(lower: 0.0, upper: 1.0) Explicit bounds: (0.0, 1.0) t: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf) Lapse probability: 0.05 Lapse distribution: Uniform(lower: 0.0, upper: 20.0)
Note
Note in the above output that lapse probability and lapse distributions are at default values.
Specifying lapse probability and distribution¶
It is easy to change the lapse probability and distribution. HSSM has added two arguments, p_outlier
and lapse
to allow the lapse probability and distribution to be specified.
The optional p_outlier
accepts a float
, a dict
, or a bmb.Prior
object. When p_outlier
is specified as a single float
value, it will be considered "fixed". You can also specify a prior distribution for p_outlier
through a dict
or a bmb.Prior
object, the same way as you would when specifying priors for any other parameter. That way, the lapse probability will be considered another parameter and will be estimated during MCMC sampling.
Likewise, the lapse
argument accepts a dict
or a bmb.Prior
object to specify a fixed lapse distribution. This distribution will be considered as the one that outliers are generated from.
ddm_model_lapse = hssm.HSSM(
data=ddm_data,
p_outlier={"name": "Uniform", "lower": 0.0001, "upper": 0.5},
lapse=bmb.Prior("Uniform", lower=0.0, upper=20.0),
)
ddm_model_lapse
Model initialized successfully.
Hierarchical Sequential Sampling Model Model: ddm Response variable: rt,response Likelihood: analytical Observations: 1000 Parameters: v: Prior: Normal(mu: 0.0, sigma: 2.0) Explicit bounds: (-inf, inf) a: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf) z: Prior: Uniform(lower: 0.0, upper: 1.0) Explicit bounds: (0.0, 1.0) t: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf) Lapse probability: Uniform(lower: 9.999999747378752e-05, upper: 0.5) Lapse distribution: Uniform(lower: 0.0, upper: 20.0)
lapse_trace = ddm_model_lapse.sample()
lapse_trace
Using default initvals.
Initializing NUTS using adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [t, a, p_outlier, z, v]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 25 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics
0%| | 0/2000 [00:00<?, ?it/s]
7%|▋ | 149/2000 [00:00<00:01, 1489.12it/s]
15%|█▍ | 298/2000 [00:00<00:01, 1482.61it/s]
22%|██▏ | 447/2000 [00:00<00:01, 1483.06it/s]
30%|██▉ | 596/2000 [00:00<00:00, 1484.48it/s]
37%|███▋ | 745/2000 [00:00<00:00, 1485.02it/s]
45%|████▍ | 894/2000 [00:00<00:00, 1485.40it/s]
52%|█████▏ | 1043/2000 [00:00<00:00, 1480.00it/s]
60%|█████▉ | 1192/2000 [00:00<00:00, 1481.90it/s]
67%|██████▋ | 1342/2000 [00:00<00:00, 1485.58it/s]
75%|███████▍ | 1491/2000 [00:01<00:00, 1464.61it/s]
82%|████████▏ | 1638/2000 [00:01<00:00, 1433.86it/s]
89%|████████▉ | 1784/2000 [00:01<00:00, 1441.20it/s]
97%|█████████▋| 1934/2000 [00:01<00:00, 1456.74it/s]
100%|██████████| 2000/2000 [00:01<00:00, 1468.25it/s]
-
<xarray.Dataset> Size: 48kB Dimensions: (chain: 2, draw: 1000) Coordinates: * chain (chain) int64 16B 0 1 * draw (draw) int64 8kB 0 1 2 3 4 5 6 7 ... 993 994 995 996 997 998 999 Data variables: t (chain, draw) float32 8kB 0.1074 0.09952 0.09923 ... 0.114 0.1265 p_outlier (chain, draw) float32 8kB 0.04303 0.04826 ... 0.04697 0.04665 z (chain, draw) float32 8kB 0.4945 0.4749 0.4783 ... 0.4988 0.4932 a (chain, draw) float32 8kB 1.458 1.473 1.485 ... 1.496 1.461 1.455 v (chain, draw) float32 8kB 0.5517 0.6129 0.6006 ... 0.545 0.5263 Attributes: created_at: 2025-01-01T23:59:24.069921+00:00 arviz_version: 0.19.0 inference_library: pymc inference_library_version: 5.19.1 sampling_time: 25.262215852737427 tuning_steps: 1000 modeling_interface: bambi modeling_interface_version: 0.15.0
-
<xarray.Dataset> Size: 16MB Dimensions: (chain: 2, draw: 1000, __obs__: 1000) Coordinates: * chain (chain) int64 16B 0 1 * draw (draw) int64 8kB 0 1 2 3 4 5 6 ... 993 994 995 996 997 998 999 * __obs__ (__obs__) int64 8kB 0 1 2 3 4 5 6 ... 994 995 996 997 998 999 Data variables: rt,response (chain, draw, __obs__) float64 16MB -2.278 -0.9188 ... -2.366 Attributes: modeling_interface: bambi modeling_interface_version: 0.15.0
-
<xarray.Dataset> Size: 252kB Dimensions: (chain: 2, draw: 1000) Coordinates: * chain (chain) int64 16B 0 1 * draw (draw) int64 8kB 0 1 2 3 4 5 ... 995 996 997 998 999 Data variables: (12/17) acceptance_rate (chain, draw) float64 16kB 0.9807 0.9467 ... 0.8408 diverging (chain, draw) bool 2kB False False ... False False energy (chain, draw) float64 16kB 2.005e+03 ... 2.004e+03 energy_error (chain, draw) float64 16kB -0.2655 -0.1448 ... 0.0188 index_in_trajectory (chain, draw) int64 16kB -1 -4 1 6 4 ... -2 -3 -3 2 2 largest_eigval (chain, draw) float64 16kB nan nan nan ... nan nan ... ... process_time_diff (chain, draw) float64 16kB 0.009008 ... 0.004366 reached_max_treedepth (chain, draw) bool 2kB False False ... False False smallest_eigval (chain, draw) float64 16kB nan nan nan ... nan nan step_size (chain, draw) float64 16kB 0.5432 0.5432 ... 0.4679 step_size_bar (chain, draw) float64 16kB 0.5131 0.5131 ... 0.5953 tree_depth (chain, draw) int64 16kB 3 3 3 3 4 3 ... 3 3 3 3 2 2 Attributes: created_at: 2025-01-01T23:59:24.085322+00:00 arviz_version: 0.19.0 inference_library: pymc inference_library_version: 5.19.1 sampling_time: 25.262215852737427 tuning_steps: 1000 modeling_interface: bambi modeling_interface_version: 0.15.0
-
<xarray.Dataset> Size: 16kB Dimensions: (__obs__: 1000, rt,response_extra_dim_0: 2) Coordinates: * __obs__ (__obs__) int64 8kB 0 1 2 3 4 ... 996 997 998 999 * rt,response_extra_dim_0 (rt,response_extra_dim_0) int64 16B 0 1 Data variables: rt,response (__obs__, rt,response_extra_dim_0) float32 8kB 2... Attributes: created_at: 2025-01-01T23:59:24.090134+00:00 arviz_version: 0.19.0 inference_library: pymc inference_library_version: 5.19.1 modeling_interface: bambi modeling_interface_version: 0.15.0
ddm_model_lapse.summary(lapse_trace)
mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
---|---|---|---|---|---|---|---|---|---|
p_outlier | 0.035 | 0.013 | 0.010 | 0.058 | 0.000 | 0.000 | 1274.0 | 932.0 | 1.0 |
z | 0.491 | 0.013 | 0.465 | 0.513 | 0.000 | 0.000 | 1335.0 | 1336.0 | 1.0 |
t | 0.110 | 0.021 | 0.070 | 0.149 | 0.001 | 0.000 | 1336.0 | 1117.0 | 1.0 |
a | 1.472 | 0.028 | 1.424 | 1.527 | 0.001 | 0.001 | 1201.0 | 1398.0 | 1.0 |
v | 0.543 | 0.036 | 0.475 | 0.609 | 0.001 | 0.001 | 1318.0 | 1175.0 | 1.0 |
ddm_model_lapse.plot_trace(lapse_trace)
plt.tight_layout()
Disable lapse probabilities and distributions¶
When p_outliers
is set to None
or 0, lapse probability and distributions will be ignored. They will not be included in the model output.
Note
If `p_outlier` is set to `None` or 0 but `lapse` is not set to None, you will receive a warning. Please remember to also set `lapse` to `None`.
ddm_model_no_lapse = hssm.HSSM(data=ddm_data, p_outlier=None, lapse=None)
ddm_model_no_lapse
Model initialized successfully.
Hierarchical Sequential Sampling Model Model: ddm Response variable: rt,response Likelihood: analytical Observations: 1000 Parameters: v: Prior: Normal(mu: 0.0, sigma: 2.0) Explicit bounds: (-inf, inf) a: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf) z: Prior: Uniform(lower: 0.0, upper: 1.0) Explicit bounds: (0.0, 1.0) t: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf)
ddm_model_no_lapse_warn = hssm.HSSM(data=ddm_data, p_outlier=0)
ddm_model_no_lapse_warn
You have specified the `lapse` argument to include a lapse distribution, but `p_outlier` is set to either 0 or None. Your lapse distribution will be ignored.
Model initialized successfully.
Hierarchical Sequential Sampling Model Model: ddm Response variable: rt,response Likelihood: analytical Observations: 1000 Parameters: v: Prior: Normal(mu: 0.0, sigma: 2.0) Explicit bounds: (-inf, inf) a: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf) z: Prior: Uniform(lower: 0.0, upper: 1.0) Explicit bounds: (0.0, 1.0) t: Prior: HalfNormal(sigma: 2.0) Explicit bounds: (0.0, inf)