Routing-SNN Dashboards

SSM

State space memory dashboards

Routed SSM is the parent experiment family. MQAR and WikiText are the task-specific drill-downs beneath it, so users can move from the overview to recall and language-model evidence without scanning unrelated SNN work.

Routed SSM →

Stateful, path-dependent routing on a continuous diagonal-SSM substrate. The 2×2 ablation (router-state × block-state) + MQAR parity + decode speedup. Routing-on-state vs the stateless MoE-Mamba/BlackMamba baseline.

stateful routing ~0.99 @ 12.5% active vs stateless ~0.48

MQAR →

Generalized, auto-discovering MQAR training-curve view: every routed/dense run on one page, one shared filter (router × difficulty × seed × variant) driving all charts, mean±band aggregation across seeds.

hard top-k routed ~0.91 vs dense ~0.77 (11 seeds)

WikiText-103 LM →

Language-model pilot: dense vs stateful-routed vs random-k on an 8 k-sequence WikiText-103 subset. Headline metric is perplexity (lower = better). Sortable table + PPL curves + active-frac over training.

dense best_val_ppl ~245 · routed k4of8 ~253 (3 seeds)

SNN

Spiking neural network dashboards

The SNN branch holds the systems work: low-level kernel benchmarking and the spatial pruning/training ablations. That keeps implementation performance and spatial architecture results together instead of scattering them across a flat global list.

Kernels →

V2–V8 CUDA kernel speed: bit-packed inputs, wavefront parallelism, Tensor-Core block-sparse decode. Firing-rate crossover scans and width-scaling sweeps.

V7.1 non-routed decode ~1.73–1.84× dense

Spatial SNN Training Ablation →

Pre-training spatial sparsity vs dense on SHD. Exponential 2D locality (exp2dloc) beats dense at 91.8% sparsity. Live training curves, seed grouping, per-config topology cards.

exp2dloc +3 pp vs dense @ 8% density