041 — E rate is affine in gamma frequency (trains)
Abstract
nb037 varied at inference and the per-cell E rate tracked the gamma cycle. Does that survive re-training at each ? Yes. Across ms × 3 seeds, with . The slope is per-cycle E-cell participation; the intercept is a non-rhythmic baseline; accuracy stays at 81–89% across the sweep. The cycle clock constrains what the network can become.
Methods
Six values × three seeds = 18 networks. Recipe matches the nb025 PING baseline (100 epochs medium tier on MNIST, Adam at , batch 256, mem-mean readout, off, ms, ms); only varies. For each cell, is the parabolic-interpolated peak of the Welch PSD on the per-trial population E trace. Figure 5 reports per-trial peak medians (avoids the centroid bias of trial-mean PSD peaks; see Figure 2). Fit across the 18 cells.
Results
Training converges
Accuracy plateaus by epoch 15; rate keeps climbing until epoch 70–100. The 100-epoch numbers are converged.
| training | (Hz) | (Hz/Hz) | |
|---|---|---|---|
| 30 epochs | 1.14 | 0.166 | 0.990 |
| 100 epochs | 0.76 | 0.216 | 0.997 |
The shape is robust; tightens with training. The 100-epoch row is canonical.
Trial-to-trial
Per-trial peak distribution per , pooled across seeds (≈ 1200 trials per panel). Dashed verticals mark per-trial medians; solid red marks the trial-mean PSD peak.
Two read-offs: (a) the trial-mean PSD peak sits 1–3 Hz above the per-trial median (centroid bias of trial-averaging PSDs whose peaks fluctuate), so the per-trial-median refit is the more honest fit; (b) there’s real trial-to-trial spread in that grows with itself.
The comb-like fine structure is a Welch artefact: Hz quantises each trial’s peak to one of six gamma-band bins, and parabolic interpolation only smears each by Hz, so per-trial peaks pile up at the bin centres. The comb spacing equals exactly — methodological, not physical. The envelope is single-peaked at every , ruling out digit-class-specific clusters. A longer would dissolve the comb.
Population PSDs
Trial-mean Welch PSDs by . Dots mark the parabolic-interpolated peak. Peak shifts cleanly from ≈ 14 Hz at ms to ≈ 54 Hz at ms; no overlap between adjacent conditions.
Parabolic interpolation with peak-bin values :
Necessary because the bare 5 Hz quantisation would coarsen across the six conditions; on a well-isolated peak the interpolation error is .
Single-trial rasters
One MNIST trial through each network. Cycle period stretches from ≈ 17 ms ( Hz) to ≈ 63 ms ( Hz). The eye and the spectrum agree.
The affine law
The shape is predicted by cycle dynamics, not curve-fitted. Within one cycle of duration , a fraction of E cells emits exactly one spike (those nearest threshold when the I shunt drops); the rest are still recovering. The cyclic per-cell rate is . At long the I conductance never fully decays and the cycle dissolves into a tonic bath, leaving a feedforward baseline independent of :
Top: mean post-training E rate vs , six clusters × three seeds, error bars from seed variance. The affine fit passes through every error bar. Bottom: per-cluster accuracy — flat. The rate change is not paid in classification.
Two interpretations come for free:
- is per-cycle participation. At ms: , matching the fit. nb024’s per-cell distribution gives 0.21 from a different angle.
- is the non-rhythmic baseline. At the fit extrapolates to ≈ 0.8 Hz — the no-rhythm regime that nb042 probes directly. The intercept is a prediction, not a nuisance term.
Discussion
nb037 showed rate followed the rhythm under inference-time mutation. nb041 shows the optimiser can’t escape: every retrained network sits on the same affine line.
Cortical prediction
The derivation depends only on the cycle structure produced by the recurrent E↔I loop, not on the task or training substrate. To the extent that cortical pyramidal–interneuron circuits implement the same kind of self-clocking cycle, the law should hold in vivo:
Pyramidal-cell mean firing rate should be linearly predictable from local gamma peak frequency: , with – and a small non-rhythmic baseline .
Testable in any awake-cortex recording with simultaneous gamma + pyramidal-rate measurement. Falsifiable three ways:
- Slope — if is consistently outside 0.05–0.3, per-cycle participation is the wrong interpretation.
- Linearity — if pyramidal rates correlate with gamma power but not gamma frequency affinely, the cycle-counting picture is wrong.
- Direction — shortening (some benzodiazepine washouts, specific GABA-A manipulations) should raise pyramidal rates, counter-intuitive but a direct consequence.
This assumes adult-cortex-like biophysics ( ms, – ms, E:I ≈ 4:1). If cortical gamma is generated by a different mechanism (ING, pacemaker-driven), the law need not transfer — so the prediction also tests whether cortical gamma is PING-like.
Next steps
Quantitative-law leg of ar010’s pre-shipping chain. Combined with nb042’s rhythm-locking and dimensional-collapse results, the structural-bound argument is complete.