041 — E rate is affine in gamma frequency (trains)

Abstract

nb037 varied τGABA\tau_\text{GABA} at inference and the per-cell E rate tracked the gamma cycle. Does that survive re-training at each τGABA\tau_\text{GABA}? Yes. Across τGABA{4.5,6,9,12,18,27}\tau_\text{GABA} \in \{4.5, 6, 9, 12, 18, 27\} ms × 3 seeds, rE=0.76+0.216fγr_E = 0.76 + 0.216 \cdot f_\gamma with R2=0.997R^2 = 0.997. The slope is per-cycle E-cell participation; the intercept is a non-rhythmic baseline; accuracy stays at 81–89% across the sweep. The cycle clock constrains what the network can become.

Methods

Six τGABA\tau_\text{GABA} values × three seeds = 18 networks. Recipe matches the nb025 PING baseline (100 epochs medium tier on MNIST, Adam at 4×1044 \times 10^{-4}, batch 256, mem-mean readout, θu\theta_u off, Δt=0.1\Delta t = 0.1 ms, T=200T = 200 ms); only τGABA\tau_\text{GABA} varies. For each cell, fγf_\gamma is the parabolic-interpolated peak of the Welch PSD on the per-trial population E trace. Figure 5 reports per-trial peak medians (avoids the centroid bias of trial-mean PSD peaks; see Figure 2). Fit rE=a+pfγr_E = a + p \cdot f_\gamma across the 18 cells.

Results

Training converges

Figure 1. Per-cell training curves
Two stacked panels. Top: test accuracy vs epoch, one line per cell, coloured by τ_GABA. Accuracy plateaus at 80–89% by epoch 15 across all cells. Bottom: test E rate vs epoch — every curve climbs steadily across the first 30 epochs then flattens by epoch 70–100.

Accuracy plateaus by epoch 15; rate keeps climbing until epoch 70–100. The 100-epoch numbers are converged.

trainingaa (Hz)pp (Hz/Hz)R2R^2
30 epochs1.140.1660.990
100 epochs0.760.2160.997

The shape is robust; pp tightens with training. The 100-epoch row is canonical.

Trial-to-trial fγf_\gamma

Figure 2. Per-trial PSD peak distribution vs trial-mean-PSD peak
Six stacked histograms, one per τ_GABA value (viridis colormap), of per-trial PSD peak frequencies pooled across the three seeds (1200 trials per panel). Dashed verticals: per-trial medians. Solid red: trial-mean PSD peaks. Distributions have a single-peaked envelope with fine comb-like structure from the 5 Hz Welch bin spacing.

Per-trial peak distribution per τGABA\tau_\text{GABA}, pooled across seeds (≈ 1200 trials per panel). Dashed verticals mark per-trial medians; solid red marks the trial-mean PSD peak.

Two read-offs: (a) the trial-mean PSD peak sits 1–3 Hz above the per-trial median (centroid bias of trial-averaging PSDs whose peaks fluctuate), so the per-trial-median refit is the more honest fit; (b) there’s real trial-to-trial spread in fγf_\gamma that grows with fγf_\gamma itself.

The comb-like fine structure is a Welch artefact: Δf=1/T=5\Delta f = 1/T = 5 Hz quantises each trial’s peak to one of six gamma-band bins, and parabolic interpolation only smears each by ±2.5\pm 2.5 Hz, so per-trial peaks pile up at the bin centres. The comb spacing equals Δf\Delta f exactly — methodological, not physical. The envelope is single-peaked at every τGABA\tau_\text{GABA}, ruling out digit-class-specific clusters. A longer TT would dissolve the comb.

Population PSDs

Figure 3. Trained-network population PSDs — peak marks f_γ
Six PSD curves on a linear y axis, one per τ_GABA value (4.5, 6, 9, 12, 18, 27 ms; viridis colormap), x-axis frequency 5–150 Hz. Each curve has a clear peak in the gamma band, marked with a dot at the parabolic-interpolated peak frequency.

Trial-mean Welch PSDs by τGABA\tau_\text{GABA}. Dots mark the parabolic-interpolated peak. Peak shifts cleanly from ≈ 14 Hz at τGABA=27\tau_\text{GABA} = 27 ms to ≈ 54 Hz at τGABA=4.5\tau_\text{GABA} = 4.5 ms; no overlap between adjacent conditions.

Parabolic interpolation with peak-bin values (y0,y1,y2)(y_0, y_1, y_2):

fγ=freq[peak]+12y0y2y02y1+y2Δf,Δf=5 Hz.f_\gamma = \text{freq}[\text{peak}] + \tfrac{1}{2}\frac{y_0 - y_2}{y_0 - 2y_1 + y_2}\cdot \Delta f, \qquad \Delta f = 5 \text{ Hz}.

Necessary because the bare 5 Hz quantisation would coarsen fγf_\gamma across the six conditions; on a well-isolated peak the interpolation error is O((Δf)3)O((\Delta f)^3).

Single-trial rasters

Figure 4. Single-trial rasters across the τ_GABA sweep
Six stacked raster panels for τ_GABA values 4.5, 6, 9, 12, 18, 27 ms. Each panel shows E spikes (black) and I spikes (red) across the first 100 ms of one MNIST trial. The gamma cycle period grows monotonically with τ_GABA: bursts every ~17 ms at 4.5 ms; every ~63 ms at 27 ms.

One MNIST trial through each network. Cycle period stretches from ≈ 17 ms (rE=14.5r_E = 14.5 Hz) to ≈ 63 ms (rE=4.6r_E = 4.6 Hz). The eye and the spectrum agree.

The affine law

The shape rE=a+pfγr_E = a + p \cdot f_\gamma is predicted by cycle dynamics, not curve-fitted. Within one cycle of duration 1/fγ1/f_\gamma, a fraction pp of E cells emits exactly one spike (those nearest threshold when the I shunt drops); the rest are still recovering. The cyclic per-cell rate is pfγp \cdot f_\gamma. At long τGABA\tau_\text{GABA} the I conductance never fully decays and the cycle dissolves into a tonic bath, leaving a feedforward baseline aa independent of fγf_\gamma:

rE=afeedforward baseline+pfγcyclic contribution.r_E = \underbrace{a}_{\text{feedforward baseline}} + \underbrace{p \cdot f_\gamma}_{\text{cyclic contribution}}.
Figure 5. Retrained τ_GABA sweep — post-training E rate tracks measured f_γ
Top panel: scatter of mean post-training E rate (Hz) vs measured gamma frequency f_γ (Hz). Six clusters, one per τ_GABA value (annotated 4.5–27 ms), three seeds each. Error bars in both dimensions. An affine fit line r_E = 0.76 + 0.216 · f_γ passes through every error bar with R² = 0.997. Bottom panel: test accuracy vs same x-axis — flat at 81–89% across the entire f_γ range. f_γ on the x-axis is the per-trial PSD peak median (see Figure 2).

Top: mean post-training E rate vs fγf_\gamma, six clusters × three seeds, error bars from seed variance. The affine fit passes through every error bar. Bottom: per-cluster accuracy — flat. The rate change is not paid in classification.

Two interpretations come for free:

  • pp is per-cycle participation. At τGABA=9\tau_\text{GABA} = 9 ms: (rEa)/fγ=7.8/35=0.22(r_E - a)/f_\gamma = 7.8/35 = 0.22, matching the fit. nb024’s per-cell distribution gives 0.21 from a different angle.
  • aa is the non-rhythmic baseline. At fγ0f_\gamma \to 0 the fit extrapolates to ≈ 0.8 Hz — the no-rhythm regime that nb042 probes directly. The intercept is a prediction, not a nuisance term.

Discussion

nb037 showed rate followed the rhythm under inference-time mutation. nb041 shows the optimiser can’t escape: every retrained network sits on the same affine line.

Cortical prediction

The derivation depends only on the cycle structure produced by the recurrent E↔I loop, not on the task or training substrate. To the extent that cortical pyramidal–interneuron circuits implement the same kind of self-clocking cycle, the law should hold in vivo:

Pyramidal-cell mean firing rate should be linearly predictable from local gamma peak frequency: rEa+pfγr_E \approx a + p \cdot f_\gamma, with p0.1p \approx 0.10.20.2 and a small non-rhythmic baseline aa.

Testable in any awake-cortex recording with simultaneous gamma + pyramidal-rate measurement. Falsifiable three ways:

  • Slope — if pp is consistently outside 0.05–0.3, per-cycle participation is the wrong interpretation.
  • Linearity — if pyramidal rates correlate with gamma power but not gamma frequency affinely, the cycle-counting picture is wrong.
  • Direction — shortening τGABA\tau_\text{GABA} (some benzodiazepine washouts, specific GABA-A manipulations) should raise pyramidal rates, counter-intuitive but a direct consequence.

This assumes adult-cortex-like biophysics (τAMPA=2\tau_\text{AMPA} = 2 ms, τGABA5\tau_\text{GABA} \in 53030 ms, E:I ≈ 4:1). If cortical gamma is generated by a different mechanism (ING, pacemaker-driven), the law need not transfer — so the prediction also tests whether cortical gamma is PING-like.

Next steps

Quantitative-law leg of ar010’s pre-shipping chain. Combined with nb042’s rhythm-locking and dimensional-collapse results, the structural-bound argument is complete.