TL;DR

Dynamic 3D Gaussian Splatting overfits by 6.18 dB on average on D-NeRF. A systematic ablation traces >80% of this gap to the split operation of Adaptive Density Control. Across 9 ablation conditions we see a log-linear count–gap correlation (r = 0.995). Then EER—a k-NN elastic-strain penalty on per-Gaussian deformation—breaks this correlation: it reduces the gap by 40.8% while increasing the Gaussian count by 85%. Our full combination closes 57.4% of the baseline gap.

6.18 dBbaseline gap (8 D-NeRF scenes)
99.72%EER strain reduction (8-scene mean)
40.8%EER gap reduction (D-NeRF)
15.9%EER gap reduction (HyperNeRF, real)
57.4%full combination
Count vs gap: EER breaks the log-linear correlation.
The count–gap paradigm shift. Ablations (gray) follow a log-linear trend (r = 0.995, bootstrap 95% CI [0.993, 1.000]). EER (green) uses more Gaussians yet overfits less. The correlation holds within 41 non-EER configurations (r = 0.987) — EER is the only lever we found that breaks it.

Abstract

Dynamic 3D Gaussian Splatting achieves impressive novel-view synthesis on monocular video by coupling a deformable point cloud with Adaptive Density Control (ADC), but exhibits a severe train–test generalization gap. On the D-NeRF benchmark (8 synthetic scenes) we measure an average gap of 6.18 dB (up to 11 dB per scene) and, through a systematic ablation of every ADC sub-operation (split, clone, prune, frequency, threshold, schedule), identify splitting as the dominant pathway.

Our central finding is that Elastic Energy Regularization (EER)—an isotropic k-NN penalty on the relative deformation of neighboring Gaussians—breaks the log-linear count–gap correlation observed across ablations. This reframes overfitting from a capacity problem to an incoherent deformation problem. We evaluate 48 configurations spanning four axes of control (capacity, deformation complexity, view-dependent encoding, stochastic regularization); capacity control and coherence regularization compound, and GAD+LogiGrow+PTDrop+EER closes 57.4% of the baseline gap.

All findings are on synthetic D-NeRF scenes; real-world validation (HyperNeRF, Deformable-3DGS cross-architecture) is partial and still in progress — see the cross-architecture section.

Key Findings

1. Split drives >80% of overfitting

Disabling split collapses both the cloud (2K vs 44K Gaussians) and the gap (1.15 dB vs 6.18 dB). Disabling pruning changes nothing.

2. Count–gap correlation is real but incomplete

r = 0.995 on 9 ablation conditions, holding within both sub-clusters (r = 0.998 on high-count, 0.95 on low-count) and across 41 non-EER configurations (r = 0.987).

3. EER breaks the correlation

+85% Gaussians, −40.8% gap. At the per-Gaussian level, EER reduces deformation strain by 99.6% on Lego, 99.8% on T-Rex, 99.6% on Hellwarrior.

4. Orthogonal axes compound

GAD+EER = 48.2% reduction. Adding LogiGrow + PTDrop = 57.4%, the only configuration in our sweep to more than halve the gap.

Method ranking across 48 configurations

Method ranking by gap reduction.
Every EER-containing method dominates. View-dependent regularization (ChromReg, OEM) has no meaningful effect.

Pareto frontier: quality vs overfitting

Pareto frontier.
GAD+EER and the full combination define the Pareto front. No non-EER combination exceeds 25% gap reduction.

Ablation summary

Ablation summary bar chart.
Left: test PSNR (quality). Right: train–test gap (overfitting). A1/A2 kill the gap but destroy quality; A3 (no clone) is the best ablation trade-off; A4 (no prune) is irrelevant.

Gap grows with training, not with iterations alone

Overfitting gap over training iterations.
Train–test PSNR gap over training (mean ± std across 8 scenes). Baseline grows to ~6 dB; disabling split holds it at ~1 dB. The divergence tracks the densification window (iters 500–15,000).

Why early stopping fails: densification is front-loaded

Front-loaded densification bar chart.
84–89% of cloud growth happens before iter 7,500. Stopping densification at iter 7,500 (A6) only trims the count by 10% and has essentially no effect on the gap — confirming that mitigation must modulate densification from the start, not truncate it at the end.

Dose–response across all methods

Dose-response curves for all methods.
Each panel sweeps a method's strength parameter. EER shows the steepest dose–response; ChromReg and OEM are essentially flat, confirming that view-dependent regularization is not the right axis.

Method Taxonomy

We organize 8 mitigation methods along 3 axes of control, plus stochastic regularization:

Capacity (how many Gaussians)

  • GAD — BIC-motivated adaptive threshold
  • LogiGrow — Verhulst logistic carrying capacity
  • SGD — spectral gating on loss FFT

Deformation coherence (how deformations behave)

  • EER ★ — elastic strain energy on k-NN graph
  • STSR — H¹ Sobolev on the deformation in time

View-dependent encoding

  • ChromReg — penalize high-degree SH coefficients
  • OEM — opacity entropy maximization

Stochastic regularization

  • PTDrop — temporal-consistency-weighted dropout

GAD: a BIC-motivated threshold schedule

We adapt the per-iteration gradient threshold as

τGAD(t) = τbase · (1 + λ · K(t) / (N · Δℓema(t)))

where K(t) is the current count, N is the number of training pixels, and Δℓema is an EMA of the per-iteration loss improvement. λ is the single tunable knob. The mapping from BIC to this formula is a heuristic (see paper, §6.2); the empirical diminishing-returns exponent we measure (α ≈ 0.04) is too mild to justify the often-quoted O((N/λ)1/4) growth bound, so we present the bound qualitatively as "sublinear in N".

EER: k-NN elastic strain energy

For a subset of Gaussians i and their k=8 canonical neighbors j, we penalize

EER = meani,j ‖ u(xi, t) − u(xj, t) ‖² / (‖ xi − xj ‖² + ε)

where u(x, t) is the deformation offset at time t. This is the discrete elastic strain — physically the correct choice for linear elasticity (Hooke's law penalizes ∂u/∂x, not ∂u). In canonical space the k-NN graph is stable; we rebuild it every 500 iterations and apply a cosine ramp from iteration 3K to 10K.

Interactive 3D Deformation Viewer

Explore the deformation field in 3D. Left panel: baseline (incoherent per-Gaussian deformation). Right panel: EER (coherent elastic deformation). Use the time slider to animate — watch how baseline Gaussians scatter chaotically at novel timesteps while EER maintains spatial coherence. Drag to orbit; scroll to zoom. Cameras are linked between panels.

12,000 highest-opacity Gaussians per scene, 11 timesteps (t=0.0 to 1.0). Color by displacement magnitude (viridis) or strain (inferno). Requires serving via HTTP (python -m http.server 8000).

What EER Actually Does to the Deformation Field

For every D-NeRF scene, we load the trained 4DGS model, query the per-Gaussian deformation at 4 timesteps, and plot the distribution of per-Gaussian strain εi = meanj ‖ui−uj‖² / ‖xi−xj‖² over its 8 canonical neighbors.

Lego deformation field.
Lego: strain ↓ 99.62%
T-Rex deformation field.
T-Rex: strain ↓ 99.80%
Hellwarrior deformation field.
Hellwarrior: strain ↓ 99.58%
Bouncing-balls deformation field.
Bouncing-balls: strain ↓ 99.90%
Jumping-jacks deformation field.
Jumping-jacks: strain ↓ 99.84%
Stand-up deformation field.
Stand-up: strain ↓ 99.82%
Mutant deformation field.
Mutant: strain ↓ 99.64%
Hook deformation field.
Hook: strain ↓ 99.59%

Each panel shows (left) canonical cloud colored by displacement magnitude, (middle) a subsampled quiver of u(x, t=0.5), (right) the per-Gaussian strain histogram. Baseline is bimodal with heavy tails; EER collapses the distribution by two orders of magnitude. This is the direct mechanism behind EER's overfitting reduction.

Strain reduction on every scene

Scene Baseline ε EER ε Reduction
bouncingballs2.8350.0029699.90%
hellwarrior5.7850.0240899.58%
hook2.6270.0109099.59%
jumpingjacks6.7720.0110699.84%
lego1.5730.0059499.62%
mutant1.3230.0048199.64%
standup3.6860.0066799.82%
trex3.7150.0073899.80%
mean (n=8)3.5390.0092299.72%

Measured at iter 20,000 on trained 4DGS checkpoints. Strain ε is mean over k=8 canonical neighbors of ‖ui−uj‖² / ‖xi−xj‖², averaged over 4 timesteps (t=0, 0.25, 0.5, 0.75).

EER: The Paradigm Shift

EER three-panel analysis.
(a) EER λ sweep: consistent gap reduction across scenes. (b) EER increases final Gaussian count — the reverse of capacity control. (c) Per-scene gap reduction: consistent across all 8 scenes, including the pathological Lego and Hellwarrior.
Combination additivity plot.
Combinations are super-additive: GAD+EER exceeds the sum of individual reductions, confirming capacity and coherence target orthogonal failure modes.

Real-World Validation (HyperNeRF)

EER transfers to real monocular video. On HyperNeRF chickchicken, with 4DGS and the same λ=0.05 tuned on D-NeRF, EER reduces every generalization-gap metric:

Metric Baseline EER λ=0.05 Reduction
PSNR gap (dB)5.484.6115.9%
SSIM gap0.0670.05123.7%
LPIPS gap0.0300.02033.4%
Test PSNR (dB)26.4226.22−0.20

4DGS on HyperNeRF chickchicken, 14K iterations (stock HyperNeRF config), single run on an RTX 3070. The same λ=0.05 used on D-NeRF transfers directly — no per-dataset tuning required. Multi-scene HyperNeRF + iPhone + Nerfies are left as future work.

Cross-Architecture Validation (Deformable-3DGS)

Main experiments are on 4DGS (HexPlane deformation). We ported EER and GAD to Deformable-3DGS (MLP deformation) and ran baseline + EER on three D-NeRF scenes for 20K iterations.

Phase 1: direct-transfer test at D-NeRF-tuned λ=0.05

Scene Baseline gap EER λ=0.05 gap Reduction ΔPSNR
lego13.15 dB13.56 dB-3.1%-0.02 dB
trex1.50 dB1.81 dB-20.8%-0.38 dB
hellwarrior4.08 dB3.87 dB+5.2%-0.22 dB

Direct transfer at λ=0.05 is poor (mean −6% reduction). Why? Deformable-3DGS trains with L1+0.2·(1−SSIM) vs.\ 4DGS's pure L1 — the loss magnitude is roughly 3× larger and λ=0.05 is therefore under-regularized. Our dimensional-analysis note (paper §6.2) predicts the correct λ for Deformable-3DGS is ≈ 0.15–0.30. Testing this directly:

Phase 2: λ sweep on Deformable-3DGS Lego (dimensional-analysis test)

λ Gap (dB) Train PSNR Test PSNR ΔTest Reduction
0 (baseline)13.1538.3825.23
0.0513.5638.7725.21−0.02−3.1%
0.1510.2335.5525.33+0.10+22.3%
0.308.2633.6025.34+0.11+37.2%
0.607.8233.2125.39+0.16+40.6%

Clean monotonic dose-response. At every λ ≥ 0.15, EER simultaneously reduces the gap by 22–41% AND slightly improves test PSNR — a rare regularizer that gives you both. The coherence mechanism transfers across deformation architectures; the hyperparameter requires per-architecture calibration, exactly as the dimensional-analysis note predicted.

BibTeX

@article{droby2026monodygs,
  author  = {Ahmad Droby},
  title   = {Monocular Dynamic Gaussian Splatting Overfits:
             A Diagnostic Study of Densification in 4D Gaussian Fields},
  journal = {arXiv preprint},
  year    = {2026}
}