Generative Modeling of Weights:
Generalization or Memorization?

1Princeton University, 2University of Pennsylvania

Abstract

Generative models, with their success in image and video generation, have recently been explored for synthesizing effective neural network weights. These approaches take trained neural network checkpoints as training data, and aim to generate high-performing neural network weights during inference. In this work, we examine four representative methods on their ability to generate novel model weights, i.e., weights that are different from the checkpoints seen during training. Surprisingly, we find that these methods synthesize weights largely by memorization: they produce either replicas, or at best simple interpolations, of the training checkpoints. Current methods fail to outperform simple baselines, such as adding noise to the weights or taking a simple weight ensemble, in obtaining different and simultaneously high-performing models. We further show that this memorization cannot be effectively mitigated by modifying modeling factors commonly associated with memorization in image diffusion models, or applying data augmentations. Our findings provide a realistic assessment of what types of data current generative models can model, and highlight the need for more careful evaluation of generative models in new domains.

Background: Generative Modeling of Weights

Building on the success of generative models in image and video synthesis, recent studies have applied them to synthesize weights for neural networks. These methods collect network checkpoints trained with standard gradient-based optimization, and apply generative models to learn the weight distributions and produce new checkpoints, that often perform comparably to conventionally trained weights.


To understand the fundamental mechanisms and the practicality of these methods, we wonder:
have the generative models learned to produce distinct weights that generalize beyond the training ones
or do they merely memorize and reproduce the training data?

We analyze four representative methods, covering different types of generative models and downstream tasks:


Hyper-Representations

Hyper-Representations trains an autoencoder on classification model weights from different runs with identical architectures, fits their latents using KDE, and samples from the fitted distribution.

G.pt

G.pt is a conditional diffusion model trained on checkpoints from tens of thousands of runs. It generates weights for a small predefined model, given initial weights and a target loss.

HyperDiffusion

HyperDiffusion is an unconditional diffusion model trained on neural field MLPs representing 3D shapes. It generates new weights from which meshes can be reconstructed.

P-diff

P-diff trains an unconditional latent diffusion model on 300 checkpoints saved at consecutive steps during an extra training epoch of a base classification model, after it has converged.


Memorization in Weight Space

A natural first step in evaluating the novelty of generated weights is to find the nearest training weights to each generated checkpoint, and check for replications in weight values.

Weight heatmap

We use heatmaps to visualize the model weights at randomly selected parameter indices. In each heatmap, the top row (outlined in red) is a random generated checkpoint, and the three rows below (separated by white lines) are the three nearest training checkpoints. We observe that for every generated checkpoint, at least one training checkpoint is nearly identical to it.

Distance to training weights

We visualize the distribution of the distance from each training and generated checkpoint to its nearest training checkpoint. For all methods except p-diff, the generated checkpoints are significantly closer to the training checkpoints than training checkpoints are to one another. This indicates that these methods produce models with lower novelty than training a new model from scratch.


Memorization in Model Behaviors

Beyond similarity in weight space, we also compare the behaviors of generated models to the behaviors of their nearest training models, and assess whether generative modeling methods differ from a simple noise-addition baseline for creating new weights.

Model outputs

We show the decision boundaries or reconstructed 3D shapes of randomly selected generated checkpoints and their nearest training checkpoints. The generated and nearest training models produce highly similar predictions in image classification, or reconstruct to nearly identical 3D shapes. This suggests that generated weights also closely resemble training weights in model behaviors.

Accuracy-novelty trade-off

We evaluate generated checkpoints by test accuracy (higher is better) for classification models and point cloud distance to test shapes (lower is better) for neural fields. Novelty is measured by maximum prediction error similarity to training checkpoints (lower is better) or point cloud distance to training shapes (higher is better). We compare them to a simple baseline that adds noise to training weights. All methods except p-diff fail to outperform this baseline in obtaining novel and simultaneously high-performing models.


Understanding P-diff's Accuracy-Novelty Trade-off

Different from the other methods, p-diff's training checkpoints are saved at consecutive training steps rather than from different training runs. We seek to understand why p-diff can outperform the noise-addition baseline in the accuracy-novelty trade-off.

P-diff's generated weight values concentrate around the average of training values. Averaging weights of models fine-tuned from the same base model is known to improve accuracy. The generated models may achieve higher accuracy by interpolating training weights.

accuracy-novelty trade-off

t-SNE of weight values

We generate new models using two baselines ("averaged" and "gaussian") that approximate interpolations of training weights. We find that both the weight values and behaviors of the generated models closely match those of models from the interpolation baselines.

Analysis

Impact of modeling factors on memorization

Using L2 distance as a proxy for the novelty of generated weights, and classification accuracy or minimum matching distance (MMD) to training shapes as measures of performance, we find that modeling factor adjustments shown to reduce memorization in image diffusion do not alleviate the memorization issue in weight generation: none substantially improved novelty without degrading performance.

Weight space symmetries

Neural networks have symmetries: certain transformations (e.g., permutation and scaling) can be applied to the weights without changing the model's behavior. Among the four methods, G.pt and Hyper-Representations leverage permutation symmetry, but only as a form of data augmentation. We evaluate whether such augmentations provide meaningful benefits for generative modeling.

We apply function-preserving transformations to the training weights, and reconstruct both the original and transformed weights using the Hyper-Representations' autoencoder. The resulting reconstructions highly differ in accuracy and predictions. For reference, we report the average accuracy difference and prediction similarity between different untransformed training models ("original"). These results suggest that symmetry-based data augmentation alone is insufficient to train the autoencoder to fully capture weight space symmetries.

We add 1, 3, and 7 random weight permutations as data augmentation for training HyperDiffusion, effectively enlarging the dataset by factors of ×2, ×4, and ×8, respectively. Even when we only add a single permutation, HyperDiffusion fails to produce meaningful shapes.

BibTeX

@article{zeng2025generative,
  title={Generative Modeling of Weights: Generalization or Memorization?},
  author={Boya Zeng and Yida Yin and Zhiqiu Xu and Zhuang Liu},
  journal={arXiv preprint arXiv:2506.07998},
  year={2025},
}