Search for a command to run...
The deep image prior (DIP) suggests that it is possible to train a randomly initialized network with a suitable architecture to solve inverse imaging problems by simply optimizing its parameters to reconstruct a single degraded image. However, the prior knowledge exploited by vanilla DIP relies on basic local convolutions, which inevitably limits the performance of inverse imaging tasks to the generative capacity of the model. Furthermore, image information is often not only related to neighboring pixels but also dependent on global color features and spatial distribution. Simple local convolutions used in inverse imaging cannot capture precise fine-grained details. Moreover, DIP is an unsupervised process but requires iterations to learn inverse imaging, consuming computational power and limiting the adaptation of global attention. To solve these problems, this article explores an efficient global prior module—a tri-directional multi-head self-attention mechanism—aiming to learn pixel-wise correlations along three directions: horizontal, vertical, and channel-wise. Our observations found that global learning can effectively enhance the detail information of edge pixels, making images more vivid and textures clearer. In addition, tri-directional multi-head self-attention can efficiently replace the global perception ability of pixel-level self-attention. Finally, we demonstrate that global learning can effectively improve the imaging effect of inverse imaging problems and enhance the information of texture edge pixels. Moreover, tri-directional multi-head self-attention can effectively alleviate the computation redundancy of pixel-level self-attention, thus achieving efficient and high-quality inverse imaging tasks. The principle of this method lies in global feature capture and efficient attention modeling, striking a balance between detail fidelity and computational practicality.