Abstract: An example apparatus for super resolution imaging includes a convolutional neural network to receive a low resolution frame and generate a high resolution illuminance component frame. The apparatus also includes a hardware scaler to receive the low resolution frame and generate a second high resolution chrominance component frame. The apparatus further includes a combiner to combine the high resolution illuminance component frame and the high resolution chrominance component frame to generate a high resolution frame.
Description:Related Application
[0001] This application is a divisional of India Patent Application No. 202247041134, filed on 19 July 2022, entitled “SUPER RESOLUTION USING CONVOLUTIONAL NEURAL NETWORK”.
Background
[0002] Super-resolution imaging (SR) is a class of techniques that increase the resolution of images processed by an imaging system. For example, low resolution images may be converted into high resolution images with improved details using various SR techniques.
Brief Description of the Drawings
[0003] Fig. 1 is a block diagram illustrating an example system for super resolution using a scalable neural network;
[0004] Fig. 2 is a block diagram illustrating an example scalable convolutional neural network for super resolution;
[0005] Fig. 3 is a flow chart illustrating an example system for training a scalable convolutional neural network for super resolution;
[0006] Fig. 4 is a pair of graphs showing a replacement of a ReLU activation function with a PReLU activation function;
[0007] Fig. 5 is a block diagram illustrating an example system for training a scalable convolutional neural network for super resolution with a self-similarity loss;
[0008] Fig. 6 is a process flow diagram illustrating a method for training a scalable convolutional neural network for super resolution;
[0009] Fig. 7 is a process flow diagram illustrating a method for training a scalable convolutional neural network for super resolution with a self-similarity loss;
[0010] Fig. 8 is a process flow diagram illustrating a method for super resolution using a scalable neural network;
[0011] Fig. 9 is a process flow diagram illustrating a method for super resolution using a scalable neural network with PReLU activation;
[0012] Fig. 10 is block diagram illustrating an example computing device that can execute super resolution using a scalable neural network; and
[0013] Fig. 11 is a block diagram showing computer readable media that store code for performing super resolution using a scalable neural network.
[0014] The same numbers are used throughout the disclosure and the figures to reference like components and features. Numbers in the 100 series refer to features originally found in Fig. 1; numbers in the 200 series refer to features originally found in Fig. 2; and so on.
Description of the Embodiments
[0015] Deep learning based super resolution may be used in restoring low resolution images and video frames to high resolution images and video frames. Currently, deep learning based methods may conduct training processes based on low and high resolution image pairs obtained by certain downsampling techniques. For example, a conventional super resolution technique using low resolution images downscaled with a bicubic filter may be used. For example, a conventional super resolution technique may use low resolution images downscaled by the bicubic filter. Some blind super resolution systems may further improve this downscaling process by combining bicubic filter with Gaussian smoothing using multiple kernels. This kind of training process may work for nature content. However, in screen or gaming content, severe overshoot and undershoot artifacts may be observed after the upscaling of sharp edges. As used herein, overshooting artifact are artifacts that appear as spurious bands or “ghosts” near edges. Overshooting artifacts may also be referred to as ringing artifacts. Nature content is video containing camera-captured video scenes. For example, nature content may contain fewer sharp edges. Screen content is video containing a significant portion of rendered graphics (excluding games), text, or animation rather than camera-captured video scenes. Gaming content is a significant portion of rendered game.
[0016] For deep learning based super resolution, two approaches are sometimes used to achieve higher quality output. For example, deep convolution networks may be used as a post-processing model of a traditional scaler to enhance details of the images and video resized by conventional methods such as bilinear, bicubic, Lanczos filters, etc. However, this may introduce a large computation workload to an inference device, especially when the input resolution of the images or videos is high. Another way to achieve higher quality output is to directly take a low resolution image or video frame as input, and then utilize a convolutional network to restore the details of high resolution images. For example, the convolutional network can be used to apply a series of neural network layers first to the low-resolution video frames to exact import feature maps used to restore high resolution details. After that, a dedicated neural network layer may upscale the low-resolution feature maps to a high-resolution output. In this way, part of a workload can be shifted to low resolution features. Shifting the workload in this manner may reduce the computation and bandwidth overhead compared with the previous way, as most of the compute may be conducted on the low-resolution instead of high-resolution.
, C , C , C , Claims:1. A computing platform comprising:
interface circuitry;
instructions; and
at least one programmable circuit to be programmed based on the instructions to:
generate, with a first convolutional layer, first data based on a luminance component of an input video frame, the input video frame having a first resolution;
generate, with a second convolutional layer, second data based on the first data from the first convolutional layer;
combine the first data from the first convolutional layer and the second data from the second convolutional layer to determine third data; and
generate, with a third convolutional layer, a luminance component of an output video frame based on the third data, the output video frame having a second resolution different from the first resolution.
2. The computing platform of claim 1, wherein the second resolution is lower than the first resolution.
3. The computing platform of claim 1, wherein one or more of the at least one programmable circuit is to:
generate chrominance components of the output video frame based on chrominance components of the input video frame; and
combine the luminance component of the output video frame and the chrominance components of the output video frame to generate a combined output video frame.
4. The computing platform of claim 3, wherein one or more of the at least one programmable circuit is to perform a scaling operation to generate the chrominance components of the output video frame based on the chrominance components of the input video frame.
5. The computing platform of claim 1, wherein one or more of the at least one programmable circuit is to train at least one of the first convolutional layer, the second convolutional layer or the third convolutional layer based on a self-similarity loss.
6. The computing platform of claim 5, wherein one or more of the at least one programmable circuit is to compute the self-similarity loss based on a reconstructed high-resolution frame and a corresponding high-resolution ground truth frame.
| # | Name | Date |
|---|---|---|
| 1 | 202548074329-POWER OF AUTHORITY [05-08-2025(online)].pdf | 2025-08-05 |
| 2 | 202548074329-FORM 1 [05-08-2025(online)].pdf | 2025-08-05 |
| 3 | 202548074329-DRAWINGS [05-08-2025(online)].pdf | 2025-08-05 |
| 4 | 202548074329-DECLARATION OF INVENTORSHIP (FORM 5) [05-08-2025(online)].pdf | 2025-08-05 |
| 5 | 202548074329-COMPLETE SPECIFICATION [05-08-2025(online)].pdf | 2025-08-05 |