Input & Output of Image Processing Pipeline Core
This week, I worked on the schematic of BRAM buffer and demosaic algorithm core. As I going deeper into the design, I found that I had lack of some necessary information, such as how to handle the signal of the demosaic core output later, and how would the demosaic core recognize the input signal. Thus, I went through into searching more and more reference from the internet, and I found the best way that could clear my doubt, which is the data valid timing signal.
The Data Valid Timing Signal
Since all the timing signals are all generated and controlled by the FPGA, the data valid signal is actually the indication of the data ready signal, where it is the real start of the pixel, ignoring all the optical blank signal, dummy pixels, and etc. that could be existing in the CCD sensor. For example, a CCD sensor maybe have the 1640 (H) × 1214 (V) total pixel, but only 1600 (H) × 1200 (V) active pixels, whereby the data valid signal will be asserted only during these 1600 (H) × 1200 (V) pixels. This signal is crucial for the design as it tells the core when does the pixels actually starts and ends.
The Input and Output of Image Processing Core
As I read through the example, I found that the core design can be flexible that any of the image processing core can be inserted in any stage of the processing pipeline. This could be achieved by standardizing the necessary input and output of each image processing core. The necessary signals would be the horizontal sync, frame sync, data valid and pixel clock signals.
The output data valid, horizontal sync and frame sync signals could be completely out of phase from the input data valid, horizontal sync and frame sync, because there maybe cycles latency in the image processing core itself. The advantage of these signals made the core able to switch or remove their position in the processing pipeline, as the control signals are all the same. The input and output signals would need to have the same timing pattern, so that the pixels timing would be align as well.
Initially, I was trying to use the dual-port BRAM as the data buffer of the demosaic core. After I finished the design, I look over the schematic again, and I found that the dual-port BRAM is not actually necessary, because the bottleneck of the image processing pipeline is the input pixel rate itself. Thus, to utilize the fastest pixel information, I chose a write-first singe port BRAM as the data buffer, where data would be immediately read out from the BRAM on the next cycle, and direct to the demosaic pipeline.
To test out the schematic, I tried out to write some Verilog example to infer the single port BRAM, and I found that Xilinx actually did a very good job in documenting all the inference of the resources, which can be found here.