Volume 18, Issue 2 (June 2022)                   IJEEE 2022, 18(2): 46-54 | Back to browse issues page


XML Print


Abstract:   (1567 Views)
This paper deals with the optimization of the CORDIC-based modified Gram-Schmidt (MGS) algorithm for QR decomposition (QRD) and presents a scalable algorithm with maximum throughput, the least possible latency, and hardware resources. The optimized algorithm is implemented on Xilinx Virtex 6 FPGA using ISE software as a fixed point with selected accuracy based on the results of MATLAB simulation. Using the loop unrolling technique with different coefficients, an attempt is made to reduce the latency and increase the throughput. In contrast, increasing the unrolling factor leads to a decrease in the frequency of the CORDIC unit as well as a decrease in the number of resources. As a result, there is a trade-off between the unrolling factor and the frequency of the CORDIC unit. By investigating the different unrolling factors, it is shown that the loop unrolling technique with a factor of 4 has the highest throughput with the value of 5.777 MQRD/s and the lowest latency with the value of 173 ns. Moreover, it is shown that throughput and latency are improved by 42.52% and 73.74% respectively compared to the not optimized case. The proposed method is also scalable for different sizes of m×m complex channel matrices, where log2 mN.
Full-Text [PDF 1786 kb]   (1040 Downloads)    
  • The number of the integer and fractional bits are determined by MATLAB simulation;
  • High throughput and low latency for CORDIC-based modified Gram-Schmidt algorithm for QR decomposition;
  • The number iterative of the CORDIC algorithm and unrolling coefficient is optimized for high throughput and low resource and latency.

Type of Study: Research Paper | Subject: VLSI
Received: 2021/06/06 | Revised: 2024/05/13 | Accepted: 2022/02/10

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.