Back close

Low Latency Floating-Point Division and Square Root Unit

Publication Type : Conference Paper

Publisher : Institute of Electrical and Electronics Engineers (IEEE)

Source : IEEE Transactions on Computers

Url : https://doi.org/10.1109/tc.2019.2947899

Campus : Amritapuri

School : School of Engineering

Center : Humanitarian Technology (HuT) Labs

Department : Electronics and Communication

Year : 2020

Abstract : Digit-recurrence algorithms are widely used in actual microprocessors to compute floating-point division and square root. These iterative algorithms present a good trade-off in terms of performance, area and power. We present a floating-point division and square root unit, which implements a radix-64 floating-point division and a radix-16 floating-point square root. To have an affordable implementation, each radix-64 division iteration and radix-16 square root iteration are made of simpler radix-4 iterations: 3 radix-4 iterations in division and 2 in square root. Speculation is used between consecutive radix-4 iterations to get a reduced timing. There are three different parts in digit-recurrence implementations: initialization, digit iterations, and rounding. The digit iteration is the iterative part and it uses the same logic for several cycles. Division and square root share partially the initialization and rounding stages, whereas each one has different logicforthe digit iterations. The result is a low-latency floating-point divider and square root, requiring 11, 6, and 4 cycles for double, single and half-precision division with normalized operands and result, and 15, 8 and 5 cycles for square root. One ortwo additional cycles are needed in case of subnormal operand(s) or result.

Cite this Research Publication : Javier D. Bruguera, Low Latency Floating-Point Division and Square Root Unit, IEEE Transactions on Computers, Institute of Electrical and Electronics Engineers (IEEE), 2020, https://doi.org/10.1109/tc.2019.2947899

Admissions Apply Now