Prabhu E. currently serves as Assistant Professor at the department of Electronics and Communication Engineering, Amrita School of Engineering, Coimbatore Campus. Prabhu pursued his B. E. degree in Electronics and Communication Engineering from Muthayammal Engineering College, Rasipuram, India, in 2006 and M. E. degree in VLSI Design from Kongu Engineering College, Perundurai, India, in 2008. He is currently pursuing Ph. D. degree in low power arithmetic circuits design area at the Department of Electronics and Communication Engineering, Anna University, Chennai, India.

His research interests include low-power and high-performance arithmetic circuits design, Signal Processing Architectures.  He has served as Technical Program Committee Member in many International Conferences. He is an Associate Member of IETE.


Year Affiliation
2010 – Present Assistant Professor, Department of Electronics and Communication Engineering, Amrita School of Engineering, Coimbatore.
2009 – 2010 Assistant Professor, Electronics and Communication Engineering department,  Periyar Maniammai University, Thanjavur, India
2008 – 2009 Lecturer, Electronics and Communication Engineering department, Hindustan University, Chennai, India


  • Digital system design
  • VLSI design techniques
  • VLSI Technology
  • Low power VLSI circuits
  • Testing of VLSI circuits
  • Design for test


Publication Type: Journal Article
Year of Publication Publication Type Title
2016 Journal Article E. Prabhu, Mangalam, H., and Karthick, S., “Design of area and power efficient Radix-4 DIT FFT butterfly unit using floating point fused arithmetic”, Journal of Central South University, vol. 23, pp. 1669-1681, 2016.[Abstract]

In this work, power efficient butterfly unit based FFT architecture is presented. The butterfly unit is designed using floating-point fused arithmetic units. The fused arithmetic units include two-term dot product unit and add-subtract unit. In these arithmetic units, operations are performed over complex data values. A modified fused floating-point two-term dot product and an enhanced model for the Radix-4 FFT butterfly unit are proposed. The modified fused two-term dot product is designed using Radix-16 booth multiplier. Radix-16 booth multiplier will reduce the switching activities compared to Radix-8 booth multiplier in existing system and also will reduce the area required. The proposed architecture is implemented efficiently for Radix-4 decimation in time (DIT) FFT butterfly with the two floating-point fused arithmetic units. The proposed enhanced architecture is synthesized, implemented, placed and routed on a FPGA device using Xilinx ISE tool. It is observed that the Radix-4 DIT fused floating-point FFT butterfly requires 50.17% less space and 12.16% reduced power compared to the existing methods and the proposed enhanced model requires 49.82% less space on the FPGA device compared to the proposed design. Also, reduced power consumption is addressed by utilizing the reusability technique, which results in 11.42% of power reduction of the enhanced model compared to the proposed design. More »»
2015 Journal Article S. Karthick, Valarmathy, S., and E. Prabhu, “Low power systolic array based digital filter for DSP applications”, Scientific World Journal, vol. 2015, 2015.[Abstract]

Main concepts in DSP include filtering, averaging, modulating, and correlating the signals in digital form to estimate characteristic parameter of a signal into a desirable form. This paper presents a brief concept of low power datapath impact for Digital Signal Processing (DSP) based biomedical application. Systolic array based digital filter used in signal processing of electrocardiogram analysis is presented with datapath architectural innovations in low power consumption perspective. Implementation was done with ASIC design methodology using TSMC 65 nm technological library node. The proposed systolic array filter has reduced leakage power up to 8.5% than the existing filter architectures. © 2015 S. Karthick et al.

More »»
2015 Journal Article E. Prabhu, I, V. R., R, D. Udhayan, S, R., P, W. Thadeus, and S, A., “Design and implementation of digital household energy meter with a flexible billing unit using FPGA”, International Journal of Applied Engineering Research, vol. 10, pp. 28331-28340, 2015.
2015 Journal Article E. Prabhu and Reddy, B. Madhukar, “An Efficient 16-Bit Carry Select Adder With Optimized Power and Delay”, International Journal of Applied Engineering Research, vol. 10, 2015.[Abstract]

Design of electronic devices is very important to reduce power, delay and area of components because these factors are impact the quality and performance of devices. Adder is most fundamental and important device in microprocessors and DSP chips Carry Select Adder (CSA) is the finest adder of choice while considering the need for high speed arithmetic designs. Previous architectures i.e., Conventional CSA and Binary to Excess-1 Converter (BEC) based Square Root Carry Selected Adder (SQRT-CSA) clearly explained. The redundancies in logic operations and dependencies of data factors are effects the power, area and delay in any adder design. The analysis of BEC-based SQRT-CSA (BEC SQRT-CSA) clearly shows the possibility to reduce the power and area by proper modifications at gate-level architecture because this design has been effected by above factors. This work reduces the redundancy in logic operations by introducing a new adder instead of the conventional BEC. The proposed design generates the sum and carry signals for both carry input signals (Cin =0 and Cin =1) for n-number of bits using n-number of new adder. The proposed new adder implemented with less number of logic gates compared with Half Adder (HA). the proposed CSA has less power consumption of 1.5%,44.21%,54.77%, and low area of 6.2%,21.2%,31.4% for 4,8,16-bit respectively compared to BEC SQRT-CSA. More »»
2015 Journal Article E. Prabhu and S, J., “Parallel multiplier-accumulator unit based on Vedic mathematics”, ARPN Journal of Engineering and Applied Sciences, vol. 10, 2015.[Abstract]

In this paper, an efficient parallel multiplier and accumulator (MAC) unit based on Vedic mathematics is presented. Vedic mathematics utilizes the Urdhva-tiryagbhyam sutra for the multiplier design. The proposed MAC architecture enhances the speed of operation while reducing the gate area and power dissipation. We also achieve improved delay with the help of Vedic encoder followed by the removal of accumulator stage by parallelizing the intermediate results feeding the input. Such pipelining of the midway results, prior to the final adder, has the effect of combining the accumulator stage with the partial product stage of the multiplier. Further, the overall computation speed of MAC unit is elevated by the efficient use of higher order compressors in the merged partial product compression and accumulator (PPCA) architecture. The area, timing and power reports show that, the critical path delay of the proposed design is significantly reduced and it outperforms the existing designs. We report an absolute improvement of 20-30% and 7-18% respectively for the 4-bit and 8-bit Vedic MAC units, in terms of its total circuit power, critical path delay and cell area. The architecture was synthesized using standard 90nm CMOS library and implemented on Altera's Cyclone II series FPGA. More »»
2014 Journal Article S. Karthick, Valarmathy, S., and E. Prabhu, “Low Power Heterogeneous Adder”, International Journal of Applied Engineering and Research, vol. 9, no. 22, pp. 13449-13464, 2014.[Abstract]

Flexibility and Portability has increased the requirement of Low Power components in fields like multimedia, signal processing and other computing applications. Adders are the essential computing elements in such applications. However the present adder architectures with hybrid/heterogeneous features provide performance variations but limits to consume less power. In this paper, low power heterogeneous adder architecture is proposed to enable flexibility to the computing applications and consume less power. 128 bit heterogeneous adder architecture is built using three low power sub-adders (ripple carry, carry look a head and carry bypass adders). Adder variants in sub-adders block of heterogeneous adder architecture enables to select required quality metrics viz., area, timing and power, for the design. Application requirements like low power – same performance, low power – low area, variable performance can be selected. Designs are demonstrated using Verilog HDL by synthesizing with Cadence’s RTL Compiler and mapped to TSMC 65nm technological library node. More »»
2013 Journal Article Sa Karthick, Valarmathy, Sa, and E. Prabhu, “Reconfigurable fir filter with radix-4 array multiplier”, Journal of Theoretical and Applied Information Technology, vol. 57, pp. 326-336, 2013.[Abstract]

FIR filters are commonly used digital filters which find its major application in digital signal processing. In conventional FIR filter, the input vector form is delayed by one sample and then multiplied with filter coefficients which are subsequently accumulated by the adders. The drawbacks due to this are high device utilization and high power consumption. In order to compensate these drawbacks, we propose a reconfigurable FIR filter using radix-4 multiplier. The major changes in the proposed system are radix-4 multiplier for multiplication and change in the basic architecture of the FIR filter. In this method, we combine all the input tap values having similar co-efficient values and then multiplying those with the respective co-efficient. The proposed design is simulated and synthesized using Xilinx. The proposed method is compared with the existing FIR filter. From the results, it is observed that our proposed method has got better results by having less number of occupied slices and low power consumption. The power analysis report of a 8-tap FIR filter using the proposed approach consumes 60μW at 25MHz, 110μW at 50MHz, 170μW at 75MHz and 220μW at 100MHz compared with the existing approach which was implemented on Spartan-3E. Additionally, the proposed design was also tested for n-tap FIR filter implemented in Virtex-4 FPGA and compared with the existing technique, which shows that our approach minimizes the number of slices occupied by the design and reduces the power consumption. © 2005 - 2013 JATIT & LLS. All rights reserved.

More »»
2013 Journal Article E. Prabhu, Mangalam, H., and Karthick, S., “A low power multiplier using encoding and bypassing technique”, Journal of Theoretical and Applied Information Technology, vol. 57, pp. 251-260, 2013.[Abstract]

In this paper, a low power Encoding and Bypassing technique based shift-add multiplier is presented. The proposed architecture is derived from simple way to reduce power consumption and area of the multiplier in VLSI design architecture level model. The proposed architecture maximum reduces the power consumption and area compared to the other conventional multiplier. The modification to the multiplier includes proposed Encoder design for Modified Radix-4 recording rules, removal of zero partial products using bypassing technique (decoder). A decoder instead of bypass and feeder register is utilized for the removal of zeros (bypassing) and selecting the current partial product value to be stored in register. In this paper, encoder and decoder selector circuit has been used in the proposed model work. Low power consumption and low area occupied multiplier architecture model is proposed. The simulation result for the encoding process and bypassing technique using decoder generated using Xpower analyzer in Xilinx 10.1 ISE (integrated software environment) represents the dynamic power consumption is reduced to almost 50%. When the power consumed by the proposed multiplier using Spartan-2 is 6.28mW, the Virtex-4 device is 6.89mW. The proposed multiplier is mainly applicable for designing low power VLSI circuits and high speed switching techniques. © 2005 - 2013 JATIT & LLS. All rights reserved.

More »»
Publication Type: Conference Paper
Year of Publication Publication Type Title
2014 Conference Paper P. R. Gokul, E. Prabhu, and Mangalam, H., “Performance comparison of multipliers based on Square and Multiply and montgomery algorithms”, in Green Computing Communication and Electrical Engineering (ICGCCEE), 2014 International Conference on, 2014.[Abstract]

Modular multiplication is the core arithmetic for most of the Cryptographic applications. Montgomery multiplication is one of the fastest methods available for performing modular multiplication. A k - partition method for Montgomery multiplication is thoroughly studied and analysed. This method reduces the time complexity of multiplication from O (n) to O (n/k). Another method for modular exponentiation - Square and Multiply method is implemented. As the name suggests, squaring is the main principle behind this method. The implementation results are compared with that of an ordinary Montgomery multiplier and the k - partition method in terms of power and area constraints. Results for 128, 256, 512 and 1024 bit input operands show that the Square and Multiply method is more power efficient than the other two More »»
Publication Type: Conference Proceedings
Year of Publication Publication Type Title
2014 Conference Proceedings R. M, E. Prabhu, and H, M., “A versatile low power design of bit-serial multiplier in finite fields GF (2m)”, Communications and Signal Processing (ICCSP), 2014 International Conference on. pp. 748-752, 2014.[Abstract]

A novel and efficient architecture for a versatile polynomial basis multiplier over GF(2m) is dealt with. The value m; of the irreducible polynomial degree, can be changed and so the multiplier can be configured and programmed. Thus versatility of the multiplier refers to its reconfigurable property. The architecture deals with an efficient execution of the Most Significant Bit (MSB)-First, bit serial multiplication for different operand lengths. The attractive features of the proposed architecture are (a) its flexibility on arbitrary Galois field sizes, (b) its hardware simplicity which results in small area implementation, (c) Low power consumption by employing the gated clock technique, power gating and Multi Vth optimization techniques (d) improvement of maximum clock frequency due to the lessening of critical path delay. More »»
Faculty Details


Faculty Email: