A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TMM Volume 23 | 2021

A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

By:

Huiyu Mo; Leibo Liu; Wenping Zhu; Qiang Li; Shouyi Yin; Shaojun Wei

The mnemonic descent method (MDM) algorithm is the first end-to-end recurrent convolutional system for high-accuracy face alignment. However, the heavy computational complexity and high memory access demands make it difficult to satisfy the requirements of real-time applications. To address this problem, an improved MDM (I-MDM) algorithm is proposed for efficient hardware implementation based on several hardware-oriented optimizations. First, a patch merging mechanism is introduced to dynamically cluster and eliminate redundant landmarks, which significantly reduces computational complexity with minimal accuracy loss. Second, a dedicated convolutional layer is inserted to halve the number of computations and memory access of the subsequent fully connected layer, yielding a 4.42% decrease in the failure rate. Third, a lightweight preprocessing method named dual regressors is proposed to reinitialize face images, which can greatly improve the overall accuracy. Moreover, compared with a similar method, the DR method can reduce computations and memory storage by nearly 99.9%. Overall and compared with the MDM algorithm, I-MDM not only reduces the number of computations by 23.5% but also decreases the failure rate by 17.9% on the 300 W test set. Based on the proposed I-MDM algorithm, an I-MDM-based hardwired accelerator is presented using the TSMC 65 nm CMOS process. First, compared with similar solutions, the gradient calculation operation is rearranged and loaded pixels are reused in the HoG feature extraction to eliminate all division operations and 25% off-chip memory access. Second, patch-independent central activations are used to enable patch-level pipelined operations, yielding a 2× acceleration in the overall process. This accelerator achieves 460 GOPS/W energy efficiency at 330 MHz, which is 38× higher than the most recent face alignment accelerator with the same process.

Read on IEEE Xplore

Tags:

IEEE TMM Article

SPS on Twitter

DEADLINE EXTENDED: The 2023 IEEE International Workshop on Machine Learning for Signal Processing is now accepting… https://t.co/NLH2u19a3y
ONE MONTH OUT! We are celebrating the inaugural SPS Day on 2 June, honoring the date the Society was established in… https://t.co/V6Z3wKGK1O
The new SPS Scholarship Program welcomes applications from students interested in pursuing signal processing educat… https://t.co/0aYPMDSWDj
CALL FOR PAPERS: The IEEE Journal of Selected Topics in Signal Processing is now seeking submissions for a Special… https://t.co/NPCGrSjQbh
Test your knowledge of signal processing history with our April trivia! Our 75th anniversary celebration continues:… https://t.co/4xal7voFER

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2024 IEEE – All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

Justin_Dauwels.jpg

Distinguished Lecture: Prof. Dr. Justin Dauwels (TU Delft)

Tran_Quoc_Long.jpg

Distinguished Lecture: Dr. Tran Quoc Long (VNU University of Engineering and Technology, Vietnam)

Maarten_de_Vos.jpg

Distinguished Lecture: Prof. Maarten de Vos (KU Leuven, Belgium),

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

TMM Menu

Publications & Resources

For Authors

award_nomination_article_2023_new.jpg

success.jpg

pubs_general.jpg

Top Reasons to Join SPS Today!

A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

SPS on Twitter

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

Search form

You are here

TMM Menu

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

SPS on Twitter

IEEE SPS Educational Resources