Vocal Visage: Crafting Lifelike 3D Talking Faces from Static Images and Sound

Authors

  • Y Prudhvi Student, Department of Information Technology, Vasireddy Venkatadri Institute of Technology, Guntur, Nambur, India Author
  • T Adinarayana Student, Department of Information Technology, Vasireddy Venkatadri Institute of Technology, Guntur, Nambur, India Author
  • T Chandu Student, Department of Information Technology, Vasireddy Venkatadri Institute of Technology, Guntur, Nambur, India Author
  • S Musthak Student, Department of Information Technology, Vasireddy Venkatadri Institute of Technology, Guntur, Nambur, India Author
  • G Sireesha Assistant Professor, Department of Information Technology, Vasireddy Venkatadri Institute of Technology, Guntur, Nambur, India Author

Keywords:

Eye Blinking, Generative Models, Natural Lip Synchronization, Talking Face Animations

Abstract

In the field of computer graphics and  animation, the challenge of generating lifelike and  expressive talking face animations has historically  necessitated extensive 3D data and complex facial motion  capture systems. However, this project presents an  innovative approach to tackle this challenge, with the  primary goal of producing realistic 3D motion coefficients  for stylized talking face animations driven by a single  reference image synchronized with audio input. Leveraging  state-of-the-art deep learning techniques, including  generative models, image-to-image translation networks,  and audio processing methods, the methodology bridges the  gap between static images and dynamic, emotionally rich  facial animations. The ultimate aim is to synthesize talking  face animations that exhibit seamless lip synchronization  and natural eye blinking, thereby achieving an exceptional  degree of realism and expressiveness, revolutionizing the  realm of computer-generated character interactions. 

Downloads

Download data is not yet available.

References

F. I. Parke," Computer generated animation of faces," in Proc. the ACM Annual Conference, vol. 1, pp. 451-457, 1972.

Zhou, Yang & Xu, Zhan & Landreth, Chris & Kalogerakis, Evangelos & Maji, Subhransu & Singh, Karan. (2018). VisemeNet: Audio-driven animator-centric speech animation. ACM Transactions on Graphics. 37. 1-10. 10.1145/3197517.3201292.

Yu Ping, Heng & Abdullah, Lili & Sulaiman, Puteri & Abdul Halin, Alfian. (2013). Computer Facial Animation: A Review. International Journal of Computer Theory and Engineering. 5. 658-662. 10.7763/IJCTE.2013.V5.770.

Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Perez, Christian ´Richardt,Michael Zollhofer, and Christian Theobalt. Deep ¨video portraits. ACM Transactions on Graphics (TOG), 2018.

Xin Wen, Miao Wang, Christian Richardt, Ze-Yin Chen, and Shi-Min Hu. Photorealistic audio-driven video portraits.IEEE Transactions on Visualization and Computer Graphics,26(12):3457–3466, 2020.

Justus Thies, Mohamed Elgharib, Ayush Tewari, Christian Theobalt, and Matthias Nießner. Neural voice puppetry:Audio-driven facial reenactment. In ECCV, 2020.

N. Otberdout, C. Ferrari, M. Daoudi, S. Berretti and A. Del Bimbo, "Sparse to Dense Dynamic 3D Facial Expression Generation," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 20353-20362, doi: 10.1109/CVPR52688.2022.01974.

M. Cerda, R. Valenzuela, N. Hitschfeld-Kahler, L. D. Terissi and J. C. Gómez, "Generic Face Animation," 2010 XXIX International Conference of the Chilean Computer Science Society, Antofagasta, Chile, 2010, pp. 252-257, doi: 10.1109/SCCC.2010.25.

H. E. Tasli, T. M. den Uyl, H. Boujut and T. Zaharia, "Real time facial character animation," 2015 11th IEEE International Conference and Workshops on Automatic Face

and Gesture Recognition (FG), Ljubljana, Slovenia, 2015, pp. 1-1, doi: 10.1109/FG.2015.7163173.

E. Mendi and C. Bayrak, "Facial animation framework for web and mobile platforms," 2011 IEEE 13th International Conference on e-Health Networking, Applications and Services, Columbia, MO, USA, 2011, pp. 52-55, doi: 10.1109/HEALTH.2011.6026785.

Downloads

Published

2023-11-30

How to Cite

Vocal Visage: Crafting Lifelike 3D Talking Faces from Static Images and Sound . (2023). International Journal of Innovative Research in Computer Science & Technology, 11(6), 13–17. Retrieved from https://www.acspublisher.com/journals/index.php/ijircst/article/view/11603