Post
1509
I'm working on talking head generation that takes audio and video as input, can someone suggest me a good existing architecture that can generate videos with less latency or can we make it in real time?