Recently I was asked to look into creating a virtual human based interactive digital assistant from scratch. I have a good bit of experience with various aspects of interactive virtual humans from my time at the Institute for Creative Technology, but most of those efforts are built upon complex animation control systems and specialized rendering pipelines. I wanted to see if I could create a virtual human system from commodity based API’s and content, without having to hire an animator or modeler. My biggest concern was lip sync and facial animation. It’s not too difficult to come up with a human model rigged up for some basic animations, but it’s much more difficult to create the facial phoneme animations to do proper lip sync.
After searching around a bit, I decided to check out Adobe Fuse and Mixamo. Fuse offers a very nice human character creation tool, complete with hair, clothing, body shape customization and many other options. For a developer, or someone that wants a rapid prototype, it’s very powerful tool.
Once you have created your character, you can click on “Save to Mixamo” and add animations. I was mostly interested in idle animations as this project is not a game or action based experience. There were plenty of ready made animations and of course you can upload your own. I created a male and female character and added a few animations.
From Mixamo, you can export the Character to FBX for unity. You can also port over all the animations you made. Bringing the FBX into unity presented a few issues, many of which are addressed here. You will have to play with the shaders in order to get the character to look remotely correct. Apparently Mixamo/Fuse used to provide a nice shader pack for Unity, but it has been discontinued. For the purposes of my testing, I didn’t want to bother too much with shaders until I could test animation and lip sync.
I loaded the animation editor and played a few of the idle animations. They looked pretty good. The Mixamo export provided an option to include facial rigging, and I could see that the character did in fact import with a full set of facial blend shapes, including blinks and phonemes.
Next, I wanted to try a free or low cost lip sync plugin. There are several available, but Lipsync pro seems to be a great middle ground between free and high end. I’ve used FaceFX in the past, but I think it’s overkill and costly. Faceplus is designed to work directly with Mixamo characters, but it’s more of a puppeteering tool.
I ended up grebbing Lipsync Lite just to see how it would drive the prepacked phoneme blendshapes. It worked very well, except that the timing of the audio playback and animation schedule seems to vary quite a bit. The full version comes with source code and so I may be able to tweak things a bit should this project get funded. Lipsync Lite comes with a phoneme editor, but it won’t analize your existing audio and create a schedule for you. The full version does this and comes with a gesture markup tool and eye controller. I will probably buy the full version as that’s all well worth $35! For the time being I wrote a simple blink trigger to give the eyes some life. The Mixamo blendshapes include separate left and right blinks.
Conclusion: In less than a day I was able to compile a very simple human character that could speak and animate with just a few basic tools and some trialware. I know there are a few other character creation tools out there, but this was pretty painless. The biggest concern is that the character needs cleanup and some shader work to get it to look like it does in fuse. Also I’m not a lighting expert and would need an artist to help me set up a better looking scene, but overall I’m very pleased to be able to find the tools that allowed me to pull this proof of concept together so fast!