I am aware of https://github.com/gilescoope/rhubarb-timeline and would like to recreate its features but in slightly different way. Existing infrastructure in the project aside, we want to achieve the fllowing: The position of for example an MBP mouth shape when serialized should remain relative to the start of the audio, not the whole timeline, so that when we move clips that represent audio+mouthshapes within a timeline, the attached timed mouth shapes automatically move with it The data should be reusable, so the mouthshapes that are generated or manually edited are associated with an audioclip and could be reused in multiple timelines Please let me know If the following approach I describe is suboptimal and could be somehow properly solved in a better way, for example with nested timelines. Given: A Custom Controller class AnimalController : MonoBehaviour which among other things controls an AudioSource and an Animator component A ScriptableObject AnimalVoiceBehaviour which among some meta-data holds a reference to an AudioClip, and a sequence of Mouth Cues (the data from Rhubarb contraining start, end as well as a Mouth shape) I would like to create a track with clips such that The Track references an AnimalController (i.e. has that TrackBindingType) The PlayableBehaviour references an AnimalVoiceBehaviour When played back, a track with clips should both play back the audio and animations that correspond to the mouthshapes at the proper timestamps All of this should work both in play and edit mode it should work with timeline 'scrubbing' as well as with native audio- and animation track, that means Audio doesn't have to be heard when scrubbing, but playback should start at the correct position after scrubbing in both play- and edit mode. Animations should be visible both in play- and edit mode, and during scrubbing as well I have already managed to draw a preview of the audiowave with reflection using the internal unity classes that do it the same way it is done for the Audio tracks. Eventually there should be a UI that allows to controll the mouth-cues within the AnimalVoiceBehaviour clip, but for starters I would consider that beyond the topic of this post. For now my questions are: How can I achieve the playback of animation and audio? Should I define additional PlayableBinding outputs in the track? Is that the right way forward? If so, I can not find any tutorials or helpful documentation on the outputs. How do I prevent The error and feed them with audio/animation data?