Emotion Transfer

reference audios

general - reference
00:00 / 00:00
happy - reference
00:00 / 00:00
sad - reference
00:00 / 00:00
angry - reference
00:00 / 00:00

synthesized audios

Text1: no no no, you see, the key to being healthy is being happy! and cookies make you happy.

GST

general - text1
00:00 / 00:00
happy - text1
00:00 / 00:00
sad - text1
00:00 / 00:00
angry - text1
00:00 / 00:00

TTS-GAN

general - text1
00:00 / 00:00
happy - text1
00:00 / 00:00
sad - text1
00:00 / 00:00
angry - text1
00:00 / 00:00

Text2: This is why I want you gone ninety percent of the time.

GST

general - text2
00:00 / 00:00
happy - text2
00:00 / 00:00
sad - text2
00:00 / 00:00
angry - text2
00:00 / 00:00

TTS-GAN

general - text2
00:00 / 00:00
happy - text2
00:00 / 00:00
sad - text2
00:00 / 00:00
angry - text2
00:00 / 00:00

Text3: Dude, you never texted me!!

GST

general - text3
00:00 / 00:00
happy - text3
00:00 / 00:00
sad - text3
00:00 / 00:00
angry - text3
00:00 / 00:00

TTS-GAN

general - text3
00:00 / 00:00
happy - text3
00:00 / 00:00
sad - text3
00:00 / 00:00
angry - text3
00:00 / 00:00

Text4: There are several listings for gas station.

GST

general - text4
00:00 / 00:00
happy - text4
00:00 / 00:00
sad - text4
00:00 / 00:00
angry - text4
00:00 / 00:00

TTS-GAN

general - text4
00:00 / 00:00
happy - text4
00:00 / 00:00
sad - text4
00:00 / 00:00
angry - text4
00:00 / 00:00

Text5: For the first time in her life she had been danced tired.

GST

general - text5
00:00 / 00:00
happy - text5
00:00 / 00:00
sad - text5
00:00 / 00:00
angry - text5
00:00 / 00:00

TTS-GAN

general - text5
00:00 / 00:00
happy - text5
00:00 / 00:00
sad - text5
00:00 / 00:00
angry - text5
00:00 / 00:00

Identity Transfer

reference audios (seen speaker)

speaker 1 (female): 

speaker1 (female) - reference
00:00 / 00:00

speaker 2 (male): 

speaker2 (male) - reference
00:00 / 00:00

synthesized audios

Text1: We don't know what is happening to them.

Text2: There are several listings for gas station.

speaker1 (female) - text1
00:00 / 00:00
speaker2 (male) - text1
00:00 / 00:00
speaker1 (female) - text2
00:00 / 00:00
speaker2 (male) - text2
00:00 / 00:00

reference audios (unseen speaker)

synthesized audios

speaker 1

speaker 1 - reference
00:00 / 00:00
speaker 1 - synthesized
00:00 / 00:00

speaker 2

speaker 2 - reference
00:00 / 00:00

speaker 3

speaker 3 - reference
00:00 / 00:00
speaker 2 - synthesized
00:00 / 00:00
speaker 3 - synthesized
00:00 / 00:00

speaker 4

speaker 4 - reference
00:00 / 00:00
speaker 4 - synthesized
00:00 / 00:00