Nikita Torgashov’s Post

View profile for Nikita Torgashov

PhD Student @ KTH | Conversational AI & Speech Generation

VoXtream is now open-sourced! VoXtream is a full-stream zero-shot TTS model for real-time use that begins speaking from the first word. 𝗞𝗲𝘆 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀: - 𝗦𝘁𝗿𝗲𝗮𝗺𝗶𝗻𝗴: Support a full-stream scenario, where the full sentence is not known in advance. The model takes the text stream coming word-by-word as input and outputs an audio stream in 80ms chunks. - 𝗦𝗽𝗲𝗲𝗱: Works 𝟱𝘅 times faster than real-time and achieves 𝟭𝟬𝟮 𝗺𝘀 first packet latency on GPU. - 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗮𝗻𝗱 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝗰𝘆: With only 9k hours of training data, it matches or surpasses the quality and intelligibility of larger models or models trained on large datasets. The work was done under the guidance and supervision of Gustav Eje Henter and Gabriel Skantze. 🔗 Paper: https://lnkd.in/d5u_hfcr 🔗 Demo: https://lnkd.in/d7zj2mgc 🔗 Code: https://lnkd.in/dYv5ceib 🔗 Model: https://lnkd.in/dmkq--8K #tts #texttospeech #streaming

  • No alternative text description for this image
Hasan Shoaib

Co-Founder & CTO @Q9labs | Voice AI | AI Agents

2mo

Amazing Nikita Torgashov! Is there a hosted version of this somewhere?

Anton Pimenov

Principal Data Scientist: einsum(‘domain,task->solution’, [Voice, Face], [Biometric, AntiSpoof, Generate])

2mo

Awesome, may it be adopted for real-time voice conversion?

Christopher Shulby

Machine Learning Engineering Leader

2mo

Really cool. What is the footprint in VRam more or less? Quality looks good 🙌. Great work

Anton Okhotnikov

AI Researcher | SotA Speaker Recognition | UK Global Talent

2mo

Amazing work, Nikita! Will certainly give it a shot soon!

Muhammad Adil Abid

PhD Candidate @Malmö University | Deep Learning, Data Analyst & Optimization | Pre-hospital Stroke Care | Ambulance Travel Time Estimation

2mo

Nicely done! What tool did you use to create this figure? Looks really good.

Like
Reply

Congratulations! Very good job 👏🏼👏🏼👏🏼

Juan Pablo Montoya

AI @ Google | Prev @ Microsoft, Cisco

1mo

This is so cool!

Shivam Mehta

Research Scientist @ Netflix. PhD, Ex-Intern @Meta and @Microsoft Research. Working with generative probabilistic models. WASP PhD @ KTH Royal Institute of Technology

2mo

Amazing work Nikita Torgashov !!

Kaylo Littlejohn

Senior Machine Learning Engineer Roblox | PhD Berkeley AI Research

1mo

nice, looks super useful :) !! thanks for making this

See more comments

To view or add a comment, sign in

Explore content categories