Looking for algorithms to 'sync' recordings of speech

Let's say I have recording of two persons (A & B) reading the same block of text. Of course their voices sound different, they read at different speeds, etc. But let's assume:

they read exactly the same text
they have the same accent
there's no coughing, hiccups, major background noise, etc.
no major reading errors

Basically, we can assume both the reading and the recording are pretty 'clean'. What's a good algorithm to 'sync' up the recordings ?

By 'sync' I mean you somehow break up the recordings into a sequence of 'sound bites', and the two recording share the same sequence, and the algorithm should output the time each speaker spent on each sound bite. For example:

sound-bite-0: t=[0.1,0.3] for A; t=[0.11, 0.28] for B
sound-bite-1: t=[0.32,0.39] for A; t=[0.29, 0.35] for B

Each 'sound bite' can be a word, a syllable, a group of words, etc, but let's say sound bites should be between 0.1 to 2 seconds (or any reasonable range) in length.

There's no need to do speech recognition on the recording, in fact if they read a list of unintelligible, random syllables, the algorithm should still work. Bonus point if the algorithm works on songs too.

submitted by by_321
[link] [5 comments]

Looking for algorithms to 'sync' recordings of speech

Trending Articles

ZARIA CUMMINGS

Practice Sheet of Right form of verbs for HSC Students

BREAKING NEWS: Early success in Chinn appeal bid

Betrayer - 04 (2004)

Police arrest and charge wanted man Ryan Griffin

CC1310: FCC ID Help

Michel Roux roast duck with cherries, cherry sauce and potatoes recipe on...

Kumbalangi Nights - English (1CD ) - subtitles

Sheila Mwanyigha Biography, Boyfriend,Marriage and Tribe

Wutah – Kotosa ( Prod by Appietus ) ThrowBack

Chandrugonda Mandal Sarpanch Upa-Sarpanch Mobile Numbers List Khammam...

Download: Bicko Bicko ft Rich Bizzy & Crew G- Wanfulanganya (Prod by: Bicko...

Black Angus Grilled Artichokes

Miley Cyrus – Something Beautiful – Pre-Single [iTunes Plus M4A]

TO: TIA PARMETER AND CORY GROU...

[Album] 中森明菜 – Akina Box 1982-1991 (2020.12.09/Hi-Res FLAC/RAR)

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

Autodesk AutoCAD 2015 Portable (Win64)

BigXthaPlug – TAKE CARE (DELUXE) [iTunes Plus M4A]

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्