Thank you. Are these dataset overlapped ?

#2
by thusinh1969 - opened

Thanks for accepting me.
I have a few questions please:

  1. Is speaker ID checked that is it unique or it can be overlapped with other channels same persons --> can not use for SV
  2. One YT movie may have more than one person speaking (podcast etc...), do we actually scan them out

At the end, uniqueness if what I am after as it may be a hit.miss for training speaker verification model.

Thanks much for your answer.
Steve

voice-is-cool org

Hi Steve!

  1. Speakers ids aren't overlapped - i.e. duplicated speakers with high similarity were filtered out from the dataset

  2. We applied per-video and per-channel clustering and selected the majority speaker cluster only

Please find all the relevant filtering pipeline details in this paper: https://www.isca-archive.org/interspeech_2023/yakovlev23_interspeech.pdf

Best regards,
Anton

antonioperfecto changed discussion status to closed

Sign up or log in to comment