GSoC/GCI Archive
Google Summer of Code 2015 CCExtractor development

Constraint Based Speaker Diarization Module for Heterogeneous News data

by Karan Singla for CCExtractor development

The framework proposed will implement speaker diarization module for a massive heterogeneous television news corpus and pre-processing steps which will allow a user to use it’s own segmentation information or can use auto-segmentation. Another contribution is to make a speaker identification module within a network using technique called “Constrained Global Clustering” across video files. This framework will be then compiled with open-source toolKit called “voice-id” to use Multi-Modal approach.