I'm using kaldi for asr and now I want to do speaker segmentation using Kaldi's x-vector approach. They are providing some example segmentation scripts at https://github.com/kaldi-asr/kaldi/tree/master/egs/sre16/v2 .They also provide a basic pretrained model on LDC corpus at https://david-ryan-snyder.github.io/2017/10/04/model_sre16_v2.html
This pretrained model has following structure when unarchived:
I don't have access to LDC corpus and I want to know how to train a model on my own data, and then how to use that model to do actual segmentation ?
There is voxceleb demo which uses public data, you can run it yourself.
You can also format your data in the proper data structure (create data/utt2spk and data/wav.scp files) and run with your data.
https://github.com/kaldi-asr/kaldi/tree/master/egs/voxceleb/v2
You start with the scripts from the demo, removing unused parts. That will give you basic segmentation demo. You can call this reduced demo to do the segmentation with system(2) call from your application or in a similar way.
Then if you need you can turn the scripts into corresponding C++ API calls and call the same procedure from C++ or from any scripting language.