Voice SDK API

Enrollment

To use enrollment endpoint developer must give a vector of audios, composed by one audio, or three to five audios:

- 1 audio for text-independent enrollment.
- 3 to 5 audios for text-dependent enrollment.

Also must specify if a prove of life is wanted on enrollment, and as this SDK works with probabilities, a liveness probability threshold from which audios are considered valid.

Audios are given as bytebuffer containing the file, without any decompression or data extraction, and encoded in base64. The audios can be encrypted or not, the system will detect it automatically.

As result of a correct execution, a biometric template is returned, which can be used on further authentications. The template returned is always encrypted and encoded in base64.

#include <iostream>
#include <facephi/voice-sdk/services/EnrollmentService.hpp>

void EnrollmentExample(const std::vector<ByteBuffer> &audios, 
                       bool checkLiveness, float livenessThreshold) {
    Facephi::VoiceSdk::EnrollResponse enrollResponse;
    enrollResponse.EnrollmentService(audios, checkLiveness, livenessThreshold):
    std::cout << "Generated template " << enrollResponse.Template() << std::endl;
}

It is also possible to use encription on audios and templates, using the tokenizer library. For this, we have two methods, one passing JSON objects and other passing a byte buffer.

JSON object:

    JsonCpp jsonRequest;
    jsonRequest["check_liveness"]             = true;
    jsonRequest["liveness_threshold"]         = 0.3;
    jsonRequest["audios"]                     = JsonCpp();
    jsonRequest["minimum_snr_db"]             = 15.0f;
    jsonRequest["minimum_speech_duration_ms"] = 1500;
    const ByteBuffer &audioBuffer 
        {Encryption::EncryptDataB64("audio", Utils::ReadBinaryFile("my_audio.wav"))};
    jsonRequest["audios"].append(String(audioBuffer.begin(), audioBuffer.end()));

    EnrollResponse enrollResponse;
    enrollResponse.EnrollmentService(std::make_shared<JsonCpp>(jsonRequest));

    const JsonCpp &jsonResponse = enrollResponse.GetJson();

Byte buffer:

    Tokenizer::TokenizerData requestData;
    requestData.AddDocumentData("check_liveness", "true");
    requestData.AddDocumentData("liveness_threshold", "0.3");
    requestData.AddImageData("audio_0", Utils::ReadFileAsBase64("my_audio.wav"));
    ByteBuffer encryptedRequest = Tokenizer::Encrypt(requestData.Write());

    VoiceSdk::EnrollResponse enrollResponse;
    enrollResponse.EnrollmentService(encryptedRequest);

    ByteBuffer               encryptedResponse = enrollResponse.GetByteBuffer();
    Tokenizer::TokenizerData responseData;
    String                   segregationCode;
    responseData.Load(Tokenizer::Decrypt_V1(encryptedResponse, segregationCode));

Authentication

To use authentication endpoint developer must give an audio and a template to match with. This template must be previously generated by an enrollment process. Unlike enrollment, prove of life is always performed, but you have to specify a probatility threshold also for considering audio as valid.

As result of a correct execution, function returns if audio matches template and probability of match.

#include <iostream>
#include <facephi/voice-sdk/services/AuthenticationService.hpp>

void AuthenticationExample(const ByteBuffer &audio,
                           const ByteBuffer &templ, float livenessThreshold) {
    Facephi::VoiceSdk::AuthResponse authResponse;
    authResponse.AuthenticationService(audio, templ, livenessThreshold);
    if(authResponse.Match()) {
        std::cout << "Audio matches with template!" << std::endl;
    } else {
        std::cout << "Audio does not match with template..." << std::endl;
    }
}

It is also possible to use encription on audios and templates, using the tokenizer library. For this, we have two methods, one passing JSON objects and other passing a byte buffer.

JSON object:

    JsonCpp jsonRequest;
    jsonRequest["template"]                   = enrollmentTemplate;
    jsonRequest["liveness_threshold"]         = 0.3f;
    jsonRequest["minimum_snr_db"]             = 10.0f;
    jsonRequest["minimum_speech_duration_ms"] = 2000;
    const ByteBuffer &audioBuffer 
        {Encryption::EncryptDataB64("audio", Utils::ReadBinaryFile("my_audio.wav"))};
    jsonRequest["audio"] = String(audioBuffer.begin(), audioBuffer.end());

    AuthResponse authResponse;
    authResponse.AuthenticationService(std::make_shared<JsonCpp>(jsonRequest));

    const JsonCpp &jsonResponse = authResponse.GetJson();

Byte buffer:

    Tokenizer::TokenizerData requestData;
    requestData.AddDocumentData("template", "BgsALLvvv...");
    requestData.AddDocumentData("liveness_threshold", "0.3");
    requestData.AddImageData("audio", Utils::ReadFileAsBase64("my_audio.wav"));
    ByteBuffer encryptedRequest = Tokenizer::Encrypt(requestData.Write());

    VoiceSdk::AuthResponse authResponse {};
    authResponse.AuthenticationService(encryptedRequest);

    ByteBuffer               encryptedResponse = authResponse.GetByteBuffer();
    Tokenizer::TokenizerData responseData;
    String                   segregationCode;
    responseData.Load(Tokenizer::Decrypt_V1(encryptedResponse, segregationCode));

    enrollmentTemplate = responseData.GetDocumentData().at("template");