VoiceRecognition

Metreos.MediaControl.VoiceRecognition

Asynchronous Callbacks

Summary

Applies 1-3 speech grammars defined on a Nuance OSR 3.01 server to the specified connection in order to detect phrases being spoken in real-time.

Usage

The VoiceRecognition action allows one to detect pre-defined phrases spoken on a connection. One can also specify WAV or VOX files to be played to the connection in conjunction with the VoiceRecognition command.

Once VoiceRecognition has successfully finished with VoiceRecognition_Complete, one can then extract the top-matched meaning and confidence score from the Meaning and Score event parameters.

Two action parameters are unique to VoiceRecognition audio streaming: VoiceBargeIn and CancelOnDigit. VoiceBargeIn, if set to true, will cause any specified prompts to stop playing, although the voice recognition continues until completion. CancelOnDigit, if set to true, will cause the voice recognition command to process the audio received up until the digit push and exit with VoiceRecognition_Complete. The Meaning and Score event parameters are valid to use in this case.

The termination condition parameters on the action are a means to create a matrix of reasons that the action should stop successfully.

Remarks

Text-to-speech is not currently supported when playing audio with the VoiceRecognition command. If one wishes to play a text-to-speech prompt in parallel with performing a VoiceRecognition command, one must instead use the VoiceRecognition (with no prompts specified) and Play (with TTS prompts specified) actions concurrently.

The grammar files specified must already have been provisioned on the Nuance OSR server before the action commences.

The following properties cover most allowable audio files that can be played by the media engine: sample rate of 6, 8, 11, sample size of 4, 8, and 16 bit, and encoding types of ulaw, alaw, pcm, and adpcm. Only mono vox and wav files are allowed.

A VoiceRecognition to a connection or a conference results in a speech resource being utilized until the action results in the VoiceRecognition_Complete event. The use of prompts still use this same speech resource instead of using an additional voice resource.

Action Parameters
Parameter Name.NET TypeDefaultDescription
Prompt3System.StringA prompt field can be one of two types of values. It can be either an audio file name or a free-formed string which will be converted to text-to-speech.
CancelOnDigitSystem.BooleanIndicates to stop the action successfully when a digit is entered, returning the VoiceRecognition_Complete event.
CommandTimeoutSystem.UInt32Indicates a command timeout value (in milliseconds).
VolumeSystem.Int32The amount by which to modify the volume (in decibels) of audio playback. Valid values range from -10 to 10.
SpeedSystem.Int32The amount by which to modify the speed of audio playback. Valid values range from -10 to 10.
StateSystem.StringOptional user state information which is guaranteed present as the State event parameter in VoiceRecognition_Complete or VoiceRecognition_Failed.
VoiceBargeInSystem.BooleanIndicates whether the occurrence of voice on the connection should abort any specified prompts.
ConnectionId *System.StringThe connection to perform the VoiceRecognition on.
TermCondMaxTimeSystem.UInt32The amount of time (in milliseconds) that can elapse before terminating the voice recognition operation. If this condition is met, the VoiceRecognition command will result in the VoiceRecognition_Complete event.
TermCondSilenceSystem.UInt32The amount of silence (in milliseconds) to observe before terminating the record operation. If this condition is met, the VoiceRecognition command will result in the >VoiceRecognition_Complete event with a TerminationCondition of silence.
TermCondNonSilenceSystem.UInt32The amount of non-silence (in milliseconds) to observe before terminating the voice recognition operation. If this condition is met, the VoiceRecognition command will result in the VoiceRecognition_Complete event.
AudioFileSampleRateSystem.UInt32The sample rate of the audio file (in kHz). Valid values are 6, 8, or 11. 11 should be avoided as it has a higher impact on the media engine. If not specified, the media engine configuration file defines the sample rate to use, which by default is 8.
AudioFileSampleSizeSystem.UInt32The sample size used in the audio file (in bits). Valid values are 4, 8, or 16. 4 and 16 should be avoided as each has a higher impact on the media engine.
AudioFileEncodingSystem.StringThe encoding of the audio file: ulaw, alaw, pcm, or adpcm. Pcm and adpcm should be avoided as each has a higher impact on the media engine. If not specified, the media engine configuration file defines the file encoding to use, which by default is ulaw.
Prompt1System.StringA prompt field can be one of two types of values. It can be either an audio file name or a free-formed string which will be converted to text-to-speech.
Prompt2System.StringA prompt field can be one of two types of values. It can be either an audio file name or a free-formed string which will be converted to text-to-speech.
TimeoutSystem.Int32The Timeout property specifies to the Application Runtime Environment how long to wait for a response from the provider for the current action. The ReturnValue returned in this case is Timeout. The value must be a literal value in milliseconds.
Grammar1 *System.StringThe name of a grammar file (with extension) which defines the grammar rules to use when interpretting the voice input on the connection.
Grammar2System.StringThe name of a grammar file (with extension) which defines the grammar rules to use when interpretting the voice input on the connection.
Grammar3System.StringThe name of a grammar file (with extension) which defines the grammar rules to use when interpretting the voice input on the connection.
Result Data
Parameter Name.NET TypeDescription
OperationIdSystem.StringA unique identifier to this VoiceRecognition operation. This identifier can later be used by the StopMediaOperation action to stop just this particular operation on a connection, even if multiple media operations are concurrently executing on that connection.
ResultCodeSystem.StringA numeric code indicating the result status of the operation. A '0' indicates success; a positive number indicates an error. Please reference the Media Control Error Codes table for descriptions on specific error codes.
ConnectionIdSystem.StringThe value of the ConnectionId result data is the same as that specified as an action parameter. This ConnectionId is what one would later specify in StopMediaOperation if one were to abort the command programmatically.

Branch Conditions 

Success

No description.

Failure

No description.

Timeout

No description.