Provider
Asynchronous
No Custom Parameters
Applies 1-3 speech grammars defined on a Nuance OSR 3.01 server to the specified connection in order to detect phrases being spoken in real-time.
The VoiceRecognition action allows one to detect pre-defined phrases spoken on a connection. One can also specify WAV or VOX files to be played to the connection in conjunction with the VoiceRecognition command.
Once VoiceRecognition has successfully finished with VoiceRecognition_Complete, one can then extract the top-matched meaning and confidence score from the Meaning and Score event parameters.
Two action parameters are unique to VoiceRecognition audio streaming: VoiceBargeIn and CancelOnDigit. VoiceBargeIn, if set to true, will cause any specified prompts to stop playing, although the voice recognition continues until completion. CancelOnDigit, if set to true, will cause the voice recognition command to process the audio received up until the digit push and exit with VoiceRecognition_Complete. The Meaning and Score event parameters are valid to use in this case.
The termination condition parameters on the action are a means to create a matrix of reasons that the action should stop successfully.
Text-to-speech is not currently supported when playing audio with the VoiceRecognition command. If one wishes to play a text-to-speech prompt in parallel with performing a VoiceRecognition command, one must instead use the VoiceRecognition (with no prompts specified) and Play (with TTS prompts specified) actions concurrently.
The grammar files specified must already have been provisioned on the Nuance OSR server before the action commences.
The following properties cover most allowable audio files that can be played by the media engine: sample rate of 6, 8, 11, sample size of 4, 8, and 16 bit, and encoding types of ulaw, alaw, pcm, and adpcm. Only mono vox and wav files are allowed.
A VoiceRecognition to a connection or a conference results in a speech resource being utilized until the action results in the VoiceRecognition_Complete event. The use of prompts still use this same speech resource instead of using an additional voice resource.
| Parameter Name | .NET Type | Default | Description |
|---|---|---|---|
| Prompt3 | System.String | A prompt field can be one of two types of values. It can be either an audio file name or a free-formed string which will be converted to text-to-speech. | |
| CancelOnDigit | System.Boolean | Indicates to stop the action successfully when a digit is entered,
returning the VoiceRecognition_Complete event. | |
| CommandTimeout | System.UInt32 | Indicates a command timeout value (in milliseconds). | |
| Volume | System.Int32 | The amount by which to modify the volume (in decibels) of audio playback. Valid values range from -10 to 10. | |
| Speed | System.Int32 | The amount by which to modify the speed of audio playback. Valid values range from -10 to 10. | |
| State | System.String | Optional user state information which is guaranteed present as the State event parameter in
VoiceRecognition_Complete or VoiceRecognition_Failed. | |
| VoiceBargeIn | System.Boolean | Indicates whether the occurrence of voice on the connection should abort any specified prompts. | |
| ConnectionId * | System.String | The connection to perform the VoiceRecognition on. | |
| TermCondMaxTime | System.UInt32 | The amount of time (in milliseconds) that can elapse before terminating the voice recognition operation.
If this condition is met, the VoiceRecognition command will result in the VoiceRecognition_Complete event. | |
| TermCondSilence | System.UInt32 | The amount of silence (in milliseconds) to observe before terminating the record operation.
If this condition is met, the VoiceRecognition command will result in the >VoiceRecognition_Complete event with a
TerminationCondition of silence. | |
| TermCondNonSilence | System.UInt32 | The amount of non-silence (in milliseconds) to observe before terminating the voice recognition operation.
If this condition is met, the VoiceRecognition command will result in the VoiceRecognition_Complete event. | |
| AudioFileSampleRate | System.UInt32 | The sample rate of the audio file (in kHz).
Valid values are 6, 8, or 11. 11 should be avoided as it has a higher impact on the media engine. If not specified, the media engine configuration file defines the sample rate to use, which by default is 8. | |
| AudioFileSampleSize | System.UInt32 | The sample size used in the audio file (in bits).
Valid values are 4, 8, or 16. 4 and 16 should be avoided as each has a higher impact on the media engine. | |
| AudioFileEncoding | System.String | The encoding of the audio file: ulaw, alaw, pcm, or adpcm.
Pcm and adpcm should be avoided as each has a higher impact on the media engine. If not specified, the media engine configuration file defines the file encoding to use, which by default is ulaw. | |
| Prompt1 | System.String | A prompt field can be one of two types of values. It can be either an audio file name or a free-formed string which will be converted to text-to-speech. | |
| Prompt2 | System.String | A prompt field can be one of two types of values. It can be either an audio file name or a free-formed string which will be converted to text-to-speech. | |
| Timeout | System.Int32 | The Timeout property specifies to the Application Runtime Environment how long
to wait for a response from the provider for the current action.
The ReturnValue returned in this case is Timeout. The value must be a literal value in milliseconds. | |
| Grammar1 * | System.String | The name of a grammar file (with extension) which defines the grammar rules to use when interpretting the voice input on the connection. | |
| Grammar2 | System.String | The name of a grammar file (with extension) which defines the grammar rules to use when interpretting the voice input on the connection. | |
| Grammar3 | System.String | The name of a grammar file (with extension) which defines the grammar rules to use when interpretting the voice input on the connection. |
| Parameter Name | .NET Type | Description |
|---|---|---|
| OperationId | System.String | A unique identifier to this VoiceRecognition operation. This identifier can later be used by the
StopMediaOperation
action to stop just this particular operation on a connection, even if multiple media operations are concurrently executing on that connection. |
| ResultCode | System.String | A numeric code indicating the result status of the operation. A '0' indicates success; a positive number indicates an error. Please reference the Media Control Error Codes table for descriptions on specific error codes. |
| ConnectionId | System.String | The value of the ConnectionId result data is the same as that specified as an action parameter.
This ConnectionId is what one would later specify in
StopMediaOperation if one were to abort the command programmatically. |
Branch Conditions
No description.
No description.
No description.