Speech and NLP

In this section you can find all of the API calls related to triggering Misty's speech functions and utilizing her Natural Language Processing capabilities.

Speak

Starts Misty speaking text using her onboard text-to-speech engine.

By default, Misty speaks in US English. You can find the full list of languages and their reference codes in Languages

Example Code

misty.speak("Buongiorno, mi chiamo Misty", 1, 1, "it-it-x-itb-local")

To stop Misty speaking before she reaches the end of a text-to-speech utterance, use the misty.StopSpeaking command.

Misty raises a TextToSpeechComplete event when she finishes speaking a text-to-speech utterance. To receive a TextToSpeechComplete event message for a given utterance in your skills, you must set an utteranceId when you issue the Speak command, and you must register a listener for TextToSpeechComplete events.

The Speak command uses the text-to-speech (TTS) engine on Misty's 820 processor. At this time Misty's TTS engine supports a limited subset of Speech Synthesis Markup Language (SSML) Version 1.0.

Parameters

misty.speak(self, text : str = None, pitch : float = None, speechRate : float = None, voice : str = None, flush : bool = None, utteranceId : str = None, language : str = None)
  • Text (string) - The text to speak, along with any relevant SSML tags to customize speech synthesis.

  • Flush (bool) - Optional. Whether to flush all previously enqueued Speak commands. Default is false.

  • UtteranceId (string) - Optional. An identifier of your choosing for this instance of the Speak command. You must set a value for UtteranceId in order to receive a TextToSpeechComplete event when Misty stops speaking this utterance.

SpeakAndListen

Triggers Misty to listen for speech after speaking.

Example Code

misty.speak_and_listen("Hi,are you human?", true, none, "yes-no-questions")

Parameters

misty.speak_and_listen(self, text : str = None, flush : bool = None, utteranceId : str = None, context : str = None)
  • Text (string) - The text to speak, along with any relevant SSML tags to customize speech synthesis.

  • Flush (bool) - Optional. Whether to flush all previously enqueued Speak commands. Default is false.

  • UtteranceId (string) - Optional. An identifier of your choosing for this instance of the Speak command. You must set a value for UtteranceId in order to receive a TextToSpeechComplete event when Misty stops speaking this utterance.

  • Context (string) - the name of the context file which Misty should reference when listening to speech

StopSpeaking

Stops Misty speaking the currently playing text-to-speech utterance.

misty.stop_speaking

StartKeyPhraseRecognition

Starts Misty listening for the "Hey, Misty!" key phrase. Additionally, configures Misty to record speech she detects after recognizing the key phrase. Misty's chest LED blinks blue when she is recording audio or listening for the key phrase.

Example Code

misty.start_key_phrase_recognition(true,3000,false,600)

Misty waits to start recording until she detects speech. She then records until she detects the end of the utterance. By default, Misty records an utterance up to 7.5 seconds in length. You can adjust the maximum duration of a speech recording with the MaxSpeechLength parameter.

There are two event types associated with key phrase recognition:

  • Misty triggers a KeyPhraseRecognized event each time she recognizes the "Hey, Misty" key phrase.

  • Misty triggers a VoiceRecord event when she captures a speech recording.

Note: Misty cannot use her microphones for wake word detection or recording speech while actively streaming audio and video.

Additional Notes
  • When you issue a StartKeyPhraseRecognition command, Misty listens for the key phrase by continuously sampling audio from the environment and comparing that audio to her trained key phrase model ("Hey, Misty!"). Misty does not create or save audio recordings until after she recognizes the key phrase.

  • Because Misty cannot record audio and listen for the "Hey, Misty!" key phrase at the same time, she stops listening for the key phrase when issued a separate command to start recording audio. To have Misty start listening for the key phrase after capturing speech, you must issue another StartKeyPhraseRecognition command.

  • When Misty recognizes the key phrase, she automatically stops listening for key phrase events. In order to start Misty listening for the key phrase again, you need to issue another StartKeyPhraseRecognition command.

Follow these steps to code Misty to respond to the "Hey, Misty!" key phrase:

  1. Invoke the StartKeyPhraseRecognition command. If needed, use the optional parameters to configure Misty's speech capture settings.

  2. Register an event listener for KeyPhraseRecognized event messages to trigger a callback function when Misty recognizes the key phrase.

  3. Register an event listener for VoiceRecord event messages to trigger a callback function when Misty captures a speech recording.

  4. Write the code to handle what Misty should do when she recognizes the key phrase and captures a speech recording. For example, you might have Misty send the captured speech off to a third-party service for additional processing.

Parameters

misty.start_key_phrase_recognition(self, overwriteExisting : bool = None, silenceTimeout : int = None, maxSpeechLength : int = None, captureSpeech : int = None, speechRecognitionGrammar : str = None)
  • CaptureSpeech (bool) - Optional. If true, Misty starts recording speech after recognizing the "Hey, Misty" key phrase. By default, Misty saves speech recordings under the filename capture_HeyMisty.wav. Defaults to true.

  • MaxSpeechLength (int) - Optional. The maximum duration (in milliseconds) of the speech recording. If the length of an utterance exceeds this duration, Misty stops recording after the duration has elapsed, and the system triggers a VoiceRecord event with a message that Misty did not detect the end of the recorded speech. Range: 500 to 20000. Defaults to 7500 (7.5 seconds).

  • OverwriteExisting (bool) - Optional. If true, the captured speech recording overwrites any existing recording saved under the filename capture_HeyMisty.wav. If false, Misty saves the speech recording under a unique, timestamped filename: capture_HeyMisty_{Day}-{Month}-{Year}-{Hour}-{Minute}.wav. Defaults to true. Note: If you program Misty to save each unique speech recording, you should occasionally delete unused recordings to prevent them from filling the memory on the robot's 820 processor.

  • SilenceTimeout (int) - Optional. The maximum duration (in milliseconds) of silence that can precede speech before the speech capture mechanism times out. If Misty does not detect speech before the SilenceTimeout duration elapses, she stops listening for speech and triggers a VoiceRecord event with a message that she did not detect the beginning of speech. Range: 500 to 10000. Defaults to 5000 (5 seconds).

StopKeyPhraseRecognition

Stops Misty listening for the "Hey,Misty!" key phrase.

misty.stop_key_phrase_recognition

CaptureSpeech

Starts capturing speech in a new audio recording. By default, Misty's chest LED pulses blue when she is recording audio or listening for the key phrase. Misty's head tally light also turns on when she is recording audio or video.

misty.capture_speech(true, 500, 2000, true)

Misty waits to start recording until she detects speech. She then records until she detects the end of the utterance. By default, Misty records an utterance up to 7.5 seconds in length. You can adjust the maximum duration of a speech recording by using the MaxSpeechLength parameter.

Misty triggers a VoiceRecord event when she captures a speech recording.

Parameters

misty.capture_speech(self, overwriteExisting : bool = None, silenceTimeout : int = None, maxSpeechLength : int = None, requireKeyPhrase : bool = None, speechRecognitionGrammar : str = None)
  • OverwriteExisting (bool) - Optional. If true, the captured speech recording overwrites any existing recording saved under the default speech capture filename. (Note: Misty saves speech recordings she captures with this command under one of two default filenames: capture_HeyMisty.wav when RequireKeyPhrase is true, or capture_Dialogue.wav when RequireKeyPhrase is false.) If OverwriteExisting is false, Misty saves the speech recording under a unique, timestamped filename: capture_{HeyMisty or Dialogue}_{Day}-{Month}-{Year}-{Hour}-{Minute}.wav Defaults to true.

Note: If you program Misty to save each unique speech recording, you should occasionally delete unused recordings to prevent them from filling the memory on her 820 processor.

  • SilenceTimeout (int) - Optional. The maximum duration (in milliseconds) of silence that can precede speech before the speech capture mechanism times out. If Misty does not detect speech before the SilenceTimeout duration elapses, she stops listening for speech and triggers a VoiceRecord event with a message that she did not detect the beginning of speech. Range: 500 to 10000. Defaults to 5000 (5 seconds).

  • MaxSpeechLength (int) - Optional. The maximum duration (in milliseconds) of the speech recording. If the length of an utterance exceeds this duration, Misty stops recording after the duration has elapsed, and the system triggers a VoiceRecord event with a message that Misty did not detect the end of the recorded speech. Range: 500 to 20000. Defaults to 7500 (7.5 seconds).

  • RequireKeyPhrase (bool) - Optional. If true, Misty waits to start recording speech until she recognizes the key phrase. If false, Misty immediately starts recording speech. Defaults to true.

StartConversation

Example Code

misty.start_conversation("FoodConversation")

Parameters

misty.start_conversation(self, name : str = None)
  • name (string): The unique name of the conversation.

StopConversation

Stops the ongoing conversation.

misty.stop_conversation()

StartDialog

Initiates a dialogue session with Misty, enabling her to engage in interactive speech-based communication using pre-defined states and contexts.

misty.start_dialog("session_12345")

Parameters

misty.start_dialog(self, sessionId : str = None)
  • sessionId (string): An optional identifier for the dialogue session. This can be used to manage or reference specific dialogue interactions, particularly useful in scenarios where multiple dialogue sessions might be occurring or tracked.

StopDialog

Stops the ongoing dialog.

misty.start_dialog(self)

ConfigureDialog

Configures Misty's dialogue services, including natural language processing (NLP), automatic speech recognition (ASR), and text-to-speech (TTS), by setting up the necessary service providers and their respective access credentials and endpoints.

Example code

misty.configure_dialog(nlpService="YourNLPService", nlpServiceKey="YourNLPKey", nlpServiceRegion="YourNLPRegion",
                       asrService="YourASRService", asrServiceKey="YourASRKey", asrServiceRegion="YourASRRegion",
                       ttsService="YourTTSService", ttsServiceKey="YourTTSKey", ttsServiceRegion="YourTTSRegion")

Parameters

misty.configure_dialog(self, nlpService : str = None, nlpServiceKey : str = None, nlpServiceRegion : str = None, nlpServiceEndpoint : str = None, asrService : str = None, asrServiceKey : str = None, asrServiceRegion : str = None, asrServiceEndpoint : str = None, ttsService : str = None, ttsServiceKey : str = None, ttsServiceRegion : str = None, ttsServiceEndpoint : str = None
  • nlpService (string): The name of the natural language processing service provider.

  • nlpServiceKey (string): The access key for the NLP service.

  • nlpServiceRegion (string): The region or location of the NLP service.

  • nlpServiceEndpoint (string): The endpoint URL for the NLP service.

  • asrService (string): The name of the automatic speech recognition service provider.

  • asrServiceKey (string): The access key for the ASR service.

  • asrServiceRegion (string): The region or location of the ASR service.

  • asrServiceEndpoint (string): The endpoint URL for the ASR service.

  • ttsService (string): The name of the text-to-speech service provider.

  • ttsServiceKey (string): The access key for the TTS service.

  • ttsServiceRegion (string): The region or location of the TTS service.

  • ttsServiceEndpoint (string): The endpoint URL for the TTS service.

CreateConversation

Defines a new conversation flow for Misty, setting up a structured sequence of dialog states and interactions.

Example code

misty.create_conversation(name="FoodConversation", startingState="Foodstart", description="A simple greeting and chat conversation", useVisionData=True, overwrite=False)

Parameters

misty.create_conversation(self, name : str = None, startingState : str = None, description : str = None, useVisionData : bool = None, overwrite : bool = None
  • name (string): The unique identifier for the conversation. It's used to reference and manage the conversation within Misty's system.

  • startingState (string): The name of the initial state from which the conversation begins. This state sets the stage for the conversation's flow.

  • description (string): An optional description of the conversation's purpose and flow. This is useful for documentation and for understanding the conversation's design.

  • useVisionData (bool): Determines whether the conversation should utilize data from Misty's vision capabilities, such as facial recognition or object detection. This allows for more interactive and responsive conversations based on visual cues.

  • overwrite (bool): If set to True, any existing conversation with the same name will be overwritten. This is useful for updating or modifying existing conversations.

DeleteConversation

Deletes your saved conversations.

Example code

misty.delete_conversation("myconversation")

Parameters

misty.delete_conversation(self, name : str = None)
  • name (string): The unique name of the conversation.

UpdateConversation

Changes the name of your conversation and starting state.

Example code

misty.update_conversation(currentName="FoodConversation", newName="TacoPizzaConversation",startingState="Foodstart", description="A simple conversation about Tacos and Pizza", useVisionData=True, overwrite=False)

Parameters

misty.update_conversation(self, currentName : str = None, newName : str = None, startingState : str = None, useVisionData : bool = None, description : str = None)
  • currentName (string): The current name of the conversation to be updated. This is the identifier that Misty uses to locate the existing conversation.

  • newName (string): The new name for the conversation. This allows you to rename the conversation for clarity or organizational purposes.

  • startingState (string): The new starting state for the conversation. Changing this alters the initial interaction or response when the conversation begins.

  • useVisionData (bool): Indicates whether the updated conversation should utilize data from Misty's vision capabilities (like facial recognition or object detection). This can make the conversation more dynamic and responsive to visual inputs.

  • description (string): A new description for the conversation. This is useful for detailing the purpose, changes, or flow of the updated conversation.

SetContext

Configures Misty's speech recognition capabilities to understand and respond to specific phrases or words based on a given context.

Example Code

misty.set_context("yes-no-questions", "YesIntent, NoIntent", True, True)

Parameters

misty.set_context(self, context : str = None, filteredIntents : str = None, overlapContexts : bool = None, retrain : bool = None) 
  • context (string): The name of the context file Misty should use when processing speech. This file defines the phrases or words Misty will recognize and respond to.

  • filteredIntents (string): A comma-separated list of specific intents to filter from the context. Misty will only listen for and respond to these intents.

  • overlapContexts (bool): Optional. Specifies whether the new context should overlap with any previously set contexts. If True, Misty considers both the new and existing contexts when recognizing speech. If False, Misty uses only the new context. Default is False.

  • retrain (bool): Optional. Indicates whether Misty should retrain her speech recognition model with the new context. Setting this to True can improve accuracy but might require additional processing time. Default is False.

CreateState

Sets up a customized behavioral state for Misty, defining how she should act, speak, and respond during conversations.

Example Code

misty.create_state(name="TacoState", speak="Yeah, tacos are yummy, but pizza is way better", speakingAction="Wave", noMatchSpeech="I'm sorry,I didn't get that",repeatMaxCount=3)

Parameters

misty.create_state(self, name : str = None, speak : str = None, followUp : str = None, audio : str = None, listen : bool = None, contexts : str = None, preSpeech : str = None, startAction : str = None, speakingAction : str = None, listeningAction : str = None, processingAction : str = None, transitionAction : str = None, noMatchAction : str = None, noMatchSpeech : str = None, noMatchAudio : str = None, repeatMaxCount : int = None, failoverState : str = None, retrain : bool = None, overwrite : bool = None, reEntrySpeech : str = None, filters : str = None, requiredContext : str = None)
  • name (string): The unique name for the state being created.

  • speak (string): Text or speech synthesis markup language (SSML) for Misty to speak when in this state.

  • followUp (string): The next state for Misty to transition to after completing this state.

  • audio (string): File name of an audio clip for Misty to play in this state.

  • listen (bool): If True, Misty listens for speech input while in this state.

  • contexts (string): Comma-separated list of speech recognition contexts to be active in this state.

  • preSpeech (string): Text or SSML for Misty to speak before executing the primary speech command.

  • startAction (string): The action for Misty to perform upon entering this state.

  • speakingAction (string): The action for Misty to perform while speaking.

  • listeningAction (string): The action for Misty to perform while listening.

  • processingAction (string): The action for Misty to perform while processing input.

  • transitionAction (string): The action for Misty to perform during state transitions.

  • noMatchAction (string): The action for Misty to perform if no matching speech input is recognized.

  • noMatchSpeech (string): Text or SSML for Misty to speak if no matching speech input is recognized.

  • noMatchAudio (string): Audio file for Misty to play if no matching speech input is recognized.

  • repeatMaxCount (int): Maximum number of times to repeat this state if no match is found.

  • failoverState (string): The state to transition to if this state fails or is not matched.

  • retrain (bool): If True, retrains Misty's speech model for this state.

  • overwrite (bool): If True, overwrites any existing state with the same name.

  • reEntrySpeech (string): Text or SSML for Misty to speak if re-entering this state.

  • filters (string): Comma-separated list of filters to apply in this state.

  • requiredContext (string): Context required for this state to be active.

StartState

Initiates a specific state in Misty's behavior, optionally utilizing vision data for enhanced interaction.

Example code

misty.start_state(name="Foodstart", useVisionData=False)

Parameters

misty.start_state(self, name : str = None, useVisionData : bool = None)
  • name (string): The name of the state to be started. This state should be predefined in Misty's system.

  • useVisionData (bool): Determines whether the started state should make use of Misty's vision capabilities, such as facial recognition or object detection. This allows for more dynamic and context-aware interactions.

MapState

Defines the navigation and flow between different states within a specified conversation for Misty, allowing for detailed control over how Misty transitions from one state to another based on triggers and conditions.

Example code

misty.map_state(conversation="FoodConversation", state="FoodStart", trigger="taco", nextState="TacoState")

Parameters

misty.map_state(self, conversation : str = None, state : str = None, trigger : str = None, triggerFilter : str = None, nextState : str = None, detail : str = None, nextConversation : str = None, reEntry : bool = None, includeFollowUp : bool = None, overwrite : bool = None)
  • conversation (string): The name of the conversation to which the state mapping belongs.

  • state (string): The current state from which Misty will transition based on the defined trigger.

  • trigger (string): The trigger that initiates the state transition. This could be a specific command, user response, or other input.

  • triggerFilter (string): Additional filter criteria to refine how the trigger is evaluated. For example, categorizing user responses as "Positive" or "Negative".

  • nextState (string): The state Misty transitions to upon the trigger being activated.

  • detail (string): A description or detail about the state transition, useful for documentation or debugging.

  • nextConversation (string): Optionally specify a different conversation to transition to, instead of just a new state within the current conversation.

  • reEntry (bool): Indicates whether Misty can re-enter this state if the conditions are met again.

  • includeFollowUp (bool): Specifies whether to include any follow-up action or response after transitioning to the new state.

  • overwrite (bool): If set to True, this allows overwriting any existing state mapping with the same parameters.

RemoveMapState

misty.remove_map_state(self, conversation : str = None, state : str = None, trigger : str = None, triggerFilter : str = None, detail : str = None)

DeleteState

Deletes a state from your context.

Example code

misty.delete_state("pizzastate")

Parameters

misty.delete_state(self, name : str = None)
  • name (string): The unique name of your state.

CreateAction

Define and store custom actions for Misty.

Example code

misty.create_action(name="question_action", script= "LED-PATTERN:0,0,255,40,0,112,1200,breathe;IMAGE:e_ApprehensionConcerned.jpg;ARMS:29,29,1000;HEAD:10,0,0,1000;", overwrite= True)

Parameters

misty.create_action(self, name : str = None, script : str = None, overwrite : bool = None
  • name (string): The unique name assigned to the action. This name is used to identify and execute the action in your code.

  • script (string): A string that describes the sequence of commands that make up the action. The commands are separated by semicolons (;) and can include various actions such as LED patterns, image displays, arm movements, and head movements.

  • overwrite (bool): An optional parameter that determines whether an existing action with the same name should be overwritten. If set to True, the new action replaces any existing action with the same name.

DeleteAction

Deletes a specific action in Misty's memory.

Example code

misty.delete_action("question_action")

Parameters

misty.delete_action(self, name : str = None)

name (string): The unique name of your action.

TrainNLPEngine

Trains Misty's natural language processing (NLP) engine with specified contexts and intents.

Example code

misty.train_nlp_engine(context="samplefood", intents=blob, save=True, overwrite=True)

Parameters

misty.train_nlp_engine(self, context : str = None, intents : object = None, save : bool = None, overwrite : bool = None)
  • context (string): The name of the context within which the NLP training will occur. This context groups together various intents and phrases that Misty should understand.

  • intents (object): A dictionary or similar object containing the intents to be trained. Each intent is mapped to a list of phrases or utterances that exemplify that intent.

  • save (bool): If set to True, the trained data is saved in Misty's system for future use. This is essential for persisting the training across different sessions or interactions.

  • overwrite (bool): If True, existing training data for the specified context will be overwritten. This is useful for updating or refining Misty's NLP capabilities.

DeleteNLPContext

Deletes a saved context

Example code

misty.delete_nlp_context(context="samplefood")

Parameters

misty.delete_nlp_context(self, context : str = None)
  • context (string): The name of the context within which the NLP training will occur. This context groups together various intents and phrases that Misty should understand

PlayAndListen

Directs Misty to play an audio file and then listen for a response or input, typically within a specified context.

Example code

misty.play_and_listen("s_Awe.wav","samplefood")

Parameters

misty.play_and_listen(self, audioFile : str = None, context : str = None)
  • audioFile (string): The name of the audio file that Misty will play. This file should be preloaded or accessible to Misty. It can be a spoken phrase, question, or any other audio prompt.

  • context (string): Optionally specifies the context within which Misty should listen and interpret the response after playing the audio. The context helps Misty understand the expected types of responses or commands and process them appropriately

RestoreNLPModel

Restores Misty's natural language processing (NLP) model to its default state or a previously saved state. This function is used to revert any customizations or training that have been applied to Misty's NLP capabilities.

misty.restore_nlp_model()

TriggerConversationEvent

Triggers a specific event within Misty's conversation flow, allowing for manual control over the progression of a conversational interaction

Example code

misty.trigger_conversation_event(name="GreetUserEvent")

Parameters

misty.trigger_conversation_event(self, name : str = None)
  • name (string): The name of the conversation event to be triggered. This name corresponds to a predefined event within Misty's conversational capabilities.


Last updated