In this skill, Misty uses two onboard AI Capabilities
Object Detection
Human Pose Estimation
Object Detection is used to make Misty look at the closest person. For this specific skill, I only wanted Misty to find Human. Hence I specifically look for the 1st human object and ignore the rest.
Human pose estimation is used to detect the ~waving arm gesture. The event provides 16 keypoints per message like nose, eye, ear, shoulder, elbow, wrist, hip, ankle etc.. Using these keypoints, logic can be built to detect specific gestures.
In this case, the logic will be:
Elbow is lower than Shoulder && Shoulder is lower than Wrist
This project will only work in the Misty Desktop Environment because we will modify the events file, exactly like we did for the QR code detector.
Open the folder containing the Python-SDK that you use for the desktop environment.
The folder should look like this one.
Open the folder MistyPy, open the file Events.py in Visual Studio code and modify the Events class to add the PoseEstimation Event.
Now you're ready to use Misty's Human Pose Estimation capabilities in Python.
Constants
Since Misty will have to track your face, including some constants about Misty's head's maximum range of movements will be necessary.
Python code
from mistyPy.Robot import Robotfrom mistyPy.Events import Eventsimport timeimport mathimport random# Initialize Mistymisty =Robot("YOUR_ROBOT_IP_ADDRESS")misty.change_led(0, 255, 0)misty.move_head(0, 0, 0)misty.display_image("e_DefaultContent.jpg")# Constantsyaw_left =81.36yaw_right =-85.37pitch_up =-40.10pitch_down =26.92# Variablescurr_head_pitch =0curr_head_yaw =0waving_now =Falseperson_width_history = [0,0,0,0]#Event handler for getting the current head positiondefcurr_position(data):global curr_head_pitch, curr_head_yawif data["message"]["sensorId"] =="ahp": curr_head_pitch = data["message"]["value"]if data["message"]["sensorId"] =="ahy": curr_head_yaw = data["message"]["value"]defget_pos(): misty.register_event(event_name="get_curr_position", event_type= Events.ActuatorPosition, keep_alive= True, callback_function=curr_position)
time.sleep(0.25) misty.unregister_event(event_name="get_curr_position")#Event handler for analyzing the human posedefhuman_pose(data):global waving_nowprint("starting pose estimation") keypoints = data["message"]["keypoints"]# 5,6- Shoulder 7,8- Elbow 9,10- Wrist if (waving_now ==False):#left handif (confident(keypoints[7])andconfident(keypoints[5])andconfident(keypoints[9])):if (pair_correlation(keypoints[7],keypoints[5])andpair_correlation(keypoints[5],keypoints[9])):if (scale_valid(keypoints[7],keypoints[5])): waving_now =Truewave_back("left")#right handelif (confident(keypoints[8])andconfident(keypoints[6])andconfident(keypoints[10])):if (pair_correlation(keypoints[8],keypoints[6])andpair_correlation(keypoints[6],keypoints[10])):if (scale_valid(keypoints[8],keypoints[6])): waving_now =Truewave_back("right")# Functions helper for the human posedefscale_valid(keypoint_one,keypoint_two): x_offset = keypoint_one["imageX"]- keypoint_two["imageX"] y_offset = keypoint_one["imageY"]- keypoint_two["imageY"]return math.sqrt(x_offset**2+ y_offset**2)>60defconfident(data):return data["confidence"]>=0.6defpair_correlation(keypoint_one,keypoint_two):return keypoint_one["imageY"]> keypoint_two["imageY"]defwave_back(arm):global waving_nowif arm =="left":print("Waving back left") misty.play_audio("s_Acceptance.wav") misty.display_image("e_Joy2.jpg") misty.transition_led(0, 90, 0, 0, 255, 0, "Breathe", 800) misty.move_arms(80, -89) time.sleep(1) misty.move_arms(80, 0) time.sleep(0.75) misty.move_arms(80, -89) time.sleep(0.75)else:print("Waving back right") misty.play_audio("s_Awe.wav") misty.display_image("e_Love.jpg") misty.transition_led(90, 0, 0, 255, 0, 0, "Breathe", 800) misty.move_arms(-89, 80) time.sleep(1) misty.move_arms(0, 80) time.sleep(0.75) misty.move_arms(-89, 80) time.sleep(0.75) time.sleep(1.5) misty.display_image("e_DefaultContent.jpg") misty.transition_led(0, 40, 90, 0, 130, 255, "Breathe", 1200) misty.move_arms(random.randint(70, 89), random.randint(70, 89)) time.sleep(1.5) waving_now =False# Human pose estimation eventdefstart_human_pose_estimation(): misty.start_pose_estimation(0.2, 0, 1) misty.register_event(event_name="pose_estimation", event_type=Events.PoseEstimation, keep_alive= True, callback_function= human_pose)
#Event handler for analyzing person detectiondefperson_detection(data):if data["message"]["confidence"] >=0.6:print("person detected") width_of_human = data["message"]["imageLocationRight"] - data["message"]["imageLocationLeft"] person_width_history.pop(0) person_width_history.append(width_of_human) # The first part checks if this measurement is the closest person and the second part checks if there is only one person that Misty can see
if abs(width_of_human - min(person_width_history)) > abs(width_of_human - max(person_width_history)) or std_deviation(person_width_history) <= 40:
x_error = (160.0 - (data["message"]["imageLocationLeft"] + data["message"]["imageLocationRight"]) / 2.0) / 160.0
y_error = (160.0 - 1.4 * data["message"]["imageLocationTop"] + 0.2 * data["message"]["imageLocationBottom"]) / 160.0
threshold =max(0.1, (341.0- width_of_human) /1000.0)get_pos() actuate_to_yaw = curr_head_yaw + x_error * ((yaw_left - yaw_right) / 6.0) if abs(x_error) > threshold else None
actuate_to_pitch = curr_head_pitch - y_error * ((pitch_down - pitch_up) / 3.0) if abs(y_error) > threshold else None
ifabs(curr_head_pitch -round(actuate_to_pitch))>11orabs(curr_head_yaw -round(actuate_to_yaw))>11: misty.move_head(actuate_to_pitch, None, actuate_to_yaw)# Functions helper for person detectiondefstd_deviation(array): mean_value =sum(array)/len(array)return math.sqrt(sum([(x - mean_value) **2for x in array]) /len(array))# Person tracking eventdefstart_person_tracking() : misty.start_object_detector(0.5, 0, 15) misty.register_event(event_name="person_detection", event_type= Events.ObjectDetection, callback_function=person_detection, keep_alive=True)
# Start programstart_person_tracking()start_human_pose_estimation()misty.keep_alive()
In this code, every group of functions or variables is explained in their use.
As always the first steps are declaring the libraries, initializing the robot and the constants.
Right after it's used the same couple of functions are used in Misty follow human to get the current Misty's head's values.
Then you can find the logic behind the Human Pose Estimation event.
These are the 16 keypoints that Misty can recognize:
After recognizing the logic it's time to animate Misty and you can modify it in the wave_back function.
In the person detection function the first part attempts to look just at the closest person when multiple people are in front of Misty, while the second adjusts Misty's head position.
The last lines of code start the whole program and keep it alive.
There are lots of magic numbers in the data received from these functions!
Play with it!
To get those constants you can use the same code as the one in the Misty follow human project in the section.