The exoskeletons are instrumented to match the kinematics and sensor suite of the actual robot gripper. You can trivially train a model on human collected gripper data and replay it on the robot.
You mentioned UMI, which to my knowledge runs VSLAM on camera+IMU data to estimate the gripper pose and no exoskeletons are involved. See here: https://umi-gripper.github.io/
Calling UMI an "exoskeleton" might be a stretch but the principle is the same - humans use a kinematically matched instrumented end affector to collect data that can be trivially replayed on the robot.