Willow 5: Skeletal Topology Map
Version: 5.4
Target: Integration Engineers (Quest 3, Azure Kinect, OptiTrack, Custom CV)
SDK Reference: C++ Edge SDK (willow.hpp) | Python SDK (Gold Standard)
To evaluate motion against a Willow 5 model (.int8, .json, .onnx, .h), your input data must be mapped to our 76-Point Hybrid Topology.
The engine expects a flattened array or tensor of shape [Frames, 75, 4] or [Frames, 76, 4], representing [X, Y, Z, Presence].
Core Engine Principles
1. The RDM & Dual Spatial Modes (Universal vs. Physics)
Willow evaluates dynamic distances between joints rather than absolute world coordinates. The mathematical routing depends on the spatial mode baked into your model:
- Universal Mode (Scale Invariance): To ensure models work seamlessly across users of any height or camera distance, the engine normalizes the Relational Distance Matrix (RDM) against a skeletal scale. The engine calculates this scale using the 3D distance between the Left Shoulder (Index 11) and Left Hip (Index 23).
-
Universal Fallback Logic: If the Torso zone is disabled via the Bitmask, the engine automatically falls back to scaling via the Left Leg (Index 25 to 27) or the Left Arm (Index 13 to 15) to maintain continuous scale invariance.
-
Physics Mode (Metric Space): Scale normalization is bypassed entirely. The engine calculates absolute physical distances in Meters. This unlocks exact spatial metrology and enables the tracking of external physical items via Node 76. Your input array must be provided in absolute meters.
2. The Presence Gate (The 4th Dimension)
The Point3D struct in the willow.hpp and willow-python-sdk includes a 4th value: visibility (or Presence).
- How to map it: If a joint is tracked by your hardware, set this to 1.0. If the joint is missing or occluded (e.g., feet are out of frame), set this to 0.0.
-
The Presence Gate (Pair-Wise Confidence): The 4th dimension of the input array [X, Y, Z, Visibility] acts as a dynamic confidence gate. The Willow Engine evaluates visibility on a pair-wise basis rather than dropping individual points.
- When calculating the distance between two joints, the engine multiplies their visibility scores. If the combined product (v1 * v2) falls below 0.25 (the mathematical equivalent of two 0.5 confidence points), the pair is muted and zero-weighted in the RDM.
- Engineering Note: This product-based logic means a highly occluded joint (e.g., 0.26) paired with a perfectly visible joint (1.0) will still be processed (0.26 * 1.0 > 0.25). This preserves partial kinetic relationships and prevents catastrophic tracking loss during complex physical rotations.
3. The "Legacy Hand" Deadzone (Indices 17 - 22)
You must strictly pass [0.0, 0.0, 0.0, 0.0] for indices 17 through 22.
- Reason: These indices are reserved for crude hand estimates found in legacy pose models.
- The Upgrade: Willow 5 uses a dedicated high-fidelity hand system (Indices 33-74) for all grip and finger-tracking logic. To prevent data conflict, the legacy slots (17-22) are permanently disabled.
Section 1: Body Topology (Indices 0 - 32)
Crucial for core mechanics and Torso Normalization.
|
Index |
Joint Name |
Zone |
Engineering Requirement |
|---|---|---|---|
0 |
Nose |
HEAD |
|
1 - 10 |
Eyes, Ears, Mouth |
HEAD |
|
11 |
Left Shoulder |
TORSO |
Normalization Anchor |
12 |
Right Shoulder |
TORSO |
Normalization Anchor |
13 |
Left Elbow |
ARMS |
|
14 |
Right Elbow |
ARMS |
|
15 |
Left Wrist |
ARMS |
|
16 |
Right Wrist |
ARMS |
|
17 - 22 |
DEADZONE |
NONE |
Strictly Mute: [0,0,0,0] |
23 |
Left Hip |
TORSO |
Normalization Anchor |
24 |
Right Hip |
TORSO |
Normalization Anchor |
25 - 28 |
Knees & Ankles |
LEGS |
|
29 - 32 |
Heels & Toes |
FEET |
Section 2: High-Fidelity Hands (Indices 33 - 74)
Provides 21-point dense articulation per hand. Essential for high-intent actions.
Left Hand (33 - 53) | Zone: HANDS
- 33: Left Wrist (Anchor)
- 34 - 37: Thumb (CMC, MCP, IP, Tip)
- 38 - 53: Fingers (Index, Middle, Ring, Pinky)
Right Hand (54 - 74) | Zone: HANDS
- 54: Right Wrist (Anchor)
- 55 - 58: Thumb (CMC, MCP, IP, Tip)
- 59 - 74: Fingers (Index, Middle, Ring, Pinky)
Section 3: Tracked Object (Index 75)
Exclusively used for Physics Models (Metric Space) to evaluate human-object interactions.
Object / Implement (75) | Zone: OBJECT
- 75: Tracked Object Centroid (e.g., Ball, Tool, Chassis). Must be mapped into the exact same 3D metric universe as the humanoid skeleton. If evaluating a Universal Model, or if the object is untracked, strictly pass [0.0, 0.0, 0.0, 0.0] to prevent matrix corruption.
Hardware Mapping Guides
Meta Quest 3 (XR)
- Torso: Use an IK solver (like VRIK) to estimate the positions of Shoulders (11, 12) and Hips (23, 24). These are required for the Universal Torso-Length normalization to function.
- Hands: Quest native TrackedHand data maps 1:1 to indices 33-74.
- Coordinate Bridge: Quest uses Y-Up. You must use the Transforms::to_unity() method in the SDK to flip the data to the engine's Y-Down standard.
OptiTrack / Vicon (MoCap)
- Map your skeleton markers to the corresponding indices.
- If your marker set does not include fingers or objects, you must pass 0.0 for the Presence value on indices 33-75.
Azure Kinect / LiDAR
- Standard Body Tracking (k4abt) maps smoothly to indices 0-32. Ensure indices 1-10, 17-22, and 75 are zero-padded.
Best Practices
- Anchor Check: Ensure the 4 anchors (11, 12, 23, 24) are always tracked. If these collapse to zero, Universal scale-invariance breaks.
- Array Sizing: Ensure your final array is exactly 75 or 76 points long before passing it to the Detector. Passing incorrect array shapes will trigger a segmentation fault safeguard and throw an `std::invalid_argument` exception.
- Filtering: If your hardware data is noisy, apply a One-Euro Filter to the X, Y, Z coordinates before mapping to the topology.
- Frame Rate: The SDK Detector expects a consistent frame rate. Align your camera frequency to the model (usually 30Hz or 60Hz).
- SDK Alignment: Always use the latest C++ or Python SDK modules to ensure your coordinate transforms and NMS logic match the Oracle ground truth.