The basis of ARKit's hand tracking API in visionOS is to obtain data such as coordinates and rotation of each joint of the hand.
Here is a sample code of ARKit's hand tracking API in visionOS that works with only 73 lines of code.
Basic Knowledge
First, watch this session video. It gives a good overview of ARKit and hand tracking in visionOS.
Meet ARKit for spatial computing - WWDC23 - Videos - Apple Developer
Hand tracking is explained from 15:05.
Two APIs
There are two APIs to get joint data. Both can be accessed from the HandTrackingProvider instance.
- anchorUpdates: receives the latest values via AsyncSequence.
- latestAnchors: contains the latest values.
In this case, I implemented it with anchorUpdates.
link: AsyncSequence ver
Overview of the Sample Code
- Starts hand tracking when the app launches.
- Places simple sphere objects at all joints of both hands.
- Updates each object’s position to match the latest joint position.
- Ensures the objects are not hidden by the hand.
Full Source Code
import SwiftUI
import RealityKit
import ARKit
@main
struct MyApp: App {
var body: some SwiftUI.Scene {
ImmersiveSpace {
RealityView { content in
for chirality in [HandAnchor.Chirality.left, .right] {
for jointName in HandSkeleton.JointName.allCases {
let jointEntity = ModelEntity(mesh: .generateSphere(radius: 0.006),
materials: [SimpleMaterial()])
jointEntity.components.set(JointComponent(chirality: chirality,
jointName: jointName))
content.add(jointEntity)
}
}
}
}
.upperLimbVisibility(.hidden)
}
init() {
JointComponent.registerComponent()
JointSystem.registerSystem()
}
}
struct JointComponent: Component {
let chirality: HandAnchor.Chirality
let jointName: HandSkeleton.JointName
}
struct JointSystem: System {
private let session = ARKitSession()
private let provider = HandTrackingProvider()
init(scene: RealityKit.Scene) {
setUpSession()
}
private func setUpSession() {
Task {
try! await session.run([provider])
}
}
func update(context: SceneUpdateContext) {
let entities = context.entities(matching: .init(where: .has(JointComponent.self)),
updatingSystemWhen: .rendering)
let (leftHandAnchor, rightHandAnchor) = provider.latestAnchors
for handAnchor in [leftHandAnchor, rightHandAnchor] {
guard let handAnchor else { continue }
for entity in entities {
let component = entity.components[JointComponent.self]!
guard component.chirality == handAnchor.chirality,
let joint = handAnchor.handSkeleton?.joint(component.jointName) else {
continue
}
entity.setTransformMatrix(handAnchor.originFromAnchorTransform * joint.anchorFromJointTransform,
relativeTo: nil)
}
}
}
}
Copy and paste this code to use it.
Additional Steps
- Set any text for “NSHandsTrackingUsageDescription” in Info.plist.
- Set “Preferred Default Scene Session Role” to “Immersive Space” in Info.plist.
Comments
The explanations in the session video and the basic knowledge of SwiftUI and RealityKit are omitted.
latestAnchors?
https://developer.apple.com/documentation/arkit/handtrackingprovider/4189752-latestanchors
The most recent hand anchors for each hand.
Accessing this tuple consumes its values and sets them to nil until the next anchor update.
ECS
struct JointComponent: Component {
let chirality: HandAnchor.Chirality
let jointName: HandSkeleton.JointName
}
struct JointSystem: System {
...
func update(context: SceneUpdateContext) {
let entities = context.entities(matching: .init(where: .has(JointComponent.self)),
updatingSystemWhen: .rendering)
let (leftHandAnchor, rightHandAnchor) = provider.latestAnchors
...
}
}
I implemented polling using the ECS (Entity Component System) pattern, a common implementation in RealityKit.
Setting Access Permissions
To request access permissions, you need to set any text for “NSHandsTrackingUsageDescription” in Info.plist.
This key does not appear in the pull-down menu, so enter it directly.
Hands occlusion
Sets the preferred visibility of the user’s upper limbs, while an ImmersiveSpace scene is presented.
The system can show the user’s upper limbs during fully immersive experiences, but you can also hide them, for example, in order to display virtual hands instead.
https://developer.apple.com/documentation/swiftui/scene/upperlimbvisibility(_:)
.upperLimbVisibility(.hidden)
Launch the App in Full Space
@main
struct MyApp: App {
...
var body: some SwiftUI.Scene {
ImmersiveSpace {
...
}
...
}
}
When you create a new visionOS app project in Xcode, it generates code to launch in a window. For simplicity, I made it launch in full space.
Set “Preferred Default Scene Session Role” to “Immersive Space” in Info.plist. If you do not make this setting, the app will crash right after launch.
Note: A Physical Device is Required
You need a physical device to test the ARKit Hand Tracking API. It does not work at all in the simulator.
Next Step
- Check the current authorization status: session.queryAuthorization(for:)
- Explicitly request authorization: session.requestAuthorization(for:)
- Check the state of anchors: AnchorUpdate.Event
- Check if each anchor or joint is being tracked: TrackableAnchor.isTracked
- Observe the session state: ARKitSession.Events
- Check if the current runtime environment supports it: HandTrackingProvider.isSupported
Links
Meet ARKit for spatial computing - WWDC23 - Videos - Apple Developer
ARKit in visionOS | Apple Developer Documentation
upperLimbVisibility(_:) | Apple Developer Documentation
FlipByBlink/HandsRuler: Measure app by hand tracking for Apple Vision Pro
Top comments (0)