DEV Community

Cover image for How to Visualize ARKit's 3D Indoor Scenes in Rerun
Rerun
Rerun

Posted on

How to Visualize ARKit's 3D Indoor Scenes in Rerun

Try it in browser Source Code Explore Other Examples

This post is a guide for visualizing a 3D indoor scene captured using Apple's ARKit technology with the open-source visualization tool Rerun.

ARKitScenes Dataset

The ARKitScenes dataset, captured using Apple's ARKit technology, encompasses a diverse array of indoor scenes. 
Every 3D indoor scene contains:

  • Colour and Depth Images
  • Reconstructed 3D Meshes
  • Labelled Bounding Boxes Around Objects If you want to learn more about the scene structure, the data organization and structure of scenes are explained here.

Logging and Visualising with Rerun

Entities and Components

Rerun uses an Entity Component System architecture pattern in which entities represent generic objects while components describe data associated with those entities.

In our example, we have these entities:

  • world entity: includes 3D mesh data (world/mesh ), pinhole camera (world/mesh ), and annotiations (world/annotations )
  • video entity: includes RGB images (video/rgb) and depth images (video/depth )

You can learn more on this page Entities and Components.

Camera with Color and Depth Image

Log a moving RGB-D camera

To log a moving RGB-D camera, we log four key components: the camera's intrinsics via a pinhole camera model, its pose or extrinsics, along with the color and depth images. Both the RGB and depth images are then logged as child entities, capturing the visual and depth aspects of the scene, respectively.

# Log Pinhole Camera and its transforms
rr.log("world/camera_lowres", rr.Transform3D(transform=camera_from_world))
rr.log("world/camera_lowres", rr.Pinhole(image_from_camera=intrinsic, resolution=[w, h]))

# Log RGB Image
rr.log("video/rgb", rr.Image(rgb).compress(jpeg_quality=95))

# Log Depth Image
rr.log("video/depth", rr.DepthImage(depth, meter=1000))
Enter fullscreen mode Exit fullscreen mode

Here's a breakdown of the steps:

  1. Pinhole camera is utilized for achieving a 3D view and camera perspective through the use of the Pinhole and Transform3D archetypes.
  2. The RGB images are logged as Image archetype.
  3. The Depth images are logged as Depth archetype.

3D Mesh

Log 3D Mesh

The mesh is composed of mesh vertices, indices (i.e., which vertices belong to the same face), and vertex colors.

# ... load mesh data from dataset ... 

rr.log(
    "world/mesh",
    rr.Mesh3D(
        vertex_positions=mesh.vertices,
        vertex_colors=mesh.visual.vertex_colors,
        indices=mesh.faces,
    ),
    timeless=True,
)
Enter fullscreen mode Exit fullscreen mode

Here, the mesh is logged to the world/mesh entity using Mesh3D archetype and is marked as timeless, since it does not change in the context of this visualization.

3D Bounding Boxes

3D Bounding Boxes

Here we loop through the data and add bounding boxes to all the items found.

# .. load annotation data from dataset ...

for i, label_info in enumerate(annotation["data"]):
    rr.log(
        f"world/annotations/box-{uid}-{label}",
        rr.Boxes3D(
            half_sizes=half_size,
            centers=centroid,
            rotations=rr.Quaternion(xyzw=rot.as_quat()),
            labels=label,
            colors=colors[i],
        ),
        timeless=True,
    )
Enter fullscreen mode Exit fullscreen mode

The bounding boxes are logged as Boxes3D archetype.

Join us on Github

GitHub logo rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.

Build time aware visualizations of multimodal data

Use the Rerun SDK (available for C++, Python and Rust) to log data like images, tensors, point clouds, and text. Logs are streamed to the Rerun Viewer for live visualization or to file for later use.

A short taste

import rerun as rr  # pip install rerun-sdk
rr.init("rerun_example_app")

rr.connect()  # Connect to a remote viewer
# rr.spawn()  # Spawn a child process with a viewer and connect
# rr.save("recording.rrd")  # Stream all logs to disk

# Associate subsequent data with 42 on the “frame” timeline
rr.set_time_sequence("frame", 42))

# Log colored 3D points to the entity at `path/to/points`
rr.log("path/to/points", rr.Points3D(positions, colors=colors
Enter fullscreen mode Exit fullscreen mode

Top comments (0)