Navigating the Kinematic Maze:
Analyzing, Standardizing and Unifying XR Motion Datasets

An IEEE VR 2024 workshop paper from
A user playing the VR game Beat Saber.
The same recording, but the coordinate system has been imported with wrong specifications into our motion visualization tool.

The above example shows why working with user motion data is hard: kinematic researchers need to know a lot of details about XR motion data before they can actually work with them. In our paper and with this page we explain this issue and why it is dangerous for research, introduce a catalogue of already prepared XR motion datasets and define guidelines and best practices for future work.


For all details, download our paper. If you want to work or play around with any of these motion datasets, check out our catalogue. For live demos and guidelines for dataset creators, researchers or reviewers, scroll down!

Critical Motion Specifications

Researchers working with XR motion data require a minimum set of critical information to correctly load and interpret motion recordings. We identify the following mandatory specifications, which are required for any XR motion dataset.

Here you can see a correctly loaded example recording from a user throwing a virtual bowling ball from the LiebersLabStudy21 dataset; the animations below show the same recording, but loaded with wrong specifications to showcase the different effects.

Correctly Imported Recording
A correctly imported motion recording.

Specification


Examples


Coordinate System

Understanding the coordinate system used in a dataset is essential for accurately interpreting spatial data. If it is unclear how the X, Y, and Z coordinates correspond to axes like up, forward, and left/right, spatial relationships cannot be properly reconstructed. For example, the same motion will suddenly look very unrealistic if ‘X’ gets interpreted as ‘up’ instead of ‘Y’ – not only because positions are flipped, but also because rotations will be misinterpreted. The datasets analyzed for this work all use two similar coordinate systems, which only slightly differ: one is left- handed, so Z points ‘right’, the other right-handed, so Z points ‘left’. Even though this difference is relatively subtle, assuming the wrong coordinate system will result in highly corrupted motions and will lead to incorrect conclusions.

Wrong Coordinate System
This recording has been loaded with an inverted Z-axis; this not only results in mirrored positions of HMD and controllers, but also in broken rotations.

Units of Measurement

The units of measurement used in a dataset, whether meters, centimeters, custom units, etc., are fundamental for accurately assessing and comparing spatial data. For example, assuming centimeters instead of meters would lead to peripherals appearing a 100 times closer and motions 100 times slower.

Wrong Units
This recording has been loaded with a wrong scaling of units (e.g., meters instead of centimeters).

Representation of Rotations

Misinterpreting rotations (e.g., Euler angles, quaternions, or transformation matrices) leads to incorrect reconstructions of motions. For instance, Euler notation seems straightforward at first glance, as it defines rotations around the X, Y, and Z axes. Yet, to apply it correctly, one needs to know whether the rotations are intrinsic (rotating about the axes of the moving coordinate system) or extrinsic (rotating about the axes of the fixed coordinate system), as well as the order of applying rotations along each axis. Like before, wrong assumptions regarding this are easy to miss, but will lead to incorrectly reconstructed motions.

Intrinsic vs. Extrinsic Euler notation &
Wrong Euler sequence
Euler angles can be intrinsic or extrinsic; if the wrong type gets assumed during import, rotations are inherently broken. Furthermore, three values are needed to represent a rotation with the Euler notation, but it has to be defined to which axis each of these values is mapped to; X→Y→Z may seem straightforward, but it can also be ZXY, ZYX, or even something like XZX!

Time Encoding

Accurate timing information is crucial for understanding the sequence of frames and duration of movements in motion data. This can be represented through timestamps or a fixed framerate. Without clear timing data, the dynamics of motion cannot be accurately analyzed. Assuming the wrong timing of frames will effectively lead to reconstructed motions to be too fast or too slow.

Wrong Time Encoding
This example shows what happens if the wrong time encoding gets applied to a motion sequence: the reconstructed animation is either too fast or too slow!

Structure

The structure of a dataset, particularly how recordings are organized, significantly impacts data accessibility. A poorly structured dataset can lead to confusion about which files correspond to specific sessions or participants. For example, if a dataset combines multiple recordings into a single file without clear demarcation, it becomes challenging to isolate and analyze individual recordings. Conversely, if every motion sequence is saved as a separate file without a systematic naming convention or indexing, researchers might struggle to locate and aggregate relevant data for their studies. This issue is even more relevant in large datasets, where the sheer volume of recordings necessitates a well-defined organizational scheme to facilitate easy access and selection. Efficient data retrieval and analysis depend on a logical, well-documented structure that aligns with the research objectives.

File Format

The file format is vital in determining how recordings can be loaded and attributes correctly labeled. An unsuit- able or poorly documented file format can lead to misunderstandings. This affects the integrity of the research, as conclusions drawn from improperly interpreted data are likely to be erroneous.


Catalogue of XR Motion Datasets

We have compiled a catalogue of XR motion datasets. Originally, each of these datasets comes in a different format and requires very different approaches to download, read and convert. We streamlined this process by not only aligning each dataset into a unified format, but also by hosting each aligned dataset on Hugging Face. Now, you can use each dataset right away with just one line of code.

Beneath you find aligned example recordings from each dataset in the catalogue, visualized using our Motion Visualization Tool.

LiebersLabStudy21

This dataset focuses on users performing specific bowling and archery motions. The latter is visualized here.


Specifications

Coordinate System

X: Right, Y: Up, Z: Backward

Rotations

Euler (extrinsic, degrees, sequence is XYZ)

Units

Meter

Time

relative (ms)

Format

CSV

LiebersHand22

This dataset focuses on users performing interactions with various interface elements, such as buttons and sliders, in AR and VR environments for motion-based user identification.


Specifications

Coordinate System

X: Right, Y: Up, Z: Backward

Rotations

Euler (extrinsic, degrees, sequence is XYZ) and Quaternion

Units

Meter

Time

relative (s and ms)

Format

TSV

RMillerBall22

This dataset focuses on capturing the motions of users performing ball throwing actions in VR.


Specifications

Coordinate System

X: Right, Y: Up, Z: Backward

Rotations

Quaternion

Units

Meter

Time

fixed to 45 fps or 75 fps

Format

Custom

Who-Is-Alyx

This dataset focuses on capturing the motions of users playing the game 'Half-Life: Alyx' in VR.


Specifications

Coordinate System

X: Right, Y: Up, Z: Forward

Rotations

Quaternion

Units

Centimeter

Time

abs. (ISO 8601)

Format

CSV

BOXRR Beat Saber

This dataset focuses on users playing the game Beat Saber in VR and aims to provide a large-scale human motion dataset for researchers and studies.


Specifications

Coordinate System

X: Right, Y: Up, Z: Backward

Rotations

Quaternion

Units

Meter

Time

relative (s)

Format

XROR

BOXRR Tilt Brush

This dataset focuses on users playing the game Tilt Brush in VR and aims to provide a large-scale human motion dataset for researchers and studies.


Specifications

Coordinate System

X: Right, Y: Up, Z: Backward

Rotations

Quaternion

Units

Decimeter

Time

relative (ms)

Format

XROR

LiebersBeatSaber23

This dataset focuses on users playing the game Beat Saber and is aimed at user identification through motion.


Specifications

Coordinate System

X: Right, Y: Up, Z: Backward

Rotations

Euler (extrinsic, degrees, sequence is XYZ) and Quaternions

Units

Meter

Time

90 Hz (presumably fixed)

Format

CSV

MooreCrossDomain23

This dataset focuses on users performing assembly tasks in VR and is also intended for user identification research.


Specifications

Coordinate System

X: Right, Y: Up, Z: Backward

Rotations

Euler (extrinsic, degrees, sequence is XYZ), Rotation Matrix and Quaternion

Units

Meter

Time

relative (s)

Format

CSV

VR.net

This dataset focuses on users playing various VR games and was designed for cybersickness research.


Specifications

Coordinate System

X: Right, Y: Up, Z: Forward

Rotations

Transformation Matrix

Units

Meter

Time

absolute (unix)

Format

CSV


Guidelines

The following guidelines are for creators and users of future XR motion datasets who want to make their datasets and research accessible. These guidelines are the result of our analyses described in our paper and best practices we have established over time within our team for creating, utilizing, and evaluating XR user motion datasets. Addressing dataset creators, as well as authors and reviewers, the guidelines aim to foster transparency, consistency, and accessibility, which is essential for the integrity and advancement of research in our field.

For Motion Dataset Creators


GC1
Use Accessible Standards and Report Critical Requirements

First and foremost, it is imperative that future datasets comprehensively report all relevant specifications as described above. For file formats, we recommend adopting common formats like CSV, or binary equivalents such as HDF5 or Parquet for tabular data, or formats like JSON, YAML, or BSON (binary variant of JSON). Custom formats, like the XROR format used by the BOXRR-23 dataset, can also be a sensible solution for datasets with very specific characteristics and unique requirements. Regardless of the format chosen, the dataset should be accompanied by a thorough documentation of each data attribute and how it has been labeled. We advocate using quaternions to represent rotations, as Euler angles require additional specifications and are easily misinterpreted. Providing a timestamp column with the passed time since the start of the recording in milliseconds offers a clear way to specify the timing of each frame. Clear documentation of all of these aspects is crucial for accurate data interpretation and replication of research.

GC2
Account for Sensible File Structure

Motion recordings should be organized in an accessible and transparent file structure. Ideally, there is one file per recording and data type. Combined with a clear naming scheme, this should make it straightforward to select individual recordings without having to inconveniently extract them from the rest of the dataset. Not only does this save time, resources and frustrations, it also makes the routines for importing recordings less susceptible to bugs.

GC3
Allow Easy and Permanent Dataset Access

Other researchers should be able to easily, and permanently access and download published datasets. Several of the analyzed datasets require to download the full data, which is especially cumbersome if the total size is prohibitively large. Often, only certain data types or recordings of a few users are needed, so dataset creators should look for ways to not only provide bulk- but also partial downloads. Tools like Git paired with Git LFS or DVC allow straightforward ways to easily manage even large datasets, offering a convenient alternative to single zip archives. For hosting, there are options like Hugging Face, GitHub, Kaggle, Zenodo, etc. that not provide free hosting, but often tooling for up- and downloading datasets, which improves accessibility for both, creators and users of datasets.

GC4
Provide Demo Code

Additionally, datasets should include example scripts that demonstrate how to properly load the data and correctly identify each attribute. These scripts not only serve as a practical tool for other researchers but also act as a form of documentation, offering insights into the intended use and interpretation of the data.

GC5
Offer Visualizations

An ideal enhancement for future datasets is the provision of options for data visualization. This significantly eases the process of understanding and analyzing motion data, making datasets more accessible and user-friendly. To aid in this, we publish the code for our motion visualization tool and provide instructions for how to set it up.

GC6
Add Contextual Information

Beyond the discussed fundamental specifications, there is additional contextual information that can greatly benefit researchers. For example, providing background information about the used data source, such as whether it was collected from Unity, Steam OpenVR, or other platforms, can give important context information about the dataset’s characteristics. Moreover, awareness of application-specific traits is crucial. For instance, scenarios where users are teleported within a scene can result in abrupt and seemingly inexplicable ‘jumps’ in the data. Similarly, it should be clarified if data for certain peripherals, like hand tracking, are available only under specific conditions (like when hands are visible to the camera). This helps to distinguish between intentional data absences and potential errors. Additionally, understanding whether users might place their controllers or HMDs down during a session, or whether they are seated or standing, can offer valuable insights into the dataset’s dynamics. These nuances, although seemingly minor, can have substantial implications for the accuracy and reliability of research outcomes, underscoring the importance of comprehensive dataset documentation. Creators of dataset should also be aware that their datasets might be used for different purposes, so they should not just focus on the specific requirements of their own research.

GC7
Disclose recording methods

We highly recommend that dataset creators disclose and discuss the software used in generating their datasets. This transparency not only allows for the reproduction, verification, and extension of the datasets but also fosters an environment of open collaboration and innovation in the field. By sharing the tools and methods used for data collection, researchers can contribute to a more robust and dynamic understanding of kinematic data in XR environments.

GC8
Make Information Easily Accessible

In line with general best practices for dataset creation, it is highly advisable to utilize established dataset labeling frameworks, such as Dataset Nutrition Labels, Data Cards, or Datasheets for Datasets. These frameworks provide structured and standardized ways to present critical information about datasets, promoting transparency and ease of use. By incorporating these labeling frameworks, dataset creators can ensure that users are well-informed about the nature and characteristics of the data. These frameworks can easily be augmented with the aforementioned specific information relevant to XR user motion studies. This practice not only allows researchers to quickly understand the datasets without spending time and resources for tedious and error prone analyses, but also fosters a culture of clarity and accountability in data sharing within the kinematic research community. Implementing such comprehensive labeling approaches will significantly contribute to the rigor and reproducibility of future research in this field.

GC9
Consider Ethical Implications

Collecting and sharing XR motion datasets research entails significant ethical considerations, particularly regarding participant privacy. As research has shown, motion data can inadvertently reveal personal information, making it crucial to implement protective measures. Even if users are fine with being openly recognized within the dataset collection study, they should be aware that they could be re-identified in different scenarios where they want to stay anonymous, just based on their motion data. Hence, researchers must account for informed consent, pseudonymize data as soon as possible, and comply with relevant data protection laws to safeguard participant privacy. Additionally, Data Use Agreements (DUAs) can regulate access, outlining specific conditions for data use and ensuring ethical handling.

For Authors and Reviewers


GAR1
Review and Exploration

For researchers engaging with XR user motion studies, a thorough understanding and exploration of key dataset specifications are imperative. These specifications, as outlined in previous sections, are crucial for accurate data interpretation and experimental reproducibility. If a dataset’s documentation lacks these details, researchers should make efforts to acquire this information directly from the dataset creators or conduct their own analysis to determine these specifics. Furthermore, any such efforts and findings must be transparently disclosed in their publications.

GAR2
Conversion

Researchers must document any conversions applied to the motion data. Detailed documentation of these conversions is essential for clarity and integrity of the research. It ensures that each step of data handling is accurately conveyed and that the data is not inadvertently distorted during the process. This transparency is critical for other researchers who may wish to replicate or build upon the work.

GAR3
Alignment

In studies utilizing multiple datasets, it is essential to disclose the key differences between them and the measures taken to align each dataset. This includes aligning coordinate systems, normalizing units of measurement, standardizing rotation representations and frame timing. Researchers must clearly outline how they harmonized disparate datasets to ensure consistent and accurate analysis.

GAR4
Critical Analysis of Results

Authors must critically analyze and potentially discuss how the specifications of the datasets could have influenced their results. This examination should consider whether the results might be skewed — either overly optimistic, such as when machine learning models overfit to dataset-specific signals, or overly pessimistic, due to erroneous preprocessing or misinterpretation of the data. Such a critical evaluation helps in contextualizing the findings and provides a more nuanced understanding of the study’s implications and limitations.

GAR5
Publication of Codebase

Publishing the codebase is a fundamental requirement. This includes code for data import, alignment, preprocessing, and analysis. Making the codebase publicly available allows for independent verification of the methodology and ensures that data handling has been executed correctly. It fosters transparency, reproducibility, and collaborative advancement in the field. Without access to code, it is impossible for reviewers and other researchers to validate, replicate, or extend the findings, thereby impeding scientific progress in kinematic research.