The above example shows why working with user motion data is hard: kinematic researchers need to know a lot of details about XR motion data before they can actually work with them. In our paper and with this page we explain this issue and why it is dangerous for research, introduce a catalogue of already prepared XR motion datasets and define guidelines and best practices for future work.
For all details, download our paper. If you want to work or play around with any of these motion datasets, check out our catalogue. For live demos and guidelines for dataset creators, researchers or reviewers, scroll down!
Researchers working with XR motion data require a minimum set of critical information to correctly load and interpret motion recordings. We identify the following mandatory specifications, which are required for any XR motion dataset.
Here you can see a correctly loaded example recording from a user throwing a virtual bowling ball from the LiebersLabStudy21 dataset; the animations below show the same recording, but loaded with wrong specifications to showcase the different effects.
Understanding the coordinate system used in a dataset is essential for accurately interpreting spatial data. If it is unclear how the X, Y, and Z coordinates correspond to axes like up, forward, and left/right, spatial relationships cannot be properly reconstructed. For example, the same motion will suddenly look very unrealistic if ‘X’ gets interpreted as ‘up’ instead of ‘Y’ – not only because positions are flipped, but also because rotations will be misinterpreted. The datasets analyzed for this work all use two similar coordinate systems, which only slightly differ: one is left- handed, so Z points ‘right’, the other right-handed, so Z points ‘left’. Even though this difference is relatively subtle, assuming the wrong coordinate system will result in highly corrupted motions and will lead to incorrect conclusions.
The units of measurement used in a dataset, whether meters, centimeters, custom units, etc., are fundamental for accurately assessing and comparing spatial data. For example, assuming centimeters instead of meters would lead to peripherals appearing a 100 times closer and motions 100 times slower.
Misinterpreting rotations (e.g., Euler angles, quaternions, or transformation matrices) leads to incorrect reconstructions of motions. For instance, Euler notation seems straightforward at first glance, as it defines rotations around the X, Y, and Z axes. Yet, to apply it correctly, one needs to know whether the rotations are intrinsic (rotating about the axes of the moving coordinate system) or extrinsic (rotating about the axes of the fixed coordinate system), as well as the order of applying rotations along each axis. Like before, wrong assumptions regarding this are easy to miss, but will lead to incorrectly reconstructed motions.
Accurate timing information is crucial for understanding the sequence of frames and duration of movements in motion data. This can be represented through timestamps or a fixed framerate. Without clear timing data, the dynamics of motion cannot be accurately analyzed. Assuming the wrong timing of frames will effectively lead to reconstructed motions to be too fast or too slow.
The structure of a dataset, particularly how recordings are organized, significantly impacts data accessibility. A poorly structured dataset can lead to confusion about which files correspond to specific sessions or participants. For example, if a dataset combines multiple recordings into a single file without clear demarcation, it becomes challenging to isolate and analyze individual recordings. Conversely, if every motion sequence is saved as a separate file without a systematic naming convention or indexing, researchers might struggle to locate and aggregate relevant data for their studies. This issue is even more relevant in large datasets, where the sheer volume of recordings necessitates a well-defined organizational scheme to facilitate easy access and selection. Efficient data retrieval and analysis depend on a logical, well-documented structure that aligns with the research objectives.
The file format is vital in determining how recordings can be loaded and attributes correctly labeled. An unsuit- able or poorly documented file format can lead to misunderstandings. This affects the integrity of the research, as conclusions drawn from improperly interpreted data are likely to be erroneous.
We have compiled a catalogue of XR motion datasets. Originally, each of these datasets comes in a different format and requires very different approaches to download, read and convert. We streamlined this process by not only aligning each dataset into a unified format, but also by hosting each aligned dataset on Hugging Face. Now, you can use each dataset right away with just one line of code.
Beneath you find aligned example recordings from each dataset in the catalogue, visualized using our Motion Visualization Tool.
This dataset focuses on users performing specific bowling and archery motions. The latter is visualized here.
Coordinate System
X: Right, Y: Up, Z: Backward
Rotations
Euler (extrinsic, degrees, sequence is XYZ)
Units
Meter
Time
relative (ms)
Format
CSV
This dataset focuses on users performing interactions with various interface elements, such as buttons and sliders, in AR and VR environments for motion-based user identification.
Coordinate System
X: Right, Y: Up, Z: Backward
Rotations
Euler (extrinsic, degrees, sequence is XYZ) and Quaternion
Units
Meter
Time
relative (s and ms)
Format
TSV
This dataset focuses on capturing the motions of users performing ball throwing actions in VR.
Coordinate System
X: Right, Y: Up, Z: Backward
Rotations
Quaternion
Units
Meter
Time
fixed to 45 fps or 75 fps
Format
Custom
This dataset focuses on capturing the motions of users playing the game 'Half-Life: Alyx' in VR.
Coordinate System
X: Right, Y: Up, Z: Forward
Rotations
Quaternion
Units
Centimeter
Time
abs. (ISO 8601)
Format
CSV
This dataset focuses on users playing the game Beat Saber in VR and aims to provide a large-scale human motion dataset for researchers and studies.
Coordinate System
X: Right, Y: Up, Z: Backward
Rotations
Quaternion
Units
Meter
Time
relative (s)
Format
XROR
This dataset focuses on users playing the game Tilt Brush in VR and aims to provide a large-scale human motion dataset for researchers and studies.
Coordinate System
X: Right, Y: Up, Z: Backward
Rotations
Quaternion
Units
Decimeter
Time
relative (ms)
Format
XROR
This dataset focuses on users playing the game Beat Saber and is aimed at user identification through motion.
Coordinate System
X: Right, Y: Up, Z: Backward
Rotations
Euler (extrinsic, degrees, sequence is XYZ) and Quaternions
Units
Meter
Time
90 Hz (presumably fixed)
Format
CSV
This dataset focuses on users performing assembly tasks in VR and is also intended for user identification research.
Coordinate System
X: Right, Y: Up, Z: Backward
Rotations
Euler (extrinsic, degrees, sequence is XYZ), Rotation Matrix and Quaternion
Units
Meter
Time
relative (s)
Format
CSV
This dataset focuses on users playing various VR games and was designed for cybersickness research.
Coordinate System
X: Right, Y: Up, Z: Forward
Rotations
Transformation Matrix
Units
Meter
Time
absolute (unix)
Format
CSV
The following guidelines are for creators and users of future XR motion datasets who want to make their datasets and research accessible. These guidelines are the result of our analyses described in our paper and best practices we have established over time within our team for creating, utilizing, and evaluating XR user motion datasets. Addressing dataset creators, as well as authors and reviewers, the guidelines aim to foster transparency, consistency, and accessibility, which is essential for the integrity and advancement of research in our field.
First and foremost, it is imperative that future datasets comprehensively report all relevant specifications as described above. For file formats, we recommend adopting common formats like CSV, or binary equivalents such as HDF5 or Parquet for tabular data, or formats like JSON, YAML, or BSON (binary variant of JSON). Custom formats, like the XROR format used by the BOXRR-23 dataset, can also be a sensible solution for datasets with very specific characteristics and unique requirements. Regardless of the format chosen, the dataset should be accompanied by a thorough documentation of each data attribute and how it has been labeled. We advocate using quaternions to represent rotations, as Euler angles require additional specifications and are easily misinterpreted. Providing a timestamp column with the passed time since the start of the recording in milliseconds offers a clear way to specify the timing of each frame. Clear documentation of all of these aspects is crucial for accurate data interpretation and replication of research.
Motion recordings should be organized in an accessible and transparent file structure. Ideally, there is one file per recording and data type. Combined with a clear naming scheme, this should make it straightforward to select individual recordings without having to inconveniently extract them from the rest of the dataset. Not only does this save time, resources and frustrations, it also makes the routines for importing recordings less susceptible to bugs.
Other researchers should be able to easily, and permanently access and download published datasets. Several of the analyzed datasets require to download the full data, which is especially cumbersome if the total size is prohibitively large. Often, only certain data types or recordings of a few users are needed, so dataset creators should look for ways to not only provide bulk- but also partial downloads. Tools like Git paired with Git LFS or DVC allow straightforward ways to easily manage even large datasets, offering a convenient alternative to single zip archives. For hosting, there are options like Hugging Face, GitHub, Kaggle, Zenodo, etc. that not provide free hosting, but often tooling for up- and downloading datasets, which improves accessibility for both, creators and users of datasets.
Additionally, datasets should include example scripts that demonstrate how to properly load the data and correctly identify each attribute. These scripts not only serve as a practical tool for other researchers but also act as a form of documentation, offering insights into the intended use and interpretation of the data.
An ideal enhancement for future datasets is the provision of options for data visualization. This significantly eases the process of understanding and analyzing motion data, making datasets more accessible and user-friendly. To aid in this, we publish the code for our motion visualization tool and provide instructions for how to set it up.
Beyond the discussed fundamental specifications, there is additional contextual information that can greatly benefit researchers. For example, providing background information about the used data source, such as whether it was collected from Unity, Steam OpenVR, or other platforms, can give important context information about the dataset’s characteristics. Moreover, awareness of application-specific traits is crucial. For instance, scenarios where users are teleported within a scene can result in abrupt and seemingly inexplicable ‘jumps’ in the data. Similarly, it should be clarified if data for certain peripherals, like hand tracking, are available only under specific conditions (like when hands are visible to the camera). This helps to distinguish between intentional data absences and potential errors. Additionally, understanding whether users might place their controllers or HMDs down during a session, or whether they are seated or standing, can offer valuable insights into the dataset’s dynamics. These nuances, although seemingly minor, can have substantial implications for the accuracy and reliability of research outcomes, underscoring the importance of comprehensive dataset documentation. Creators of dataset should also be aware that their datasets might be used for different purposes, so they should not just focus on the specific requirements of their own research.
We highly recommend that dataset creators disclose and discuss the software used in generating their datasets. This transparency not only allows for the reproduction, verification, and extension of the datasets but also fosters an environment of open collaboration and innovation in the field. By sharing the tools and methods used for data collection, researchers can contribute to a more robust and dynamic understanding of kinematic data in XR environments.
In line with general best practices for dataset creation, it is highly advisable to utilize established dataset labeling frameworks, such as Dataset Nutrition Labels, Data Cards, or Datasheets for Datasets. These frameworks provide structured and standardized ways to present critical information about datasets, promoting transparency and ease of use. By incorporating these labeling frameworks, dataset creators can ensure that users are well-informed about the nature and characteristics of the data. These frameworks can easily be augmented with the aforementioned specific information relevant to XR user motion studies. This practice not only allows researchers to quickly understand the datasets without spending time and resources for tedious and error prone analyses, but also fosters a culture of clarity and accountability in data sharing within the kinematic research community. Implementing such comprehensive labeling approaches will significantly contribute to the rigor and reproducibility of future research in this field.
Collecting and sharing XR motion datasets research entails significant ethical considerations, particularly regarding participant privacy. As research has shown, motion data can inadvertently reveal personal information, making it crucial to implement protective measures. Even if users are fine with being openly recognized within the dataset collection study, they should be aware that they could be re-identified in different scenarios where they want to stay anonymous, just based on their motion data. Hence, researchers must account for informed consent, pseudonymize data as soon as possible, and comply with relevant data protection laws to safeguard participant privacy. Additionally, Data Use Agreements (DUAs) can regulate access, outlining specific conditions for data use and ensuring ethical handling.
For researchers engaging with XR user motion studies, a thorough understanding and exploration of key dataset specifications are imperative. These specifications, as outlined in previous sections, are crucial for accurate data interpretation and experimental reproducibility. If a dataset’s documentation lacks these details, researchers should make efforts to acquire this information directly from the dataset creators or conduct their own analysis to determine these specifics. Furthermore, any such efforts and findings must be transparently disclosed in their publications.
Researchers must document any conversions applied to the motion data. Detailed documentation of these conversions is essential for clarity and integrity of the research. It ensures that each step of data handling is accurately conveyed and that the data is not inadvertently distorted during the process. This transparency is critical for other researchers who may wish to replicate or build upon the work.
In studies utilizing multiple datasets, it is essential to disclose the key differences between them and the measures taken to align each dataset. This includes aligning coordinate systems, normalizing units of measurement, standardizing rotation representations and frame timing. Researchers must clearly outline how they harmonized disparate datasets to ensure consistent and accurate analysis.
Authors must critically analyze and potentially discuss how the specifications of the datasets could have influenced their results. This examination should consider whether the results might be skewed — either overly optimistic, such as when machine learning models overfit to dataset-specific signals, or overly pessimistic, due to erroneous preprocessing or misinterpretation of the data. Such a critical evaluation helps in contextualizing the findings and provides a more nuanced understanding of the study’s implications and limitations.
Publishing the codebase is a fundamental requirement. This includes code for data import, alignment, preprocessing, and analysis. Making the codebase publicly available allows for independent verification of the methodology and ensures that data handling has been executed correctly. It fosters transparency, reproducibility, and collaborative advancement in the field. Without access to code, it is impossible for reviewers and other researchers to validate, replicate, or extend the findings, thereby impeding scientific progress in kinematic research.