Omnihuman. ⁠Realistic Synthetic Human Data for Vision AI

Example Uses of Omnihuman

Train Robust Models for Human Safety

Our Synthetic Data Cloud easily simulates complex environments, e.g. airports, construction sites, manufacturing areas. The Omnihuman module feature enables precise control over human pose, motion, personal protective equipment, occlusions, gaze estimation, and gesture recognition data for worker tracking, hazard detection, and behavior analysis without exposing sensitive real-world data.

Build In-cabin Monitoring Systems

Omnihuman enables generation of highly realistic synthetic in-cabin monitoring data, simulating diverse occupants, behaviors, and scenarios across different vehicle types. This allows training of robust computer vision models for driver fatigue detection, passenger behavior analysis, and multi-camera systems—while ensuring privacy and minimizing the need for real-world data collection. Compliant with NCAP provisions.

Improve Security and Defense

Images simulated with Omnihuman deliver unique realism for security & defence AI training. The Platform generates custom human detection, crowd monitoring, and threat identification scenarios in diverse environments. Users can modify terrain, lighting, and behavior patterns to create comprehensive datasets for surveillance, rescue operations, and tracking systems while maintaining complete privacy protection.

Streamline Processes in Manufacturing

Omnihuman enables creation of functional, realistic datasets for human detection in manufacturing environments, supporting applications like safety monitoring, worker behavior analysis and access control. With advanced simulation features and control over sensor parameters, our Platform accelerates AI model development while reducing reliance on real-world data.

Design and Build new Solutions for Smart Homes

Accelerate smart home AI with perfect training data tailored to any environment. Omnihuman simulates realistic humans in diverse surroundings, enabling unlimited, privacy-safe computer vision datasets. Train IoT devices instantly for any lighting, layout, or user scenario—boosting performance, accuracy, and adaptability across real-world conditions.

Full Spectrum of Thoughtful Features

Head and face

The Omnihuman feature of our Synthetic Data Cloud enables comprehensive control over head poses (roll, pitch, yaw), emotions (Facial Action Coding System, FACS), eye movements, and facial expressions. Moreover, the synthetic characters have fully randomizable skin texture with pores, imperfections, and makeup, allowing for successful domain adaptation.

Body

Full body, limbs, and trunk modeling enable realistic gestures, actions, and poses, including walking, sitting, slouching, head rotation, eye opening/closing, and even falls. Additionally, our synthetic humans can interact with accessory objects (96 interactions) and display over 50 face expressions.

Modalities

Each scene designed and generated using Omnihuman supports a selection of modalities, from Vis to NIR to Radar to Lidar. Additionally, our Platform utilizes proprietary algorithms for light interactions with matter resulting in realistic simulation of light interactions with skin (subsurface scattering, multilayer rendering, light reflections from surfaces).

Diversity of looks

With Omnihuman, you can generate millions of synthetic individuals on our Synthetic Data Cloud, by mixing and matching 216 unique adult identities, 6 ethnicities, 5 eye colors, 5 age ranges, and 60 variations of hair. You can also randomize appearance aspects like aging, BMI, and height. Add to that accessories, such as headwear or jewellery and you get a true hornof plenty of identities.

Multiple behaviors

Omnihuman allows for adjusting human behavior to your ML training needs. You can choose from 30 emotions based on FACS (e.g., neutral, tired, goofy), interactions with 96 unique accessory elements across 24 sets. You can further fine-tune your selected behaviors with 52 action units for adults. All of that to generate humans that are as realistic and information rich as possible.

Realistic Activities

Synthetic human activities offered in Omnihuman are deeply rooted in the domain knowledge. This approach ensures that every scene and every render accurately represent real-life scenarios. For example, we know that drivers often smoke or use mobile phones when they drive and that is why we included them in Omnihuman. However, there are many more to choose from.

Data Scientist-Friendly
Environment

Data Science-Centric

Our Platform and seamlessly integrates with data science and MLOps tools, and is offered as a Software-as-a-service (SaaS) access. It supports procedural image and animation synthesis with full control over dataset design and generation parameters (scene parameters, size, resolution, frame order, etc.).

Purpose build
3D assets

Omnihuman's 3D assets were custom built from scratch, utilizing motion capture technology. Our in-house CG team meticulously adjusted each model for facial landmarks, skeleton, and range of motion, in order to maximize the teaching output of synthetic data generated with those 3D assets.

Rich
⁠Ground ⁠Truth

Ground truth includes a vast array pixel-perfect metadata. They include: bounding boxes, 2D/3D keypoints, semantic masks for whole body, body parts, and face elements, depth maps, heat maps, gaze vectors. Also: gender, age, BMI, skin tone/ethnicity, along with sensor parameters and camera distortions.

More features

Omnihuman in Synthetic Home

High-fidelity, realistic synthetic humans in various indoors environments.
Asset textures informed by domain knowledge.
Automated, procedural mesh and texture generator.
Full-body posture and movement simulation.
Face expressions simulated in accordance with Facial Action Coding System
Domain-specific, physically simulated clothes.

Synthetic Eye Technology

Pixel-perfect metadata for gaze vectors
Gaze vectors for automotive safety, human-computer interaction (HCI), marketing research, manufacturing, healthcare, home robotics, and more
High-fidelity simulations of light interactions with the eye surface
Physics-based internal reflections, diffusions, and other effects in the eye

Deep integration

Our Platform features deep integration of well-known technologies for data scientists and AI developers.

Trusted by

Efrat Swissa

Director Core ML, Google. Ex-Nvidia

My team and I take a great pleasure in supporting SKY ENGINE AI, our collaboration is a win-win for both Nvidia and SKY ENGINE AI. SKY ENGINE showing what is possible with Nvidia tech and SKY ENGINE AI is leading the way with synthetic data and ML platform, which ultimately will dominate how the companies train DL models. I look forward to more collaboration opportunities.

Some answers to your most-asked questions

What details are available in labels generated on the SKY ENGINE AI Platform?

We provide a wide array of detailed labels required for training computer vision models. They include: - Classification (Whole-image content labels) - Image-aligned 2D and 3D bounding boxes (horizontal and vertical box edges) - Object-aligned 2D and 3D bounding boxes (bounding box fits snugly about the object) - Segmentation (Pixel-level shadings of training object shapes) - Instance segmentation - 3D Keypoints - Tracks of training object movement trajectories across sequences of generated imagery

How do you achieve the variety of human appearance within datasets?

We ensure the variety human appearances and behaviors required for excellent CV algorithm training outcomes by providing a large number of modifiable character properties. They include: - 216 unique adult identities - 6 ethnicities - 5 eye colors - 5 age ranges - 60 variations of hair - 30 emotions based on FACS (e.g. Neutral, tired, goofy) - Interactions with 96 unique accessory elements in 24 sets - 52 action units describing face expression for adults

What kinds of sensor outputs can be modelled and simulated on the Platform?

You can choose from a variety of modalities, among others: - Multispectral imagery (including RGB images) - Panchromatic imagery - Near-infrared imagery - Hyperspectral imagery - Lidar data - Motion data – electro-optical - Motion data – infrared - Ultra-wideband (UWB)

Are your renders physically based?

Yes, all our renders are based on physical models of light interactions with surfaces and sensors, such as microfacet models for refraction through rough surfaces [1] or Fresnel term approximations for metals [2].

Do you allow for mesh penetration or overlapping in your 3D models?

No, we do not. Our 3D modeling pipelines ensure that all assets remain free from penetration, obstruction, and overlapping issues, guaranteeing high-quality, realistic, and usable 3D models.

How are your images annotated?

Our images are annotated automatically on the Platform. Having complete control over the scene means you possess all the information regarding the 3D dependencies present. By eliminating manual labeling, you remove biases and inconsistencies associated with labeling.

What kinds of animation can be modelled and simulated on the Platform?

Virtually any animation can be modelled and simulated in our Synthetic Data Cloud. It depends only on the resources and time you have to spare. Our example animations include: - Dynamic shadowing - Dynamic illumination - Vehicle activity and scenarios - Human activity and scenarios - Weather - Sensor platform movement (platform motion, fly-through behavior) - Sensor movement (camera jitter, sway, or tilting)

How do you simulate aging?

Aging is simulated procedurally, for increased realism and randomization.