百工·本象 Ego Lens Echo (E.L.E.)

First-Person View · Real-World Video Dataset

30,000+hours
Total Video Duration
As of January 2026
500+types
Collection Tasks
As of January 2026
100+jobs
Covered Occupations
As of January 2026

Three Key Features

01

Professional Long-Range

This dataset is captured by professional practitioners performing their actual jobs—for example, coffee scenes are captured from real baristas working at their stations. First-person operation focuses on complete processes of complex real-world tasks, ensuring data has long-term temporal correlation and task closure characteristics; first-person real-world roaming also fully presents spatial exploration and path planning processes.

02

Rich & Diverse

This dataset's hand first-person view covers thousands of professional operation types, including farm operations, factory operations, kitchen operations, auto repair, hairdressing, craftsmanship, pet grooming, and more; roaming first-person view includes diverse scenarios like urban streets, natural landscapes, and warehouse spaces. Helping models break through scene limitations and achieve exponential improvement in generalization capability.

03

Real & Dexterous

This dataset is based on the real world, focusing on first-person hand operations and roaming video data from professionals in specific job positions.
Every frame is precisely captured from real scenes, enhancing the model's understanding of physical laws and operation logic.

Core Products

Two flagship products covering complete data solutions for hand operations and spatial roaming

Providing core data assets for world model and embodied intelligence pretraining

🖐️

Operative Stream (OS)Flagship

First-Person Long-Range Complex Hand Operation

First-person long-range complex task hand operation dataset. Focused on capturing human hand movements and object interactions during complex task completion, providing high-quality training data for robotics and embodied intelligence.

Complex Task Scenarios Multi-object Interaction Fine Hand Movements Sequential Continuous Data
🌍

Real Roam (RR)Flagship

First-Person Real-World Roaming

First-person real-world roaming dataset. Collecting spatial perception data of human activities in real environments, covering various indoor and outdoor scenes, providing authentic data support for spatial intelligence and world models.

Real Scene Collection Spatial Location Data Rich Scene Coverage Multiple Roaming Modes
OS includes 536 Skills
Detail 1
Statistics as of January 2026
OS includes 35,823 Interactive Objects
Detail 2
Statistics as of January 2026
OS includes 2,968 Human Operation Tasks
Detail 3
Statistics as of January 2026
RR Roaming Mode Duration Distribution
Real Roam Detail
Statistics as of January 2026

Data Processing Pipeline

From raw data collection to finished data product delivery

Relies on close cooperation between algorithms and human expertise—leveraging strengths for efficiency and accuracy

Data Processing Pipeline

Data Examples & Description

Three data examples each for Operative Stream (OS) and Real Roam (RR)

OS Example 1: Florist Operation

Text annotation (optional): This video shows a person carefully arranging and wrapping a fresh flower bouquet primarily consisting of orange-yellow roses inside a flower shop.

Ego Lens Echo Dataset basic storage format is MP4 file Data parameters: 1080P/30fps

OS Example 2: Food Stall Owner Operation

Text annotation (optional): This video shows a chef preparing fish soup, including slicing fish, blanching ingredients and placing them into a bowl with side dishes.

Ego Lens Echo Dataset basic storage format is MP4 file Data parameters: 1080P/30fps

OS Example 3: Auto Mechanic Operation

Text annotation (optional): This video shows an auto mechanic performing an oil change maintenance on a vehicle in the workshop, including draining oil by loosening chassis bolts and removing and cleaning the oil filter.

Ego Lens Echo Dataset basic storage format is MP4 file Data parameters: 1080P/30fps

RR Example 1: Walking Mode Roaming

Text annotation (optional): This video shows a first-person view walking through a narrow alley in an old residential area, with weathered brick walls, low-rise flat buildings and parked electric bikes visible around.

Ego Lens Echo Dataset basic storage format is MP4 file Data parameters: 1080P/30fps

RR Example 2: Cycling Mode Roaming

Text annotation (optional): This video shows a first-person view cycling along a suburban road on a sunny day, with residential buildings, parked vehicles and distant mountains visible along the way.

Ego Lens Echo Dataset basic storage format is MP4 file Data parameters: 1080P/30fps

RR Example 3: Sports Mode Roaming

Text annotation (optional): This video shows a first-person view of a ski resort environment on a sunny day, including riding a magic carpet up the slope and skiing down a wide, smooth slope, with mountains as the background.

Ego Lens Echo Dataset basic storage format is MP4 file Data parameters: 1080P/30fps