Manufacturing
Manipulation Dataset
Technical specification for buyers and researchers.
What this dataset is
The TalosHub Manufacturing Manipulation Dataset is a collection of structured, real-world episodes captured from human operators performing industrial manipulation tasks in active manufacturing facilities. Each episode is multi-view video synchronized with machine-state events from real CNC controllers, phase-segmented at millisecond precision, with labeled exception classes and recovery actions. Episodes are delivered as scoped task packs targeting specific workflows — CNC machine tending, manual deburring, mixed-fastener assembly, welding, inspection.
Canonical episode schema
Every episode is a directory containing synchronized streams plus a manifest.
{ "episode_id": "ep_042", "schema_version": "1.0", "task_family": "cnc_machine_tending", "machine": { "type": "haas_vf2", "controller": "fanuc_0i_mf", "cell_id": "cell_A3", "facility": "facility_001" }, "streams": { "video": ["cam_overhead.mp4", "cam_front.mp4", "cam_side.mp4", "cam_wrist_r.mp4", "cam_wrist_l.mp4"], "machine_state": "machine_events.jsonl", "operator_actions": "actions.jsonl" }, "phases": [ {"id": 1, "label": "approach", "start_ms": 0, "end_ms": 3200}, {"id": 2, "label": "door_open", "start_ms": 3200, "end_ms": 4800} ], "exceptions": [ {"type": "misload", "phase_id": 4, "severity": "moderate", "recovery_action": "reposition_and_retry", "recovery_success": true, "recovery_duration_ms": 1850} ], "outcome": "success", "quality_score": 0.94, "labels": {"split": "train"}, "capture": { "session_id": "sess_012", "operator_pseudonym": "OP-A", "operator_skill_level": "expert", "timestamp": "2026-04-10T06:14:22.847Z" } }
How we capture
Each episode includes 3-5 synchronized cameras: overhead (4K/30fps), front operator-facing (4K/30fps), side profile (4K/30fps), and on highest-value packs, two wrist-mounted close-ups (1080p/30fps). All cameras are frame-synchronized via LED flash at session start; multi-camera extrinsics are calibrated with a ChArUco target before each session.
Where supported, machine events are captured live from the CNC controller via OPC-UA (Siemens, Heidenhain, modern Fanuc), MTConnect (Haas, Mazak, DMG MORI, Okuma), or FOCAS (older Fanuc). Where live capture is not available, machine state is reconstructed from a fixed camera trained on the HMI display, augmented with external sensors. All events output as JSONL with millisecond-precision timestamps.
Each pack includes data from a minimum of 3 operators of mixed skill level (junior, mid, expert) to capture behavioral variance. All operators sign consent forms before recording. Real names never appear in deliverables; operators are identified by pseudonyms (OP-A, OP-B, OP-C).
All streams within an episode are aligned to a common reference clock (machine state where available, otherwise overhead camera). Maximum permitted drift between any two streams is 50ms; episodes exceeding this are flagged for manual realignment or rejection.
How we label
Each episode is decomposed into discrete phases drawn from a fixed per-task-family vocabulary. Phase boundaries are placed at millisecond precision; phases must cover 100% of episode duration with no gaps or overlaps.
Episodes contain natural and induced exceptions drawn from a per-task-family taxonomy (for CNC tending: misload, jam, incomplete_seat, chip_buildup, alarm_response, gripper_slip). Each exception is tagged with severity (minor / moderate / critical), recovery action, recovery success (boolean), and recovery duration (ms).
Phase boundary placement is validated by a second reviewer for 10% of episodes. Mean inter-rater drift on phase boundaries is reported per pack (target under 200ms).
Quality threshold per episode
-
Stream sync drift
≤ 50msacross all cameras and machine-state stream -
Phase coverage
= 100%of episode duration -
Exception annotations include all required fields (
type,severity,recovery_action,recovery_success) - Schema validates against canonical Pydantic schema
-
Quality score
≥ 0.85, computed as:0.3 × video_quality + 0.3 × label_accuracy + 0.2 × sync_precision + 0.2 × exception_completeness
Available task families (Q2 2026)
| Task Family | Episodes | Status |
|---|---|---|
| CNC Machine Tending | 50 | Ready (Apr 2026) |
| Manual Deburring & Finishing | 50 | In capture (Apr-May 2026) |
| Precision Quality Inspection | 50 | In capture (Apr-May 2026) |
| Mixed-Fastener Sub-Assembly | 50 | Scheduled (May-Jun 2026) |
| CNC Setup & Changeover | 50 | Scheduled (May-Jun 2026) |
| Manual Welding (MIG/TIG) | 50 | Scheduled (Jun 2026) |
Delivery format
Each pack is split 70/15/15 (train/val/test) by default, stratified by exception type so each split contains proportional exception coverage. Custom split ratios available on request. Episodes ship as a zipped archive (~150-200MB per episode) with a manifest, sample loader (Python, RLDS-compatible), schema documentation, and capture notes.
What this dataset is NOT
This dataset is human demonstration data captured for training manipulation policies. It is NOT teleoperation data — robot joint angles, end-effector trajectories, and force/torque sensor data from a robot platform are not included. Buyers integrating with their own robot platform must pair this data with their robot-specific demonstrations or use it for vision/action grounding only.
Episodes are captured from a limited set of facilities (currently 1, expanding to 3 by Q3 2026). Cell-to-cell variance is limited until additional facilities come online. Lighting variance is limited to standard factory conditions; dramatically different lighting environments (outdoor, low-light, high-glare) are not currently represented.
Licensing terms
Standard
Perpetual license to use the data for model training, evaluation, and deployment. Resale and redistribution restricted.
Exclusive
TalosHub commits not to sell the scoped pack to competitors for an agreed exclusivity window. Premium pricing (typically 2-3x base).
Research
Reduced pricing for academic labs with citation requirements. Open-friendly terms negotiable per project.
How to cite
Get the technical bundle
For methodology questions, schema details, or custom scoping — request a 20-minute call. We share the full sample loader code and 2-3 demonstration episodes after the scoping conversation.