Datasets:

dynamic-maps
/

hard-intersection-multimodal-sample

Hard Intersection Multimodal Samples is a curated multimodal dataset of accident-prone urban intersection in Japan for autonomous driving research and development.
It provides multi-camera images, trajectory, HD maps, semantic annotations, point cloud data, and 3DGS assets. A comprehensive, high-precision multimodal dataset of public road environments captured via an industrial-grade MMS equipped with high-performance IMU, GNSS, LiDAR, and cameras. Features raw sensor data (images, point clouds, trajectories), processed HDMaps, metric 3DGS, and HDMap-derived semantic images/point clouds.

Dataset Description

Overview

This dataset focuses on challenging real-world urban intersection and is designed to support research in perception, mapping, and scene understanding. The current release include the following intersection:

Takanawadai, Tokyo, Japan
The Takanawadai intersection concentrates multiple adverse driving conditions into a single spot: sensor blind spots at the crest of a hill, an irregular six-way intersection with a sharp curve, and other vehicles crossing centerlines on narrow roads.
By creating a 3D model of this real-world, accident-prone location—which easily triggers errors even in human drivers—it serves as the ideal "safety benchmark" to test whether autonomous systems can successfully navigate extreme edge cases.
Even in the real world, this intersection historically ranked as the second worst in Tokyo for traffic accidents.

The dataset combines:
- Six synchronized camera views
- Two synchronized camera views
- Trajectory data
- HD maps (OpenDRIVE/RoadRunner/Lanelet2/Vissim)
- Semantic annotations
- point cloud data
- 3DGS reconstruction assets

Key Features

Focused on high-risk real-world intersection
Multi-domain support:
- sensor domain
- map domain
- semantic projection domain
- reconstruction domain
- 360-degree coverage with six cameras
- Developer-friendly formats: OpenDRIVE/RoadRunner/Lanelet2/Vissim
- LAS point clouds

Modalities

Raw / Primary Data

Six synchronized camera images (Camera0–Camera5)
Two synchronized camera images (Camera_1,+15deg/Camera_2,+15deg)
Point cloud data
Trajectory data

Derived assets

HD maps (OpenDRIVE/RoadRunner/Lanelet2/Vissim)
Semantic image labels
Semantic point clouds with labels
3DGS reconstruction assets

Camera Setup

Six synchronized cameras (Camera0–Camera5)
Full 360-degree coverage
Two synchronized cameras (Camera_1,_+15deg/Camera_2,_+15deg)
The file naming convention is {folder}_{record_name}_{date}_{hhmmssmmm}_Camera_{number}
Camera0 is the front view, Camera1 is the front-right view, Camera2 is the rear-right view, Camera3 is the rear-left view, Camera4 is the front-left view, and Camera5 is the top view.
Additionally, for the two synchronized cameras: Camera_1,_+15deg represents the front view, and Camera_2,_+15deg represents the rear-right view.
Although the filename includes “+15deg,” it does not mean that the camera is installed in that direction.
calibration/images.txt: extrinsic parameters (camera poses for each image)
calibration/cameras.txt: intrinsic parameters (camera model and intrinsics)
In cameras.txt, for the PINHOLE model, PARAMS[] corresponds to: fx, fy, cx, cy.

Trajectory Data

The file naming convention is {folder}_{record_name}_{date}
The dataset includes recordings in the following order of 26047_Record004, 26047_Record050, 26047a_Record004, and 26047a_Record084.
The trajectory data is defined in EPSG:6677.

HD Map Formats

The HD map is provided in the format shown below.

OpenDRIVE(Ver1.4/1.6/1.8)
RoadRunner HD Map
Lanelet2
Vissim

Point Cloud Data

Provided in LAS format
Multi-run aggregated (not single-frame LiDAR)
The point cloud data (.las) is defined in EPSG:6677, with elevations given as orthometric heights (above sea level).
Use cases:
- Map-aligned geometry
- Semantic projection base
- Reconstruction workflows

Semantic Annotations

Semantic Image: COCO JSON format
The semantic information of the image data and the corresponding images to which the information is assigned are defined in the COCO JSON format.
Semantic Point Cloud: point-wise (LiDAR)

The semantic information of the Semantic Point Cloud is stored in the UserData field of the LAS format, as specified in the table below.

Item	Code
road_surface	11
traffic_island	12
solid_white_line	21
dashed_white_line	22
solid_yellow_line	23
dashed_yellow_line	24
double_line	25
straight_arrow	31
left_arrow	32
right_arrow	33
left_and_straight_arrow	34
right_and_straight_arrow	35
pedestrian_crossing	41
stop_bar	42
traffic_calming_strip	43
bus	44
vertical_two_traffic_light	51
horizontal_three_traffic_light	52
horizontal_four_traffic_light	53
wrong_way	61
interstate_route	62
other_blue	63
other	71

These annotations are assigned based on HD Map attributes, and objects not defined in the HD Map, such as vehicles, are not annotated.

3DGS and Reconstruction Assets

3DGS assets are included as reconstruction-oriented scene representations.
These assets are intended for scene reconstruction, scene visualization, synthetic data preparation, simulation-oriented environment understanding, and map-aligned scene asset generation.
3DGS assets are defined in the same coordinate system as the point cloud data.
Data Collection Constraints: Constructed clean, static data by completely filtering out dynamic objects from highly congested public roads with constant vehicle and pedestrian traffic.
Environmental Complexity: Accurately reproduced the geometry of complex, dynamic environments—where standard 3DGS generation typically fails—by integrating high-precision LiDAR point clouds captured for HDMap creation.
System Scale: Scaled beyond single objects to cover large-scale road networks through an integrated image and point cloud pipeline. Furthermore, when paired with our HDMap data, it functions as a "Metric 3DGS" capable of precise real-world physical dimensions and positioning.

Please refer to the following viewer for 3DGS assets.

Dataset Structure

Dataset
├─annotation
│  ├─semantic_images
│  └─semantic_pointcloud
├─calibration
├─images
├─maps
│  ├─lanelet2
│  ├─opendrive
│  ├─roadrunnerhdmap
│  └─vissim
│      └─images
├─pointcloud
├─reconstruction
└─trajectory

Intended Uses

multi-camera perception research
map-aware perception
semantic segmentation
map projection workflows
difficult-scene localization
trajectory analysis
point cloud semantic research
panorama-based perception research
reconstruction and 3D scene asset research
synthetic data or simulation preparation workflows

Social Impact of Dataset

It is expected that an increase in the volume of this dataset will contribute to the following: Improvement of autonomous driving technology.
Reduction in storage capacity through the elimination of duplicate datasets.

Limitations

Limited to 1 intersection (not large-scale coverage)
LAS is aggregated multi-run data
Representation differences across formats
Images have been processed to protect personal information: faces have been mosaicked, and license plates have been masked as much as possible, although the masking may not be complete. Users may contact us if additional anonymization is required.
The semantic images are not provided for all images. Users may contact us if additional annotation is required.

Difficulty Tags

occlusion_heavy
dense_traffic
complex_lane_topology
multi_phase_signal
unprotected_turn

Citation

@dataset{hard_intersection_multimodal_samples_2026,
  title={Hard Intersection Multimodal Samples},
  author={Dynamic Map Platform Co., Ltd.},
  year={2026},
  publisher={Hugging Face}
}

Acknowledgements / Attribution

This dataset was generated using several open-source models and libraries. We gratefully acknowledge the contributions of the original authors.

Models

Grounding DINO
Liu et al., IDEA-Research, ECCV 2024
License: Apache License 2.0
https://github.com/IDEA-Research/GroundingDINO
OneFormer
Jain et al., SHI-Labs, CVPR 2023
License: MIT License
https://github.com/SHI-Labs/OneFormer
ViTMatte
Yao et al., HUST Vision Lab, Information Fusion 2024
License: Apache License 2.0
https://github.com/hustvl/ViTMatte

Pretrained Models

Grounding DINO Base
IDEA-Research
License: Apache License 2.0
https://huggingface.co/IDEA-Research/grounding-dino-base
OneFormer Cityscapes Swin-L
SHI-Labs
License: MIT License
https://huggingface.co/shi-labs/oneformer_cityscapes_swin_large
ViTMatte Base (Distinctions-646)
HUST Vision Lab
License: Apache License 2.0
https://huggingface.co/hustvl/vitmatte-base-distinctions-646

3D Gaussian Splatting Libraries

gsplat
Ye, Li et al., UC Berkeley / Nerfstudio, JMLR 2025
License: Apache License 2.0
https://github.com/nerfstudio-project/gsplat
Splatfacto-W
Xu et al., UC Berkeley / ShanghaiTech, arXiv 2024
License: Apache License 2.0
https://github.com/KevinXu02/splatfacto-wyy

This dataset contains only derived data outputs generated using the above tools.
No original model weights or source code are redistributed.

All rights and licenses of the original works remain with their respective authors.

Redistribution Notice

This repository distributes dataset artifacts only. It does NOT include:

source code of the above models
pretrained model weights
third-party libraries

Users must obtain those components separately from their original sources and comply with their respective licenses.

Feedback/Contact

Feedback is optional, but very welcome.
Contact: opensource@dynamic-maps.co.jp

Dynamic Map Platform Co., Ltd. (DMP) is a Japan-based provider of high-precision 3D geospatial data and HD maps for automotive and infrastructure applications. Established with support from the Japanese government and major automakers, DMP has built strong relationships with both industry and public sectors. LinkedIn

In the automotive domain, DMP works with global OEMs including Toyota, Honda, General Motors, Nissan, and SUBARU supporting production vehicles and advanced driving systems. PDF

Downloads last month: 959

Total file size:

3.86 GB