Sharper AI Enhances Maritime Safety

Journal of Remote Sensing

A new artificial intelligence system improves maritime ship detection in optical remote sensing imagery by combining broad target screening with fine-scale boundary analysis. The method is designed to handle multiscale vessels, cluttered sea backgrounds, cloud interference, wakes, and partial occlusion. By linking saliency prediction with instance segmentation, the framework reduces both missed detections and false alarms, offering a more reliable path for maritime surveillance and ocean monitoring.

Ship detection from remote sensing images is essential for maritime transportation, naval awareness, emergency rescue, and port logistics. Optical imagery provides rich texture and structural information, but it also creates major challenges for automated detection. Ships may appear extremely small, densely packed, partly hidden, or visually confused with wakes, coastlines, clouds, and sea clutter. Conventional single-stage detectors often struggle to balance sensitivity and precision under these conditions, especially when object scales vary sharply across the same image. Based on these challenges, in-depth research is needed on more robust ship detection frameworks for optical remote sensing imagery.

Researchers from Yan'an University and Northwestern Polytechnical University reported (DOI: 10.34133/remotesensing.1038) this advance in Journal of Remote Sensing , published on March 25, 2026. Their study introduces a coarse-to-fine saliency-driven maritime ship detection network called C2FSMSDet, developed to improve detection accuracy in complex optical remote sensing scenes. The system addresses a practical problem faced by current methods: how to reliably identify ships when image backgrounds are noisy and object boundaries are weak or overlapping.

The core innovation is a two-stage design. In the coarse detection stage (CoarseDet), the model first generates a pixel-level saliency map to highlight likely ship regions while suppressing wakes, sea clutter, and coastal interference. In the fine detection stage (FineDet), those saliency cues guide instance-level segmentation so that ship boundaries can be separated more precisely. This design differs from standard single-stage detectors because it decouples rough localization from detailed delineation. On a mixed test set, the framework achieved an F1 score of 0.912 and a mean average precision at an intersection over union threshold of 0.5 (mAP0.5) of 0.953, outperforming strong baseline detectors such as YOLOv7, which reached an mAP0.5 of 0.893.

C2FSMSDet combines transformer-based global reasoning with multiscale feature extraction. In CoarseDet, a Fully Convolutional Transformer (FCT) captures long-range contextual dependencies, while a Wide Focus Block (WFB) uses parallel dilated convolutions to analyze targets at different scales. A Criss-Cross Attention Module (CCAM) further strengthens pixel-level contextual modeling across rows and columns, helping the network distinguish ships from confusing background structures. The resulting saliency map is then passed to FineDet. This second stage is built on an optimized Mask Region-Based Convolutional Neural Network (Mask R-CNN) with a Swin Transformer backbone, mosaic data augmentation, and a Context Enhancement Module (CEM). Together, these components improve boundary recovery, object separation, and robustness in dense or partially occluded scenes. The model was evaluated on three public datasets: Airbus Ship Detection, HRSC2016, and DOTA. In saliency evaluation, the CoarseDet configuration with CCAM reached the best values among FCN-based variants, including a root mean square error (RMSE) of 8.40 and mean absolute error (MAE) of 0.072.

According to the authors, the strength of the framework lies in its coarse-to-fine strategy: the first stage prioritizes recall by finding possible ship regions, while the second stage improves precision through detailed instance segmentation. This coordinated design allows the model to better separate closely packed ships and reduce confusion from surrounding maritime backgrounds.

The researchers trained and tested the system using three publicly available ship-detection datasets covering open sea, ports, nearshore scenes, and aerial views. Training included standard augmentation such as rotation, scaling, flipping, and color jittering, while the FineDet stage also used mosaic augmentation. CoarseDet was optimized with Adam, an initial learning rate of 1 × 10−4, and cosine annealing. Experiments were run with an NVIDIA GeForce RTX 3090 Graphics Processing Unit (GPU), Intel Xeon E5-2699C v4 Central Processing Unit (CPU), PyTorch 1.12.1, and CUDA 11.7.

This study points to a promising future for high-accuracy maritime monitoring based on optical remote sensing. A more reliable ship detector could support maritime traffic management, anomaly warning, port operations, and rescue response, while also contributing to sustainable ocean economic development. The coarse-to-fine design may also be adapted to other remote sensing tasks involving small, dense, or weak-boundary targets, such as vehicles, infrastructure, or environmental hazards in complex scenes.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.