Instance Segmentation and Contour Extraction Method for Open Curves in Engineering Drawings

ChenjiaNiu1

LuWu1✉Emailwulv@whut.edu.cn

NingLi1

NingXu1

School/Faculty NameWuhan University of TechnologyNo. 122 Luoshi Road430070WuhanHubei ProvinceChina

Abstract

Engineering drawing is a universal language that enables engineers to accurately understand design intentions.To accurately understand the intent of the design, it is necessary to segment the engineering drawing into different instances.However,segmenting drawings with dimension details is challenging due to irregular components, shapes, and annotations.These drawings often have different views with small details such as dimension lines and annotations,especially in the package drawing. To obtain single views with dimension lines from open curve in different viewing angles,this paper presents a method integrating the Segment Anything Model (SAM) ,edge detection and path finding algorithms.First, the initial masks are obtained by segmenting drawings using SAM. Then, morphological processing is used to enhance mask quality.Lastly,pixel path search algorithms are used to refine the outer contours and region details. Experiments show that this approach significantly improved segmentation accuracy, providing an effective viewing of the package drawings.

Keywords

Instance Segmentation

Contour Extraction

Engineering Drawings

Open Curves

SAM

Introduction

In engineering drawings, instance segmentation is crucial for enabling precise recognition and separation of individual components within complex technical diagrams, which traditional object detection or semantic segmentation cannot fully achieve. It allows overlapping or closely spaced identical symbols, such as bolts, connectors, or pads, to be isolated as unique objects for accurate counting, labeling, and analysis. By providing pixel-level boundaries, instance segmentation supports exact dimensional measurements essential for PCB layouts, mechanical blueprints, and CAD exports, where small geometric variations can have manufacturing implications.Furthermore, because engineering drawings often exhibit hierarchical structures with main components, sub-component, and annotations, instance segmentation provides a structured decomposition that supports relation extraction,such as linking annotations to their corresponding components,and enhances semantic understanding for downstream engineering tasks.

The Segment Anything Model\cite{kirillov2023segment} serves as a general-purpose segmentation framework capable of performing instance segmentation through its prompt-driven design.However,it does not inherently assign instance identifiers and may require post-processing to achieve complete instance segmentation pipelines.Our research reveals that its effectiveness on engineering drawings is suboptimal due to the following characteristics: (1) the irregular typesetting of views provided by different manufacturers; (2) varying levels of compactness in the distribution of views, where overly dense layouts significantly increase segmentation difficulty; and (3) the inherent complexity of packaging views, including intricate boundaries and small objects with dimensional information which is non-closed curve, exhibiting complex connectivity relationships with other views.

To improve segmentation quality for packaging component drawings, we propose an optimization method. Morphological processing and pixel-level path search refine SAM's initial masks, precisely extracting contours and regional details.The research in this paper can be summarized as follows:

(1)Morphological processing is applied to small objects, such as size information.

(2)To address the adhesion issue caused by excessive dilation in engineering components, we propose an adaptive morphological dilation optimization algorithm that dynamically adjusts kernel sizes to identify the optimal dilation iterations to achieve maximum shape congruence between masks and objects generated by SAM.

(3) Proposal of a pixel-level path search algorithm to address complex contour problems, with comparisons to existing edge detection algorithms to highlight the advantages of our approach.

(4) We compare the SAM segmentation results from the SAM masks with the refined results from the post-processing module,demonstrating the improved segmentation performance after optimization.

Related Works

In recent years, general semantic segmentation networks \cite{yurtkulu2019semantic}have demonstrated strong performance in natural image understanding, performing pixel-wise classification and semantic segmentation of complex scenes \cite{minaee2021image}\cite{yu2023techniques}\cite{chen2017rethinking}.

A growing body of research focuses on package-view segmentation, which aims to isolate complex component views containing dimensional annotations and contours. For example, Zhang et al.\cite{zhang2023component} employed graph-based representations of geometric primitives to distinguish outlines, dimension lines, and text regions effectively. Carrara et al.\cite{carrara2025vectorgraphnet} introduced a graph attention transformer architecture for accurate segmentation of technical drawings with complex topological relationships. These methods highlight the potential of integrating geometric priors into deep segmentation frameworks.In addition, for low-quality or noisy engineering drawings, restoration-and-recognition frameworks have been explored. Yang L et al.\cite{yang2024comprehensive} proposed an end-to-end pipeline for the Restoration and Recognition of Low-Quality Engineering Drawings, jointly enhancing visual clarity and symbol segmentation. In the context of sheet-metal engineering drawings, Song et al.\cite{song2025segmentation} integrated the Convolutional Block Attention Module (CBAM) into a U-Net backbone to enhance feature representation and segmentation precision.However, these methods often face challenges when applied to engineering drawings due to the unique visual characteristics of technical diagrams—such as the presence of fine-scale structures, overlapping annotations, and sparse line-based representations.

Another emerging line of research lies in image vectorization and structural abstraction, which aims to convert rasterized line drawings into layered vector formats. For example, Zhou H et al.\cite{zhou2024segmentation} introduced a segmentation-driven vectorization framework that preserves geometric and topological consistency. Similarly,Wang Z et al.\cite{wang2025layered} focuses on multi-layer decomposition to facilitate the hierarchical reconstruction of technical diagrams.

Moreover, large-scale CAD and architectural drawing datasets are increasingly being released to support symbol detection and segmentation. The ArchCAD-400K dataset \cite{luo2025archcad} provides a large-scale benchmark for panoptic symbol spotting, while CADSpotting \cite{yang2024cadspotting} offers a robust method for panoptic symbol detection across extensive CAD drawings. These datasets and benchmarks promote cross-domain learning and fine-grained segmentation of technical symbols and graphical entities.

Finally, although foundation models such as the Segment Anything Model (SAM) \cite{kirillov2023segment} have achieved significant success in natural image segmentation, their direct application to engineering drawings often yields suboptimal results due to the high density of thin lines, overlapping annotations, and geometric complexity. Consequently, an increasing number of studies have proposed SAM adapters or task-specific extensions to improve boundary precision and detail retention. For example, SAM variants have been adapted for medical\cite{cheng2025interactive}\cite{wu2025medical}, remote sensing\cite{osco2023segment}\cite{chen2024rsprompter} and defect detection applications, demonstrating the importance of scenario-specific optimization for domain transfer. Motivated by this, our work combines SAM with a tailored framework designed specifically for engineering drawings, aiming to achieve fully automated and accurate segmentation of component views, dimension lines, and annotations.

Method

Method Framework

Fig. 1

Flow chart of post - optimization based on morphological processing and pixel path search algorithm

In Figure 1,the preprocessing stage achieves coarse localization of the object region through a multi-step morphological processing pipeline\cite{chen2017rethinking}. First, the input image undergoes sequential operations including binarization and dilation to generate a valid mask. This valid mask is then fused with SAM masks through IOU matching and area sorting strategies, ultimately producing a segmentation object that effectively encompasses the object region.

The fusion stage employs a contour tracing algorithm to achieve precise segmentation. The process begins by selecting an arbitrary feature point located on the outermost contour of the segmentation object, where the point exhibits a pixel value of 1 (black) and is surrounded by a white background. A scanning matrix is then constructed around the feature point to systematically examine its neighborhood. When a nearby point with a pixel value of 1 is detected, it is marked as visited and designated as the new feature point for subsequent iterations. This search process continues until a complete closed loop is formed, resulting in a well-defined edge path. Finally, post-processing operations such as contour filling are applied to generate the refined precise mask.

Input Mask Processing

In automatic segmentation tasks using the SAM model, a large number of candidate masks covering different semantic regions are generated, and their spatial hierarchical relationships (such as inclusion and overlap) lead to significant redundancy in the results. To retain the required mask blocks and eliminate the interference of internal sub - masks, this paper proposes a mask hierarchical relationship analysis and spatial screening strategy to achieve the object.

As illustrated in Figure 2, our method first processes the candidate masks. We calculate the pixel area of all masks and sort them from largest to smallest. This ensures that larger mask blocks are processed first. Meanwhile, we obtain the minimum bounding rectangle of each mask for quick pre - screening.Then, we carry out mask hierarchical relationship determination and screening. For each mask, we conduct a spatial inclusion test with other masks. If the bounding box of mask A is entirely within that of mask B, mask A is completely removed; otherwise, it is kept in the set.

In the specific implementation, we use the Intersection over Union (IoU) threshold method. We traverse all pairs of masks and calculate the IoU. If mask A's IoU exceeds a certain threshold (here, we set the threshold at 0.95) and it has a smaller area, it is determined to be a sub-mask that is included.

By using the above method, we can eliminate the interference of sub-masks and only retain the mask information we need.

Connected Dimensions & Components

In engineering component views, there exists a substantial amount of dimensional information and dimension lines as small surrounding objects, which are not physically connected to the component view itself. If we directly extract the contour of the component view, these peripheral dimension lines and annotations may be inadvertently omitted—an outcome we particularly wish to avoid.

First, we preprocess the image, focusing on grayscale and binarization. Image feature extraction relies on brightness, not color information. Using a weighted average method for grayscale conversion transforms the original three-channel color image into a single channel. This not only reduces the data volume but also minimizes color interference. The formula is as follows:

begin{equation} \label{eq:placeholder_label} \mathrm{I}_{\text{binary}}(x, y) = \begin{cases} 0, & \quad \text{otherwise} \\ 255, & \quad \text{if } I_{\text{gray}}(x, y) \geq \mathrm{T} \end{cases}\end{equation}

$\text{I}_{\text{gray}}$

is the grayscale image, T is the threshold, and

$\text{I}_{\text{binary}}$

is the output binary image. Since the choice of threshold T directly impacts the segmentation effect, this study uses Otsu algorithm to adaptively calculate the optimal threshold based on the pixel intensity distribution of the object.

$\mathrm{T}_{\text{opt}} = \arg \max_{T} \left[ \sigma_{b}^{2}(T) \right]$

eq:placeholder_label

In the formula,

$\sigma_{b}^{2}$

represents the between-class variance. By maximizing the separation between foreground and background, robust segmentation is achieved. This method effectively adapts to local contrast changes, laying the foundation for subsequent morphological operations and feature extraction tasks.

In image processing, dilation and erosion are the most commonly used morphological operations. The primary purpose of dilation is to expand the image boundaries, typically used to fill small holes in the image and connect broken regions. In a binary image, the dilation operation can be defined as:

$(\mathrm{A} \oplus B)(x, y) = \max_{(i, j) \in B} A(x+i, y+j)$

eq:placeholder_label

In the equation, A is the input binary image, B is the structuring element, (x,y) denotes the pixel coordinates in the image, and (i, j) represents the offset of the structuring element.By applying dilation to the binary image of the object component, we can easily connect the small object size information to the main body. However, since components in engineering drawings are often densely and irregularly arranged, using too large a structuring element could potentially connect other components to the current object component.

To address the problem of expansion and adhesion in dense engineering parts, this paper proposes a progressive expansion strategy based on collision perception. This method achieves expansion through hierarchical iterative expansion and dynamically adjusts the size of the expansion kernel, ensuring the connectivity of the object while avoiding over-expansion. The core idea is to set the initial expansion kernel size and the maximum number of iterations. After each iteration of expansion, the current expanded object is evaluated for similarity with the mask obtained in Section 3.2 using a composite area difference kernel IoU (Intersection over Union) metric. Meanwhile, a dual dynamic stopping mechanism is employed: the iteration is immediately halted when the area difference rate is less than 5%; and the iteration is also stopped when the score does not improve for three consecutive iterations.

Component View Contour Search

Fig. 2

Results of Various Edge Detection Operators

Unlike conventional image contour searches, our aim is to obtain the outermost contour of the object component. In image processing, common edge detection algorithms such as the Canny edge detector, Sobel operator, and Laplacian operator, though capable of extracting object contours, are not effective in fully capturing the complex internal and boundary structures of components. Additionally, they often suffer from interference caused by internal contours,as shown in Figure 2.

Inspired by the A* path finding algorithm, we only need to find a single point on the outermost contour of the object component to serve as the starting point, this point corresponds to the starting feature of the Fusion section in Figure 1. And then trace the coordinates along the outermost contour. To determine the starting point, we first apply the dilation operation in Section 3.3, which connects the components and produces the overall object image.

For efficient acquisition of feature starting points in binary image processing, this study adopts a matrix-based pixel traversal strategy. For a binary 2D matrix where the 0-value elements represent the background, and the 1-value elements represent the object features:

$\mathrm{I} \in \{0,1\}^{\wedge}\{m, n\}$

eq:placeholder_label

we construct a Boolean condition matrix and use the advanced function from the NumPy scientific computing library for efficient coordinate retrieval. This function returns the index tuple (coords_y, coords_x) indicating the positions where I(i, j) = 1. The coordinates are scanned in both row-major and column-major order, following raster scan order. To obtain the coordinates of the first occurrence of the feature, we extract the first valid element from the index sequence (x_min, y_min), which corresponds to the smallest topological order position in the image space domain that satisfies the feature condition.

After determining the starting point, upon which a precise search for the outermost contour is conducted. To achieve this, a path coordinate list is established to store the final contour coordinates, while a temporary coordinate stack is created to temporarily store the current point information. The starting point coordinates are pushed onto the stack and marked as visited to prevent redundant processing.

Next, contour searching is performed with the condition that the stack is non-empty. In each iteration, the top element of the stack is popped as the current contour point, and a 5×5 scanning window centered on this point is constructed to search for adjacent contour points. Within this scanning window, if a pixel's coordinates are within bounds and its intensity value matches that of the starting point, it is considered a contour extension point. All qualifying adjacent contour points are sequentially pushed onto the stack, marked as visited, and added to the contour coordinate list. This process continues until the stack becomes empty, indicating that the complete outermost contour has been successfully extracted.

Upon completing the contour search, a completely black mask image and a completely white background image of the same size as the original image are generated. The extracted outermost contour is then drawn on the mask, and the interior of the contour is filled to form a closed region representing the object. Subsequently, the mask is applied to the original image, where the pixel values within the mask region are copied from the original image to the background image, while the remaining areas remain white, thereby achieving precise segmentation of the object.

As shown in Figure 3, through the above method, the outermost contour of the object can be completely extracted while maintaining clear boundaries and ensuring segmentation accuracy.

Fig. 3

Contour results acquisition via pixel path search

Experiments and Analysis

Experimental Setup

To validate the effectiveness of our proposed method, we retrained the SAM (ViT-B) model on a domain-specific engineering diagram dataset. The dataset contains 4752 engineering drawings, including chip packaging layouts, component views, dimensional annotations, and auxiliary notes. Accurate segmentation masks were manually annotated, and the dataset was split into 80% training and 20% testing subsets.The base model employed for fine-tuning was SAM (ViT-B). Model optimization was performed using the AdamW optimizer with an initial learning rate of

$1 \times 10^{-4}$

, and a batch size of 16 was adopted to ensure stable convergence. The model was trained for a total of 50 epochs, balancing computational efficiency and segmentation accuracy. All training procedures were executed on an NVIDIA GeForce RTX 4090 GPU equipped with 24GB GDDR6X memory, leveraging CUDA acceleration to maximize hardware performance.

Evaluation Metrics

To comprehensively evaluate the performance of the proposed method on engineering diagram segmentation, we employed a set of complementary metrics that jointly measure pixel-level accuracy, object-level consistency, and fine-grained annotation preservation. The details are as follows:

Intersection over Union (IoU) and Dice Coefficient

IoU and Dice are widely adopted pixel-level metrics in segmentation tasks, quantifying the overlap between predicted masks and ground truth.

IoU is defined as the ratio between the intersection and union of predicted and ground truth pixels. A higher IoU indicates a larger proportion of correctly segmented pixels relative to the total area covered by both masks.Dice Coefficient places more emphasis on overlap by computing twice the intersection over the sum of pixels in both sets. Dice is particularly useful in imbalanced cases where foreground regions are small compared to the background.

Object Accuracy at Thresholdt  (Obj@t)

In addition to pixel-level scores, it is important to evaluate whether segmented instances align with the ground truth objects as a whole. We therefore employ the object accuracy metric, Obj@t, which measures the proportion of predicted instances whose IoU with a ground truth object exceeds a threshold t.

In our experiments, we report Obj@0.75, meaning that only predictions achieving IoU

$\geq 0.75$

with ground truth are considered correct.

Small-Annotation Preservation Rate (SAPR)

Engineering diagrams contain fine-grained elements such as dimensional annotations, tolerance marks, and auxiliary symbols, which are often small in scale and easily missed during segmentation. To specifically measure the ability of a method to retain such information, we propose the Small-Annotation Preservation Rate (SAPR).

SAPR is defined as the ratio of correctly segmented small-scale annotation regions to the total number of ground truth annotations.This metric highlights whether the segmentation algorithm is capable of preserving details critical to engineering applications, beyond large-scale component structures.

By combining these metrics, we obtain a holistic evaluation framework: IoU and Dice capture pixel-level fidelity, Obj@t ensures instance-level correctness, and SAPR emphasizes the preservation of fine-grained details unique to engineering diagrams.

Comparison Results

In Table 1,We compared the following three methods:directly using the original SAM weights trained on natural images(SAM -Pretrained).SAM model retrained on the engineering diagram dataset(SAM-Retrained). SAM-Retrained model with our proposed post-processing modules (Ours).

begin{table} \centering \caption{Quantitative comparison results on the engineering diagram test set} \label{tab:placeholder} \begin{tabular}{cccll}\toprule Method & IoU & Dice & Obj@0.75 & SAPR\\\midrule SAM-Pretrained & 0.65 & 0.68 & 0.54 & 0.47\\ SAM-Retrained & 0.85 & 0.86 & 0.87 & 0.85\\ Ours & 0.84 & 0.87 & 0.93 & 0.94\\ \bottomrule \end{tabular} \end{table}The results clearly show that SAM-Retrained outperforms SAM-Pretrained, indicating the significant domain gap between natural images and engineering diagrams. Furthermore, incorporating our post-processing strategy on top of SAM-Retrained leads to substantial improvements, particularly in SAPR, with gains exceeding 9 percentage points. This demonstrates the effectiveness of our method in preserving fine-grained dimensional annotations while achieving precise segmentation of engineering components.

To further validate that the proposed post-processing framework is not specifically tailored to SAM but can generalize to other segmentation architectures, we additionally evaluated our method on the U-Net + + model[zhou2018unet++].

We trained the U-Net++ model on the same engineering-diagram dataset described in Section 3.1 using identical preprocessing, augmentation, and optimization settings to ensure a fair comparison.

After obtaining the raw segmentation masks from U-Net++, we applied our morphological refinement + pixel-path contour search post-processing pipeline, identical to that used for SAM outputs. Quantitative results are summarized in Table 2.

begin{table} \centering \captionsetup{justification=centering, singlelinecheck=false} \caption{Quantitative comparison of U-Net++ and SAM models with and without the proposed post-processing} \label{tab:placeholder} \begin{tabular}{cccll}\toprule Method & IoU & Dice & Obj@0.75 & SAPR\\\midrule U-Net++[zhou2018unet++] & 0.82 & 0.84 & 0.86 & 0.81\\ U-Net++ + Ours & 0.83 & 0.86 & 0.90 & 0.91\\ SAM-Retrained & 0.85 & 0.86 & 0.87 & 0.85\\ SAM-Retrained + Ours & 0.84 & 0.87 & 0.93 & 0.94\\ \bottomrule \end{tabular} \end{table}

Fig. 7

Comparison of experimental results

The results demonstrate that our post-processing module significantly enhances small-annotation preservation and object-level consistency compared with the U-Net++ predictions.

The improvements mirror those observed for SAM, confirming that the proposed morphological–contour refinement pipeline is model-agnostic and can effectively correct over-segmentation, refine fine-scale contours, and recover dimension lines across different segmentation backbones.

These consistent gains across two distinct segmentation architectures demonstrate the robust generalization of our post-processing strategy beyond SAM, underscoring its potential as a universal enhancement module for engineering-drawing segmentation tasks

Visualized results

We compared our method against the SAM using multi-object integrity segmentation accuracy (including small-scale dimensional annotations) as the evaluation metric,Figure 4(a) shows the segmentation results obtained by SAM, while Figure 4(b) presents the segmentation results of engineering drawings using our method.

We qualitative results to highlight the capability of our method in preserving fine-grained annotations within engineering diagrams. Figure 5 illustrates several representative cases, where the top row shows segmentation results obtained by SAM, and the bottom row shows results generated by our method.

As can be observed, SAM often fails to accurately capture small-scale annotation regions such as dimension lines, tolerance indicators, and fine boundary symbols. These elements are either partially segmented or completely missed, which reduces the readability of the engineering diagram. In contrast, our method demonstrates a superior ability to retain such small objects, producing accurate and continuous masks even for thin dimension lines and densely packed annotation marks.

In contrast, our method introduces object modifications to SAM, yielding substantial advantages in engineering component segmentation tasks. It effectively mitigates over-segmentation errors while resolving the critical issue

Conclusion

To address the suboptimal performance of the Segment Anything Model (SAM) in engineering drawing instance segmentation, this paper proposes an enhanced framework that integrates SAM with domain-specific morphological processing and a novel contour path search algorithm. The morphological operations reconstruct the local connectivity of small-scale structures, such as dimension annotations and fine contours, while the contour-based refinement guarantees geometric and topological integrity of segmented objects.

Fig. 8

small-scale dimensional annotations comparison

Extensive experiments on engineering drawings from multiple companies demonstrate that the proposed framework significantly improves segmentation accuracy, particularly in preserving small annotations and fine geometric details. More importantly, beyond the numerical improvements, our approach possesses strong engineering applicability. It can be directly embedded into EDA (Electronic Design Automation) or CAD-to-manufacturing pipelines to assist in tasks such as component view separation, dimensional data extraction, and intelligent schematic-to-layout conversion. By producing structurally consistent and annotation-preserving masks, the framework enhances the downstream automation level in PCB packaging design, mechanical drafting interpretation, and digital twin modeling.

In future work, we plan to extend the framework in two directions. First, we will explore a multi-modal fusion architecture that jointly incorporates OCR-based text understanding and semantic segmentation to achieve unified parsing of geometric and textual entities. Second, we will investigate vector-level representation learning, aiming to convert raster segmentation outputs into structured CAD primitives for seamless integration with existing industrial design systems.

Ultimately, this research contributes not only a novel technical solution but also a scalable foundation for the intelligent digitization of engineering documentation, paving the way for automated, data-driven design and manufacturing workflows.

bibliography{sn-bibliography}

Statements and Declarations

Author Contributions

C.N. wrote the main manuscript text and conducted the experiments. L.W., N.X., and N.L. provided guidance and assistance for the experimental design and analysis. L.W. is the corresponding author and supervised the overall research. All authors reviewed and approved the final manuscript.

Ethics, Consent to Participate, and Consent to Publish

Not applicable.

Competing Interests

The authors declare that they have no competing interests.

Data Availability

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request

Author Contribution

Data Availability

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request

Acknowledgement

The authors would like to express their sincere appreciation to L.W., N.X., and N.L. for their valuable guidance, constructive suggestions, and technical support throughout the study. The authors also wish to thank the colleagues from the Wuhan University of Technology for their helpful discussions and assistance during the research. We are also grateful to the anonymous reviewers for their insightful comments and suggestions, which have greatly improved the quality of this work. This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

References:

Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C and Lo, Wan-Yen and others (2023) Segment anything. 4015--4026, Proceedings of the IEEE/CVF international conference on computer vision

Dori, Dov and Tombre, Karl (1995) From engineering drawings to 3D CAD models: are we ready now?. Computer-Aided Design 27(4): 243--254 Elsevier

Minaee, Shervin and Boykov, Yuri and Porikli, Fatih and Plaza, Antonio and Kehtarnavaz, Nasser and Terzopoulos, Demetri (2021) Image segmentation using deep learning: A survey. IEEE transactions on pattern analysis and machine intelligence 44(7): 3523--3542 IEEE

Yu, Ying and Wang, Chunping and Fu, Qiang and Kou, Renke and Huang, Fuyu and Yang, Boxiong and Yang, Tingting and Gao, Mingliang (2023) Techniques and challenges of image segmentation: A review. Electronics 12(5): 1199 MDPI

黄鹏 and 郑淇 and 梁超 (2020) 图像分割方法综述. 武汉大学学报 ( 理学版) 66(6): 519--531

Elyan, Eyad and Jamieson, Laura and Ali-Gombe, Adamu (2020) Deep learning for symbols detection and classification in engineering drawings. Neural networks 129: 91--102 Elsevier

Chen, Liang-Chieh and Papandreou, George and Schroff, Florian and Adam, Hartwig (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587

Zhang, Wentai and Joseph, Joe and Yin, Yue and Xie, Liuyue and Furuhata, Tomotake and Yamakawa, Soji and Shimada, Kenji and Kara, Levent Burak (2023) Component segmentation of engineering drawings using Graph Convolutional Networks. Computers in Industry 147: 103885 Elsevier

Carrara, Andrea and Nousias, Stavros and Borrmann, Andr{\'e} (2025) Vectorgraphnet: Graph attention networks for accurate segmentation of complex technical drawings. Journal of Computing in Civil Engineering 39(6): 04025085 American Society of Civil Engineers

Moreno-Garc{\'\i}a, Carlos Francisco and Johnston, Pam and Garkuwa, Bello (2020) Pixel-based layer segmentation of complex engineering drawings using convolutional neural networks. IEEE, 1--7, 2020 International joint conference on neural networks (IJCNN)

Cheng, Junlong and Fu, Bin and Ye, Jin and Wang, Guoan and Li, Tianbin and Wang, Haoyu and Li, Ruoyu and Yao, He and Cheng, Junren and Li, JingWen and others (2025) Interactive medical image segmentation: A benchmark dataset and baseline. 20841--20851, Proceedings of the Computer Vision and Pattern Recognition Conference

Osco, Lucas Prado and Wu, Qiusheng and De Lemos, Eduardo Lopes and Gon{\c{c}}alves, Wesley Nunes and Ramos, Ana Paula Marques and Li, Jonathan and Junior, Jos{\'e} Marcato (2023) The segment anything model (sam) for remote sensing applications: From zero to one shot. International Journal of Applied Earth Observation and Geoinformation 124: 103540 Elsevier

Chen, Keyan and Liu, Chenyang and Chen, Hao and Zhang, Haotian and Li, Wenyuan and Zou, Zhengxia and Shi, Zhenwei (2024) RSPrompter: Learning to prompt for remote sensing instance segmentation based on visual foundation model. IEEE Transactions on Geoscience and Remote Sensing 62: 1--17 IEEE

Li, Yaqin and Wang, Dandan and Yuan, Cao and Li, Hao and Hu, Jing (2023) Enhancing agricultural image segmentation with an agricultural segment anything model adapter. Sensors 23(18): 7884 MDPI

Wu, Junde and Wang, Ziyue and Hong, Mingxuan and Ji, Wei and Fu, Huazhu and Xu, Yanwu and Xu, Min and Jin, Yueming (2025) Medical sam adapter: Adapting segment anything model for medical image segmentation. Medical image analysis 102: 103547 Elsevier

Jamieson, Laura and Elyan, Eyad and Moreno-Garc{\'\i}a, Carlos Francisco (2024) Few-Shot Symbol Detection in Engineering Drawings. Applied Artificial Intelligence 38(1): 2406712 Taylor & Francis

Yang, Lvyang and Zhang, Jiankang and Li, Huaiqiang and Ren, Longfei and Yang, Chen and Wang, Jingyu and Shi, Dongyuan (2024) A comprehensive end-to-end computer vision framework for restoration and recognition of low-quality engineering drawings. Engineering applications of artificial intelligence 133: 108524 Elsevier

Song, Zhiwei and Yao, Hui and Tian, Dan and Zhan, Gaohui and Gu, Yajing (2025) Segmentation method of U-net sheet metal engineering drawing based on CBAM attention mechanism. AI EDAM 39: e14 Cambridge University Press

Zhou, Hengyu and Zhang, Hui and Wang, Bin (2024) Segmentation-Guided Layer-Wise Image Vectorization with Gradient Fills. Springer, 165--180, European Conference on Computer Vision

Nguyen, Van Nguyen and Groueix, Thibault and Ponimatkin, Georgy and Lepetit, Vincent and Hodan, Tomas (2023) Cnos: A strong baseline for cad-based novel object segmentation. 2134--2140, Proceedings of the IEEE/CVF International Conference on Computer Vision

Luo, Ruifeng and Liu, Zhengjie and Cheng, Tianxiao and Wang, Jie and Wang, Tongjie and Wei, Xingguang and Wang, Haomin and Li, YanPeng and Chai, Fu and Cheng, Fei and others (2025) ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting. arXiv preprint arXiv:2503.22346

Tang, Ziqiang and Han, Chao and Li, Hongwu and Fan, Zhou and Sun, Ke and Huang, Yuntian and Chen, Yuhang and Wang, Chenxing (2025) Vector Extraction from Design Drawings for Intelligent 3D Modeling of Transmission Towers.. Computers, Materials & Continua 82(2)

Jamieson, Laura and Moreno-Garcia, Carlos Francisco and Elyan, Eyad (2025) Towards fully automated processing and analysis of construction diagrams: AI-powered symbol detection. International Journal on Document Analysis and Recognition (IJDAR) 28(1): 71--84 Springer

Yang, Fuyi and Mu, Jiazuo and Zhang, Yanshun and Zhang, Mingqian and Zhang, Junxiong and Luo, Yongjian and Xu, Lan and Yu, Jingyi and Shi, Yujiao and Zhang, Yingliang (2024) Cadspotting: Robust panoptic symbol spotting on large-scale cad drawings. arXiv preprint arXiv:2412.07377

Kim, Byungsoo and Wang, Oliver and {\"O}ztireli, A Cengiz and Gross, Markus (2018) Semantic segmentation for line drawing vectorization using neural networks. Wiley Online Library, 329--338, 2, 37, Computer Graphics Forum

Wang, Zhenyu and Huang, Jianxi and Sun, Zhida and Gong, Yuanhao and Cohen-Or, Daniel and Lu, Min (2025) Layered image vectorization via semantic simplification. 7728--7738, Proceedings of the Computer Vision and Pattern Recognition Conference

Lin, Yi-Hsin and Ting, Yu-Hung and Huang, Yi-Cyun and Cheng, Kai-Lun and Jong, Wen-Ren (2023) Integration of deep learning for automatic recognition of 2D engineering drawings. Machines 11(8): 802 MDPI

Tian, Dan and Yao, Hui and Song, Zhiwei and Zhan, Gaohui and Wang, Zhijie and others (2022) Improved Parts Drawing Segmentation Method Based on U-net. J. Image Process. Theory Appl 5(1): 52--58

Zhou, Zongwei and Rahman Siddiquee, Md Mahfuzur and Tajbakhsh, Nima and Liang, Jianming (2018) Unet + +: A nested u-net architecture for medical image segmentation. Springer, 3--11, International workshop on deep learning in medical image analysis

Yurtkulu, Salih Can and {\c{S}}ahin, Yusuf H{\"u}seyin and Unal, Gozde (2019) Semantic segmentation with extended DeepLabv3 architecture. IEEE, 1--4, 2019 27th signal processing and communications applications conference (SIU)

Additional Files

Additional file 1