A Review of AI Classification Models for Lithological Characterization in Laqlouq, Lebanon.
A
INTRODUCTION
A
Lithological discrimination is crucial for optimal resource extraction in fields such as civil engineering, hydrogeology, and the oil industry. The first phase of lithological discrimination began with Lebanon's oldest geological map by Austrian J. Russegger in 1843, at a scale of 1:290,000. This was followed by an enhanced version produced by C. Diener in 1886 at a scale of 1:500,000. Godefroy Zumoffen created a more detailed geological map at a scale of 1:200,000 in 1926 (Zumoffen,
1926). The second phase of lithological discrimination and geological exploration commenced in January 1928 with field surveys led by Louis Dubertret, known as Lebanon's modern geology pioneer (Hakim,
1986). Dubertret, along with colleagues and students, conducted surveys and published results across articles, maps, and reports from 1934 to 1975. The comprehensive geological map of Lebanon at a scale of 1:200,000 and its accompanying report were published in 1955, during which Dubertret developed a geological database at a scale of up to 1:50,000 (Hakim,
1986). Currently, geological mapping across Lebanon continues at larger scales. Youssef et al. (
2023) created a 1:5,000 geological map series for the Laqlouq area, representing the third phase, which utilized Artificial Intelligence (AI) and satellite imagery. This era has benefited from numerous tools and missions, which have dramatically advanced geological mapping using remote sensing data (Bishop,
1995). Van der Meer et al. (
2014) were the first to assess Sentinel images for geological remote sensing, which historically began with low-resolution Landsat 1 images. They compared Sentinel-2A bands with those of ASTER and found that the Sentinel bands are suitable for band ratios previously designed for ASTER and Landsat.
Machine learning provides new opportunities for effective and efficient remote sensing algorithms and methods. Its strength lies in its ability to handle high-dimensional data and classify complex map features (Maxwell et al., 2018). Remote sensing techniques are increasingly used to distinguish lithology, locate mineral deposits, and create geological maps at various scales (Rowan, 2006; Gad & Kusky, 2007). The most common remote sensing analysis methods for geological research and lithological discrimination are band ratios (Pour & Hashim, 2012a; 2012b), principal component analysis (PCA) (Rajesh, 2008), and image classification (Gibson & Power, 2000; Ngcofe & Van Niekerk, 2016).
Several authors have utilized the short-wave infrared (SWIR), near-infrared (NIR), and visible portions of the electromagnetic spectrum for lithological discrimination (Rowan et al., 2006; Gad & Kusky, 2007). Melhorn and Sinnock (1973) applied a classification scheme to identify lithology (Hewson et al., 2002). To better understand the ability of ASTER data in delineating regolith and alteration areas in Australia, an improved geological map was created using ASTER data. Using the band ratio technique, Cudahy and Hewson (2002) detected minerals from the skarn, epithermal, and porphyry groups. Rowan and Mars (2003) demonstrated early on that lithological mapping was feasible in well-exposed areas. Mineralogy was mapped using ASTER radiance imagery by Hewson et al. (2005). Qari et al. (2008) generated a 1:100,000 geological map of the basement rocks exposed over the Arafat area, Western Arabian Shield, Saudi Arabia, using ASTER data (Arivazhagan & Anbazhagan, 2017; Hewson et al., 2015). The primary goal of this study is to test and assess the application of modern techniques in lithologic mapping using satellite images for the Laqlouq region, based on field observations and remote sensing data. Artificial intelligence lithofacies classification methods were tested by applying and processing different spatial resolutions of ASTER, Landsat OLI, and Sentinel multispectral imagery of the Laqlouq study area. To enhance lithological discrimination capabilities, digital processing techniques, including band ratio, principal component analysis (PCA), Minimum Noise Fraction (MNF), and spectral unmixing (SU), were employed in Neural Network (NN) and Support Vector Machine (SVM) methods. The capability of surface lithological mapping is improved by the growing number of bands in the SWIR region (Gomez et al., 2005). For accurate results, NN and SVM classifiers were compared with the 1:5,000 updated geological map of Laqlouq study area, which was constructed by Youssef et al. (2023). Findings were validated using the 1:5,000 geological map and localized truth ROIs with GPS coordinates of outcrops (field-collected samples).
STUDY AREA
Laqlouq, a Lebanese region, spans the Ibrahim River basin in the South and the Jaouz River basin in the North. It is situated on a high plateau, with an average altitude of 1750 m, bounded to the east by the mountainous massif of Jord El Aaqoura at 2187m above sea level and to the west by the bastion of Jaj mountain at 1959 m. The region receives abundant rainfall, with an average annual rainfall of 1700 mm from December to March (Hakim, 1985). The annual average temperature is 11.2°C, with January being the coldest month (3°C) and August the hottest (19.1°C).
The Laqlouq region encompasses the entire Jurassic and Cretaceous stratigraphic scale of
Lebanon (Secondary Era) and facilitates lithological discrimination using remote sensing and
artificial intelligence methods over a small area of 55 km2. The geological structure in the study
area is limited by two flexures on the western and eastern ends (Fig. 1). The geological map of the study area depicts the boundaries of the rock outcrops based on a precise ground survey.
Highlighting the main tectonic accidents and faults, measuring the dips and strikes, and proposing an East-West geological section displaying the stratigraphy and the structure of the region. The geological fieldwork in the Laqlouq region yielded a geological map at a scale of 1:5,000 (Fig. 1), synthesized based on the research and investigations of Youssef et al. (2023) and Doumit and Ghanem (2021). The rocks at the outcrop belong to the Jurassic and Cretaceous of Lebanon. They begin with the Dolomites of the lower part of the Middle Jurassic (J4a) and extend to the Cenomanian (C4), East of Jord El Aaqoura - Tannourine. The Middle Jurassic or Callovian (J4) at the western end (Fig. 1) is subdivided into two parts: the lower part (J4a) is dolomitic (thickness > 600m). The upper part (J4b) is formed of limestones and dolomitic limestones with a thickness of more than 250 m, and consists of a layer of basalt at the Baatara sinkhole. The Oxfordian (J5) is exposed in a small area to the west of the study area (Fig. 1), forming a waterproof sole on which the entire Laqlouq region rests.
The region is composed of various volcanic materials, including blackish basalts, purple cinerites, tuffs, and volcanic breccias, with a thickness ranging from 50 to 200m. The Kimmeridgian (J6) is a limestone bar, approximately 50m thick, caught by the flexure of the Jabal Jaj (Fig. 1), with dips approaching vertical.
The Neocomian (C1) and (BC1) consist of ferruginous sandstone and yellow, ocher, or white quartz sands, non-fossiliferous, and of granite origin. Its thickness is variable and ranges from 100m. The lower Aptian (C2a) occupies a large area in the Center West and South of Fig. 1. It begins with "fossiliferous" sandstones that alternate with sandstone marls, sandstone limestones, and yellowish limestones arranged in small banks. Its thickness is approximately 150m. The upper Aptian (C2b) is formed of gray limestones, about 50m thick, forming a cliff (the Falaise de Blanche). It occupies a large area in the center of Fig. 1. It outcrops everywhere in Laqlouq (geological section), west of Aaqoura and Dahr El Mghara, where it eloquently draws the lower hinge of the flexure of Jabal Jaj - Ehmej. The basaltic summit of the Upper Aptian (0C2b) is everywhere surmounted by a layer of various volcanic materials, 150m thick, presenting a high alteration of blackish and impermeable clay formations, which received many hill lakes. The Albian (C3) consists of marl-limestone and impermeable friable green marls, 100m thick, which outcrop around Jabal El Laqlouq and to the east of Fig. 1, at the foot of Jord El Aaqoura and Tim Rtiba.
The Cenomanian (C4), the Cenomanian bastion of Jabal Aaqoura and Chellale, dominates the entire Laqlouq region and plunges eastwards to form the flexure of Aaqoura - Laqlouq (Fig. 1). It is a massif formed of carbonate rocks exceeding 600m thick. The Cenomanian (C4) is subdivided into three lithological groups:
Lower Cenomanian (C4a) with a dolomitic tendency (100m thick); marly limestone
The Middle Cenomanian (C4b) marly (150m thick); and
The Upper Cenomanian (C4c) limestone (250m thick).
The Quaternary Subactual (qo) period experienced landslides and mudslides in the Marj Rima-Aaqoura region, which received heavy rainfall, causing the entire slope to slide toward the neighboring Rouaiss valley. The most prominent tectonic features in the study area are two eastern gaze flexures: the flexure of Aaqoura - Tim El Obour and the flexure of Jabal Jaj-Ehmej. These flexures run in a meridian direction (as shown on the geological map) with displacements of 400 m for the first flexure and 700 m for the second.
The Laqlouq region receives an average annual rainfall of 1600 mm. It is subject to extensive erosion, which has left its correlative deposits almost everywhere on the plateau, particularly around Jabal Saidet El Qarn (Fig. 1). Here, the dolomitic cliff of the Lower Cenomanian (C4a) has almost disappeared due to mass movements and detachment of large packages that are visible on the ground.
MATERIALS AND METHODS
The study utilized satellite images, including ASTER, Landsat, Sentinel, and high-resolution Pleiades images, acquired from the US Geological Survey's EROS Center (USGS, 2018), to refine lithological boundaries. The following were used:
ASTER level 1T scenes dated September 2001, selected due to the significance of SWIR bands in lithological discrimination. The old ASTER archive covering the study area was chosen, featuring Visible and Near Infrared (VNIR) data with 15m spatial resolution, comprising three bands at Green, Red, and Near Infrared, and 6 SWIR bands at 30m spatial resolution. The ASTER SWIR sensor has been subjected to abnormally high temperature abnormalities since April 2008, and these bands should not be used.
Landsat8 OLI level 1T multispectral imagery dated 19 November 2014, which gathered data in visible and SWIR areas and has interest for geological application (Mwaniki et al., 2015).
Sentinel-2A level-1C image dated 11 February 2017 with thirteen spectral bands from the VNIR to the SWIR at three spatial resolutions: 10, 20, 60 m (Drusch M., 2012).
High-resolution Pleiades-1A orthorectified image at 0.5 m resolution with four spectral bands (blue, green, red, and IR).
a- Preprocessing
All the VNIR ASTER bands were adjusted for atmospheric effects and resampled to a 30-m resolution. The anomalously high radiance levels in bands 5 and 9 can be attributed to instrument crosstalk effects and atmospheric effects, resulting from the energy transmission from band four optical elements to the detectors of neighboring bands (Sheikhrahimi et al.,
2019). This correction was completed using the crosstalk correction program available at
www.gds.aster.ersdac. The nearest neighbor resampling method was applied to resample the 15 m resolution VNIR ASTER data to match the 30 m spatial dimensions of the SWIR. The resulting nine-band ASTER image was created by combining the three VNIR bands with the six SWIR bands. Finally, the surface reflectance was calibrated in the crosstalk-corrected nine-band of the ASTER, L8 OLI, and Sentinel images.
The level L1T of the OLI image was collected on November 6, 2013. Like the ASTER image, digital values of the OLI image were set to the World Geodetic System projection.
Sentinel bands from the ESA Sentinel Hub have already undergone radiometric and geometric corrections. The "SEN2COR L2A" processor, included in the Sentinel-2 SNAP toolkit and available at http://step.esa.int/main/, was used to perform the atmospheric correction. Each granule produced a corrected image at "level 2A" with the reflectance at the bottom of the atmosphere (B.O.A.). Using the nearest neighbor method, the 20 m resolution bands (5, 6, 7, 8a, 11, and 12) were resampled to 10 m resolution. Then, all bands for each granule (2, 3, 4, 5, 6, 7, 8, 8a, 11, and 12) were analyzed. To acquire spatially and radiometrically corrected images for spectral data analysis and comparison, image pre-processing is crucial.
The methodology involved the use of the PCA method to analyze images obtained from ASTER, Landsat, and Sentinel. A composite image was created from the PCA output to enhance the ability to distinguish different lithological types within the area of interest. The generated spectra of the image were then used as endmembers to categorize the image using the Spectral Angle Mapping (SAM) approach (Arivazhagan & Anbazhagan, 2017).
b- Image processing
A
The PCA multivariate statistical method was utilized to select uncorrelated linear combinations of variables that can extract a linear combination with a lower variance. The first PCA band
contained the most significant percentage variance in the data, followed by the second PCA band, while the last PCA bands contained noise or slight variance (Chang et al., 2006). PCA is a well-established approach for its effectiveness in lithological and alteration mapping, as well as in the analysis of correlated multidimensional data (Massironi et al., 2008; Moore et al., 2008; Amer et al., 2010). It is responsible for removing the necessary data from the various bands (Gasmi et al., 2022).
A variation of PCA known as the minimum noise fraction (MNF) orders the data based on the signal-to-noise ratio rather than variance (Chen, 2000; Bertels et al., 2005; Ngcofe & Van Niekerk, 2016). Green et al. (1988) devised a transform approach that does not always result in images with rapidly decreasing image quality as the component numbers increase. To increase the signal-to-noise ratio when selecting main components with increasing component numbers, a technique known as the MNF transform is used. According to Lee et al. (1990), MNF uses a two-stage procedure that involves PCA after noise whitening with unit variance. MNF is also known as the noise-adjusted principal component (NAPC) transform, as noted by Gasmi et al. (2022).
For further analysis of multispectral and hyperspectral images, such as identifying endmember spectra for spectral mixture analysis, several MNFs (or NAPCs) are chosen to maximize the signal-to-noise ratio (Richards, 1999; Chuvieco, 2016). Spectral unmixing, referred to as abundances, is the primary goal of analyzing the materials (endmembers) present in each mixed pixel and their ratios (Keshava). The unmixing procedure consists of two key stages: endmember extraction and abundance inversion (Bioucas et al., 2012).
In this study, the ENVI implementation of a standard linear deconvolution method was used for unmixing procedures. ASTER, Landsat, and Sentinel satellite images were tested for
their capacity to use spectral unmixing, and a linear algorithm was used to produce fraction images for important geological endmembers in the study area. By characterizing surface materials using fraction images, geological classes and cover materials, such as plants that would otherwise have mixed spectral contributions, can be separated at sub-pixel resolutions (Leverington, 2005).
According to the geological maps of the research area, the spectral fingerprints of one homogeneous and representative surface were chosen within each lithological unit (Fig. 2). The training data (or endmember), which are image data from ASTER, Landsat, and Sentinel of each surface, correspond to certain classes of substratum attributes (Gasmi et al., 2022).
Image analysis and familiarity with the surfaces in the research region play a significant role in the supervised classification process. Supervised classification was employed, with the training data collected using the AOI tools. Training data were acquired from the output maps or from actual fieldwork, where different surface classes were detected, and their geographic locations were determined for lithological mapping purposes. The AOI technique is used in regions to collect training samples with greater accuracy. More than seventeen training sample units were gathered from various locales.
c- Image classification techniques
Classification involves assigning pixels with similar spectral characteristics to specific classes, each represented by a distinct color. In supervised classification, different rocks can be identified: granite gneiss appears pink, serpentinites are green, talc carbonate is light green, metagabbro is purple, metavolcanics is yellow, melange is blue, and late tectonic gabbro is sky blue. Two types of granite can be distinguished, with syn-tectonic
granites appearing as red and late tectonic granites as orange. The classification process involves multiple steps (Kamel and Abu El Ella 2016).
One critical factor that affects the accuracy of the classification is the extraction of endmembers, which are used as training data. However, the geographical heterogeneity of the signature and pixel-scale impurities has narrowed the applicability of the PPI approach in this study. As a result, the conventional technique was employed based on the visual interpretation of images (DS and MNF) and the extreme class from the scatter plot. Two categorization groups with the highest degree of accuracy were selected for use in lithofacies mapping, based on the highest statistical likelihood between the selected ROIs, and neural networks and SVM were used.
Remote sensing is increasingly focusing on machine learning classification, as it can simulate complex class signatures, accept a variety of predictor data as input, and make no assumptions about the data distribution. Various studies have shown that these techniques yield higher accuracy than conventional parametric classifiers, such as those by Pal and Mather (2005), Ghimire et al. (2012), and Maxwell et al. (2018).
It is well-known that feedforward backpropagation neural networks are efficient algorithms for image categorization (Lu & Weng, 2007; Wang et al., 2008). Neural networks are particularly suitable for mapping geological materials, as specific geological groups are often characterized by significant variations in reflectance qualities due to spatial inconsistencies in mineralogy, chemical alteration, and surface exposure (Leverington, 2010).
Unlike the maximum likelihood approach, neural network classifiers do not involve distribution models for training data parameterization. This allows for the categorization of irregular, multimodal distributions in training databases. The ability of neural networks to generalize information and tolerate noise is their main characteristic. Neural networks are often referred to as "black boxes" because their predictions are difficult to understand. The architecture used to arrange computing units affects how well neural networks perform (Haykin, 2005).
Based on statistical learning theory, SVM is one of the most popular and robust supervised learning techniques, which can be employed for classification and regression-related use cases. It constructs a hyperplane that maximizes the distance between the two classes while optimally separating them. This approach is built on two key concepts:
Maximum margin (Fig. 3): The separation between the nearest samples and the separation boundary is known as the margin. Support vectors are used to describe samples close to the boundaries of a decision (Oommen et al., 2008). The width of the margin is constrained by the support vectors, which are samples.
Kernel: Transforming the representation space of the input data into a higher-dimensional space is achieved using a kernel function. The SVM model employs four kernels, including linear, polynomial, radial basis function (RBF), and sigmoid. The RBF kernel type is the most used in SVM. RBF kernels with standard parameters (regularization parameter: C and kernel width: g) were used in this study for SVM categorization (Gasmi, 2016).
In unvegetated areas, many lithologic units differ based on their topographic form and spectral properties. Identification is much more difficult in vegetated areas and bodies of water, such as lakes, because the rock surface is hidden, and some of the more subtle aspects of changes in vegetative cover must be taken into consideration. Green areas (8.5 km2) and lakes (a total of 190) occupy 3.68% and 0.18% of the study area, respectively. To ensure accuracy, the classified images were masked to exclude vegetated areas and water bodies.
RESULTS AND DISCUSSIONS
Among all the principal components of the three images — ASTER, Landsat, and Sentinel — the important components were selected. For ASTER, PC1 and PC2 accounted for approximately 98% of the information, and these two components were the most significant (Table 1). The neglected information with zero values was in PC4-PC9. For Landsat, PC1 accounted for 59% and PC2 for 33%, while the neglected bands, PC5 and PC6, accounted for 0%. The first two components, PC1 and PC2, also had a higher percentage of information in Sentinel. The selected band combination for ASTER was the first three bands, PC1, PC2, and PC3, while for Landsat and Sentinel, the first four bands, PC1, PC2, PC3, and PC4, were selected.
Table 1
presents the eigenvalue percentage for the PCA and MNF bands.
| | PC1 | PCA Eigenvalue percentage | PC8 | PC9 |
|---|
PC2 | PC3 | PC4 | PC5 | PC6 | PC7 |
|---|
ASTER | 93 | 5 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
Landsat | 59 | 33 | 6 | 2 | 0 | 0 | - | - | - |
Sentinel | 66 | 26 | 4 | 3 | 0 | 0 | - | - | - |
| | | | MNF Eigenvalue percentage | | | |
| | MNF1 | MNF2 | MNF3 | MNF4 | MNF5 | MNF6 | MNF7 | MNF8 | MNF9 |
ASTER | 60 | 16 | 7 | 5 | 3 | 3 | 3 | 2 | 2 |
Landsat | 38 | 21 | 18 | 11 | 7 | 5 | - | - | - |
Sentinel | 66 | 12 | 9 | 7 | 3 | 3 | - | - | - |
| Based on the covariance matrix, the results obtained from the PCA estimation are shown in Fig. 3a, d, and g. The composite image of PCA using the bands with eigenvalues percentage (Table 1) matched well with the 1:5,000 geological map of Laqlouq area (Fig. 1). The first three principal components of the PCA differentiated the geological groups, Cretaceous, Paleogene, |
Miocene-Lower Quaternary, and Middle Quaternary-Actual. The response of each principal component analysis component was compared to a geological map containing seventeen lithological units (Fig. 1). All these components (PCs) could provide the best discrimination between lithological units. This dataset contains a large amount of information, including geologic, topographic, and roughness data. Geological information can then be extracted more easily from the principal components than from the initial ASTER data (Gomez et al., 2005).
The examination of the first six components (PCs) enables the investigation of some spectral or textural homogeneous surfaces specific to lithologic properties, which can be compared to previously mapped geological formations. Thus, the boundaries of lithological layers can be corrected, created, or eliminated.
The ASTER MNF analysis revealed that MNF1 and MNF2 had the highest eigenvalue percentages, while the remaining components showed a decreasing trend. Typically, the initial few MNF bands contain the most information, while subsequent bands contain more noise. To enhance subsequent spectral processing results, MNF components with eigenvalues less than one are commonly excluded from the data as noise (Jensen, 2005). MNF did not have any neglected bands, while ASTER had nine, and Landsat and Sentinel had a combined total of six layers. The interpreted MNF components were assigned to the RGB band combination of ASTER, Landsat, and Sentinel (Fig. 3b, h, and e). The MNF components of 1, 2, and 3 contained the least amount
of noise and were assigned to RGB band combinations to produce color composites for the best visualization output.
The UNMIX raster derived from ASTER imagery, as well as PCA and MNF images generated from ASTER, Landsat, and Sentinel bands, were classified into fourteen lithological classes using SVM methods. The classification result generated an output of 18 lithological raster layers, including:
six generated from ASTER imagery MNF Neural Networks based (AST_MNF_NN), MNF SVM based (AST_MNF_SVM), principal component neural network and SVM based (AST_PCA_NN, AST_PCA_SVM), and based on the unmix imagery, a neural network (AST_UN_NN) and an SVM classified (AST_UN_SVM).
six generated from Landsat imagery MNF Neural Networks based (LAN_MNF_NN), MNF SVM based (LAN_PCA_NN, LAN_PCA_SVM), and based on the unmix imagery, a neural network (LAN_UN_NN) and an SVM classified (LAN_UN_SVM); and
Six generated from Sentinel imagery MNF Neural Networks based (SEN_MNF_NN), MNF SVM based (SEN_PCA_NN, SEN_PCA_SVM), and based on the unmix imagery, a neural network (SEN_UN_NN) and an SVM classified (SEN_UN_SVM).
ROIs covering various lithological units were selected using Pleiades orthoimages, GPS field surveys, and a pre-existing geological map. The accuracy of the final product derived from remote sensing must be assessed to gain a warranty of classification quality and user confidence in the product (Foody, 2002). After classification, the image was superimposed with the geological map to reveal the correspondence between the classes and the geological formations. To better understand the accuracy and similarity analysis between the generated raster and the geological map, three evaluation methods were applied: the confusion matrix, Pearson correlation analysis, and spatial association between zones.
Evaluating the accuracy was accomplished by calculating the confusion matrix and comparing the classification result with the GPS coordinates of the samples collected on site and the recently published geological map.
Table 2
shows the confusion matrix accuracy percentages and Kappa coefficients.
Raster | Overall Accuracy | Kappa Coefficient |
|---|
AST MNF NN | 42% | 0.31 |
AST_MNF_SVM | 37% | 0.31 |
AST_PCA_NN | 29% | 0.22 |
AST_PCA_SVM | 37% | 0.32 |
AST UN NN | 43% | 0.30 |
AST UN SVM | 41% | 0.33 |
LAN MNF NN | 20% | 0.15 |
LAN _MNF_SVM | 26% | 0.19 |
LAN PCA NN | 22% | 0.15 |
LAN _PCA_SVM | 17% | 0.12 |
LAN UN NN | 28% | 0.22 |
LAN UN SVM | 25% | 0.20 |
SEN MNF NN | 20% | 0.17 |
SEN _MNF_SVM | 27% | 0.23 |
SEN PCA NN | 17% | 0.12 |
SEN _PCA_SVM | 17% | 0.12 |
SEN _UN_NN | 15% | 0.08 |
SEN _ UN SVM | 21% | 0.14 |
According to the findings, the AST_UN_NN had the highest overall accuracy, at 43 percent, and a Kappa coefficient of 0.3 (Table 2). The review of the confusion matrix for Landsat imagery showed a maximum accuracy of 28% with a kappa coefficient of 0.22 for the LAN_UN_NN raster. The Sentinel imagery gave lower accuracy with 27% and a Kappa coefficient of 0.23 for the raster SEN_MNF_SVM. As Table 2 displays, the diagnostic results of the confusion matrix showed that the ASTER sensor provided better results (> 40%), followed by Landsat, compared to Sentinel (< 30%). The experimental results showed that, in the same dataset, the classification accuracy of SVM was higher than that of the neural network, except for AST_MNF_NN and AST_UN_NN, where the neural network achieved higher accuracy at 42% and 43%, respectively.
The most significant advantages of SVM applications are the ability to classify data without dimensional reduction, as well as their high performance in terms of algorithm convergence, training speed, and classification accuracy. The generalization of geological map production and the synthetic nature of lithological unit boundaries also influenced the overall accuracy obtained.
To better understand the spatial similarity between the classified lithological maps and the high-resolution geological databases derived from the Pleiades image (Youssef et al., 2023), a correlation matrix of the area percentages of the lithological units was made for the similarity examination of the common lithological areas between the classified raster (Table 3). The correlation results in Table 3 are shown in red for the high positive correlation and in dark blue for the negative correlation between the seventeen classified lithological raster and the geological map. In the first column of Table 3, the similarity with the geological map highlights higher correlation values from the three satellite images used: ASTER, Landsat, and Sentinel. The most similar lithological raster to the geological map is AST_UN_NN, followed by LAN_PCA-SVM and SEN_UN_SVM, respectively. The classified lithological raster based on ASTER imagery (AST_UN_NN) yielded higher values in both validation methods, as indicated by the confusion correlation matrices. Besides the correlation between the percentage of lithological units on surfaces in geological maps and the classified raster, Table 3 also highlights the similarity in area percentage between the classified lithological raster. In ASTER imagery classified lithological raster, the highest
correlation was recorded between AST_MNF_SVM and AST_UN_SVM. For Landsat imagery, the higher correlation was between LAN_MNF_SVM and LAN_UN_SVM. For the Sentinel imagery, the highest correlation was observed between SEN_MNF_SVM and SEN_PCA_SVM. This is due to the similarity of imagery and the classification methods.
The bivariate category of correspondence was used as the third validation technique to quantify associations between numerical and categorical spatial variables by overlaying two maps and performing a cell-by-cell comparison. To evaluate the similarity between the two maps, the bivariate categories of the correspondence method compare two regionalizations, calculate a degree of association between the response map (geological map) and maps of factor variables (classified raster) (Nowosad & Stepinski, 2018).
Table 3
shows the correlation matrix of the lithological unit’s area percentages of the classified rasters.
The results of the bivariate category of the correspondence method produced 18 rasters of the similarity degree between the geological map and the classified raster. The output of these 18 raster layers was classified into nine categories of correspondence: high-high (HH), high-medium (HM), high-low (HL), medium-high (MH), medium-medium (MM), medium-low (ML), low-high (LH), low-medium (LM), and low-low (LL), as shown in Table 4.
The degree of spatial correspondence between the geological map and the classified lithological raster is also shown in this table. The highest percentage of correspondence between the ASTER imagery classified lithological raster and the geological map was for AST_UN_NN. For Landsat Imagery, it was LAN_UN_NN, and for Sentinel, it was SEN_UN_SVM. The Sentinel imagery classified lithological raster had the highest correspondence percentage due to its high spatial resolution.
Table 4
displays the bivariate category of correspondence percentages of similar areas between classified rasters and the geological map.
| | HH | HM | HL | MH | MM | ML | LH | LM | LL |
|---|
AST MNF NN | 17.7 | 17.7 | 13.2 | 7.7 | 7 | 11.6 | 4.4 | 8.7 | 12 |
AST_MNF_SVM | 25 | 4.3 | 6.6 | 6.1 | 13.3 | 12.5 | 12.4 | 11.7 | 8.1 |
AST_PCA_NN | 8.5 | 17.7 | 10.8 | 16.3 | 10.2 | 9.8 | 8.5 | 9.3 | 9 |
AST_PCA_SVM | 21.6 | 10.7 | 5.9 | 10 | 11.6 | 13 | 9.2 | 9.3 | 8.7 |
AST_UN_NN | 23.9 | 5.3 | 5 | 7.2 | 17.4 | 9 | 7.1 | 11.9 | 13.2 |
AST UN SVM | 23 | 6.7 | 4.6 | 8.3 | 13.9 | 12.4 | 9.8 | 13.6 | 7.8 |
LAN MNF NN | 12 | 14.7 | 18.7 | 15.4 | 6.2 | 5.3 | 11.1 | 11.5 | 5 |
LAN_MNF_SVM | 18.1 | 13.3 | 8.9 | 16.3 | 8.8 | 5.2 | 5.4 | 12.9 | 11 |
LAN_PCA_NN | 19.9 | 18.6 | 11.2 | 20.4 | 13.2 | 9.8 | 2 | 2.1 | 2.8 |
LAN_PCA_SVM | 11.5 | 15.4 | 7.6 | 23 | 9.9 | 7.3 | 7.6 | 9.4 | 8.4 |
LAN_UN_NN | 22.8 | 8.7 | 2.7 | 7.4 | 21.2 | 6.9 | 7.8 | 15.4 | 7.2 |
LAN UN SVM | 20 | 11.3 | 5.9 | 8.8 | 14.7 | 9.8 | 8.7 | 11.6 | 9.1 |
SEN MNF NN | 12.9 | 13.3 | 8.1 | 14.8 | 15.4 | 11.1 | 9.8 | 8.5 | 6.1 |
SEN_MNF_SVM | 21 | 16.9 | 8 | 10.2 | 12.2 | 10.5 | 8.8 | 7.1 | 5.4 |
SEN_PCA_NN | 23.2 | 6.2 | 4.6 | 6.4 | 16.6 | 10.6 | 9.2 | 13.3 | 9.8 |
SEN_PCA_SVM | 25.6 | 12.4 | 7.1 | 2.8 | 14.7 | 10.4 | 13.8 | 6.4 | 6.8 |
SEN_UN_NN | 20.2 | 20.3 | 14.1 | 8.5 | 6.3 | 5.3 | 7.5 | 9.6 | 8.3 |
SEN UN SVM | 26.6 | 4.5 | 7 | 11.1 | 12.9 | 9.3 | 11.2 | 9.1 | 8.4 |
Table 5
summarizes the results of the three test methods: area percentages used in the confusion matrix, the correlation matrix, and the bivariate category of correspondence.
Confusion matrix | AST_UN_NN LAN_UN_NN SEN_MNF_SVM | 42.55% 28.07% 27.27% |
|---|
| | LAN_PCA_SVM | 39% |
Correlation | SEN_UN_SVM | 41% |
matrix | AST _UN_NN | 33% |
Bivariate | SEN_UN_SVM | 26.60% |
category of | AST_MNF_SVM | 25% |
correspondence | LAN_UN_NN | 22.80% |
Table 5 selects the best classified lithological raster for each of the three tested imagery types: ASTER, Landsat, and Sentinel. The selection results in Table 5 show that the unmixed method in all imageries gave a good classification result, followed by the MNF method. Based on our experiment, the best lithological raster for ASTER imagery was AST_UN_NN, LAN_UN_NN was the best for Landsat, and SEN_UN_SVM was the best for Sentinel imagery.
The classification of satellite images generated from ASTER, Landsat, and Sentinel bands was carried out to determine the value of these data to support NN as well as vector machine (SVM) algorithms for per-pixel separation of lithological formations and to distinguish superficial formations that field geology cannot differentiate. The category correspondence results, for example, show a higher similarity in the lithological formations (BC2b) and (J4b)
(Fig. 4) In the three classified raster. Alluvium, screen, landslides, and basalts (q) also showed high similarity in the three-classified raster, with a notable dominance in the Landsat imagery.
Limestones, dolomitic limestones, and beds of dolomites (C4a and C4c) were detected in ASTER (Fig. 4a) more accurately than Landsat and Sentinel imageries. The Jurassic limestones and dolomitic limestones (J4b) exhibited high similarity with the geological map, as they constitute 13.5% of the study area.
In this work, VNIR and SWIR ASTER bands, as well as Unmix (UN) and Neural Network (NN) algorithms, were employed to discriminate geological units. The best results were obtained for Cretaceous, Quaternary, and superficial formations. Field surveys confirmed the results of image processing techniques using ASTER data. Additionally, the database of lithological formations generated from Pleiades imagery provided more information than Landsat and Sentinel imagery.
First, on nine ASTER bands, crosstalk correction, resampling, orthorectification, atmospheric correction, and radiometric normalization were applied. The Landsat and Sentinel sensors have two SWIR bands that can predict mineral association changes (Sabins, 1997). The ASTER instrument has six SWIR bands and five thermal bands, which improved lithologic and mineral information extraction and produced the best lithological discrimination results (Fig. 5).
Overall, the experiment yielded promising results in the three validation methods, especially since 21% of the area is covered by water bodies and vegetation, which makes it challenging to detect lithology using optical remote sensing. The resulting AST_UN_NN lithological map of the Laqlouq region was consistent with the geological map by over 50%. While this result can help in geological mapping, it alone cannot form a good lithological discrimination material.
CONCLUSION
Combining high spatial and spectral remote sensing techniques with local field knowledge enhanced the amount of information available about the distribution of various lithologies in the Laqlouq region of Lebanon. The potentials of Support Vector Machine (SVM), Unmixing (UN), and Neural Network (NN) for geological mapping using ASTER, Landsat, and Sentinel satellite data were investigated and assessed. ASTER multispectral data in the VNIR and SWIR bands, as well as the large-scale geological map produced by Youssef et al. (2023), were used. ASTER, Landsat, and Sentinel bands were subjected to crosstalk correction, resampling, orthorectification, atmospheric correction, and radiometric normalization. The enhancement techniques used, such as MNF, PCA, and pixel unmixing, enabled the discrimination of nearly all outcrops in the study area with high detail, while also reducing the study dimension and allowing for the superposition and evaluation of the boundaries of lithological layers originally mapped on the geological map. All nine generated raster images were classified using machine learning supervised methods, including neural networks and support vector machines, resulting in eighteen lithologically classified raster images.
The validation analysis of these eighteen lithological rasters, in conjunction with the geological database map generated from Pleiades high-resolution imagery, was conducted using three methods: a confusion matrix, a correlation matrix, and a bivariate category of correspondence. The supervised classification of pixel unmixing results using a neural network validated the effectiveness of ASTER imagery in discriminating lithological units in the Laqlouq region, as observed in this study. The overall accuracy of the ASTER classification, as determined from the confusion matrix, was 42%. The correlation matrix for the lithological unit areas reached 33%, and the spatial similarity with the geological map was 25%. The accuracy assessment in the three testing methods showed good correlation between the generated classes and the geologic map. The ASTER multispectral satellite imagery in bare unvegetated areas can be a powerful tool for geological mapping, specifically in the case of J4b and C4c.
Landsat and Sentinel images helped in correcting and specifying the boundary of lithological formations and distinguished superficial formations that field geology cannot identify. An image classification procedure was performed, followed by an accuracy assessment. The success of an image classification in remote sensing depends on the availability of high-quality remotely sensed imagery and ancillary data, the design of a proper classification procedure, and the skills and experiences of the analyst.
This study demonstrated the importance of ASTER, Landsat, and Sentinel multispectral images in lithological mapping. The high spectral resolution of the VNIR and SWIR bands creates synergy with the high spatial resolution, enabling optimal lithological mapping. This combination can provide geologists with an opportunity to improve their investigations in difficult-to-access zones. In the future, hyperspectral remote sensing will be tested to identify and map specific chemical and geometric patterns of land that can be used to identify areas with economically valuable mineral/oil deposits.