Spatiotemporal analysis and nonlinear statistics of hourly wind speed variability over tropical Cuba

Humberto Millán 1✉ Emailhmillanv@udg.co.cu

Ramiro Cumbrera 1

Behzad Ghanbarian 2,3,4

Rene Arias 5

Riccardo Biondi 6

Abdel Acosta 7

Aziz Benhamrouche 8

Department of Mathematics, Physics and Applied Informatics University of Granma Apdo. 21 85100 Bayamo, Granma Cuba

2 Department of Earth and Environmental Sciences University of Texas at Arlington 76019 Arlington TX USA

3 Department of Civil Engineering University of Texas at Arlington 76019 Arlington TX USA

4 Division of Data Science, College of Science University of Texas at Arlington 76019 Arlington TX USA

5 Department of Mechanical Engineering University of Granma Apdo. 21 85100 Bayamo, Granma Cuba

6 CIMA Research Foundation 217100 Savona Italy

7 Mechanical Enterprise-Bayamo (EMBA) Apdo. 21 85100 Bayamo, Granma Cuba

Department of Territorial Planning University Constantine 1, Frères Mentouri Algeria

Humberto Millán^a,*, Ramiro Cumbrera^a, Behzad Ghanbarian^b,c,d, Rene Arias^e, Riccardo Biondi^f, Abdel Acosta^g, Aziz Benhamrouche^h

^aDepartment of Mathematics, Physics and Applied Informatics, University of Granma, Apdo. 21, Bayamo 85100, Granma, Cuba

^bDepartment of Earth and Environmental Sciences, University of Texas at Arlington, Arlington TX 76019, USA

^cDepartment of Civil Engineering, University of Texas at Arlington, Arlington TX 76019, USA

^dDivision of Data Science, College of Science, University of Texas at Arlington, Arlington TX 76019, USA

^eDepartment of Mechanical Engineering, University of Granma, Apdo. 21, Bayamo 85100, Granma, Cuba

^fCIMA Research Foundation, 217100 Savona, Italy

^gMechanical Enterprise-Bayamo (EMBA), Apdo. 21, Bayamo 85100, Granma, Cuba

^hDepartment of Territorial Planning, University Constantine 1, Frères Mentouri, Algeria

^*Corresponding author

Email address: hmillanv@udg.co.cu, ORCID: http://orcid.org/0000-0002-1718-7508

Abstract

$\:(\stackrel{-}{{\Lambda\:}}$

), global Lyapunov exponent (λ_m), Hurst exponent (H) and multiscale entropy (MSE) were also computed for each series series using, respectively, the time reversibility, delay, detrended fluctuation analysis and multiscale entropy methods. These parameters were compared with those computed from 40 surrogate time series generated at each station. We found that the original wind speed time series could be classified as Gaussian noises with Lévy index α = 2.00, lower deterministic component

$\:\stackrel{-}{\wedge\:}$

=0.318±0.023, time reversibility Z-score < 2.021 and approximately constant MSE. The decorrelated data fitted sub-Gaussian distributions with the α exponent in the range 1.227 ≤ α ≤ 1.704. A multiple regression analysis found a significant empirical link between H, λ_m, the exponent of the MSE function (η) and

$\:\:\stackrel{-}{{\Lambda\:}}$

with the correlation coefficient of r = 0.972. These findings could be useful for the siting, operation and optimization of wind energy-based technologies.

Keywords:

Wind speed

Nonlinear system

Lyapunov exponent

Multiscale entropy

1 Introduction

Wind is a fundamental component of the atmospheric system, governing energy exchange, moisture transport and large-scale circulation patterns. It has been recognized that wind and its direction and intensity play a negative role in the transport of contaminant microplastics and viruses (Abbasi et al. 2023: Fluck and Raveh-Rubin 2023) and influence soil erosion (Lackoóvá et al. 2023). However, it is a renewable energy source playing a role in the global transition from fossil fuels to sustainable and low-carbon energy systems. According to the International Energy Agency (IEA) Wind Technology Collaboration Programme, key research priorities for advancing wind energy include the characterization of wind resources, short-term forecasting, resource assessment, and optimal site selection (IEA Wind 2024). For example, wind energy production in the state of North Dakota in the United States has approximately a potential of 250 GW (Swisher et al. 2001). However wind, its speed and associated kinetic energy exhibit complex, intermittent and nonlinear variability across time and space (Calif and Schmitt 2014). Those complex dynamics can generate turbulent fields of wind speed fluctuations, which affect the efficiency of wind turbines and windmills (Jemal et al. 2023). Thus, understanding the underlying dynamics and variability of wind speed requires appropriate statistical approaches to enhance wind resource assessment. One of the characteristics of wind speed time series is their histograms; the broader the histogram, the greater the wind speed variation. In the literature, wind speed variability has been generally analyzed using the two-( Paul et al. 2012; Jund and Schindler 2019; Sadiah et al. 2023) and three- (Baranitharan et al. 2024) parameter Weibull probability distribution functions (PDFs). Other statistical PDFs, such as Rayleigh, have been also used to describe wind speed data (Girma et al. 2015; Ogunjo 2025). Some recent studies compared various PDFs including two- and three-parameter Weibull, Rayleigh, lognormal, Gamma, Generalized Extreme Value (GEV) and Champernowne distributions (Zhan et al. 2025; Sumaili Ndeba et al. 2025. Those studies found that the Champernowne wss a robust distribution in regions with intermittent wind speed fluctuations. Shi et al. (2021) reviewed advantages and disadvantages of parametric, multiparametric and nonparametric PDFs. According to those authors, no single PDF adequately represented wind speed data across all sites. Several studies have emphasized that most PDFs fail to capture the scaling properties of wind speed time series (Ogunjo 2025). This issue deserves further attention, as certain scaling parameters, such as the Hurst exponent, depend on the underlying statistical distribution (Sanchez et al. 2015). Although histograms provide general information, they fail to preserve temporal correlations and nonlinear features of the underlying process.

Long/short term persistence versus Brownian-like behavior of wind speed data has been also characterized via the Hurst exponent (Calif and Schmitt 2014) and fractal dimensions (Liang et al. 2015; Shu and Chan 2025). Those studies suggest the introduction of hybrid models which could include fractal and multifractals analyses and machine learning to improve the predictive potential for wind engineering. Wind speed time series have been also analyzed using invariants from the theory of nonlinear dynamical systems. For instance, Drisya and Thumba (2017) found low-dimensional chaos of wind speed fluctuations at nine locations over the Indian subcontinent. In another study, the Lyapunov exponent was used to investigate the chaotic characteristics of wind speed on a tall tower in Missouri (Balkissoon 2021). Lin et al. (2021) examined the chaotic dynamics of high and low frequency wind speed through the largest Lyapunov exponent. Duan et al. (2020) proposed a wind speed prediction method based on the largest Lyapunov exponent and the theory of Lévy-stable distributions. All those studies and findings disagree, in part, with the traditional hypothesis of wind speed fluctuation as a pure stochastic variable. The complex structure of wind speed fluctuations can be investigated using the multiscale entropy (MSE) method implemented by Costa et al. (2005). Nogueira (2017) applied that method and found a link between MSE and fractal scaling in hourly near-surface wind speed data over southeast Kansas in the United States. That study allowed the authors to find different scaling regimes in the power spectrum of wind speed fluctuations. Wind speed data from the Caribbean region have been investigated in several studies. For instance, Chadee and Clarke (2014) analyzed wind data measured at 10 m level from the NCEP/DOE database for the period of 1979–2010 and reported that dominant wind direction was east-north-east over the Caribbean islands. They also found a bimodal distribution for the wind direction. Perez-Albornoz et al. (2023) collacted wind speed and direction data using an ultrasonic wind sensor at 40 m level and a mobile tower in the north coast of Yucatan Peninsula. Those authors analyzed the wind speed distribution and found slightly posively right skewed PDF (see their Fig. 4). They also fitted both Normal and Weibull distributions to their data and found a slightly better fit using the latter. In another study, Alonso Diaz et al. (2019) studied the trend of wind speed using a regional climate model (called PRECIS) and analyzed variations in wind speed distribution for three periods (i.e., 2011–2040, 2041–2070 and 2071–2099). They found that variations in wind speed would be greater in the northern and eastern coasts and statistically significant in the second half of the 21st century. They also reported that thirteen wind farms built up in Cuba were located in those areas with increased wind power.

Fig. 4

Illustrative example of the original and decorrelated wind speed time series corresponding to Bayamo site

The application of nonlinear methods to wind data still needs to address some issues such as, for example, the test of nonlinearity based on surrogates (Theiler et al. 1992; Lucio 2012) derived from the investigated data. Schreiber and Schmitz (2000) stated that,"...the application of nonlinear time series methods has to be justified by establishing nonlinearity in the time series...". The main objectives of this work, therefore, are to: i) test empirically the hypothesis of nonlinearity, determinism and chaotic behavior of wind speed time series and ii) investigate statistical relationships between different nonlinear parameters. We show that the theory of nonlinear dynamical systems provides insightful results to describe and model the complex behavior of wind speed fluctuations. The predictive potential of such statistics is still a challenging task which needs to be investigated in details.

2 Material and Methods

2.1 Study region and data information

The hourly wind speed data used in this study were collected from eleven stations in the Granma Province, Cuba southwest, as shown in Fig. 1. Table 1 summarizes the location (longitude and latitude) and the elevation above sea level at each station. The study area covers 8.4×10³ km², 59.5% of which are agricultural lands, the main economical activity in the region. The main crops include sugarcane (Saccharum oficinarum L.) and coffee (Cofea Arabica L.). The climate of the region is tropical according to the Köppen-Geiger classification system (Peel et al. 2007). The study area can be considered as vulnerable in environmental and hydrometeorological terms. Agricultural to hydrological droughts are common extreme events affecting this region. From the hydrological point of view, there exist five dams located within the largest hydrographic basin of the country (Rio Cauto basin). However, severe droughts, which result in reduced dam water levels and deforestation have been the main drivers of water scarcity for irrigation and livestock consumption, which in turn has become one of the major agricultural challenges in recent years Millán et al. 2025). One potential solution is to harness the wind speed dynamics of the region to install water-pumping systems driven by 18-bladed windmills. Wind rose analyses indicate that the prevailing wind direction ranges from northeast to west.

Fig. 1

Geographical location of the studied sites

Table 1

Physical location and elevation above sea level (e.a.s.l) of the study sites
Station	Longitude (⁰)	Latitude (⁰)	e.a.s.l. (m)
Bayamo	20.39 N	-76.62 W	109.95
Cabo Cruz	19.85 N	-77.73 W	122.90
Campechuela	20.23 N	-77.27 W	42.60
Cauto	20.55 N	-76.70 W	109.95
Guisa	20.18 N	-76.47 W	153.17
Jiguani	20.33 N	-76.41 W	155.60
Media Luna	20.14 N	-77.38 W	42.60
Mota	19.92 N	-77.15 W	122.90
Rio Cauto	20.58 N	-77.06 W	109.95
Veguita	20.31 N	-76.90 W	109.95
UoG	20.28 N	-76.70 W	101.00

We used hourly wind speed time series measured at 10 m above the surface from January 1st 2014 to December 31st 2023 (N = 87,648 data points). The datasets were obtained from the National Aeronautics and Space Administration (NASA) Langley Research Center (LaRC) Prediction of Worldwide Energy Resources (POWER) Project, funded through the NASA Earth Science/Applied Science Program (NASA 2023). Each hourly value corresponds to MERRA-2 10 m wind speeds corrected for elevation and surface type (Chandler et al. 2005).

2.2 Pre-whitening and data distribution

Our preliminary analysis revealed that the wind speed time series exhibited strong serial correlation, which is expected given the hourly measurement resolution. Such autocorrelation can lead to spurious results, for example, by artificially magnifying long-term correlations. Therefore, as a first step, we removed the lag-one autoregressive component (AR(1)) and part of the trend from each wind speed time series x_ν(t) (ν = 1,2,...N) using the von Storch method (1995).

$\:{x}_{t}^{d}={x}_{t}-{\rho\:}_{1}{x}_{t-1}$

where

$\:{x}_{t}^{d}$

is the decorrelated value and ρ₁(r₁ > 0) is the lag-one correlation coefficient.

We fitted several PDFs i.e., Normal, Weibull, Rayleigh and Gamma, to both the original and decorrelated time series. In addition, we estimated the Lévy-stable parameter for each case. The Lévy exponent provides information on the probability of extreme wind speed events under the assumption that the variable x_t follows a sub-Gaussian distribution. The PDF of wind speed data can be analyzed using a Lévy-stable distribution, characterized by the stability index, α. The special case α = 2 corresponds to the Gaussian (or Normal) distribution, while α = 1 corresponds to the Cauchy distribution. More generally, 0 < α < 2 defines a family of Lévy-stable (sub-Gaussian) distributions (Samorodnitsky and Taqqu 1994).

The PDF of a Lévy-stable process, f(x), is expressed through its Fourier transform,

$\:\widehat{\:f}\left(k\right),\:$

since no closed-form analytical expression exists:

$\:\widehat{f}\left(k\right)=\underset{-\infty\:}{\overset{\infty\:}{\int\:}}{e}^{-ikx}f\left(x\right)dx=\text{e}\text{x}\text{p}[-{\sigma\:}^{\alpha\:}{\left|k\right|}^{\alpha\:}\left(1+i\beta\:sign\left(k\right)tan\left(\frac{\pi\:\alpha\:}{2}\right)\right)-i\mu\:k]$

where

$\:\widehat{f}\left(k\right)$

is the Fourier transform of f(x), k is the Fourier variable (sign(k) = − 1 for k < 0 and sign(k) = 1 otherwise), 0 < α ≤ 2 is the Lévy-stable index, σ > 0 is the spread of the distribution, -1 ≤ β ≤ 1 is the skewness and

$\:\mu\:\:$

is the location parameter, respectively (Benson et al. 2001).

Lévy-stable distributions deviate from both Normal and Lognormal distributions, with their PDF tails decaying according to a Pareto-like function. In such cases, the probability, P, that the variable X exceeds a given magnitude scales as:

$\:P[\left|X\right|>x]\propto\:{x}^{-\alpha\:}$

Equations (2) and (3) describe a Gaussian process when α = 2 and an infinite-variance process when α < 2. For a sub-Cauchy process (α < 1), the mean value is undefined (Samorodnitsky and Taqqu 1994). In this study, the parameters α, β, σ and µ were optimized using a maximum likelihood estimator (MLE) implemented in the open-access code, stable 4.0, developed by Nolan (1997).

2.3 Nonlinear parameter estimation

2.3.1 Phase space reconstruction, local/global Lyapunov exponent, determinism and nonlinearity tests

The Takens theorem (Takens 1981) or delay method is particularly suitable for embedding a long empirical time series into an E-dimensional phase space as suggested by Li and Yuan (2008).

$\:{X}_{t}^{E}=({x}_{t},\:{x}_{t-\tau\:},\dots\:..,{x}_{t-\left(E-1\right)\tau\:})$

where E is the embedding dimension and τ is the time delay. Here the term

$\:{X}_{i}^{E}$

refers to the decorrelated time series defined by Eq. (1). The delay method reconstructs the attractor as:

$\:{X}_{t}^{E}={\left[{x}_{t},{x}_{t-\tau\:},{\dots\:,x}_{\xi\:}\right]}^{T},\:\:\xi\:=N-(E-1)$

In this study, we used a minimum embedding dimension of E = 1 and a maximum of E = 10. The appropriate time lag τ was determined as the first minimum of the autocorrelation function. For example, for the decorrelated series

$\:{X}_{i}^{d}$

corresponding to Bayamo, τ = 6, while for the original series, τ = 18. Given the selected embedding dimension E and time delay τ, the spectrum of Lyapunov exponents (λ) can be determined. Additionally, the relevant scales should be defined for computing local Lyapunov exponents.

Empirically, the error Θ(t) grows exponentially, and the local λ value converges to a global Lyapunov exponent λ_m as the scale increases (Abarbanel et al. 1993).

$\:{\Theta\:}\left(t\right)={\Theta\:}\left(0\right)\text{e}\text{x}\text{p}\left({\lambda\:}_{m}t\right)$

We selected a dyadic scale defined as r = 2ⁿ with n = 0,1,...,9. Phase space reconstruction and Lyapunov spectrum estimation were performed using the R functions timelag, embedseries and lyapunov available within the R package fractal (Constantine et al. 2017).

The estimated time delay, τ, and embedding dimension, E, allow to perform a determinism test on the investigated time series. This test was conducted using the method described by Kaplan and Glass (1992). In what follows, we provide a brief overview of the method, following the description of those authors and its application to meteorological variables by Millán et al. (2010) and Millán et al. (2011). First, the phase space is coarse-grained into an n

$\:\times\:$

n grid of small boxes. Each pass δ of the trajectory through the j-th box generates a unit vector v_{δ, j}, referred to by Kaplan and Glass (1992) as a trajectory vector. The direction of v_{δ, j} is determined by the angle between the trajectory of the entering point into the box and its exit point. The resultant vector V_j of all visits to the j-th box is then normalized by the total number of visits n_j to that box as follows:

$\:{\varvec{V}}_{\varvec{j}}=\frac{1}{{n}_{j}}\sum\:_{=1}^{{n}_{j}}{\varvec{v}}_{,j}$

Here, we define the pass δ as the number of times that a trajectory intersects a given box (δ = 1,2,...,n_j). Intuitively, in regions where the vectors are parallel, the resulting vector satisfies V_j≈ 1, while in regions of the phase space where trajectories intersect, V_j< 1. To establish a statistical criterion, Kaplan and Glass (1992) related V_j to the average displacement after π steps of a random walk in D dimensions.

$\:{\stackrel{-}{R}}_{\pi\:}^{D}={\left(\frac{2}{\pi\:D}\right)}^{\raisebox{1ex}{$1$}\!\left/\:\!\raisebox{-1ex}{$2$}\right.}\frac{{\Gamma\:}\left(\right(D+1/2)}{{\Gamma\:}\left(\raisebox{1ex}{$\text{D}$}\!\left/\:\!\raisebox{-1ex}{$2$}\right.\right)}$

where

$\:{\stackrel{-}{R}}_{\pi\:}^{D}$

is the average displacement of the random walk after π steps and Γ is the Gamma function.

The determinism parameter

$\:\stackrel{-}{\bigwedge\:}\:$

is then defined as a weighted average of V_j over all visited boxes in the phase space.

$\:\stackrel{-}{\wedge\:}=〈\frac{{\left({V}_{j}\right)}^{2}-{\left({R}_{\pi\:j}^{D}\right)}^{2}}{1-{\left({R}_{\pi\:j}^{D}\right)}^{2}}〉$

The parameter

$\:\:\stackrel{-}{\wedge\:}\:$

quantifies the degree to which a system exhibits deterministic or stochastic behavior. For purely deterministic systems (e.g. the Lorenz attractor),

$\:\stackrel{-}{\:\wedge\:}=1$

, while for a random walk,

$\:\stackrel{-}{\:\wedge\:}=0$

. Perc (2005) developed a freely available open-access code (i.e., determinism.exe) to determine

$\:\:\:\stackrel{-}{\wedge\:}\:\:$

based on Kaplan and Glass (1992) method. As an example, we present the results for the wind speed time series from the Bayamo station. For both original and decorrelated series, the phase space was coarse-grained into a 40

$\:\times\:$

40 grid containing 1600 boxes. For the original data (τ = 18), we obtained

$\:\stackrel{-}{\wedge\:}$

=0.3257, while for the decorrelated series (τ = 6), we found

$\:\stackrel{-}{\wedge\:}$

=0.6479.

Once the appropriate time delay,τ, is defined, it is useful to test the nonlinearity of the investigated time series. The time reversibility proposed by Schreiber and Schmitz (2000) is a third-order statistic that discriminate between linear (symmetric) and nonlinear (asymmetric) behavior under time reversal Schreiber (1999). This statistic can be determined as follows:

$\:{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)=\frac{1}{N-\tau\:}\sum\:_{n=\tau\:+1}^{N}{\left({x}_{n}-{x}_{n+\tau\:}\right)}^{3}$

where

$\:{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)$

is the time reversal parameter as a function of the time delay τ.

Equation (10) was applied to both original and decorrelated time series, as well as to a set of surrogated time series. The surrogates were generated under the null hypothesis that the original or decorrelated series results from an underlying linear Gaussian process. Therefore, the hypotheses to be tested are:

$\:\left\{\begin{array}{c}{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)\left(original\right)-{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)\left(surrogates\right)=0\:\left(null\right)\\\:{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)\left(original\right)-{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)\left(surrogates\right)\ne\:0\:\left(alternative\right)\end{array}\right.$

For a two-tailed test and the 5% confidence level, it is necessary to generate 40 = 2/0.05 (where 2 denotes the two tails and 0.05 is the confidence level) time series using the method developed by Theiler et al. (1992). The TISEAN Project (Hegger et al. 1999) provides the routines surrogates and timerev, which are used to generate surrogates series and to calculate

$\:{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)$

, respectively. In particular, the option -i1 in the surrogates code implements the iterative amplitude-adjusted Fourier transform (IAAFT) algorithm developed by Schreiber and Schmitz (1996). Those authors also defined a Z-score statistics as:

$\:Z=\frac{{T}_{observed\:}^{rev}-\:{\stackrel{-}{T}}_{surr}^{rev}}{SD\left({\stackrel{-}{T}}_{surr}^{rev}\right)}$

where

$\:{T}_{observed}^{rev}$

is the observed time reversal parameter,

$\:{\stackrel{-}{T}}_{surr}^{rev}$

is the mean time reversal parameter for the surrogate data and

$\:SD\left({\stackrel{-}{T}}_{surr}^{rev}\right)\:$

is the corresponding standard deviation. We set the critical value as

$\:{Z}_{crit}$

= 2.021. Accordingly, if

$\:Z$

≤

$\:{Z}_{crit}$

, the null hypothesis is accepted, while

$\:Z$

$\:{Z}_{crit}\:$

leads to rejection of the null hypothesis.

2.3.2 Detrended Fluctuation Analysis

The detrended fluctuation analysis (DFA) was first introduced by Peng et al. (1994) to investigate long-range correlations in DNA sequences and has since become a widely used tool across scientific fields. The method proceeds through several steps. The first step is to integrate the time series x_t as:

$\:{Y}_{}=\sum\:_{\varvec{t}=1}^{}\left[{x}_{}\left(t\right)-<x>\right]$

where < x > is the mean of the x_ν(t) series.

The series profile Y_ν is divided into N_s=int(N/s) non-overlapping segments of equal length, s. If the series length N is not an exact multiple of s, the procedure is repeated starting from the end of the series. This ensures that all the data points are included, resulting in 2N_s segments of length s (Kantelhardt et al. 2002).

Within each interval, the data points are fitted with a q-order polynomial. For each segment of length s, the local trend is denoted as y_s(ν). The detrending step is performed by subtracting this local trend from the profile Y_ν. The variance of the detrended profile is then computed, and its root mean square defines the local fluctuation function, F(s), for scale s:

$\:F\left(s\right)=\sqrt{\frac{1}{N}\sum\:_{\nu\:=1}^{N}{\left[{Y}_{\nu\:}-{y}_{s}\left(\nu\:\right)\right]}^{2}}$

Equation (14) is applied to each of the 2N_s intervals for every resolution s.

Under the scaling hypothesis, the fluctuation function F(s) and the scale s are expected to follow a Pareto-type (power-law) relationship as follows:

$\:F\left(s\right)=C{s}^{H}$

where C is a scaling constant representing the root mean square fluctuation at the unit time scale, which in this study corresponds to the sampling resolution (s = 1 hour), and H is the Hurst exponent (0 < H < 1). Applying a log-log transformation to Eq. (15) enables the estimation of both H and C.

$\:\text{log}\left[F\left(s\right)\right]=logC+Hlog\left(s\right)$

In this study, we considered time scales of s = 6, 12, 24, 48, 96, 192, 384, 768, 1536, 3072, 6144, 12288 and 24576 hours, following a dyadic scale ratio of 2. A second order polynomial model (q = 2) was applied for local detrending. The Hurst exponent was calculated using the R package fractal (Constantine et al. 2017). A value of H > 0.5 indicates the presence of long-range correlations, while H < 0.5 reflects short-range dependence. The special case of H = 0.5 corresponds to regular Brownian-like behavior.

2.3.3 Multiscale entropy

The multiscale entropy (MSE) method, developed by Costa et al. (2005), provides a powerful framework to investigate the fluctuation dynamics of meteorological time series in greater detail. The first step consists of coarse-graining the original time series x_ν(t). For a given scale factor m, the corresponding coarse-grained sub-series

$\:\:{z}_{\mathcal{l}}^{m}$

is calculated as:

$\:{z}_{\mathcal{l}}^{m}=\frac{\sum\:_{\nu\:=\left(\mathcal{l}-1\right)(m+1)}^{m\mathcal{l}}{x}_{\nu\:}}{m}$

In this case

$\:\mathcal{l}$

varies as 1≤

$\:\mathcal{l}$

≤ N/m. Note that m = 1 corresponds to the length of the whole time series while m=m_max is the shortest coarse-grained sub-series which comprises N/m_max data points.

For each coarse-grained sub-series,

$\:\:{z}_{\mathcal{l}}^{m}$

, the conditional probability

$\:{\:{\Psi\:}}^{m+1}\left(\epsilon\:\right)\:$

is calculated as the probability that two sub-series coincide for m + 1 points,

$\:\:{{\Omega\:}}^{m}\left(\epsilon\:\right)\:$

denotes the probability that two sub-series match for m points. The tolerance parameter ε is defined as a fraction of the standard deviation of the entire time series. In this context, each sub-series can be treated as a time-lag vector, and two vectors are considered a match if the maximum norm, Δ, between their components satisfies Δ ≤ ε. The MSE(m,ε,N) is then determined as:

$\:MSE\left(m,\epsilon\:,N\right)=-log\left[\frac{{{\Psi\:}}^{m+1}\left(\epsilon\:\right)}{{{\Omega\:}}^{m}\left(\epsilon\:\right)}\right]$

It should be noted that Eq. (18) is valid only when

$\:{{\Psi\:}}^{m+1}\left(s\right)\le\:{{\Omega\:}}^{m}\left(s\right).\:\:$

In this study, we set the embedding dimension parameters to m_min = 1, m_max = 100, and m_step = 1, with the tolerance defined as ε = 0.15

$\:\times\:$

SD where SD is the standard deviation of the original/decorrelated time series. The MSE analysis was performed using the function MSEbyC.fn from the CGManalyzer R package (Zhang et al. 2023).

3 Results

Table 2 summarize the statistical parameters including mean, median, minimum, maximum, SD, skewness and kurtosis calculated for the eleven wind speed time series at the studied stations. The difference between mean and median values is negligible at all stations except Cabo Cruz, Jiguani, Media Luna and Veguita.This statistical evidence suggests that the wind speed distributions can be regarded as approximately symmetrical in most stations. After fitting the Normal, Weibull, Rayleigh and Gamma PDFs to the data at all stations, we found that the Normal distribution characterized the original data reasonably well. As an example, in Fig. 2(a) we illustrate the results of fitting the Normal, Weibull, Rayleigh and Gamma PDFs to the original wind speed data at the Bayamo station while Fig. 2(b) presents the comparison between the cumulative distribution function (CDF) of the Normal distribution and that of the observed data at the same Bayamo station. The smaller values of skewness and kurtosis reported in Table 2 are typical of the Normal or near-Normal distribution. Although we found that unimodal PDFs were appropriate for the original wind speed data from southern Cuba, some other studies reported that wind speed distribution may conform to bimodality. For instance, Lopez-Manrique et al. (2018) analyzed wind speed data from Cuauhtemotzin, Mexico, measured at three elevations (i.e., 26, 33 and 54 m) and found that the behavior of the wind speed followed a bimodal distribution (see their Fig. 5) with dominant northeast wind direction. The mean and standard deviation values reported in Table 2 are also less than those reported in Perez-Albornoz et al. (2023) who analyzed wind speed data measured with time interval of 10 minutes from January to December 2011 and reported the values of 6.29 m/s and 2.82 repectively as the mean and standard deviation (see their Table 3).

Table 2

Statistical parameters for the original wind speed time series.
	Mean(m/s)	Median(m/s)	Min.(m/s)	Max.(m/s)	SD (m/s)	Skewness	Kurtosis
Bayamo Cabo Cruz Campechuela Cauto Guisa	3.85 4.20 4.06 4.17 3.04	3.88 4.02 3.99 4.20 3.00	0.01 0.01 0.04 0.01 0.02	16.70 15.71 15.20 18.10 9.83	1.63 2.00 1.88 1.76 1.25	0.21 0.42 0.49 0.21 0.29	0.19 -0.11 0.14 0.19 0.10
Jiguani	3.20	2.81	0.02	14.39	1.59	0.95	0.83
Media Luna	3.19	3.06	0.01	11.95	1.52	0.42	-0.10
Mota	4.01	3.95	0.00	14.26	1.71	0.30	0.03
Rio Cauto	3.17	3.17	0.01	13.77	1.34	0.21	0.20
UoG	3.08	3.05	0.01	12.23	1.22	0.32	0.50
Veguita	3.42	3.29	0.00	16.54	1.65	0.73	0.73

Fig. 2

Illustration of the fitting of four traditional distribution functions to wind speed data collected from Bayamo site: a) Density functions and b) Cumulative Distribution Function (CDF) corresponding to the normal distribution

Fig. 5

a) Density function corresponding to the original (Gaussian with α = 2.00) and decorrelated (Lévy-stable with α = 1.58) data and b) the right-hand tail of a) in log-log coordinates. Data were previously transformed to zero mean and unit variance wind speed

Table 3

The skewness, kurtosis, Kolmogorov-Smirnov distance (KSd) (p < 0.01), Z-score, Levy index (α), maximal Lyapunov exponent (λ_m) and determinism parameter (
$\:\stackrel{-}{\wedge\:})\:$
calculated for the decorrelated time series. The Z-score for the original times series is also reported. Recall that Z_crit=2.021.
Station	Skewness	Kurtosis	KSd	Z-score (original)	Z-score (decorrelated)	α	λ_m(h^− 1)	$\:\stackrel{-}{\wedge\:}$
Bayamo	0.947	3.342	0.066	0.396	2.509	1.584	1.286	0.647
Cabo Cruz	0.849	2.806	0.056	0.933	3.770	1.656	0.802	0.490
Campechuela	0.581	2.472	0.043	2.542	4.802	1.675	1.353	0.518
Cauto	0.947	3.347	0.067	0.397	2.467	1.585	1.273	0.512
Guisa	0.612	2.596	0.048	6.273	6.527	1.674	1.193	0.599
Jiguani	1.163	4.779	0.133	1.828	2.673	1.227	1.444	0.669
Media Luna	0.848	2.805	0.056	0.934	5.668	1.654	0.789	0.489
Mota	0.619	2.745	0.044	5.154	9.114	1.704	1.037	0.552
Rio Cauto	0.947	3.344	0.067	0.399	2.543	1.585	1.276	0.643
Veguita	1.251	4.975	0.076	0.534	2.761	1.584	1.363	0.694
UoG	0.950	3.266	0.101	0.045	3.118	1.381	1.35	0.647

In Table 3 we present the skewness, kurtosis and results of the Kolmogorov-Smirnov test of normality (KSd) for the decorrelated time series at each station. The skewness and kurtosis values indicate that the distributions are heavy tailed to the right and asymmetrical. In all cases, the Kolmogorov-Smirnov distance was greater than the critical value (KSd_crit=0.0055, at p < 0.01) indicating that the difference between the observed and the hypothesized Normal distribution is statistically significant. However previous studies showed that the Weibull PDF provided the best goodness-of-fit of original, untransformed, data (see e.g., Wais (2017) and Perez-Albornoz et al. (2023). Figure 3 (a-b) depicts the autocorrelation function (ACF) for the Bayamo station (Fig. 3a) and the corresponding partial autocorrelation function (PACF) for the same location (Fig. 3b). The ACF decays approximately according to a Pareto-type function of the lag L, while a pronounced periodic cycle with a 24-hour period is also evident. On the other hand the PACF shows alternating negative, positive and zero partial correlation coefficients indicating that the serial correlation has been effectively removed from the original signals. Figure 4 illustrates the original and the decorrelated time series at the Bayamo station after applying Eq. (1), while Table 3 summarizes the results of the time reversal nonlinearity test for both cases. For the original series, eight out of eleven Z-scores were smaller than the critical value of 2.021, while for the decorrelated data all Z-scores exceeded this threshold. According to the selected confidence level, these results indicate that the original wind speed records can generally be regarded as realizations of a linear Gaussian process, with the exception of the Campechuela, Guisa and Mota stations. In contrast, the decorrelation procedure reveals an underlying nonlinear structure in the wind speed data. These findings provide an essential basis for the subsequent application of nonlinear analysis methods.

Fig. 3

Auto-correlogram of wind speed time series corresponding to Bayamo site: a) original data and b) decorrelated data after removing the lag-one AR(1) component (von Storch 1995 method)

Table 3 also presents the Lévy index (α) (Eq. 2), maximal Lyapunov exponent (λ_m) (Eq. 6) and determinism parameter (

$\:\stackrel{-}{\wedge\:})\:\:\left(\text{E}\text{q}.9\right)\:$

calculated for the decorrelated time series. The Lévy-stable index ranged from α = 1.227 at the Jiguani station to α = 1.704 at the Mota station. This suggests that the Lévy-stable distribution provides a better description of the decorrelated wind speed data than the Gaussian distribution. Figure 5 illustrates this result for Bayamo station, where the density functions of the original (α = 2.00) and decorrelated (α = 1.584) series appear approximately similar without transformation (Fig. 5a), but clear differences emerge after applying a log–log transformation (Fig. 5b). For consistency, all wind speed data were standardized to zero mean and unit variance prior to this analysis. Figure 6 presents the Lyapunov spectrum, where a clear separation is observed between the maximal Lyapunov exponent for the decorrelated data (λ_m = 1.286) and the average of 40 surrogate series (λ_m = 1.371) at the Bayamo station. The attractor reconstruction in three dimensions (x(t), x(t-6), x(t-12)) shown in Fig. 7 further highlights that the decorrelation procedure reveals the deterministic component embedded in the data. From Table 3, the average determinism parameter

$\:\:\stackrel{-}{{\Lambda\:}}=$

0.587 for the decorrelated series, compared to

$\:\:\stackrel{-}{{\Lambda\:}}=$

0.318 for the original data, reinforces the presence of deterministic structure.

Fig. 6

Spectra of Lyapunov exponents for the original data (Bayamo site) and the average of 40 time series of surrogates

Fig. 7

The attractor reconstruction for Bayamo site using time delays τ = 6 and τ = 12 hours

Figure 8 shows an example of the goodness-of-fit of Eq. (16) for Bayamo station, while Table 4 lists the resulting parameters. The determination coefficients were consistently high (r² > 0.950), indicating that the DFA model captures the data dynamics well across the full range of time scales, s. In this model, the coefficient C represents the (R/S) value at the data sampling resolution (s = 1 hour). Finally, Fig. 9 illustrates the multiscale entropy (MSE) profiles. For the original series (Fig. 9a), MSE(m) increased from m = 1 (whole time series) to m = 8 and remained nearly constant thereafter. In contrast, for the decorrelated series (Fig. 9b), MSE(m) reached a maximum at m = 4 and decreased with larger scale factors. According to Nogueira [25] this means high irregularities at all the selected time scales in the first case (Fig. 9a) and more repetitive patterns as the time scale increases in the second case (Fig. 9b).

Fig. 8

Illustration of the goodness-of-fit of Eq. (16) to Bayamo site time series

Table 4

Hurst exponent statistics for the decorrelated time series
Location		H	logC	r²
	Bayamo Cabo Cruz Campechuela Cauto Guisa	0.816(0.02) 0.761(0.03) 0.7870.03) 0.816(0.03) 0.797(0.03)	-1.188(0.07) -1.106(0.09) -1.131(0.08) -1.153(0.07) -1.357(0.08)	0.990 0.982 0.985 0.990 0.987
	Jiguani	0.842(0.02)	-1.099(0.07)	0.991
	Media Luna	0.761(0.03)	-1.225(0.09)	0.982
	Mota	0.787(0.03)	-1.209(0.08)	0.986
	Rio Cauto	0.816(0.04)	-1.281(0.10)	0.980
	UoG^*	0.828(0.02)	-1.314(0.06)	0.991
	Veguita	0.839(0.02)	-1.125(0.07)	0.991
^* UoG is the University of Granma location

Fig. 9

a) MSE(m) of the original time series and the average of 40 surrogates for Bayamo site and b) MSE(m) of the decorrelated data for the eleven study sites

4 Discussion

4. 1 Autocorrelation, Lévy noise and nonlinear structure of observed wind speed data

Although the ACF (Fig. 3a) and PACF (Fig. 3b) are standard tools to construct ARMA(p,q) models, our primary interest was to examine the correlations between different lags in greater detail. The ACF for the first 200 lags decays with the time lag L approximately as L^−γ with an average exponent γ = 0.464 ± 0.012. From this, a mean Hurst exponent of H = 0.768 ± 0.006 (H = 1-γ/2) can be derived. However, as noted by Couillard and Davison (2005), such estimates should be interpreted cautiosly. For example a preliminary DFA analysis yielded a mean Hurst exponent of H = 0.989 ± 0.002, which probably reflects an artificial magnification of long-range correlations caused by strong serial autocorrelation within the first 200 lags. On the contrary, the Lévy-stable index calculated from Eq. (2) for the eleven original series was α = 2.00, which corresponds to H = 0.5 (H = 1/α), consistent with a Gaussian process. The PACF highlights two key features: (i) alternating positive and negative correlations between periodic peaks, independent of intermediate lags, and (ii) near-zero correlation at more distant lags. Similar behavior was previously reported by Cadenas et al. (2019) in hourly wind speed data collected from La Venta, Oaxaca, Mexico. The decorrelation process transformed the original wind speed time series from approximately symmetrical and Gaussian (Table 2),with Lévy index α = 2.00 (Eqs. 2–3), into highly skewed, non-Gaussian and leptokurtic distributions (Table 3).

Figure 4 illustrates both distributions for the Bayamo station. Although not conclusive, this result indicates a potentially more complex structure underlying wind speed variations. It is worth recalling that the original and decorrelated time series differ only in their shapes, as shown in Fig. 5(a). The contrasting behavior between the original and decorrelated data is more evident in Fig. 5(b) (Bayamo station), presented on log-log axes. Interestingly, Fig. 5(b) closely resembles the probability of exceedance of wind speed increments reported by Jose et al. (2024). For instance, the sub-Gaussian long-tail distribution observed in Fig. 5(a-b) with α = 1.58 suggests that extreme wind speed values occur with higher probability than predicted by a Gaussian model, commonly interpreted as intermittency Peinke et al. (1993). Nonetheless, it is important to note that the detected non-Gaussianity in the decorrelated series may also reflect underlying linear or nonlinear processes, which warrents further investigation and data analysis. The nonlinearity test based on the time reversibility criterion

$\:\:{\text{{\rm\:T}}}^{rev}\left(\tau\:\right)$

was computed from the observed data with time delay τ = 18 together with 40 surrogate time series. At the 95% confidence level, eight out of eleven original wind speed time series could be considered as realizations of a linear Gaussian process (Table 3). By contrast, the transformed (decorrelated) data with τ = 6 indicate an underlying non-Gaussian (Lévy-stable) and nonlinear structure of hourly wind speed variations, as all Z-score exceeded the critical threshold (Table 3). Furthermore, Table 3 also shows that all Lévy-stable exponents fall within 1 < α < 2 with an average of α = 1.574 ± 0.14, which is close to the Holtsmark distribution (α = 1.5). It is important to emphasize that non-Gaussianity does not necessarily imply nonlinearity, since many linear natural variables exhibit non-Gaussian distributions (Hinich et al. 2005). In summary, the combined non-Gaussianity and nonlinearity tests provide an important foundation for subsequent nonlinear analyses of wind speed fluctuations.

4.2 Nonlinear invariants

The spectra of local Lyapunov exponents converge to a global exponent (λ_m > 1) for nine out of eleven sites (Table 3), while λ_m < 1 for the wind speed series collected in coastal areas (Cabo Cruz and Media Luna sites). This is illustrated in Fig. 6 for the Bayamo station. It should be noted that the average spectrum of Lyapunov exponents computed from 40 surrogate series did not reproduce the observed spectrum, further supporting the presence of nonlinear dynamics. For stations with λ_m > 1, the traditional Lyapunov time (1/λ_m) is shorter than the one-hour sampling interval, while stations with λ_m < 1, the Lyapunov time exceed one hour.

It is typically assumed that the dynamics of chaotic systems is unpredictable beyond the Lyapunov time (1/λ_m). However, as pointed out by Boffetta et al. (1998), this interpretation is not always valid. Instead, 1/λ_m may be better understood as the characteristic time for a perturbation to amplify by a fixed factor. Similarly, Crisanti et al. (1993) argued that this predictive time scale holds only in the absence of intermittency. This interpretation is consistent with our results, since we identified α-stable distributions and intermittent behavior in the wind speed series. Thus, using the Lyapunov exponent solely as a predictive tool through 1/λ_m may yield unreliable results in the context of wind speed dynamics.

Despite this limitation, trajectories around the center of the attractor preserve the chaotic nature of wind speed fluctuations (Fig. 7), indicating that wind speed varies chaotically from hour to hour at high frequencies, thus constraining predictability even at short timescales. A similar phenomenon has been reported in daily rainfall time series Millán et al. (2011). Furthermore, the deterministic component of the wind speed time series, quantified by

$\:\stackrel{-}{{\Lambda\:}}$

, increased from an average of 0.318 for the original series to 0.587 after removing serial correlation (Table 3). A linear regression analysis revealed a significant positive correlation between λ_m and

$\:\stackrel{-}{{\Lambda\:}}\:$

(r = 0.718, p < 0.05), suggesting that higher levels of chaotic determinism correspond to larger Lyapunov exponents and, consequently, reduced predictability of future system states. Although our analysis is based on only eleven time series, this significant correlation aligns with the general behavior described by Aurell et al. (1997) in the context of turbulence.

4.3 The Hurst exponent and multiscale entropy of wind speed fluctuations

Equation (16) provided a reasonable fit to the data, as shown in Fig. (8) for the Bayamo station. No evidence of crossovers was detected, indicating that the determined Hurst exponent (Table 4) consistently characterizes the scaling behavior across all selected window sizes (s). All calculated values of H were greater than 0.76 (Table 4) suggesting, at a first sight, a strong long-range correlation or persistence in the wind speed dynamics since H > 0.5 is generally interpreted as persistence between present and future states. However, the H exponent contains deeper information about chaotic dynamics. For example, the high determination coefficient (r² > 0.980, Table 4) confirms that the scaling law (Eq. 15) holds over nearly four orders of magnitude in the independent variable (s_max=24,576 hours), which is sufficient to establish fractality (Malcai et al. 1997). Theoretically, this implies that H can be used to determine the fractal dimension of he attractor (D_f = 2 -H). Moreover, Rangarajan and Sant (1997) highlighted the predictive aspect of H, noting that the difference H − 0.5 relates to predictability. Those authors proposed a predictive index (PI = 2

$\:\left|{D}_{f}-1.5\right|=2\left|H-0.5\right|$

), which was successfully applied to time series of rainfall, temperature and atmospheric pressure. For wind speed, however, where chaotic laminar and turbulent regimes exist, predictability is more complex. In our analysis, we also found a significant positive linear relationship between λ_m (Table 3) and H (Table 4). Tarnopolski (2018) reported a similar relationship between λ_m and H for conservative maps, suggesting that for certain real-world chaotic systems, H may be better interpreted as a measure of system complexity rather than persistence alone. From this perspective, in nonlinear and chaotic fluids, such as air, predictability should be regarded as a joint function of both H and λ_m. This insight could have practical implications for the management of wind-energy technologies (e.g., wind turbines and windmills), particularly since λ_m follows a power-law dependence on the Reynolds number i.e., λ_m∼Re^ϖ where the exponent ϖ depends on the Hölder exponent of the multifractal distribution (Crisanti et al. 1993).

The MSE has been suggested as a potential indicator of predictability in many scientific fields (see e.g., Courtiol et al. 2016). For the original wind speed time series, the results do not convey relevant information, as shown in Fig. 9a for the Bayamo station. Note that MSE(m) for the surrogates matched almost exactly that of the original data, supports, to some extent, the assumption of a Gaussian structure in the original series. Moreover, MSE(m) increased from 0.833 a.u (m = 1) to 1.974 a.u (m = 8) and then remained nearly constant for m > 8, resembling the behavior of a constant-variance process. Based on Eq. (18), this implies that the conditional probability of correlation between different time windows (m) decreases from

$\:\frac{{{\Psi\:}}^{m+1}\left(\epsilon\:\right)}{{{\Omega\:}}^{m}\left(\epsilon\:\right)}=0.146\:$

for m=1 to

$\:\frac{{{\Psi\:}}^{m+1}\left(\epsilon\:\right)}{{{\Omega\:}}^{m}\left(\epsilon\:\right)}=0.01\:$

for m=8. By contrast, the transformed wind speed time series exhibited markedly different behavior (Fig. 9b). In this case, the conditional probability term decreased from

$\:0.018\:$

(m = 1) to

$\:0.008\:$

(m=4), but subsequently increased, reaching a maximum value of

$\:\:0.520\:$

(m=100). This decreasing trend of MSE(m) was well described by a power law i.e., MSE(m) ∝ m^−η with an average η = 0.578 ± 0.02 and r²>0.980 in all cases. For the original wind speed data (Fig. 9a), however, we found an average η = -0.018 ± 0.003 with a much smaller determination coefficient (r²<0.305). This near-zero η value is consistent with the Gaussian white-noise behavior (Malamud and Turcotte 1999) of the original series. Two observations are worth noting here. First, the power law MSE(m) ∝ m^−η is analogous to that discussed by Han et al. (2020) in their analysis of power spectra of physiological time series. Second, the theoretical approach based on the variance of each coarse-grained interval for computing MSE(m) closely parallels that used to determine the Hurst-Kolmogorov exponent, H_K. In fact, our multiple regression analysis revealed a significant empirical relationship between H, λ_m, η and

$\:\stackrel{-}{{\Lambda\:}}$

as follows:

$\:H=0.407\left(0.07\right)+0.409\left(0.13\right)\eta\:+0.176\left(0.04\right)\stackrel{-}{{\Lambda\:}}+0.047\left(0.01\right){\lambda\:}_{m}\:(r=0.972,\:p<0.01)$

All three independent variables contributed significantly to Eq. (19) with partial correlation coefficients of r = 0.751 (p-value = 0.019) for η, r = 0.822 (p-value = 0.006) for

$\:\stackrel{-}{{\Lambda\:}}\:$

and r = 0.716 (p-value = 0.029) for λ_m. Given the relatively narrow range of variation in the estimated parameters, Eq. (19) requires further confirmation through simulations and additional real-world time series, a study that is currently in progress.

Overall, MSE(m) not only reflects the scaling of the information content in wind speed variations but also captures an underlying Hurst-Kolmogorov process.

4.4 Implications for wind engineering

All the information derived from nonlinear parameters of wind speed fluctuations can be of potential relevance for the siting, installation and operation of wind-based technologies. These include wind turbines for electricity generation as well as windmills for water pumping, among others. This aspect is particularly important in the context of the present study, which is part of a sustainable agricultural project. Many windmill models are specially designed to operate at low wind speeds (laminar regime) and high torque, enabled by a multi-blade rotor arrangement (Odesola et al. 2017). Since kinetic energy scales with the square of speed, v², and generated power scales with v³ (Betz law), the fluctuation dynamics of wind speed may strongly influence the efficiency of wind-based technologies. For a thin flat plate, the transition from laminar to turbulent flow occurs at a critical Reynolds number of Re_cr=320000, which corresponds to a sharp increase in boundary layer thickness (Çengel et al. 2018). Assuming a characteristic chord length of 0.320 m for the blade, along with typical values of air density (= 1.293 kg/m³) and kinematic viscosity (= 1.46x10^− 5 m²/s), the critical wind speed is estimated as v_cr=13.4 m/s (= 48 km/h). In the tropical study sites, wind speed usually exceed this threshold during severe local storms, gusts, tornadoes and hurricanes, particularly from June to November. For example, on September 8th, 2017, all investigated sites experienced wind speeds above 50 km/h for 8–12 consecutive hours due to the passage of Category 5 Hurricane Irma through the northeastern region of the country. Such extreme values are not well captured by traditional PDFs (see Fig. 2a), while the Lévy distribution provides a more plausible description (see Fig. 5b).

Crisanti et al. (1993) discussed the relationship between λ_m, Re and the Hölder exponent (h) compared to the Hurst exponent H (Sankaran et al. 2021). Within this framework, the interaction between free wind and turbine or widmill blades introduces an additional layer of complexity when employing λ_m as a predictive parameter. For example, at v = 18.1 m/s (Cauto site, september 8th, 2017), the Reynolds number was Re = 512949 > Re_crit (characteristic length = 0.320 m). This implies λ_m = 3.788/T, far from the traditional hypothesis of λ_m = 1/T under no intermittent conditions. The nonlinear and intermittent effects on windmill performance were partly explored by Durán Medina et al. (2015) who reported that mechanical power output did not follow the cubic scaling expected from the Betz law. Instead, they found P∼v^3.31 for wind speeds between 5 and 10 m/s.

From this standpoint, wind speed should not be considered solely as an atmospheric variable, but also as what might be termed a technological variable. In the atmospheric case, a global Lyapunov exponent can be estimated from free-air time series without explicitly accounting for the Reynolds number. In the technological case, however, one should also consider the theoretical framework discussed in Crisanti et al. (1993) and related studies. A promising application would be the use of nonlinear parameters to develop wind energy potential maps at different spatial resolutions.

5 Conclusions

We analyzed long hourly wind speed time series from eleven tropical sites as part of a sustainable agriculture project aimed at siting, installing, and operating multi-blade windmills for water pumping. Our framework relied on the statistical characterization of nonlinear and potentially deterministic chaotic systems. The main findings are summarized as follows:

At first glance, the original wind speed series appeared to exhibit long-range correlations based on their autocorrelation functions. However, additional tests indicated that they behave as Gaussian noise processes, implying that the computed Hurst exponents may be artifacts. The underlying dynamics emerged only after removing serial autocorrelation from the data.

All series followed heavy-tailed distributions with Lévy-stable index 1 < α < 2. This provides initial evidence of intermittency and indicates a higher probability of extreme wind speeds compared to the Gaussian PDF.

The time reversibility test revealed, at the 95% confidence level, that wind speed fluctuations were nonlinear, an essential property justifying the use of nonlinear analysis.

The pre-whitened time series contained a deterministic component greater than 48% and yielded global Lyapunov exponents λ_m > 0. These results suggest a mix of deterministic and stochastic chaos in the wind speed dynamics.

A multiple regression analysis indicated a significant empirical relationship between H, λ_m, η and

$\:\stackrel{-}{{\Lambda\:}}$

. This requires further validation through numerical simulations and additional observational data.

Overall, nonlinear small- and large-scale fluctuations, intermittency and probability of extreme wind speed events introduce uncertainties that should be better understood. A key challenge is to integrate all information provided by the nonlinear invariants into practical models for wind technology applications including the development of wind power potential maps at different spatial and temporal resolutions.

Acknowledgments

B.G. acknowledges financial support from the University of Texas at Arlington through a faculty start-up fund and the STARs award.

Statements and Declarations

Funding Information

HM, RC, RA and AA were supported by the Mechanical Enterprise-Bayamo (EMBA), Granma (Cuba) [Project Number PNAP: NA131 GR]. B.G. acknowledges financial support from the University of Texas at Arlington through a faculty start-up fund and the STARs award.

Competing Interests

The authors declare that they have no conflict of interest to disc lose

Author Contribution

Conceptualization: [Humberto Millán], [Ramiro Cumbrera], [Behzad Ghanbarian], [Rene Arias], Methodology: [Humberto Millán], [Ramiro Cumbrera], [Behzad Ghanbarian], Formal analysis and investigation: [Humberto Millán], [Ramiro Cumbrera], [Rene Arias], [Riccardo Biondi], [Abdel Acosta], [Aziz Benhamrouche], Writing-original draft preparation: [Humberto Millán], Writing-review and editing: [Behzad Ghanbarian], [Riccardo Biondi], [Aziz Benhamrouche], [Ramiro Cumbrera], Resources: [Abdel Acosta], [Rene Arias], Supervision: [Aziz Benhamrouche], [Riccardo Biondi], [Humberto Millán]

Data Availability

The original data sets used in the present study are available from the National Aeronautics and Space Administration NASA POWER Project. NASA Earth Science/Applied Science Program. https://power.larc.nasa.gov/data-access-viewer. All the data generated during the investigation which support the results are freely available under request to the corresponding author.

References

Abarbanel HD, Brown R, Sidorowich JJ, Tsimring LSH (1993) The analysis of observed chaotic data in physical systems. Rev Mod Phys 65(4):1331–1392

Abbasi S, Rezaei M, Mina M, Sameni A, Oleszczuk P, Turner A, Ritsema C (2023) Entrainment and horizontal atmospheric transport of microplastics from soil. Chemosphere 322:138150

Alonso Díaz Y, Bezanilla A, Roque A, Centella A, Borrajero I, Martinez Y (2019) Wind resource assessment of Cuba in future climate scenarios. Wind Eng 43(3):311–326

Aurell E, Boffetta G, Crisanti A, Paladin G, Vulpiani A (1997) Predictability in the large: an extension of the concept of Lyapunov exponent. J Phys A: Math Gen 30:1–26

Balkissoon S, Fox N, Lupo A, Ellen Haupt S, Charles Li Y, Market P, Walsh S (2021) Determining chaotic characteristics and forecasting tall tower wind speeds in missouri using empirical dynamical modeling (edm). Renew Energy 170:1292–1307

Baranitharan B, Sivakumar D, Perarul Selvan S (2024) Wind speed and direction on predicting wind energy availability for a sustainable ecosystem. Global J Environ Sci Manage 10(4):1–20

Benson DA, Schumer R, Meerschaert MM, Wheatcraft SW (2001) Fractional dispersion, Lévy motion, and the MADE tracer tests. Transp Porous Med 42:211–240

Boffetta G, Giuliana P, Paladin G, Vulpiani A (1998) An Extension of the Lyapunov Analysis for the Predictability Problem. J Atmos Sci 55:3409–3416

Cadenas E, Campos-Amezcua R, Rivera W, Espinosa-Medina MA, Méndez-Gordillo AR, Rangel E, Tena J (2019) Wind speed variability study based on the Hurst coefficient and fractal dimensional analysis. Energy Sci Eng 7:361–378

Calif R, Schmitt FG (2014) Multiscaling and joint multiscaling description of the atmospheric wind speed and the aggregate power output from a wind farm. Nonlin Processes Geophys 21:379–392

Çengel YA, Cimbala JM (2018) Fluid Mechanics: Fundamentals and Applications. McGraw-Hill Education, New York

Chadee XT, Clarke RM (2014) Large-scale wind energy potential of the Caribbean region using near-surface reanalysis data. Renew Sustain Energy Rev 30:45–58

Chandler WS, Whitlock CH, Stackhouse PW (2005) Determining wind resources as a function of surface roughness and height from NASA Global Assimilation Analysis. Proc Int Solar Energy Soc. Solar World Congress, August 6–12, Orlando, FL

Constantine W, Percival D (2017) fractal: A Fractal Time Series Modeling and Analysis Package. R package version 2.0–4, December 21. https://CRAN.R-project.org/package=fractal

Costa M, Goldberger AL, Peng C-K (2005) Multiscale entropy analysis of biological signals. Phys Rev E 71:021906

Couillard M, Davison M (2005) A comment on measuring the Hurst exponent of financial time series. Phys A 348:404–418

Courtiol J, Perdikis D, Petkoski S, Müller V, Huys R, Sleimen-Malkoun R, Jirsa VK (2016) The multiscale entropy: Guidelines for use and interpretation in brain signal analysis. J Neurosci Methods 273:175–190

Crisanti A, Jensen MH, Vulpiani A, Paladin G (1993) Intermittency and predictability in turbulence. Phys Rev Lett 70(2):166–169

Drisya GV, Thumba A (2017) Variation of Chaotic Behaviour of Wind Speed Oscillations Across Indian Subcontinent. Int J Eng Sci Technol 9(5):426–440

Duan S, Song W, Cattani C, Yasen Y, Liu H (2020) Fractional Lévy stable and maximum Lyapunov exponent for wind speed prediction. Symmetry 12:605

Durán Medina O, Schmitt FG, Calif R (2015) Multiscale analysis of wind velocity, power output and rotation of a windmill. Energy Procedia 76:193–199

Fluck E, Raveh-Rubin S (2023) Dry air intrusions link rossby wave breaking to large-scale dust storms in northwest africa: Four extreme cases. Atmos Res 286:106663

Girma M, Molina M, Assefa A (2015) Feasibility study of a wind powered water pumping system for rural Ethiopia. AIMS Energy 3(4):851–868

Han W, Zhang Z, Tang C, Yan Y, Luo E, Xie K (2020) Power-Law Exponent Modulated Multiscale Entropy: A Complexity Measure Applied to Physiologic Time Series. IEEE Access 8:112725–112734

Hegger R, Kantz H, Schreiber T (1999) Practical implementation of nonlinear time series methods: the TISEAN package. Chaos 9:413–440

Hinich MJ, Mendes EM, Stone L (2005) Detecting Nonlinearity in Time Series: Surrogate and Bootstrap Approaches. Stud Nonlinear Dyn Econom 9(4):3

IEA Wind (2013) Long-Term Research and Development Needs for Wind Energy for the Time Frame 2012 to 2030. International Energy Agency, Paris. www.ieawind.org/annual_reports_PDF. Accessed 18 May 2024

Jemal T, Shimels S, Ali Y, Olawale Fatoba S (2023) Impact of Turbulent Flow on H-Type Vertical Axis Wind Turbine Efficiency: An Experimental and Numerical Study. Int J Heat Technol 41(6):1513–1520

Jose J, Gires A, Roustan Y, Schnorenberger E, Tchiguirinskaia I, Schertzer D (2024) Multifractal analysis of wind turbine power and rainfall from an operational wind farm – Part 1: Wind turbine power and the associated biases. Nonlin Processes Geophys 31:587–602

Jund C, Schindler D (2019) Wind speed distribution selection-A review of recent development and progress. Renew Sustain Energy Rev 114:109290

Kantelhardt JW, Zschiegner SA, Koscielny-Bunde E, Bunde A, Havlin S, Stanley HE (2002) Multifractal detrended fluctuation analysis of nonstationary time series. Phys A 316(1–4):87–114

Kaplan DT, Glass L (1992) Direct test for determinism in a time series. Phys Rev Lett 68(4):427–430

Lackoóvá L, Lieskovskỳ J, Nikseresht F et al (2023) Unlocking the potential of remote sensing in wind erosion studies: a review and outlook for future directions. Remote Sens 15(13):3316

Li BB, Yuan ZF (2008) Non-linear and chaos characteristics of heart sound time series. Proc. IMechE. Part H: J Eng Med 222:265–272

Liang Z, Liang J, Zhang L, Wang C, Yun Z, Zhang X (2015) Analysis of multi-scale chaotic characteristics of wind power based on Hilbert–Huang transform and Hurst analysis. Appl Energy 159:51–61

Lin L, Xia D, Dai L, Zheng Q, Qin Z (2021) Chaotic analysis and prediction of wind speed based on wavelet decomposition. Processes 9:1793

López-Manrique LM, Macias-Melo EV, May Tzuc O, Bassam A, Aguilar-Castro KM, Hernández-Pérez I (2018) Assessment of resource and forecast modeling of wind speed through an evolutionary programming approach for the north of tehuantepec isthmus (Cuauhtemotzin, Mexico). Energies 11(11):3197

Lucio HJ, Valdés R, Rodríguez LR (2012) Improvements to surrogate data methods for nonstationary time series. Phys Rev E 85:056202

Malamud BD, Turcotte DL (1999) Self-affine time series: I. Generation and Analyses. Adv Geophys 40:1–90

Malcai O, Lidar DA, Biham O (1997) Scaling range and cutoffs in empirical fractals. Phys Rev E 56(3):2817–2828

Millán H, Kalausi A, Cukic M, Biondi R (2010) Nonlinear dynamics of meteorological variables: multifractality and chaotic invariants in daily records from Pastaza, Ecuador. Theor Appl Climatol 102:75–85

Millán H, Rodríguez J, Ghanbarian-Alavijeh B, Biondi R, Llerena G (2011) Temporal complexity of daily precipitation records from different atmospheric environments: Chaotic and Lévy stable parameters. Atmos Res 101:879–892

Millán H, Tarquís AM, Cumbrera R, Ghanbarian B, Arias R, Rodríguez Y, Acosta A (2025) Multiscale multifractal assessment of sub-monthly hydrometeorological flash events in a tropical climate. Theor Appl Climatol 156:187

NASA (2023) National Aeronautics and Space Administration. Power Data Access Viewer: Prediction of Worldwide Energy Resource. https://power.larc.nasa.gov/data-access-viewer/

Nogueira M (2017) Exploring the link between multiscale entropy and fractal scaling behavior in near-surface wind. PLoS ONE 12(3):e0173994

Nolan JP (1997) Numerical computation of stable densities and distribution functions. Stoch Model 13:759–774

Odesola IF, Adinoyi LG (2017) Development of wind powered water pump. Int J Eng Sci Comp 7(4):10341–10345

Ogunjo S (2025) Wind energy characterization using multifractal formalism at two diferent altitudes in a tropical country. Model Earth Syst Env 11:63

Paul SS, Oyedepo SO, Adaramola MS (2012) Economic assessment of water pumping systems using wind energy conversion systems in the southern part of Nigeria. Energy Explor Exploit 30(1):1–18

Peel MC, Finlayson BH, McMahon TA (2007) Updated world map of the Köppen-Geiger climate classification. Hydrol Earth Sys Sci 11:1633–1644

Peinke J, Klein M, Kittel K, Okninsky A, Parisi J, Roessler OE (1993) On chaos, fractals and turbulence. Phys Scr 49:672–676

Peng CK, Buldyrev SV, Havlin S, Simons M, Stanley HE, Goldberger AL (1994) Mosaic organization of DNA nucleotides. Phys Rev E 49(2):1685–1689

Perc M (2005) User-friendly Programs for Nonlinear Time Series Analysis. version 5. http://fizika.uni-mb.si/~matjaz/ejp/time.html

Pérez-Albornoz C, Hernández-Gómez A, Ramirez V, Guilbert D (2023) Forecast optimization of wind speed in the North Coast of the Yucatan Peninsula, using the single and double exponential method. Clean Technol 5(2):744–765

Rangarajan G, Sant DA (1997) A climate predictability index and its applications. Geophys Res Lett 24:1239–1242

Sadiah MA, Aljeddani SMA, Mohammed MA (2023) A novel approach to Weibull distribution for the assessment of wind energy speed. Alex Eng J 78:56–64

Samorodnitsky G, Taqqu MS (1994) Stable non-Gaussian random processes, Stochastic Models with Infinite Variance. Chapman and Hall, London

Sanchez MA, Trinidad JE, García J, Fernández M (2015) The Effect of the Underlying Distribution in Hurst Exponent Estimation. PLoS ONE 10(5):0127824

Sankaran A, Chavan SR, Ali M, Sindhu AD, Dharan DS, Khan MI (2021) Spatiotemporal variability of multifractal properties of fine resolution daily gridded rainfall fields over India. Nat Hazards 106:1951–1979

Schreiber T (1999) Interdisciplinary application of nonlinear time series methods. Phys Rep 308:1–64

Schreiber Th, Schmitz A (1996) Improved surrogate data for nonlinearity tests. Phys Rev Lett 77:635–640

Schreiber Th, Schmitz A (2000) Surrogate time series. Phys D 142:346–382

Shi H, Dong Z, Xiao N, Hang Q (2021) Wind speed distributions used in wind energy assessment: a review. Front Energy Res 9:769920

Shu P, Chan PW (2025) Application of fractal analysis on wind speed time series: A review. Adv Wind Eng 2:100028

Sumaili Ndeba B, El Alani O, Ghennioui A, Benzaazoua M (2025) Comparative analysis of seasonal wind power using Weibull, Rayleigh and Champernowne distributions. Sci Rep 15:2533

Swisher R, De Azua CR, Clendenin J (2001) Strong winds on the horizon: wind power comes of age. Proc IEEE 89(12):1757–1764

Takens F (1981) Detecting Strange Attractors in Turbulence. Lecture Notes in Mathematics 898. Springer, New York

Tarnopolski M (2018) Correlation between the Hurst exponent and the maximal Lyapunov exponent: Examining some low-dimensional conservative maps. Phys A 490:834–844

Theiler J, Eubank S, Longtin A, Galdrikian B, Farmer JD (1992) Testing for nonlinearity in time series: The method of surrogate data. Phys D 58:77–94

von Storch H (1995) Misuses of statistical analysis in climate research. In: von Storch H, Navarra A (eds) Analysis of Climate Variability: Applications of Statistical Techniques. Springer-, Berlin, pp 11–26

Wais P (2017) Two and three-parameter Weibull distribution in available wind power analysis. Renew Energy 103:15–29. https:/doi.org/10.1016/j.renene.2016.10.041

Zhan C, Wei R, Zhao L, Chen S, Shen C (2025) Optimal distribution modeling and multifractal analysis of wind speed in the complex terrain of Sichuan Province, China. Sci Rep 15:4648

Zhang XD, Wang D, Zhang Z, Dong X (2023) CGManalyzer: Continuous Glucose Monitoring Data Analyzer. R package version 1.3.1, July 11. https://CRAN.R-project.org/package=CGManalyzer

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Yes

Abstract

Wind energy is a rapidly expanding renewable resource that plays a pivotal role in the global transition from fossil fuels to sustainable and low-carbon energy systems. However, its potential application requires understanding its intricate dynamics.. The objectives of the present work are to i) evaluate empirically the hypothesis of nonlinearity, determinism and chaotic behavior of wind speed time series and ii) investigate statistical relationships between different nonlinear parameters. Hourly wind speed time series consisting of 87648 data points were collected from eleven meteorological stations in Granma Province, Cuba, from January/2014 to December/2023. We calculated the Lévy-stable index (α) for the original and pre-whitened time series. The nonlinearity parameter, determinism test (Λ), global Lyapunov exponent (m), Hurst exponent (H) and multiscale entropy (MSE) were also computed for each series series using, respectively, the time reversibility, delay, detrended fluctuation analysis and multiscale entropy methods. These parameters were compared with those computed from 40 surrogate time series generated at each station. We found that the original wind speed time series could be classified as Gaussian noises with Lévy index α=2.00, lower deterministic component ∧=0.318±0.023, time reversibility Z-score 2.021 and approximately constant MSE. The decorrelated data fitted sub-Gaussian distributions with the α exponent in the range 1.227 α 1.704. A multiple regression analysis found a significant empirical link between H, m, the exponent of the MSE function () and Λ with the correlation coefficient of r=0.972. These findings could be useful for the siting, operation and optimization of wind energy-based technologies.