Incremental Nonlinear Stability Analysis of Stochastic Systems Perturbed by L\'{e}vy Noise

We present a theoretical framework for characterizing incremental stability of nonlinear stochastic systems perturbed by compound Poisson shot noise and finite-measure L\'{e}vy noise. For each noise type, we compare trajectories of the perturbed system with distinct noise sample paths against trajectories of the nominal, unperturbed system. We show that for a finite number of jumps arising from the noise process, the mean-squared error between the trajectories exponentially converge towards a bounded error ball across a finite interval of time under practical boundedness assumptions. The convergence rate for shot noise systems is the same as the exponentially-stable nominal system, but with a tradeoff between the parameters of the shot noise process and the size of the error ball. The convergence rate and the error ball for the L\'{e}vy noise system are shown to be nearly direct sums of the respective quantities for the shot and white noise systems separately, a result which is analogous to the L\'{e}vy-Khintchine theorem. We demonstrate our results using several numerical case studies.


Motivation and Related Work
Many model-based controllers or observers for robotics applications are typically designed for robustness against additive Gaussian white noise. Gaussian white noise systems are appealing to study because they demonstrate properties (e.g., independent increments, Central Limit Theorem) which make them easier and more convenient to analyze than non-Gaussian stochastic systems. The choice to study Gaussian white noise processes is also justifiable in practice; vision-based localization and mapping 1 , spacecraft navigation 2 , and motion-planning 3 are notable examples of robotics applications which employ the Gaussian white noise model. Consequently, there has been a wealth of literature devoted towards the study of Gaussian white noise systems, particularly in stability analysis, controller, and observer design. For example, a model-based controller can be developed via the well-known Linear Quadratic Gaussian (LQG) approach 4 , and a model-based observer can be designed via Kalman filtering and its extensions 5,6 . More recent methods of model-based controller synthesis for Gaussian white noise systems include the path integral approach 7 , convex optimization-based approaches 8,9 , as well as a number of reinforcement learning-based approaches 10,11 . arXiv:2103.13338v3 [eess.SY] 10 Jun 2022 However, there is a major lack of generality with Gaussian white noise, which makes them unsuitable for use in non-Gaussian problems, e.g., fault diagnosis and fault-tolerant control 12 , filtering when observations are received unreliably 13 , and target-tracking with nonlinear measurement equation 14 . In many cases, machine-learning-based methods 15,16,17 are the go-to approaches used to perform control or estimation in non-Gaussian noise stochastic systems. But a downside of model-free methods are the large amounts of time and training data needed to learn information that could be obtained using structured models. On the other hand, training time and data can be reduced by expanding the set of model assumptions and considering a broader class of noise distributions prior to applying model-free techniques. For example, Cacace 2019 18 explores the possibility of developing more accurate estimators for linear non-Gaussian systems beyond the Kalman filter, which is only optimal for linear Gaussian systems and when estimators are constrained to be affine.
However, because the class of non-Gaussian noise processes is diverse, verifying stability of non-Gaussian stochastic systems is an important challenge to address prior to considering observer or controller design; Battilotti 2019 19 highlights this even for the case of linear dynamics. For systems modeled as right-continuous, strong Markov processes, the seminal work of Kushner 1967 20 laid out the foundations of stochastic stability theory by considering the following question: can trajectories of the system, arising from different sample paths of noise and different initial conditions, be bounded in some region after a sufficiently elapsed time? The common Lyapunov approach addresses this question for when the bounded region is an equilibrium point or a limit cycle. Alternative approaches to characterize stability for non-Gaussian stochastic systems have also been studied in literature: asymptotic stability of systems driven by Lévy noise is developed in Applebaum 2009 21 while exponential stability for systems perturbed by semimartingale processes is studied in Mao 1990 22 .
Incremental stability 23,24 considers the stronger condition in which multiple distinct solution trajectories converge globally exponentially towards each other. Applications of incremental stability arise in numerous settings such as cooperative control over multi-agent swarm systems 25 and phase synchronization in directed networks 26,27 . A recent tutorial paper on incremental stability and connections to machine learning is Tsukamoto et. al. 2021 28 . For trajectories of incrementally stable stochastic systems with distinct initial conditions and distinct sample paths of noise, global exponential convergence towards a reference or nominal trajectory is guaranteed to within a bounded error ball. Pham 2009 29 derived incremental stability conditions for stochastic systems perturbed by additive Gaussian white noise. Dani 2015 30 extends Pham 2009 to more general state-dependent contraction metrics, then considers the problem of observer design for Gaussian white noise systems. However, to the authors' knowledge, incremental stability has not been studied for any distribution of non-Gaussian stochastic system.
One especially prevalent distribution of non-Gaussian noise is impulsive shot noise 31 , which arises in real-world applications almost just as frequently as Gaussian white noise does. Some examples are the signal-processing neuronal spikes arising from brain activity in the field of neuroscience 32 , large fluctuations in stock prices in the field of economics 33 , and in impulsive noisy sensor measurements in spacecraft control 34 . In the field of robotics, impulsive noise can arise as large proprioceptive measurement errors 35 or disturbances due to obstacle collisions or wind turbulence 36 . Despite this prevalence, there are little to no model-based synthesis procedures for controllers or observers dedicated to robustness against impulsive shot noise. Because Gaussian white noise is usually small in magnitude and continuous in the sense that changes occur over a measurable duration of time, it cannot be used to model sudden impulsive perturbations like shot noise.
In the field of applied mathematics, there is an abundance of literature which provides useful tools and theoretical results that can be used to model the shot noise phenomenon. Just as how the standard Brownian motion process is used to model various forms of Gaussian white noise, shot noise is modeled using the Poisson processes, especially the compound Poisson process. Furthermore, like Brownian motion processes, Poisson processes have stationary and independent increments. In fact, both Brownian motion and Poisson processes are special cases of the more general Lévy processes which have these shared properties. A particularly useful result is the Lévy-Khintchine Theorem, which describes Lévy processes as affine combinations of Brownian motion processes and compound Poisson processes 37 .

Our Contributions
This paper uses the theory of Poisson random measures and Lévy processes to develop incremental stability criteria for stochastic nonlinear systems perturbed by two types of non-Gaussian noise. The first type of noise process are shot noise processes which are represented as compound Poisson processes. The second are finite-measure Lévy processes constructed by taking the affine combination of Gaussian white and compound Poisson shot noise processes. To address stochastic incremental stability for each type of system, we use contraction theory to compare trajectories of the noise-perturbed system with distinct noise sample paths against trajectories of the nominal, unperturbed system, starting from different initial conditions. We show that when a finite number of jumps arise from the noise process over a fixed, finite interval of time, the mean-squared error between the trajectories converge exponentially to a bounded error ball under the assumption that certain parameters of the noise process and contraction metric are bounded. The convergence rate and the error ball for the Lévy noise system is shown to be nearly a direct summation of the respective quantities for the shot and white noise systems separately, in likeness to the statement of the Lévy-Khintchine Theorem. For concreteness, we specialize our results to linear time-varying (LTV) dynamics, which allows us to use an explicit form of contraction metric and provide additional insights to our general results. We qualitatively discuss how both 1) fixed, inherent parameters of the system (e.g., intensity and maximum norm of jumps in the noise process) and 2) design parameters (e.g., contraction metric) affect the tightness of the stability bounds. To illustrate our results, several empirical studies are performed for various systems: a 1D linear reference-tracking system, 2D linear time-varying systems, and a 2D nonlinear system.

Paper Organization
We begin in Section 2 by outlining notation and briefly reviewing background. In Section 3, we establish the form of shot and Lévy noise system dynamics used throughout the paper, and set up the incremental stability background needed for our main results. Sections 4 and 5 present our main results. In Section 4.1, we state and prove our two theorems, the Shot Contraction Theorem and the Lévy Contraction Theorem. Section 4.2 specializes the Shot Contraction Theorem to linear time-varying (LTV) nominal systems. In Section 5, we use numerical case studies to supplement the theory developed in Section 4. Section 5.1 demonstrates the Shot Contraction Theorem for a simple 2D nonautonomous, nonlinear system. Section 5.2 considers referencetracking for a simple 1D linear shot noise system, and Section 5.3 extends this to two different 2D LTV systems. Finally, we conclude the paper in Section 6.

Terminology and Mathematical Notation
For the sake of simplifying terminology, we refer to additive Gaussian white, additive compound Poisson shot, and additive bounded-measure Lévy noise systems as simply white, shot, and Lévy noise systems, respectively. The understanding is that the shortened terminology throughout this paper does not refer to the general cases (e.g., non-Gaussian white noise, or Lévy noise whose measures have infinite total mass). We use the norm ‖⋅‖ ≜ ‖⋅‖ for vectors and the corresponding induced norm ‖ ‖ ≜ sup ≠0 (‖ ‖ ∕ ‖ ‖) for matrices. We abuse the notation ‖⋅‖ to apply to both matrices and vectors where the context is clear. For any which is a function of time, we denote the left-limit of ( ) at time as ( −) ≜ lim → ( ) where < for any > 0. For any ∶ ℝ × ℝ → ℝ , a function of both a scalar parameter ∈ ℝ and a vector ∈ ℝ , we write the partial derivatives in the following shorthand notation. First, ≡ ( , ) ≜ ( , )∕ , and ∇ ≡ ∇ ( , ) ∈ ℝ × denotes the gradient with respect to . Moreover, we denote 2 ≜ 2 ( , )∕( ) as the double partial derivative of with respect to two distinct components and of , ≠ , and likewise 2 ≜ 2 ( , )∕( 2 ) for the derivative with respect to the same component . When describing a dynamical system of the forṁ ( ) = ( , ( )), where ∶ ℝ + → ℝ and ∶ ℝ + × ℝ → ℝ , we simplify the notation by writing ( ) without the argument , as iṅ = ( , ) with the understanding that is not taking the function as input, but rather the vector ( ). We define the dot notation as the time-derivative, meaninġ ( ) ≜ ( ∕ ) ( ). Any function ( , ( )) with arguments time ∈ ℝ + and state ( ) ∈ ℝ , which evolves over time according to some dynamics of the forṁ = ( , ), has the time-derivativė ≜ + ∇ ⋅̇ ( ) = + ∇ ⋅ ( , ).

Poisson and Lévy Processes
The general Poisson random measure is characterized by an intensity measure Leb× , where Leb denotes the standard Lebesgue measure over time and ( ) is the probability measure over a space describing the distribution of jumps. Such a Poisson random measure is typically denoted ( , ) over the space [0, ]× . While most results in the theory of Poisson processes are written with respect to general Poisson random measures, our scope in this paper is specifically on the standard and compound Poisson processes, defined below. Definition 1 (Standard and Compound Poisson Processes). Let ⊆ ℕ and > 0. The standard Poisson process ( ) counts the number of jumps in that have occurred in the time interval [0, ] for ≤ . It is characterized by an intensity parameter > 0, which describes the average rate at which jumps occur in the process. The compound Poisson process ∑ ( ) =1 ( ) is a simple generalization of the standard Poisson process to include weighted jumps, where < is the arrival time of the th jump and ∶ [0, ] → is a function describing the jump distribution over the space . The intensity > 0 of a compound Poisson process is the same as its corresponding standard Poisson process ( ).
Definition 2 (Lévy Processes). A process { ( ) ∶ ≥ 0} defined on a probability space (Ω,  , ℙ) is said to be a Lévy process if the following hold: • Cádlág Paths: the paths of are almost-surely ℙ right-continuous with left-limits. This means that every sample path of must be right-continuous with left-limits.

Remark 1.
Under Definition 2, we can observe that both Gaussian white noise processes and compound Poisson shot noise processes are Lévy processes. This implies that the affine combination of the two is also a Lévy process. In fact, a well-known result called the Lévy-Khintchine Theorem (see, e.g., Theorem 1.2.14 of Applebaum 2009 21 ) says that Lévy processes can be represented as affine combinations of Brownian motion processes and Poisson processes. The Lévy-Khintchine Theorem is applicable to Lévy processes with measures that have infinite total mass. One example of this is a Gamma process, which has intensity measure on ℝ + given by ( ) = −1 − , i.e., on any finite interval of time, the number of jumps which lies in the interval of space (0, 1) is infinite. However, we emphasize that we only consider finite-measure Lévy processes that are formed by taking the affine combination of Gaussian white noise and compound Poisson shot noise processes.
The proofs to both Lemma 1 and Lemma 2 are straightforward, and have thus been omitted from the paper. Another useful formula that we employ throughout the paper is Itô's Formula. The version of the formula for functions of stochastic processes driven by more general Poisson random measures can be found in various standard references in stochastic processes literature, but for the purposes of this paper, we use the following version of the formula.
Lemma 3 (Itô's Formula). For functions ∈  (1,2) , i.e., functions which are once continuously-differentiable in time and twice continuously-differentiable in state: Here, integrals from 0+ to indicate an integral over the interval (0, ], ∈ ℝ , ∈ (0, ] is the time of the th arrival of ( ), and we use the left-limit notation of Section 2.1. Furthermore, represents the continuous part of the SDE for component , and [ , ] represents the continuous part of the quadratic variation between the two SDEs corresponding to components and .

System Dynamics and Assumptions
We consider Lévy noise systems which can be expressed as SDEs of the following form: where , is once continuously-differentiable in time and twice continuously-differentiable in state.
• ( , ) ( ) is a compound Poisson process which enters into the system as an additive disturbance, where ∶ ℝ + ×ℝ → ℝ , ∈  (1,2) describes the jumps that occur, and ( ) is the scalar standard Poisson process with intensity > 0. The "derivative" of the standard Poisson process, written as ( ), is understood as a function which takes value 1 if a jump occurs at time , and value 0 otherwise. When ( , ) ≡ 0, we have the following shot noise system: and when ( , ) ≡ 0, we have the white noise system: We choose to use Lévy noise systems in the form of (5) for its representation simplicity and relevance to real-world system dynamics. Shot or Lévy noise systems which can be modeled as (6) or (5) specifically include reinforcement learning-based robust trajectory optimization schemes for robot arm manipulators 35 , stock price fluctuations and impulse control 38 , and wireless mobile communication networks 31 .

Remark 3.
By the decomposition of Lévy noise processes into white and shot noise processes via the Lévy-Khintchine Theorem (see Remark 1), the conditions for existence and uniqueness of (5) described in Assumption 2 are obtained by combining Lemma 4 with the Lipschitz and boundedness conditions for white noise systems (7). Conditions for more general Lévy noise systems are similar to Assumption 2, and have been shown in previous literature: see, e.g., Theorem 6.2.3 of Applebaum 2009 37 . There has also been previous work describing conditions for shot noise systems (6) while imposing different, non-Lipschitz conditions on and . For instance, Li 2001 40 relaxes the Lipschitz conditions by instead assuming that and are bounded above by a concave function of the norm difference in trajectories ‖ − ‖. Alternatively, Kasahara 1991 41 presents a result for conditions where is upper-bounded in norm by a constant, and the bound on depends on the maximum norm bound of the jump. We choose to work with the simple Lipschitz and boundedness conditions of Assumption 2 because they are easier to relate to the well-known standard conditions for white noise systems (7).
The approach we take in the derivation of our main results in Section 4 is stochastic contraction theory. Our main results to be presented in the following Section 4 are referred to as the Shot Noise Stochastic Contraction Theorem and the Lévy Noise Stochastic Contraction Theorem, and they derive incremental stability conditions for the shot noise system (6) and Lévy noise system (5), respectively. For simplicity, we henceforth refer to each theorem as the Shot Contraction Theorem and the Lévy Contraction Theorem, respectively. In the following discussion, we formalize the notion of incremental stability and contraction in both the deterministic and stochastic sense.
Following the notation from Definition 4, we denote ∈ ℝ to be the infinitesimal displacement length between 1 ( ) and 2 ( ) over a fixed infinitesimal interval of time. Formally, the infinitesimal displacement length is represented as a path integral: The evolution of the infinitesimal displacement over time can be approximated by the dynamics where ≜ ∇ ( , ) is the Jacobian of the system. These dynamics, associated with the state , are commonly referred to as the virtual dynamics. Similar to the indirect and direct Lyapunov methods of testing Lyapunov stability, there is a test to determine incremental stability of a system without needing the literal Definition 4. Oftentimes, performing a differential coordinate transform from to ≜ Θ( , ) , where Θ( , ) ∈ ℝ × is a smooth invertible square matrix, makes it easier to verify the conditions of this test. The new virtual dynamics under this coordinate transform become is the generalized Jacobian of the system, and the dot notation is defined in Section 2.1. An equivalent way to say that a system ( ) = ( , ) is incrementally stable in the sense of Definition 4 is to say that it is contracting with some rate > 0. Similar to (12), we can extend the notion of contraction to more general metrics: a system is incrementally stable if it is contracting with respect to a uniformly positive definite metric ( , ) ≜ Θ( , ) Θ( , ) and convergence rate . For most practical applications, we are able to make the following assumption on ( , ).
Assumption 3 (Bounded Metric). The metric ( , ) described in the setup above is bounded in both arguments and from above and below, and its first and second derivatives with respect to the argument are also bounded from above. We thus define the following constants The inequality (9) is obtained for the special case where ( , ) = , the -dimensional identity matrix. For general deterministic system dynamics ( ) = ( , ) , the criterion for testing incremental stability is stated in the theorem below.

Theorem 1 (Basic Contraction). Consider the deterministic dynamics
= ( , ) . If there exists a uniformly positive definite metric ( , ) and > 0 such that the following condition is satisfied: then the system is contracting. Moreover, in relation to (13), For deterministic systems, incremental stability has been established as a concept of convergence between different solution trajectories with different initial conditions 42,43 . However, in the stochastic setting, the difference between trajectories also arises from using different noise processes. For this reason, we require a change in notation from deterministic incremental stability analysis. The infinitesimal displacement length now considers the difference between a solution trajectory ( ) of a stochastic system with one noise sample path and a solution trajectory ( ) of a stochastic system with another noise sample path. This can be viewed as a comparison between solution trajectories coming from distinct systems; this is different from deterministic incremental stability, which compares of two solution trajectories from the same system. To make this distinction clear, we use the notation in place of , and the path integral (10) is now written instead with a parametrization ∈ [0, 1]: The work of Pham 2009 29 considered stochastic incremental stability for the specific case of additive Gaussian white noise perturbations, and Dani 2015 30 extended this theory to more general state-dependent metrics. Both works compared between two noise-perturbed trajectories -( ), solution to (7) with white noise 1 ( , ) 1 ( ), and ( ), solution to (7) with white noise 2 ( , ) 2 ( ). However, in this paper, we compare one noise-perturbed trajectory ( ) against a trajectory ( ) of the nominal, deterministic system = ( , ) . This allows for a direct combination of the white noise result with the Shot Contraction Theorem to establish the Lévy Contraction Theorem. To that end, we consider a parametrization of a new state ( , ) ∈ ℝ , with ∈ [0, 1], such that: where ( ) and ( ) are solution trajectories of, respectively: This parametrization allows us to construct a virtual system with state ( , ), written as The virtual dynamics become where is the Jacobian defined in (11) and where , is the th column of . For the white noise system (7), the perturbed system and its nominal dynamics are parametrized such that the virtual system and virtual dynamics are established as in (19) and (20) without the shot noise terms ( , ), , and ( ). Stochastic contraction is defined in Definition 2 of Pham 2009 29 , but is only applicable to white noise systems (7). For the virtual system (19), we create a more general definition of stochastic contraction.
Remark 4. The equation (22) is a version of (8) with a nonzero steady-state error bound ( ). This is because for stochastic systems, convergence to an equilibrium often does not occur with perfectly zero error due to trajectories arising from different noise sample paths. Moreover, for the impulsive shot noise in (5) and (6), almost-sure convergence is difficult to demonstrate. We show this in Section 4, where the bound is instead derived as a probabilistic guarantee, conditioning on a finite number of jumps within a fixed interval of time.
Demonstrating stochastic incremental stability for stochastic systems perturbed by some class of noise processes involves rewriting (22) and deriving specific forms of and ( ) based on the parameters of the stochastic system. One common choice of Lyapunov function is the metric-weighted norm-squared difference between solution trajectories with distinct initial conditions and noise sample paths: Here, ( , ) ≜ Θ Θ( , ) is the contraction metric described before; the parametrization over is such that (19) and (20) hold Similar to the direct method of Lyapunov, we analyze the behavior of the system by analyzing the derivative of the Lyapunov-like function ( , , ) along trajectories of the virtual system (19).
Remark 5. Another significant distinction between our stochastic incremental stability setup and previous versions is that we derive an error bound over a fixed interval of time [ , ] for any < instead of necessarily fixing = 0 and including the initial state. This allows us to interpret our stability theorems as a measure of how far the perturbed trajectory will deviate from the nominal within a local horizon of time, which allows for the design of controllers and observers which are online and adaptive.
Theorem 2 (White Noise Stochastic Contraction Theorem). Suppose that the perturbed system (7) is stochastically contracting in the sense of Definition 5 under a differential coordinate transform Θ( , ) which satisfies Assumption 3. Then, for a fixed interval of time [ , ] for < , (22) can be written explicitly as: where and is the bound defined in Assumption 1, , , ′ , ′′ are the constants defined in Assumption 3, and is the deterministic contraction rate from Theorem 1.
Dani 2015 30 demonstrates an application of white noise incremental stability to the problem of model-based nonlinear observer design. Thus, extending Theorem 2 to account for non-Gaussian noise gives us a potential way to design model-based observers and controllers for systems perturbed by non-Gaussian noise. With this motive, the next section presents our main results, the Shot Contraction Theorem and the Lévy Contraction Theorem, which are incremental stability theorems for the shot noise system (6) and Lévy noise system (5), respectively.

STOCHASTIC CONTRACTION THEOREMS
Given our setup established in Section 3.2, we present the Shot Contraction Theorem for shot noise systems (6) and the Lévy Contraction Theorem for Lévy noise systems (5), and use them to conclude incremental stability properties for each respective system. The proofs to both theorems follow similar approaches: we analyze the behavior of the system by analyzing the derivative of the Lyapunov-like function ( , , ) along trajectories of the virtual system (19). This requires us to use the infinitesimal generator from Definition 3, which can be thought of as the stochastic analogue to the differentiation operator used in the deterministic case. For the shot and Lévy noise systems, we invoke Itô's formula (Lemma 3) instead of using the infinitesimal generator by virtue of the relationship described in Remark 2. Throughout this section, we shorten the notation such that ( , , ) is understood to mean ( , ( , ), ( )), i.e., the spatial arguments and are evaluated at the time argument .

Main Results
We begin with the Shot Contraction Theorem for shot noise systems (6). We compare a trajectory ( ) of the shot noise system (6) against a trajectory ( ) of the nominal system (18b). We define the parameter ∈ [0, 1] such that the virtual system and virtual dynamics are established as in (19) and (20) without the white noise terms ( , ), , and ( ).
Assumption 4 (Bounded Differences of Lyapunov-Like Function). Consider the shot noise system (6) and corresponding Lyapunov-like function (23). For any fixed > 0, there exists a deterministic, locally-bounded, continuously-differentiable For the remainder of this section, the expectation operator [⋅] is understood to be taken over all sources of randomness in the argument. For instance, in Assumption 4, [⋅] is taken with respect to the initial condition distribution ( 0 ), the random function describing the jump distribution, and the standard Poisson process ( ).

Remark 6.
Because the Lyapunov-like function ( , , ) takes in arguments and which depend on the shot noise process of (6), more information is needed about ( , ) and ( , ) in order to design a metric ( , ) for ( , , ). In Section 4.2, we make the abstract function ℎ( ) in Assumption 4 more concrete by specializing (6) to linear time-varying (LTV) systems where ( , ) ≜ ( ) and jumps ( , ) ≡ ( ) which are independent of the state . We will see that the expression of ℎ( ) in terms of system parameters depends on the form of the solution trajectory ( ) of (6), which is easy to obtain for LTV systems. (26) is roughly justified using the following argument. For each time ≥ 0, note that we can simplify the difference in (26) as follows:

Remark 7. The existence of a ℎ( ) in
where the last inequality follows from Assumption 3. By continuity of the nominal system, ( ) = ( −). Hence, the difference (27) is nonzero only when there exists a jump of the shot noise process at time . By Assumption 1, we know that the jumps are bounded by a constant , and so the difference (27) is also bounded at each fixed time . Theorem 3 (Shot Noise Stochastic Contraction Theorem). Suppose that the shot noise system (6) is perturbed by noise processes which satisfy Assumption 1, and is stochastically contracting in the sense of Definition 5 under a differential coordinate transform Θ( , ). Further suppose the metric ( , ) constructed from Θ( , ) satisfies (14), and is such that the Lyapunov-like function (23) satisfies Assumption 4. If for a fixed interval of time [ , ] for 0 ≤ < , ∈ ℕ jumps occur with probability ( − ) given by (28), then (22) can be written explicitly as: where and > 0 is the deterministic contraction rate from (15), is defined in (14), and the function ℎ is defined in Assumption 4.
Proof of Theorem 3. Apply Lemma 3 to (23), and apply across the resulting equation to condition on the number of jumps being ( ) − ( ) = . We get: where is the Jacobian from (11), is the time of the th arrival in the Poisson process ( ) driving the shot noise system (6), and we use the left-limit notation of Section 2.1. Similar to the argument of Remark 7, note that each term of the sum (31c) is nonzero only if there is a jump at time , where ≤ . Furthermore, the terms of (4) which correspond to the continuous part of the quadratic variation are zero for dynamics (6).
Recall that the Lyapunov-like function (23) is twice continuously-differentiable with respect to its arguments and . This means there is a jump-discontinuity in only if there is a jump discontinuity in or . But by the relationship between (19) and (20), and experience jumps at the same times. Hence, the number of jumps experienced by in a fixed interval of time [ , ] is equal to the number of jumps experienced by the trajectory in [ , ].
A bound on (31a) and (31b) is derived from Theorem 1. We bound (31c) in the following way: where ℎ( ) is defined in Assumption 4. The inequality (32c) comes from (26) and the fact that ∈ [ , ] for all = 1, ⋯ , . In (32a), we abuse the notation for the subscript in for both sums which range over = ( ) + 1 to ( ) and sums which range over = 1 to . This is done for the sake of simplicity, with the understanding that (32a) arises because we conditioned on ( ) − ( ) = .
In combination, we get: where is the contraction rate of where and ( , , ) are defined in (30a) and (30b), respectively. Note that by Assumption 3,(16), and Cauchy-Schwarz: We use (35) to write (34) as an inequality on the norm mean-squared error between the two trajectories and . Because the condition that ( ) − ( ) = occurs with probability ( − ) given by (28), we obtain our desired bound (29).
Both the shot noise system and the nominal system have the same contraction rate. This is because the shot noise system behaves exactly as the nominal system in between consecutive jumps. The displacements of the shot noise trajectory incurred by the jumps of the shot noise process can each be thought of as a reset to a different initial condition from which the system evolves nominally.
Remark 8. In conjunction with Remark 5, we note two important differences between white noise incremental stability Theorem 2 and shot noise incremental stability Theorem 3. The first difference is that taking → ∞ in the inequality of Theorem 2 yields a bound which can be interpreted as the steady-state error ball that solution trajectories are guaranteed to converge towards. In contrast, Theorem 3 is more comparable to finite-time stability theory, described in Chapter III of Kushner 1967 20 . The second difference is that, ue to the impulsive, large-norm jumps of the noise process, the mean-squared error bound for the cases of shot noise systems are provided with a specific probability of satisfaction; this probability is dependent upon the number of jumps incurred by the noise process. Depending on the application, such criteria may be viewed as being weaker than the traditional steady-state, mean-squared sense of convergence. However, for other applications which deal with online implementations of controller/observer synthesis, this probabilistic guarantee potentially allows for better time-varying, adaptive design. For instance, the probability of (28) can be used as a measure of expectation that jumps will arise in a fixed, future horizon of time; given this event, the bound of Theorem 3 provides a guarantee on the mean-squared deviation of the perturbed trajectory away from the nominal.
Remark 9. The error ball (30b) depends on two types of parameters: 1) fixed, inherent parameters which come from the system dynamics/noise process, and 2) design parameters which can be tuned to vary the stability bounds. Among the inherent parameters, we can make the following observation about the intensity of the shot noise process. An increasing is indicative of a more rapid accumulation of jumps, which implies larger deviations of the perturbed trajectory away from the nominal over shorter horizons of time. This can be seen by the error ball in (30b) being directly proportional to the number of jumps , and we demonstrate this relationship numerically using the specific 2D nonlinear system in Section 5.1. The effects of other parameters vary based on the type of function which describes the unperturbed dynamics. Hence, further discussion is deferred to Section 4.2, which provides further insights specifically for linear time-varying (LTV) forms of (6). Now we are ready to present the Lévy Contraction Theorem for the Lévy noise system (5). We show that the resulting condition is a combination of the conditions for white noise (Theorem 2) and the shot noise (Theorem 3). Consider two trajectories of a system -( ) a solution of (18a) and ( ) a solution of the nominal system (18b). We define the parameter ∈ [0, 1] which yields the parametrization (17), virtual system (19), and virtual dynamics (20). Analogous to the white noise parameters , from (25) and the shot noise parameters , from (30), denote , to be the contraction rate and steady-state error bound, respectively, for the Lévy noise system. (5) is perturbed by noise processes which satisfy Assumption 1, and is stochastically contracting in the sense of Definition 5 under a differential coordinate transform Θ( , ). Further suppose the metric ( , ) constructed from Θ( , ) satisfies (14), and is such that the Lyapunov-like function (23) satisfies Assumption 4. If for a fixed interval of time [ , ] for 0 ≤ < , ∈ ℕ jumps occur with probability ( − ) given by (28), then (22) can be written explicitly as:

Theorem 4 (Lévy Noise Stochastic Contraction Theorem). Suppose that the Lévy noise system
where and is the deterministic contraction rate from (15), is the norm bound on the variation of the white noise process from Assumption 1, and , , ′ , ′′ are defined in (14). The function ℎ is defined in Assumption 4.
Proof of Theorem 4. Applying Lemma 3 to (23): where is the Jacobian from (11), and denotes the time of the th jump. As in the proof of Theorem 3, a bound on (38a) and (38b) are derived from Theorem 1. Simplifying the quadratic variation terms (38d) to (38f) requires computing the partial derivatives of , which we omit the detailed calculations of for the sake of space. We obtain the following inequalities: where is the white-noise bound from Definition 5, and , ′ , ′′ are the metric bounds defined in Assumption 3. Note that we can use (35) to further simplify equations (38d) to (38f). Moreover, applying Assumption 4 and following logic similar to that of the proof to Theorem 3 gives us a bound on (38g). Taking [⋅] across the entire inequality, note that the white noise term (38c) disappears due to being a martingale with zero mean. Combining the bounds of each remaining term from (38) yields the following: We obtain a bound on the solution [ ( , , )] using Lemma 1 with ( ) ≜ [ ( , , )], = ℎ( ), ≜ 2 −( ′′ 2 ∕2 )− ( ′ 2 ∕ ), and ( ) ≜ ( ′ 2 + 2 ) + ℎ( ). Then we use (35) to write the resulting inequality in terms of the norm mean-squared error between and . This gives us the desired bound (36). (25) and (30) in the following way. can be interpreted as being the direct sum of and (= 2 ) with the extra 2 term removed to prevent double-counting of the convergence rate from the nominal system: = + − 2 = . Furthermore, ( , , ) is a sum of the white noise error ball and the shot noise error ball with contraction rate used in place of or ; this is written in the last two equations of (37b). We emphasize the importance of this remark because of its likeness to the Lévy-Khintchine theorem, which represents Lévy processes as an additive combination of Brownian motion processes and compound Poisson processes.

Remark 10. The parameters (37) for the Lévy noise SDE (5) can be expressed as a combination of
Remark 11. The stochastic contraction theorems we presented in this section are comparable to the theories of hybrid systems or jump-Markov systems. In shot or Lévy noise systems, large deviations away from nominal behaviors arise solely from the jump-discontinuous noise process, which is independent of the open-loop dynamics. In contrast, hybrid systems have switches (i.e., jumps) which arise as an inherent property of the open-loop dynamics. Despite this important distinction, the two settings can still be closely related to one another in two ways. First, stability analysis techniques are primarily focused on handling the jump-discontinuities more than any other property of the system. For hybrid systems, literature towards this direction of research include Lyapunov-sense conditions for asymptotic stability 44,45 and characterizations of incremental stability 46 . Second, dwell time can be related to the interarrival time by viewing it as a form of stability criteria which ensures that the system has sufficient time to converge towards a desired state in between consecutive switching phases. Likewise, the stability results of Section 4 can be alternatively interpreted as conditions imposed on the shot or Lévy noise system such that the mean time between consecutive jumps (which depends on the intensity parameter ) is long enough for the system to be reasonably close to the nominal trajectory. One notable example which utilizes dwell-time criteria for nonlinear systems is in Hespanha 1999 47 , where it is shown that input-to-state induced norms should be bounded uniformly between switches. In terms of applications, dwell-time criteria for attaining exponential stability has been shown to be effective for robotic systems, in particular walking locomotion and flapping flight 48 as well as autonomous vehicle steering 49 .

For Linear Time-Varying Systems
To demonstrate a concrete example of the function ℎ( ) from (26) Here, ∶ ℝ + →∈ ℝ × is continuous for all ≥ 0, and ∶ ℝ + → ℝ is a random function which maps time to a random vector in ℝ such that the bound in Assumption 1 is still satisfied. By virtue of Remark 6, we can leverage the additional knowledge that the shot noise system is LTV in order to further simplify the bound (27). Note that a solution trajectory of (41) with value ( ) ∈ ℝ at time ≤ can be explicitly written as: where the second equality follows from the definition of the Poisson integral from Section 2.3.2 of Applebaum 2009 37 , ( ) is the number of jumps observed by time , and ≤ denotes the time of the th jump. Instead of using a parameter ∈ [0, 1], we construct the virtual system by stacking the SDEs (41) and the nominal system ( ) = ( ) ( ) .
where ( ) is the standard Poisson process with intensity > 0.
2. For all ≥ 0 and a fixed > 0, The decomposition of ( , ) into ( ) , which can be seen as a product of a function of time and a function of state, allows us to consider metrics ( , ) ≡ ( ) which are independent of state ( ) ∈ ℝ 2 of the virtual system (43). Moreover, we choose the metric to be ( ) ≜ ( ), where ( ) satisfies Assumption 5. The condition (45) can be viewed as a simplification of (15) for LTV systems. Because the metric is independent of state, we can construct a Lyapunov-like function which is simplified compared to (23): We further assume that the nominal system admits a solution with the state transition matrix Φ( , ) ≜ ∫ ( ) satisfying the following.
Assumption 6 (Bounded State-Transition Matrix). The state-transition matrix Φ( , ) ≜ ∫ ( ) from any system with nominal dynamics ( ) = ( ) ( ) satisfies the following condition for some , > 0. (41) is perturbed by noise processes which satisfy Assumption 1, and is stochastically contracting in the sense of Definition 5 under the metric ( ) from Assumption 5. Further suppose that the nominal LTV system is such that Assumption 6 and Assumption 5 holds. If for a fixed interval of time [ , ] for 0 ≤ < , ∈ ℕ jumps occur with probability ( − ) given by (28), then (22) can be written explicitly as:

Theorem 5 (Shot Noise Stochastic Contraction Theorem: LTV Systems). Suppose the LTV shot noise system
As in the proof to Theorem 3, we again abuse the notation of the subscript in for both sums which range over = ( ) + 1 to ( ) and sums which range over = 1 to for the sake of simplicity. We use the left-limit notation of Section 2.1, and Let ( ) be the solution trajectory described in (42) with value ( ) ∈ ℝ at time . Further denote ( ) = Φ( , ) ( ) to be the solution trajectory of the nominal system with value ( ) ∈ ℝ at time . We can simplify each term in the sum (51b) as follows: where ( ) is shorthand notation for ( ) ( ) ( ) for any , and likewise for other similar notation. The first equality comes from the fact that ( ) = ( −) for all ≤ due to its continuity. The second equality is obtained by by virtue of ( −) = ( ) − 1 and ( ) = . Substituting (53) into (51) yields: Here, (54b) from using Assumption 5, the bound on ( ), and submultiplicativity. The last inequality (54c) comes from triangle inequality, (47), and Assumption 1. Taking  There are several ways to simplify ( , ). One can explicitly write out the integral form of the expectation with the knowledge that are Gamma-distributed with parameter and for all = 1, ⋯ , . While this direct computation of (50) yields the tightest bound, this method requires computing integrals and is thus increasingly difficult to compute with increasing . For concreteness, we look at two specific ways to derive a looser bound.
The first term of (50) simplifies as because the random variables , = 1, ⋯ , are such that 1 ≤ 2 ≤ ⋯ , and so − takes the largest value with the smallest index .
For the second term of (50), we can invoke Lemma 2: More specifically, (57) holds because , = 1, ⋯ , are such that 1 ≤ 2 ≤ ⋯ . This means the value of ∈ {1, ⋯ , } which maximizes − is = 1, and the value of ∈ {1, ⋯ , } which maximizes is = . Use the fact that 1 ∼ Exp( ) and ∼ Gamma( , ) to further simplify the resulting inequality. The derivative of the function ( , ) from (50) then simplifies to Alternatively, we can bound the second term of (50) in the following way: As mentioned in Remark 9, we see that ( , ) of (50) (and thus ( , , ) of (49b)) is directly proportional to the jump norm bound .

Remark 12.
A comparison of the results between Theorem 5 and Theorem 3 show that the form of the stability bounds are the same. First, note that = 1 and from (49a) in Theorem 5 is as in (30a) from Theorem 3. Second, and more importantly, having more knowledge about the system dynamics allows us to derive a more concrete bound compared to the bounds of Section 4.1, which are dependent upon some abstract function ℎ( ). In particular, for this LTV case, the difference (27) can be computed exactly using the precise solution form (42), and the metric ( , ) ≜ ( ) does not depend on the state . From (55), we have ℎ( ) = ( , ).
Remark 13. Similar to Remark 9, the strength of the stability bound in Theorem 5 is contingent on both inherent parameters and design parameters, but the explicit form of (50) allows us to derive additional insights. First, an additional inherent parameter we can consider is the the maximum norm bound : (50) is directly proportional to , which implies that larger jump norms result in larger error bounds. Second, among the design parameters, the contraction metric ( , ) ≡ ( ) can be chosen such that 1 is small and 2 is large. This enables a looser bound on (48), and so (48) is tighter and more meaningful when we choose a metric whose condition number is close to 1. For exponentially-stable unperturbed LTV systems, (50) also demonstrates that a larger deterministic contraction rate allows for faster convergence to a smaller error ball. Correspondingly, this effect can also be achieved for controllable open-loop unstable unperturbed LTV systems by designing a control law such that, in Assumption 6, is small and is large. In Section 5, we use numerical simulations to investigate how the stability bounds vary by varying the different parameters discussed in here and in Remark 9.
We empirically generate trajectories with initial condition 0 ∼ [1,6] up until the maximum time where the number of jumps is ( ) = 3 such that (15) is satisfied. To construct the virtual system, we choose the specific affine parametrization  Figure 1. Because we cannot compute the explicit form of ( ) and ( ) from the dynamics, the bound ℎ( ) from (26) is difficult to compute analytically. However, we can observe empirically from Figure 1 that as decreases, the support of increases over time, i.e., the spikes spread out over longer intervals of time with smaller values. Additionally, the mean height of the spikes decreases, which experimentally verifies Remark 9.

1D Linear Reference-Tracking
We now use the Shot Noise Stochastic Contraction Theorem to derive a stability bound for a simple linear system perturbed by shot noise, as a further specialization to the LTV system from Section 4.2. Suppose we have the following scalar system, which can be viewed as the Ornstein-Uhlenbeck process augmented with shot noise instead of white noise: where > 0 so that the system is unstable in open-loop, ( ) is a standard Poisson process with rate > 0, and jump height distribution ( ) is a Bernoulli random variable which takes value > 0 with probability , and − with probability ≜ 1 − . We are interested in the problem of tracking some given reference trajectory ( ). To achieve this, we design the following control law: with control gain > . As described in Section 2.1, the dot notation oḟ ( ) refers to the time-derivative of ( ). The system (67) can be solved directly: where 0 ∈ ℝ is the initial condition. The nominal closed-loop system has the dynamics ( ) = ( ) + ( ), with the same ( ) as in (68). A trajectory of this nominal system with initial condition 0 ∈ ℝ is thus given by Like (43), we design a virtual system by stacking the nominal closed-loop system on top of the noise-perturbed closed-loop system (67), and define = ( , ) ∈ ℝ 2 to be the virtual system state. The contraction metric ( ) from Assumption 5 is chosen to be the identity 2 , meaning both upper and lower bounds 1 = 2 = 1. The Lyapunov-like function is chosen to be ( ) = ( − ) 2 . Following an argument similar to the proof of Theorem 5 with 0 = 0, we get: conditioned on the number of jumps being ( ) = by time . Here, the last term can be simplified by using (69) and (70): Substituting (72) into (71) and using Theorem 5 yields the following bound with probability ( ) given by (28): where the contraction rate is ≜ 2( − ) > 0 and the error bound ( , 0, ) comes from (30b). We simulate (67) with the lower-level tracking controller (68) implemented to track ( ) ≜ sin( ). We use the theoretical bound created for the each of the two different versions of , (58) and (60). The results are organized in Figure 2. The intensity of the shot noise process varies across ∈ {1, 2, 4}, corresponding to each row of subfigures. All three rows share the following common experiment setup: the number of jumps is fixed to be = 5, and we simulate the evolution of trajectories starting from 0 = 0 until just before the ( + 1)th jump occurs at time +1 . In the left column of subfigures, we simulate a sample nominal closed-loop reference trajectory ( ) (grey solid line) with a sample closed-loop noise-perturbed reference trajectory ( ) (black solid line). In the middle column of subfigures, the empirical average squared-difference | ( ) − ( )| 2 is computed by timewise-averaging over 200 Monte-Carlo trial trajectories (black solid line). The two types of the theoretical mean-squared error bound (73) are also plotted, using ∕ from (60) (dark grey dashed line) and from (58) (light grey dashed line). In the right column of subfigures, the empirical probability that a jump occurs at a certain time is plotted as a histogram over time. The histogram is constructed by discretizing the maximum length of time into 30 subintervals and computing the proportion of jumps (over all 200 Monte-Carlo trials) which fall into each subinterval. We note that it is possible for the empirical squareddifference to exceed the theoretical bound. This is because the theoretical error bound is on expected behavior, and should not be treated as an almost-sure guarantee for all sample paths. Moreover, we observe that in both figures, the trajectories converge towards each other in between consecutive jumps, which aligns with the incrementally stable nature of the nominal system.
Compared to the previous simulation of Section 5.1, we can make a few insightful observations based on the two figures. The theoretical bound derived in Section 4.2 yields an expression for (0, ) which is proportional to both and − . As seen in the middle column of subfigures of Figure 2, this effect is demonstrated for both the theoretical bound computed using (60) and (58). First, note that an increasing value of corresponds to a larger accumulation of jumps. This corresponds to an increasing constant initial value in both grey dashed lines. Second, an increasing value of also corresponds to a faster accumulation of jumps, i.e. all = 5 jumps of the system occur earlier in time for larger . This corresponds to a faster speed of decay in the first bumps of both grey dashed lines. Moreover, for the light grey dashed line, the effect of is illustrated through the proximity between the line = 0 and the second bump; as grows larger, more jumps occur earlier in time, and the second bump occurs closer to = 0. These empirical results demonstrate what was qualitatively observed by Remark 13. Another observation is that the second bump which occurs when computing the theoretical bound using (58) (light grey dashed line) arises from the distribution of . The second bump essentially accounts for the possibility of seeing jumps which occur closer to in the interval [0, ]. In contrast, the theoretical error bound with ( ) as in (60) only has the initial bump; all the weight is assigned to the initial value, from which it exponentially decays over time.

2D LTV Systems
We extend the experiment of Section 5.2 by considering more complex 2D LTV shot noise systems of the form: with sup >0 ‖ ( )‖ ≤ for some > 0. Note that constructing virtual system (43) for 2D dynamics yields virtual system state vector ( ) ∈ ℝ 4 . For a more practical setup, we can follow the design of the previous 1D example from Section 5.2 and consider an open-loop unstable system ( ) with a control law of the form ( ) ≜ ( ) ( ) such that the closed-loop system is exponentially stable. However, for the simplicity of the example, we do not consider the controller design problem, and demonstrate the contraction theorems on systems which are already open-loop exponentially stable. We apply Section 4.2 to derive the theoretical mean-squared error bounds for 2D LTV systems with one of two types of matrices ( ): diagonal and (upper) triangular. We note that the relationship between increasing and the variation in error bound for both types of systems is similar to what was observed in Section 5.2 regardless of the two different approaches ((60) versus (58)) to simplifying the error bound. Hence, in contrast to the experiment of Section 5.2, we illustrate the results for only one choice of . The primary purpose of the simulations in this section is to demonstrate the analytical computation of the bounds for more complex LTV systems, especially in choosing the metric ( ) and all the appropriate parameters values to satisfy the assumptions of Theorem 5.

Diagonal ( ) Matrix
First, consider the case where ( ) is a diagonal matrix, i.e., 12 ( ) = 21 ( ) ≡ 0 for all ≥ 0. Specifically, choose: Because ( ) is diagonal, the corresponding state-transition matrix is easily computed: One choice of parameters such that Assumption 6 is satisfied with the induced 2-norm is when = 1 and = 1.
In contrast to the previous diagonal ( ) case, we now consider a symmetric ( ) with nonzero off-diagonal elements. Namely: ( ) ≜ sin( ) + 3 1 1 cos( ) + 3 which satisfies (45) with = 2. This choice of ( ) is also uniformly positive definite for each > 0, and ( ) is bounded as in Assumption 5 with 2 = 4.7071 and 1 = 1.2929. We can now construct the Lyapunov-like function (46) for this particular system. Compared to the previous diagonal matrix case, the inclusion of cross-terms make computation a little trickier.

CONCLUSION
In this paper, we designed incremental stability criteria for nonlinear stochastic systems perturbed by two types of non-Gaussian noise, both characterized by impulsive jumps. The Shot Contraction Theorem (Theorem 3) was designed for compound Poisson shot noise systems of the form (6) while the Lévy Contraction Theorem (Theorem 4) was designed for finite-measure Lévy noise systems of the form (5). In Theorem 5, a specialization of the Shot Contraction Theorem was presented for linear time-varying nominal dynamics of the form (41). All three theorems show that, under the condition that a finite number of jumps arise from the noise process over a finite interval of time, solution trajectories corresponding to different initial conditions and different realizations of the noise process converge exponentially to within a bounded error ball of each other in the mean-squared sense under certain practical boundedness conditions on the parameters of the noise process and contraction metric. We've shown that the convergence rate for (6) is equal to that of the nominal systeṁ = ( , ) because the shot noise system behaves exactly as the deterministic system in between consecutive jumps. Remark 9 discusses properties of the error bound ( , , ) defined in (30b), and makes the claim that 1) larger jump norm bounds correspond to larger error bounds, and 2) shorter interarrival times between jumps correspond to error bounds which grow larger over a shorter horizon of time. Furthermore, the convergence rate from (37a) and error bound ( , , ) from (37b) of the Lévy noise system (5) are shown to be nearly direct sums of the parameters for the white noise system (25) and the shot noise system (30), which is similar to the implications of the Lévy-Khintchine theorem. The numerical simulations of Section 5 demonstrate our results. We first showed empirical mean-squared error bounds for the specific 2D nonlinear system of Section 5.1 over three different intensities of the shot noise process. More specifically, the 1D simple linear reference-tracking shot noise system in Section 5.2 illustrates the tradeoff of Remark 9 by considering varying intensities . We also demonstrate how to derive analytical expressions for the theoretical ( , , ) in Sections 5.2 and 5.3. In particular, the two 2D LTV systems of Section 5.3 demonstrate the computation of the theoretical bounds for more complex systems than studied in Section 5.2; we show the process of choosing the metric ( ) and all the appropriate parameters values to satisfy the assumptions of Theorem 5.
We emphasize that the benefits of our work are two-fold. First, the phenomenon of impulsive jumps in noise processes is understudied for nonlinear stochastic systems in the controls community compared to Gaussian white noise despite being equally prevalent and important for many applications. Second, by addressing the prerequisite problem of stability characterization for shot and Lévy noise systems, we establish the foundations to enable model-based design of stochastic controllers and observers that are robust to shot and Lévy noise. By considering a class of noise models broader than the Gaussian assumption, we can expand the capabilities of model-based synthesis procedures. Thus, instead of using an entirely model-free approach to handle non-Gaussian noise perturbations, we can use model-free approaches to merely supplement the enhanced model-based synthesis baseline. This allows for a design procedure which consumes less training time and data. As mentioned in Remark 5 and Remark 8, by deriving stability theorems which are independent of the initial conditions, we enable a measure of how far the perturbed trajectory will deviate from the nominal within any local horizon of time [ , ] for any < ; this allows for the potential development of controllers and observers which are online and adaptive. The probabilistic guarantee of the theoretical bounds also enables a smarter design for controllers and observers by including a predictive component, by using the probability as a measure of expectation on the number of jumps that may arise in fixed, future horizons of time. For future work, we leverage the results of Section 4 to propose a specific controller synthesis procedure for nonlinear stochastic systems perturbed by shot and Lévy noise.