Layered Control for Cooperative Locomotion of Two Quadrupedal Robots: Centralized and Distributed Approaches

This article presents a layered control approach for real-time trajectory planning and control of robust cooperative locomotion by two holonomically constrained quadrupedal robots. A novel interconnected network of reduced-order models, based on the single rigid body (SRB) dynamics, is developed for trajectory planning purposes. At the higher level of the control architecture, two different model predictive control (MPC) algorithms are proposed to address the optimal control problem of the interconnected SRB dynamics: centralized and distributed MPCs. The distributed MPC assumes two local quadratic programs that share their optimal solutions according to a one-step communication delay and an agreement protocol. At the lower level of the control scheme, distributed nonlinear controllers are developed to impose the full-order dynamics to track the prescribed reduced-order trajectories generated by MPCs. The effectiveness of the control approach is verified with extensive numerical simulations and experiments for the robust and cooperative locomotion of two holonomically constrained A1 robots with different payloads on variable terrains and in the presence of disturbances. It is shown that the distributed MPC has a performance similar to that of the centralized MPC, while the computation time is reduced significantly.


A. Motivation and Goal
Human-centered communities, including factories, offices, and homes, are typically developed for humans who are bipedal walkers capable of stepping over gaps and walking up/down stairs.This motivates the development of collaborative legged robots that can cooperatively work with each other to assist humans in different aspects of their life, such as laborintensive tasks, construction, manufacturing, and assembly.One of the most challenging and essential problems in deploying collaborative legged robots is cooperative locomotion in complex environments, wherein the collaboration between robots is described by holonomic constraints.Cooperative locomotion with holonomic constraints arises in different applications of legged robots, such as cooperative transportation of payloads like social insects [1] (see Fig. 1), human-robot locomotion via prosthetic legs and exoskeletons [2]- [5], and human-robot locomotion via guide dog robots [6].
In recent years, important theoretical and technological advances have allowed for the successful control of multi-robot systems (MRSs) [7], [8], including collaborative robotic arms with or without mobility [9]- [14], aerial vehicles [15]- [26], and ground vehicles [27]- [31].In addition, distributed control algorithms, including distributed receding horizon control approaches, have been developed to address the motion planning of MRSs, see e.g., [32]- [35].Some recent works also address the control and planning of heterogeneous robot teams, including legged robots [36]- [38] but without holonomic constraints amongst the agents.However, the capabilities of cooperative legged locomotion have not been fully explored.In particular, collaborating legged robots can be described by inherently unstable dynamical systems with high dimensionality (i.e., high degrees of freedom (DOFs)), nonlinear, and hybrid nature, and subject to underactuation and unilateral constraints, as opposed to most of the MRSs where the state-of-the-art algorithms have been deployed [39].This complicates the design of real-time trajectory planning and control approaches, both in centralized and distributed fashions, to guarantee each agent's dynamic and robust stability while addressing the curse of dimensionality and respecting the holonomic and unilateral constraints.
Reduced-order (i.e., template) models provide lowdimensional realizations of full-order dynamical models of Fig. 2. Overview of the proposed layered control approach with the centralized MPC algorithm at the high level and distributed nonlinear controllers at the low level for cooperative locomotion.
legged robots [40].They can be integrated with convex optimization techniques and model predictive control (MPC) approaches to enable gait planning for the existing legged robots.Some popular reduced-order models include the linear inverted pendulum (LIP) model [41], centroidal dynamics [42], and single rigid body (SRB) dynamics [43]- [45].These template models have been used for real-time planning of different single-agent bipedal [46]- [49] and quadrupedal robots [43]- [45], [50]- [55].In this paper, we aim to answer three fundamental questions in the context of cooperative locomotion of legged robots.1) How do we develop effective and interconnected reduced-order models that describe the cooperative locomotion of dynamic legged robots with holonomic constraints?2) How do we develop computationally tractable predictive control algorithms in centralized and distributed manners for real-time planning of interconnected reducedorder models?In particular, we aim to examine the implementation of centralized and distributed predictive control algorithms for real-time planning to overcome the limitations caused by the curse of dimensionality in cooperative locomotion.And 3) How do we map optimal reduced-order trajectories to full-order and complex dynamical models of cooperative locomotion?
In order to address the above questions, this paper aims to develop mathematical foundations, experimentally implement, and comprehensively study the cooperative locomotion of two holonomically constrained dynamic legged robots.In particular, the overarching goal of this paper is to develop a layered control algorithm for the real-time trajectory planning and control of dynamic cooperative locomotion for two holonomically constrained legged-robotic systems.The higher layer of the proposed algorithm considers an innovative reducedorder model composed of two interconnected SRB dynamics subject to holonomic constraints for the planning problem.The paper develops novel centralized and distributed MPC algorithms for the planning purpose of interconnected SRB dynamics (see Figs. 2 and 3).These MPC algorithms address the real-time planning at the higher layer of the control hierarchy subject to the interaction terms and feasibility of the ground reaction forces (GRFs).The optimal reduced-order trajectories and GRFs, generated by the high-level MPCs, are then mapped to the full-order and complex dynamics via distributed nonlinear controllers at the low level for the wholebody motion control.The low-level nonlinear controllers are developed based on quadratic programming (QP) and inputoutput (I-O) linearization.The efficacy of the proposed layered control approach is validated via extensive experiments for robustly stable locomotion of two holonomically constrained A1 quadrupedal robots that cooperatively transport unknown payloads on different terrains and in the presence of disturbances (see Fig. 1).A comprehensive numerical analysis of the performance of the proposed centralized and distributed MPC algorithms is finally presented.

B. Related Work
Holonomically constrained MRSs, including fixed-based collaborative robotic arms [10], [11], aerial vehicles with payloads [20]- [23], and ground vehicles [28]- [31] have gained significant attention during the last years.Moreover, MRSs augmented with robotic arms have been studied for more complex cooperative tasks [12]- [14], [24]- [26].In contrast to the above-mentioned robotic systems, collaborative legged robots are dynamical systems with high dimensionality, unilateral constraints, and hybrid nature that add further complexity to synthesizing planning and control algorithms.In addition, the strong interacting wrenches (forces/torques) between the agents, arising from holonomic constraints, must be carefully addressed to result in a robustly stable planner for cooperative legged locomotion.As a result, collaborative legged locomotion has not been studied to the same degree as other robotic systems.This paper, there, marks the first experimental implementation in this context.
In the context of legged robots, the trajectory planning and control approaches can be sectioned into two categories: the Fig. 3. Overview of the proposed layered control approach with the distributed MPC algorithms at the high level and distributed nonlinear controllers at the low level for cooperative locomotion.ones using the full-order models and the others using the reduced-order models.Hybrid systems theory plays an important role in understanding and analyzing full-order dynamical models of legged locomotion [56]- [64].Advanced nonlinear control algorithms such as hybrid reduction [65], controlled symmetries [66], transverse linearization [67], and hybrid zero dynamics (HZD) [68], [69] address the hybrid nature of fullorder locomotion models.The HZD approach regulates some output functions, referred to as virtual constraints, with I-O linearization techniques [70] to coordinate the robot's links within a stride.This method can systematically address underactuation and its effectiveness has been validated for stable locomotion of different bipedal [71]- [75] and quadrupedal robots [76], [77] as well as powered prosthetic legs [4], [5].The full-order gait planning is typically formulated as a nonlinear programming (NLP) problem that can be addressed with existing NLP tools and direct collocation techniques [74], [78]- [81].Although the direct-collocation-based approaches generate optimal trajectories for full-order models of legged robots effectively, they cannot address real-time trajectory optimization of cooperative legged robots in complex environments.
In contrast to full-order models of legged locomotion, template models present reduced-order representations of legged robots that significantly reduce the computational burden and complexity associated with trajectory optimization.Various template models, including LIP [41], SRB [43]- [45], and centroidal dynamics [42], have been successfully integrated with the MPC framework for the real-time planning of bipedal and quadrupedal robots [43]- [53], [55].The main challenge with using template models is bridging the gap between reduced-and full-order models of locomotion arising from abstraction (e.g., ignoring the legs' dynamics in template models).In particular, one needs to translate the optimal reducedorder trajectories to the full-order joint positions and torques.Different hierarchical control algorithms have been proposed in the literature to close this gap, in which a whole-body motion controller is utilized at the low level to map the optimal trajectories, generated by the higher-level MPC, to the fullorder dynamics.For instance, [44], [45] have used Jacobian mapping, [1], [52] have used HZD-based controllers, [55] has used robust MPC integrated with reinforcement learning, [82] has used data-driven template models, and [54], [83] have used joint space whole-body controllers.
Despite the success of the above methods on individual robots, it is unknown what reduced-order models can represent multi-agent-legged robots' dynamic and cooperative transportation effectively.In addition, it is unclear if the existing MPC techniques can address the real-time trajectory planning for the reduced-order models of cooperative locomotion with increased dimensionality.Moreover, it is unclear how the centralized MPC algorithms for such complex models can be decomposed into lower-dimensional distributed MPC algorithms considering the strong interaction terms.Our previous work in [1] employed an interconnected network of LIP models with event-based MPC [52] as a trajectory planner for cooperative locomotion.The simple nature of the LIP model and eventbased MPC reduced the computational burden by running the MPC only at the beginning of the continuous-time domains rather than every time sample.However, using the LIP model prohibits us from capturing the interaction torques due to the assumption of a concentrated point mass at the center of mass (COM).This model also restricts the generation of dynamic cooperative gaits because the center of pressure (COP) must always remain within the support polygon, limiting the system's full potential.Moreover, the proposed event-based MPC was formulated only in a centralized manner and validated on numerical simulations and without experimental validations.In the current work, we aim to develop a new framework to allow more dynamic cooperative gaits while solving MPC problems faster in both centralized and distributed manners and experimentally validating the approach on two dynamic quadrupedal robots.

C. Objectives and Contributions
The objectives and key contributions of this paper are as follows: 1) The paper presents an innovative network of two holonomically constrained SRB dynamics as an effective reduced-order model to capture the interaction wrenches between agents while dynamically stabilizing the motion during cooperative locomotion.It is numerically shown that the MPC algorithms utilizing a nominal SRB model cannot stabilize cooperative locomotion.2) A layered control approach is proposed to robustly stabilize cooperative locomotion of holonomically constrained quadrupedal robots.At the high level of the control hierarchy, two different MPC algorithms, based on QP, are proposed: centralized MPC and distributed MPC (see Figs. 2 and 3).The centralized MPC algorithm solves for the optimal state trajectory, GRFs, and interaction wrenches for the interconnected SRB dynamics.The distributed MPC algorithm assumes two local QPs that share their optimal solutions with a onestep communication delay.The distributed MPCs solve for the local states, local GRFs, and estimated local interaction wrenches according to an agreement protocol in the cost function.3) At the low level of the proposed control architecture, distributed and full-order nonlinear controllers are presented for the whole-body motion control of agents.The distributed nonlinear controllers are developed based on QP and virtual constraints to impose the full-order dynamics to track the prescribed and optimal reducedorder trajectories and GRFs, generated by the high-level MPC (centralized or distributed).4) Extensive numerical simulations are presented to evaluate the performance of the cooperative locomotion of two holonomically constrained A1 robots with different payloads on different rough terrains and in the presence of external force disturbances.A comparative analysis of the closed-loop systems with centralized and distributed MPC algorithms with more than 1000 randomly generated rough terrain profiles and external forces is presented.It is shown that the proposed distributed MPC algorithm has a performance similar to that of the centralized one, while the solve time is reduced by 70%.
In addition, it is shown that the proposed centralized and distributed MPCs can drastically improve the robust stability of cooperative locomotion subject to a wide range of uncertainties, while the nominal MPCs cannot stabilize it.
5) The effectiveness of the proposed layered control algorithms (centralized and distributed) is verified with an extensive set of experiments for the blind and cooperative locomotion of two holonomically constrained A1 quadrupedal robots, each with 18 DOFs.The experiments include cooperative locomotion with different and unknown payloads on different terrains (covered with blocks, gravel, mulch, and slippery surfaces) and in the presence of external pushes and tethered pulling.Detailed robustness analysis is presented to experimentally evaluate the performance of the closed-loop system against the violations of assumptions made for the synthesis of the controller.

D. Organization
The paper is organized as follows.Section II develops interconnected SRB models as a reduced-order model of cooperative locomotion.Section III formulates centralized and distributed MPC-based trajectory planning algorithms with the proposed reduced-order model.Section IV presents distributed nonlinear controllers for the whole-body motion control.Section V provides a detailed and extensive set of numerical and experimental validations of the proposed layer control algorithm.In Section VI, we discuss the results and compare the performance of the centralized and distributed MPC algorithms.Section VII finally presents some concluding remarks and future research directions.

II. REDUCED-ORDER MODEL OF COOPERATIVE LEGGED
LOCOMOTION This section aims to address the reduced-order models that describe the cooperative locomotion of two holonomically constrained quadrupedal robots.The section assumes a rigid bar connected via ball joints to two points on the robots for carrying objects (see Fig. 1).These two points will be referred to as the interaction points.This assumption simplifies the analysis and results in a holonomic constraint, stating that the Euclidean distance between the interaction points is constant.However, the analysis of this section can be extended to more sophisticated connections, such as restricting the pitch or roll angles of the bar/load.In Section VI-D, we will experimentally show the robustness of the developed algorithms subject to these additional constraints.
In our notation, the subscript i ∈ I := {1, 2} represents the ith robot.We assume that {B i } is the local frame rigidly attached to the body of the agent i with its origin on the COM.The orientation of the frame {B i } with respect to the inertial world frame {O} is denoted by R i ∈ SO(3), where SO(3 is the special orthogonal group of order 3, and I represents the identity matrix.The Cartesian coordinates of the COM of agent i with respect to {O} are also represented by r ci := col(x ci , y ci , z ci ) ∈ R 3 , where "col" denotes the column operator.Moreover, ω Bi i ∈ R 3 represents the angular velocity of agent i expressed in the body frame {B i }.We assume that p i ∈ R 3 for i ∈ I represents Cartesian coordinates of the interaction points with respect to the inertial frame {O}, that is, where d Bi i ∈ R 3 is a constant vector denoting the coordinates of the interaction points in the body frame {B i }.For future purposes, we define η i := R i d Bi i (see Fig. 4).We remark that the holonomic constraint between two agents can be described as a constraint on the Euclidean distance between the interaction points as follows: in which • denotes the 2-norm, and ψ 0 is a constant number, determined based on the length of the bar.
According to the principle of virtual work, one can consider (p 1 −p 2 ) λ ∈ R 3 as the interaction force applied to agent 1 for some Lagrange multiplier λ ∈ R to be determined later (see again Fig. 4).Consequently, the net external wrench applied to agent i ∈ I can be expressed as follows: where j = i ∈ I denotes the index of the other agent and the hat map (•) : R 3 → so(3) represents the skew-symmetric operator with the property x y = x×y for all x, y ∈ R 3 .In (3), the superscript ∈ C i denotes the index of the contacting legs with the ground, C i represents the set of contacting legs for the agent i, and u i ∈ R 3 denotes the GRF at the contacting leg for the agent i.In addition, r i ∈ R 3 represents the position of each contacting leg with respect to the COM of agent i, that is, r i = r foot,i − r ci , where r foot,i is the position of the contacting foot of the agent i with respect to the world frame {O}.
By taking the local state variables for the agent i ∈ I as the global state variables can be defined as where "vec" represents the vectorization operator.Similarly, the global control inputs can be defined as u := col(u 1 , u 2 ), where u i denotes the local control inputs (i.e., GRFs) for the agent i, that is, By differentiating the holonomic constraint (2), one can get and hence, the state manifold for the interconnected SRB dynamics can be expressed as Finally, the interconnected SRB dynamics can be expressed as where m and I ∈ R 3×3 denote the total mass and the fixed moment of inertia in the body frame for each agent, respectively, and g represents the constant gravitational vector.We remark that the kinematics relations in ( 8) are expressed as Ṙi = R i ω Bi i for i ∈ I.The rotational dynamics can be further expressed as Euler's equation being the admissible set of control inputs, where m u denotes the total number of contacting legs with the ground (e.g., represents the linearized friction cone for some friction coefficient µ, and TX is the tangent bundle of the state manifold X . In order to make the manifold X invariant under the flow of ( 8), one would need to choose the Lagrange multiplier λ to satisfy the holonomic constraint.In particular, differentiating (7) according to (1) and This latter equation, together with the equations of motion ( 8) and (3), results in λ being a function of (x, u).However, replacing this nonlinear expression for λ in (8) can make the original dynamics (8) more nonlinear and complex.Furthermore, this can numerically complicate the Jacobian linearization of ẋ = f (x, u, λ(x, u)) when formulating the trajectory planning problem as a convex MPC in Section III.Alternatively, we pursue a computationally effective approach by considering ẋ = f (x, u, λ) subject to the equality constraint ψ(x, u, λ) = 0 within the optimal control problem formulation.More specifically, the decision variables for the MPC include the trajectories of (x, u, λ) over the control horizon, and the MPC will satisfy the equality constraint.The other advantage of this technique is that the interconnected SRB dynamics can be integrated with the variational-based approach of [44], [45] to linearize and then discretize the dynamics such that the rotation matrices R i , i ∈ I evolve on SO(3).
To clarify this latter point, following [45], we introduce a new set of local state variables for the agent i ∈ I with the abuse of notation as Here, ξ i ∈ R 3 is a vector used to approximate the rotation matrix R i around an operating point R i,op as follows: The approach of [45] has linearized the SRB dynamics subject to GRFs without interaction forces.Hence, one must extend the technique to write down the Taylor series expansion for the additional wrench terms in (3) arising from the interaction.This results in a discrete and linear time-varying (LTV) system to predict the future states as follows: for all k = 0, 1, • • • , N − 1 and with the initial condition x t|t = x t .Here, x ∈ R 24 denotes the global state variables, N represents the control horizon, and (x k+t|t , u k+t|t , λ k+t|t ) denotes the tuple of the predicted global states, global inputs (i.e., GRFs), and Lagrange multiplier at time k+t computed at time t.Furthermore, , and d op ∈ R 24 are the Jacobian matrices and offset term evaluated around the current operating point (x t , u t−1 , λ t−1 ).
The approximation in (12) only ensures that the rotation matrices evolve on SO(3).To guarantee that the state predictions in (13) belong to the tangent space of the state manifold at the operating point (i.e., T op X ), we first define the following equality constraint Then, analogous to the technique used for the linearization of the interconnected dynamics, the equality constraint ( 14) can be approximated around the operating point as follows: to ensure that Ψ(x k+t|t , u k+t|t , λ k+t|t ) ≡ 0. Here, , and h op ∈ R 3 are proper matrices and vectors that can be either computed via the approach of [45] or symbolic calculus.Remark 1: As the nature of the holonomic constraints between the agents becomes more complex, the procedure for obtaining the corresponding prediction model and equality constraints becomes computationally expensive.However, our experimental results in Section VI-D will indicate that the proposed layered control approach, developed based on the assumption of holonomic constraints in (2), can robustly stabilize cooperative locomotion subject to uncertainties in the constraints (e.g., limiting the pitch angles of the ball joints).In addition, Section V-B will show that ignoring the holonomic constraints (2) for the reduced-order model and trajectory planner can destabilize cooperative locomotion.

III. MPC-BASED TRAJECTORY PLANNING
This section aims to formulate the real-time trajectory planning problem for cooperative locomotion as centralized and distributed MPC algorithms.

A. Centralized MPC
We will consider a locomotion pattern for the agents, described by the directed cycle G(V, E), where V and E ⊂ V ×V represent the sets of vertices and edges, respectively.The vertices denote the continuous-time domains of locomotion, and the edges represent the discrete-time transitions between the continuous-time domains.
Assumption 1: At every time sample t, the higher-level MPC is aware of the current stance legs, assuming that the stance leg configuration does not change throughout the prediction horizon.
Remark 2: Assumption 1 is not restrictive and simplifies the optimal control problem of (13) subject to (15) over the control horizon.Otherwise, one would need to consider the optimal control problem for a piecewise affine (PWA) system [84,Chap. 16] subject to different switching times.
We are now in a position to present the following real-time centralized MPC algorithm for the cooperative locomotion Prediction model (13) Equality constraints ( 15) where the equality constraints for the MPC arise from a) the prediction model ( 13) to address the interconnected SRB dynamics with the initial condition of x t|t = x t , and b) the holonomic constraints (15) (see Fig. 2).Here, the centralized MPC solves for the optimal trajectories of the global states, global inputs, and the Lagrange multiplier encoded in (x(•), u(•), λ(•)) to retain the sparsity structure of [85], where x(•) R λ for some desired trajectory x des (•) and some positive definite matrices Q and R u , and a positive scalar R λ .Finally, the inequality constraints of ( 16) represent the feasibility of the GRFs for two agents.
Remark 3: The MPC in (16) addresses the trajectory planning problem over the current continuous-time domain.In particular, we do not include the following domain for prediction purposes.This is mainly due to the fact that the actual footholds for the following domain are not known a priori.More specifically, we employ Raibert's heuristic [86,Eq.(2.4), pp.46] to plan for the upcoming footholds of each agent.Assuming pre-planned footholds, one can extend the MPC to include other domains.However, our experimental results in Section V suggest that planning over the current domain is sufficient for robustly stable cooperative locomotion.This is in agreement with most of the existing MPC approaches for single SRB dynamics.We also remark that the centralized MPC has (25 + 3 m u ) N decision variables, where m u represents the total number of contacting legs with the ground.Finally, the MPC problem ( 16) solves for the optimal trajectories of the state variables x (•), control inputs u (•), and Lagrange multiplier λ (•).However, the high-level MPC only applies the first element of the optimal state and control sequence, i.e., (x t+1|t , u t|t ), to the low-level nonlinear controller for tracking while discarding λ t|t (see Fig. 2).

B. Distributed MPC
This section aims to develop a network of distributed MPCs with a smaller number of decision variables that plan for the cooperative locomotion of two holonomically constrained quadrupedal robots.From ( 13), the local dynamics of the agent i ∈ I can be expressed as follows: for , and d op,i ∈ R 12 denote the corresponding partitioning of (A op , B op , C op , d op ).In addition, represents the interaction term on the agent i.Similarly, the equality constraints (15) can be rewritten as follows: in which E op,i ∈ R 3×12 and F op,i ∈ R 3× 3 2 mu are the corresponding columns of (E op , F op ), and Ω i x j,k+t|t , u j,k+t|t := E op,j x j,k+t|t + G op,j u j,k+t|t for j = i.Motivated by the inherent limitation of the distributed QP problems, one would need to estimate the interaction terms ∆ i and Ω i , i ∈ I to solve for local QPs.For this purpose, we make the following assumption.
Assumption 2 (One-Step Communication Protocol): At every time sample t, each local MPC has access to the optimal solution of the other local MPC at time t−1.More specifically, the local MPCs share their previous optimal solutions over the network.
From Assumption 2, we can estimate the interaction terms ∆ i and Ω i in ( 17) and ( 19) using the previous optimal solutions, that is, (20) in which x j,k+t|t−1 and u j,k+t|t−1 denote the optimal solution from the local QP j for time k + t computed at time t − 1 for k = 0, 1, • • • , N − 1.We remark that as the QP j does not plan for u N −1+t|t−1 , we let u N −1+t|t−1 = 0.The assumption in (20) estimates the interaction terms in the local dynamics and equality constraints based on the optimal values from the local QP j at the previous time sample.With this assumption, the local MPC i needs to optimally solve for its own local state trajectory x i (•), local control trajectory u i (•), and the Lagrange multiplier trajectory λ(•).However, as the Lagrange multiplier λ is common between the decision variables of two local MPCs, they need to reach a consensus over time for the optimal λ value.
To address the consensus problem, we develop an agreement protocol as follows.The cost function of the centralized MPC ( 16) can be written as the sum of individual terms, i.e., We assume that each local QP estimates its own trajectory of the Lagrange multiplier, denoted by λ i (•).We then propose the following real-time distributed MPC for agent i ∈ I min Local prediction model (17) with (20) Local equality constraints ( 19) with ( 20) where w is a positive weighting factor added to introduce a new term in the local cost function to address the agreement protocol.In particular, the agreement term penalizes the difference between the local predicted values of λ i,k+t|t and the average of the previously computed optimal values λ i,k+t|t−1 and λ j,k+t|t−1 by the local MPCs i and j at time t − 1.Here, a ii and a ij are the weighting factors for averaging with the property a ii , a ij ∈ [0, 1] and a ii + a ij = 1, where i = j ∈ I.
The last two terms in the cost functions will be described shortly.The distributed MPC (22) has two sets of equality constraints arising from a) the local dynamics (17), and b) the holonomic constraint (19) with the assumption (20).The proposed local MPC for the agent i does not consider the local dynamics of the other agent (i.e., agent j).Instead, motivated by our previous work [87], it uses the Karush-Kuhn-Tucker (KKT) Lagrange multipliers that correspond to the equality constraint arising from the local dynamics of the agent j in the QP j at time t − 1.This set of KKT Lagrange multipliers is denoted by β j,t−1 .In addition, K j,i and K j,λ represent the sensitivity (i.e., gradient) of the local dynamics j with respect to the local variables (x i (•), u i (•)) and λ(•), respectively.In particular, K j,i can be computed in a straightforward manner by taking the gradient of the local interaction terms ∆ j with respect to (x i,k+t|t , u i,k+t|t ) over the entire control horizon and stacking the results together, that is, ) .
An analogous approach can be used to compute the sensitivity matrix K j,λ .We then add the last two linear terms to the cost function of the local MPC (22).Our previous work [87, Theorem 1] has shown that the inclusion of the KKT Lagrange multipliers β j,t−1 together with the sensitivity matrices (K j,i , K j,λ ) in the cost function can result in a set of local KKT conditions that have a similar structure to that of the KKT equations for the centralized problem.Finally, U i in (22) represents the local feasibility set for the GRFs (i.e., inputs).Remark 4: We remark that local MPCs in the proposed structure (22) share their optimal local state trajectory x i (•), local control trajectory u i (•), local estimates of the Lagrange multiplier trajectory λ i (•), and the KKT Lagrange multipliers corresponding to the local dynamics in the QP structure β i with the other agent and according to the one-step communication delay protocol (see Fig. 3).Finally, the number of decision variables for each local MPC becomes (13+ 3  2 m u ) N .Section VI-B will numerically study and show the consensus problem of the estimated Lagrange multipliers.

IV. DISTRIBUTED NONLINEAR CONTROLLERS FOR
FULL-ORDER MODELS The objective of this section is to present the low-level and distributed nonlinear controllers for the whole-body motion control of each agent.The full-order and floating-based model of the agent i can be described by the Euler-Lagrange equations and principle of virtual work as follows: where q i ∈ Q ⊂ R nq represents the generalized coordinates of the robot i, Q and n q denote the configuration space and number of DOFs, respectively, τ i ∈ T ⊂ R nτ represents the joint-level torques at the actuated joints, T is a closed and convex set of admissible torques, and C i represents the set of contacting legs with the environment.In addition, f i denotes the GRF at the contacting leg ∈ C i of the fullorder model for the agent i.We remark that the GRF at the contacting leg ∈ C i of the reduced-order model for the agent i was denoted by u i in Section II.This is due to the possible gap between the reduced-and full-order models arsing from abstraction (i.e., ignoring legs' dynamics).Moreover, D(q i ) ∈ R nq×nq denotes the positive definite mass-inertia matrix, H(q i , qi ) ∈ R nq represents the Coriolis, centrifugal, and gravitational terms, Υ ∈ R nq×nτ is the input distribution matrix, J (q i ) denotes the contact Jacobin matrix at the leg , J int (q i ) represents the Jacobian of the interaction point p i , and (p i −p j ) λ denotes the interaction force between the two agents as described in the reduced-order model of Section II.The local and full-order state variables for the agent i is defined as For future purposes, the vector of GRFs for the agent i is shown by For the whole-body motion control of each agent, we develop a QP-based nonlinear controller that maps the desired optimal trajectories and GRFs, generated by the high-level MPC, to the full-order model.For this purpose, we consider the following time-varying and holonomic output functions, referred to as virtual constraints [76], to be regulated: where y a (q i ) represents a set of controlled variables and y des (t) denotes the desired evolution of the controlled variables in terms of a time-based phasing variable.In this paper, the controlled variables include the Cartesian coordinates of the robot's COM, the absolute orientation of the robot's body (i.e., Euler angles), and Cartesian coordinates of the swing leg ends.The desired evolution of the COM position and Euler angles are generated by the high-level MPC.In addition, the desired evolution of the swing leg ends' coordinates are taken as a Bézier polynomial connecting the current footholds to the upcoming ones, computed based on Raibert's heuristic [86, Eq.
(2.4), pp.46].We next implement the standard I-O linearization technique [70] to differentiate the local outputs (24) twice along the full-order dynamics (23) while ignoring the interaction forces between the agents.This results in the following output dynamics where Φ τ (z i ), Φ f (z i ), and φ(z i ) are proper matrices and vectors computed based on I-O linearization and Lie derivatives similar to [1, Appendix A].Moreover, K P and K D are positive definite matrices, and δ i is a slack variable to be used later for the feasibility of the QP-based nonlinear controller.Unlike the high-level trajectory planner of Section III that takes into account the interaction terms, the lowlevel nonlinear controller ignores the interaction forces.In particular, our numerical results in Section V suggest that considering holonomic constraints for trajectory planning is crucial for stabilizing cooperative locomotion.However, the optimal trajectories, generated by the high-level MPC, can be robustly mapped to the full-order model without considering the interaction terms at the low level.This reduces the complexity of the distributed and full-order model controllers.
By stacking together the Cartesian coordinates of the stance leg ends and then differentiating them twice, one can get an additional constraint to express zero acceleration for the stance leg ends.In particular, we have where r foot,i := col{r foot,i | ∈ C i } is a vector containing the Cartesian coordinates of the stance feet for the agent i.Moreover, Θ τ (z i ), Θ f (z i ), and θ(z i ) are proper matrices and vectors computed based on I-O linearization.We them employ the following real-time and strictly convex QP [76] to solve for feasible (τ i , f i , δ i ) at 1kHz to meet the output dynamics (17) and the contact equation ( 26) where γ 1 , γ 2 , and γ 3 are positive weighting factors, and f des,i represents the desired evolution of the GRFs generated by the high-level MPC.The cost function of (27) tries to minimize the effect of the slack variable δ i in the output dynamics (25) via a high weighting factor γ 3 while 1) imposing the actual GRFs of the full-order model f i to follow the prescribed force profile f des,i with the weighting factor γ 2 , and 2) having the minimum-power torques with the weighting factor γ 1 .

V. NUMERICAL AND EXPERIMENTAL VALIDATIONS
This section aims to validate the proposed layered control architecture composed of the high-level centralized and distributed MPC algorithms and the low-level distributed nonlinear controllers via extensive numerical simulations and experiments.We study both the reduced-and full-order models of cooperative locomotion in numerical simulations to show the robust stability of the collaborative gaits.We further experimentally investigate the robustness of the trajectories = col(0, 0, 0.15) (m) for all i ∈ I (see (1)).Different mechanisms are designed to holonomically constrain the motion of two robots with ball joints and an adjustable bar length between the agents (see Fig. 5).Furthermore, the mechanisms can limit the ball joints to add further constraints on their Euler angles.For numerical and experimental studies in Sections V-B and V-C, the nominal length of the bar is 1 (m) (see Fig. 1).However, we can alter it to 0.75 (m) and 1.5 (m) for the robustness analysis.
In the following sections, we study a cooperative trot gait with a swing time of 0.2 (s) and at different speeds up to 0.5 (m/s) and subject to external disturbances, uncertainties in holonomic constraints, unknown payloads up to 55% uncertainty in one robot's mass, and on different terrains (e.g., slippery surfaces, wooden blocks, gravel, mulch, and grass).
2) Computation, control loop, and network: We use RaiSim [88] to simulate both the interconnected reducedand full-order models numerically.The proposed high-level centralized and distributed MPC algorithms for trajectory planning and the low-level distributed nonlinear controllers for whole-body motion control are solved using qpSWIFT [89] at 200 Hz and 1 kHz, respectively.A joystick is used to command the desired velocity trajectories to the high-level trajectory planner.The joystick includes two 2-DOFs gimbals, Fig. 7. Snapshots demonstrating the performance of the proposed control approach in numerical simulations.The left plot shows the snapshot of the numerical simulation with the interconnected reduced-order model (torso dynamics) and subject to a 5 (kg) payload (40% uncertainty in one robot's mass) between the agents.The right plot shows the snapshot of the numerical simulation with the full-order model and subject to a 5 (kg) payload between agents and unknown time-varying external disturbances applied to the robots.Arrows at the leg ends describe the GRFs, and the ones on the torso represent the external disturbance forces.The payload is illustrated with a box.six auxiliary switches, and two knobs for the controlling purpose (see Fig. 6).The gimbals are used to command the desired speed, whereas the switches allow us to simultaneously control both agents or individually command them.This control scheme allows us to coordinate the agents during cooperative locomotion and unexpected scenarios effectively.This will be discussed further in Section VI-E.Moreover, we remark that the joystick commands the desired trajectories for both the numerical simulations and experimental validations.The joystick connects with the computer through a 2.4 GHz wireless channel as described in Fig. 6.
The proposed layered controller, including the MPC-based trajectory planners and distributed nonlinear controllers, is solved on an off-board laptop computer with an i7-10750H CPU running at 2.60 GHz and 16 GB RAM.For the experiment, we use a network switch in the connection between the robotic team and the computer.The connection diagram is illustrated in Fig. 6.The switch supports 1000 Mbps gigabit Ethernet with five ports.The robot IP addresses are redefined to avoid IP collision during communication.Here, we also define the IP routing table and proper IP address on the computer to communicate with both agents without data packet confusion.Internally, a UDP protocol through Ethernet cables is used to communicate between the computer and the robots.The data structure in C++ is used for numerical simulations to communicate between the layered control architecture and the simulation environment.
3) Tuning controllers: The control horizon for both the centralized and distributed MPC is taken as N = 5 discretetime samples, where the time discretization at the high level is 5 (ms).The centralized and distributed MPC algorithms in ( 16) and ( 22) have 245 and 125 decision variables, respectively.The stage cost gain of the centralized MPC is tuned as where Q rci = 10 5 × diag{3 300 30}, Q ṙci = 10 4 I 3×3 , Q ξi = 10 8 I 3×3 , and Q ωi = 5 × 10 3 I 3×3 , i ∈ I.The terminal cost gain of the centralized MPC is also tuned as P = 10 −1 Q ∈ R 24×24 .The input gains of the centralized MPC are chosen as R u = 10 −2 I 24×24 and R λ = 10 4 .In a similar manner, the stage cost gain and terminal cost gain of the distributed MPC on the i-th agent are tuned as Here, two nominal MPCs are applied to the reduced-order models of agents to generate optimal GRFs without considering the holonomic constraints between them.The joystick provides desired trajectories.The instability of the cooperative gait is evident.
R 12×12 .The input gains of the distributed MPC are finally chosen as R u = 10 −2 I 12×12 and R λ = 10 4 .Additionally, we choose the weighting factor for the agreement protocol in ( 22) as w = 10, and the averaging factors in ( 22) are chosen as a ii = a ij = 0.5 for all i = j ∈ I.The friction coefficient for both the centralized MPC and distributed MPC algorithms is assumed to be µ = 0.6.However, the experiments on slippery surfaces assume a lower friction coefficient of µ = 0.3.For the low-level and distributed nonlinear controllers in (27), the weighting factors for the joint-level torques, force tracking error, and slack variables are chosen as γ 1 = 10 2 , γ 2 = 10 4 , and γ 3 = 10 6 , respectively.We finally remark that the lowlevel controller uses the same friction coefficient values from the high-level MPC.
The computation time of the centralized and distributed MPC algorithms under nominal conditions is approximately 1.38 (ms) and 0.41 (ms), respectively.This shows that the solve time with the proposed distributed MPC is reduced by 70%.Furthermore, the computation time of the low-level distributed nonlinear controllers is about 0.12 (ms).

B. Numerical Validation 1)
Simulation with the reduced-order model: We model the interconnected SRB dynamics in the RaiSim environment for numerical validation and apply the optimal GRFs generated from the proposed centralized ( 16) and distributed MPC (22) algorithms.In addition, for comparison purposes, we apply the GRFs generated from the nominal MPC that considers a standard SRB model without the holonomic constraints to this interconnected model.An overview of the numerical simulation environment for the interconnected reduced-order model is illustrated in Fig. 7(a).The evolution of the desired and actual COM velocities using the nominal MPC is depicted in Fig. 8.It is evident that the nominal MPC, which does not consider the holonomic constraint between agents, cannot stabilize the interconnected reduced-order system.On the other hand, the interconnected SRB model performs stable cooperative locomotion when integrated with the GRFs generated from the proposed centralized and distributed MPCs as shown in Figs. 9 and 10, respectively.In these simulations, an unknown payload of 5 (kg) (40% uncertainty in one robot's mass) is Here, the optimal GRFs are generated by the centralized MPC ( 16) and are applied to the reduced-order models subject to an unknown payload of 5 (kg) between agents.The joystick provides desired trajectories.The robust tracking is evident.Here, the optimal GRFs are generated by the distributed MPCs (22) and are applied to the reduced-order models subject to an unknown payload of 5 (kg) between agents.The joystick provides desired trajectories.The robust tracking is evident.considered between the agents (i.e., in the middle of the bar), and the joystick provides the desired trajectories.Figures 9 and 10 illustrate that the closed-loop interconnected reduced-order model robustly tracks the time-varying desired trajectories subject to unknown payloads.Animations of all simulations can be found online [90].
2) Simulation with the full-order model: We next numerically study the performance of the closed-loop system with the interconnected full-order dynamical model in RaiSim.
Here, the proposed layered control approach is employed, including the centralized and distributed MPC algorithms for trajectory planning and nonlinear controllers for wholebody motion control.The desired time-varying trajectories are generated using the joystick.The high-level MPC then generates optimal GRFs and reduced-order trajectories from the centralized and distributed algorithms.The distributed lowlevel controller computes the corresponding joint-level torques to impose the full-order model to track the optimal trajectories.An overview of the numerical simulation environment for the full-order model is illustrated in Fig. 7(b).The desired trajectories provided by the joystick together with the optimal trajectories computed by the centralized and distributed MPC are depicted in Figs.11(a  for agent 1.Here, we consider the trot gait over a randomly generated rough terrain with a maximum height of 5 (cm) (19% uncertainty in the robot's nominal height).The gait is also subject to an unknown payload of 5 (kg) and an unknown sinusoidal external disturbance force with the magnitude of 20 (N) and the period of 1.0 (s), 0.7 (s), and 0.4 (s) along the x-, y-, and z-directions, respectively.It is evident that the closedloop system robustly tracks the desired trajectories.

C. Experimental Validation and Robustness Analysis
This section experimentally validates the proposed layered control approach with the high-level centralized and distributed MPC algorithms and the low-level distributed nonlinear controllers.The robustness of the cooperative gaits on different indoor and outdoor terrains and subject to unknown payloads and external disturbances is evaluated.Fig. 14.Plots of the optimal GRFs generated from the centralized MPC for agents 1 and 2 during the nominal trot experiment on flat ground.The figure depicts the z components of the optimal GRFs for the left front leg of each agent.We remark that the GRFs in the z-direction are close to 60 (N) since the trot gait is adopted and the total mass of the robot is 12.45 (kg).

1)
Indoor experiments with the centralized MPC: In the indoor experiments, we employ the proposed layered control algorithm on two A1 robots subject to holonomic constraints, where ball joints are applied at the interaction points (see Fig. 12).We first investigate the nominal and cooperative trot gait with the centralized MPC algorithm on flat ground and without unknown disturbances.The desired and optimal COM trajectories, generated by the high-level MPC, together with the generated optimal GRFs, are illustrated in Fig. 13 and Fig. 14, respectively.It is evident that the team of two A1 robots performs stable cooperative locomotion while the trajectory planner effectively tracks the time-varying desired trajectories.Furthermore, the optimal GRFs generated by the centralized MPC are feasible, with the vertical component value being close to 60 (N), which is approximately the force required by each stance leg to support the total mass of each robot during trotting.
We further investigate the robustness of the proposed layered control approach by studying the tracking performance of the closed-loop system with different experiments, including locomotion on rough terrain (see Fig. 12(a)), locomotion on a slippery surface (see Fig. 12(c)), and locomotion subject to unknown external disturbances (see Fig. 12(d)), as shown in Figs.15(a), 15(b), and 15(c), respectively.these experiments, the rough terrain is made of randomly displaced wooden blocks with a maximum height 5 (cm) (19% robot's height).Moreover, the slippery surface is a whiteboard covered with cooking spray.The unknown external disturbances are further applied by a human user, including pushes and tethered pulls on both agents.The robots cooperatively transport an unknown payload of 4.53 (kg) (36% uncertainty in one robot's mass) in all these experiments.The optimal GRFs computed by the MPC on rough terrain, on the slippery surface, and subject to external disturbances are depicted in Figs.16(a), 16(b), and 16(c), respectively.We remark that despite the uncertainties, the GRFs are in the feasible range, and the MPC's outputs robustly track the desired and time-varying trajectories.Furthermore, the phase portraits of the body's roll and pitch motions (i.e., unactuated DOFs) during these cooperative trot gaits are shown in Figs.17(a) and 17(b).Figure 17 indicates that the A1 robots can perform robustly stable cooperative locomotion in the presence of various unknown terrains and disturbances.Videos of all experiments are available online [90].
2) Indoor experiments with the distributed MPC: In this part, we evaluate the performance of the closed-loop system with the proposed distributed MPC algorithm in similar indoor experiments (see Fig. 12).The evolution of the optimal trajectories generated from the distributed MPC and time-varying desired trajectories during the cooperative transportation of the same payload over rough terrain, the slippery surface, and subject to unknown disturbances are illustrated in Figs.18(   The evolution of the virtual constraints (24) for trotting over the gravel and transitioning from concrete to grass with the centralized MPC and trotting over mulch and grass with the distributed MPC is shown in Fig. 21.As the virtual constraint plots stay close to zero, we can conclude that the full-order system effectively tracks the optimal reduced-order trajectories generated by the high-level MPCs.Furthermore, it is evident that the proposed layered control approach with both centralized and distributed MPCs can robustly stabilize cooperative gaits in the presence of payloads on unknown outdoor terrains.

VI. DISCUSSION AND COMPARISON
Numerical simulations and experimental validations in Section V show the effectiveness of the proposed centralized and distributed MPC algorithms for cooperative locomotion.This section aims to analyze and compare the performance of the proposed MPCs while discussing their limitations.Here, (a) and (c) depict the evolution of virtual constraints with the distributed MPC at the high level.In addition, (b) and (d) illustrate the evolution of the virtual constraints with the centralized MPC at the high level.Here, we plot the components of virtual constraints in (24) that correspond to the COM position along the x and y axes (m) (i.e., COM position tracking) and the body's roll and pitch angles (rad) (i.e., angle tracking).It is clear that the full-order system tracks the prescribed optimal and reduced-order trajectories generated by the MPCs.

A. Comparison of the Centralized and Distributed MPCs
The robustness of the cooperative locomotion with the proposed centralized and distributed MPC algorithms in the presence of various uncertainties and disturbances is studied numerically and experimentally in Section V. To compare the performance and robustness of the proposed trajectory planners, we apply the nominal, centralized, and distributed MPCs over 1500 randomly generated rough terrains in the simulation environment of RaiSim, as shown in Fig. 22(a).Here, the randomly generated landscapes' maximum height is 12 (cm) (46% uncertainty in the robot's height).Furthermore, the total length of the terrain is assumed to be 10 (m).In these simulations, we evaluate the cooperative locomotion as a success if the agents reach 10 (m) without losing stability.We assess the locomotion as a failure if at least one of the agents' bodies touches the ground before reaching 10 (m).The success rate versus the length of the terrain is depicted in Fig. 22(b).The overall success rate of the nominal, centralized, and distributed MPCs is 0%, 54.2%, and 53.8%, respectively.
Similarly, we compare the performance and robustness of the nominal, centralized, and distributed MPCs subject to 1200 randomly generated external forces and payloads, as shown in Fig. 22(c).The external force is taken as sinusoidal with a maximum amplitude of 80 (N) (65% of one robot's weight) and a maximum period of 4 (s) on the x-, y-, and z-directions.The maximum mass of the payload is also assumed to be 5 (kg).We evaluate the cooperative locomotion as a success if the agents sustain the stability until 60 seconds.We assess the locomotion as a failure if at least one of the agents' bodies touches the ground before 60 (s).The success rate versus time is depicted in Fig. 22(d).The overall success rate of the nominal, centralized, and distributed MPCs is 0%, 55.6%, and 53.9%, respectively.
Our experimental studies in Figs.[13][14][15][16][17][18][19] and Fig. 21 suggest that the proposed centralized and distributed trajectory planners show similar robustness in indoor and outdoor experiments.Slightly better robustness has been observed in numerical simulations of Fig. 22 when employing the centralized MPC at the high level.Still, the success rate between the centralized and distributed MPCs does not significantly differ.These comparisons suggest that the proposed centralized and distributed MPCs can robustly stabilize dynamic cooperative locomotion.However, the distributed MPC has substantially less computational time.

B. Evolution of the Lagrange Multiplier in Distributed MPC
Section VI-A demonstrated a similar success rate for the centralized and distributed MPC algorithms with the randomly generated terrains and disturbances.To further study this similar robust stability behavior, Fig. 23 illustrates the evolution of the estimated Lagrange multiplier, λ, for each agent when the agents cooperatively walk with the distributed MPC.In formulating the distributed MPC, each agent locally estimates the Lagrange multiplier according to the one-step communication delay and the agreement protocol.Therefore, λ on each distributed MPC evolves differently.We introduced the consensus protocol in the cost function of (22) to mitigate the divergence of the local estimates and to impose the agreement.The magnified portion of the plot in Fig. 23 shows that the initial λ values on each agent are different while converging after a short amount of time according to the consensus protocol.The plot also shows that each agent's λ values are not precisely the same during cooperative locomotion.However, we can observe that both λ values stay close.

C. Synchronization and Asynchronization
We aim to study the robustness of the layered control approach against possible phase differences between agents that can easily occur on rough terrain, where the discrete-time transitions (i.e., impacts) happen earlier or later than anticipated times on normal gaits.To further investigate this point, we study the estimated height of the agents' front right legs over rough terrain in Fig. 24.Both agents are synchronized at the beginning of the locomotion.After encountering the rough terrain, the asynchrony is observed in Fig. 24.However, the proposed centralized and distributed MPCs show robust cooperative gaits over unknown rough terrains, as shown in Figs.12(a), 17, 20, and Fig. 21.Moreover, the robustness subject to more than 1000 randomly generated rough terrains is also validated in Fig. 22.

D. Robustness Against Unknown Holonomic Constraints
The holonomic constraint of Section II assumes a distance constraint between the interaction points of agents.In particular, we take no additional rotational constraints at the interaction points.This assumption simplifies the interconnected SRB model and, thereby, the centralized and distributed MPC algorithms.However, more sophisticated connections could   We observe that the cooperative locomotion over rough terrain with different and unknown holonomic constraints has robust stability similar to the one illustrated in Fig. 17.However, the phase portraits in Fig. 25 show that the unknown additional interactions from the limited DOFs on both ends of the holonomic constraint can induce aggressive angular positions and velocity changes.

E. Limitations and Future Study 1)
Optimal control with switching: The proposed MPC approaches for cooperative locomotion were shown to be very robust to various unknown terrains and subject to unknown disturbances.However, the gait presented here does not exhibit extremely dynamic or highly agile maneuvers.One of the reasons for this is the relatively small planning horizon (25 (ms)).While the distributed approach provides an interesting avenue to explore longer horizons in future work due to the considerable decrease in computation time, long horizons suffer when only considering the current domain.For this reason, future work should not only explore increased planning lengths but should also consider a PWA optimal control formulation [84,Chap. 16] such that the change in stance leg configurations can be considered directly by the planner.
2) Sophisticated constraints between agents: We assumed the holonomic constraint (2) with a ball joint on agents to simplify the development of the interconnected reduced-order model and the synthesis of centralized and distributed MPCs.We further studied the robustness of the proposed layered control approach subject to the unknown restrictions on the ball joints in Section VI-D.However, more sophisticated cooperative tasks may require dexterous manipulation during cooperative locomotion.For instance, quadrupedal robots can be equipped with robotic arms for loco-manipulation.Our future work will investigate the development of robust control algorithms that systematically address the gap between simplified reduced-order models and complex dynamical models of cooperative loco-manipulation.
3) Extension to multi-agents: Our previous work [1] presented quasi-statically stable cooperative gaits for M ≥ 2 agents.In particular, a closed-form expression for the interconnected LIP models was developed to address the real-time trajectory planning based on a centralized MPC algorithm.The interconnected LIP model cannot address interaction torques between the agents.Furthermore, the gait is not dynamic.The current paper presents an interconnected reduced-order model, based on the SRB dynamics, that addresses interaction torques between the agents while allowing dynamic cooperative gaits.In addition, centralized and distributed MPC algorithms are developed for the cooperative locomotion of two agents.However, a closed-form expression for the Jacobian matrices in ( 13) and (15) may not be easily computed for M ≥ 3 interconnected SRB dynamics with sophisticated holonomic constraints.Our future work will investigate the extension of the approach for dynamic cooperative locomotion of M ≥ 3 agents with complex holonomic constraints.One possible way is to develop robust distributed MPC algorithms integrated with reinforcement learning and data-driven techniques [55], [82] to bridge the gap between interconnected reduced-order models and full-order models.
4) Coordination between agents: In numerical simulations, each agent's global coordinates can be easily used without sensor limitations or unexpected noises.However, experimental evaluations estimate the agents' global coordinates via kinematic estimators.The estimation errors may result in unexpected coordination changes.This paper addresses this issue by the human operator who coordinates the agents with the corresponding speed commands from the joystick.For example, the user commands a higher or lower desired speed to the lagging or leading agent, respectively.Our future work will investigate the design of algorithms that robustly estimate the global coordinates of the agents in the presence of noisy measurements.

VII. CONCLUSION
This paper presented a layered control algorithm for realtime trajectory planning and robust control for cooperative locomotion of two holonomically constrained quadrupedal robots.An innovative reduced-order model of cooperative locomotion is developed and studied based on interconnected SRB dynamics.At the high level of the layered control algorithm, the real-time trajectory planning problem is formulated as an optimal control problem of the interconnected reduced-order model with two different schemes: centralized and distributed MPCs.The centralized MPC plans for the global reduced-order states, global GRFs, and the interaction wrenches between agents.The distributed MPC is developed based on a one-step communication delay and an agreement protocol to solve for the local reduced-order states, local GRFs, and the local estimated wrenches.At the low level of the control scheme, distributed nonlinear controllers, based on QP and virtual constraints, are developed to impose the fullorder model of each agent to track the optimal reduced-order trajectories and GRFs prescribed by the high-level MPCs.
The effectiveness of the proposed layered control approach was verified with extensive numerical simulations and experiments for the blind and robust cooperative locomotion of two holonomically constrained A1 robots with different payloads on different terrains and subject to external disturbances.A detailed study was presented to compare the performance of the proposed centralized and distributed MPCs over more than 1000 randomly generated landscapes and external pushes.It was shown that the distributed MPC has a robust stability performance similar to that of the centralized MPC, while the computation time is reduced significantly.The results also show that both the centralized and distributed MPCs integrated with the interconnected SRB dynamics can drastically improve the robust stability of cooperative locomotion compared to the individual nominal MPCs.The experimental results suggest that the proposed control algorithm can result in robustly stable cooperative locomotion on different terrains (e.g., wooden blocks, slippery surfaces, grass, mulch, and concrete) subject to unknown payloads and external disturbances at different speeds.The robustness of the control approach was also studied against uncertainties in holonomic constraints and assumptions.
For future work, we will investigate the extension of the approach for more sophisticated constraints between agents.We will also study the extension to multi-agents while systematically developing robust optimal control algorithms to address switching in hybrid models.

Fig. 4 .
Fig. 4. Illustration of the interconnected SRB models for the cooperative locomotion of two quadrupedal robots.

Fig. 5 .
Fig. 5. Illustration of the mechanisms designed to be mounted on the torso of each robot to holonimcally constrain the motion of agents.The mechanism in (a) can implement the holonomic constraint in (2) with free ball joints.The mechanisms in (b), (c), and (d) implement the constraint (2) while also restricting the roll, yaw, and pitch motions, respectively.The mechanism implemented on top of the robots is illustrated in (e).

Fig. 6 .
Fig.6.Numerical and experimental validation system setup.Here, the joystick commands the desired velocity trajectories to the trajectory planner of each agent.Both agents are controlled by one joystick.Joystick sends out the desired trajectories on both numerical simulations and experimental validations.The network switch is used to build the connection between the computer and two agents without IP address collision.UDP communication protocol through Ethernet cables is used in experimental validations.

Fig. 8 .
Fig. 8. Plots of the desired and actual velocities of the closed-loop interconnected reduced-order model for two agents in the numerical simulation.Here, two nominal MPCs are applied to the reduced-order models of agents to generate optimal GRFs without considering the holonomic constraints between them.The joystick provides desired trajectories.The instability of the cooperative gait is evident.

Fig. 9 .
Fig. 9. Plots of the desired and actual velocities of the closed-loop interconnected reduced-order model for two agents in the numerical simulation.Here, the optimal GRFs are generated by the centralized MPC(16) and are applied to the reduced-order models subject to an unknown payload of 5 (kg) between agents.The joystick provides desired trajectories.The robust tracking is evident.

Fig. 10 .
Fig. 10.Plots of the desired and actual velocities of the closed-loop interconnected reduced-order model for two agents in the numerical simulation.Here, the optimal GRFs are generated by the distributed MPCs(22) and are applied to the reduced-order models subject to an unknown payload of 5 (kg) between agents.The joystick provides desired trajectories.The robust tracking is evident.

Fig. 11 .
Fig. 11.Comparison between the desired velocities and optimal velocities, generated with the high-level centralized and distributed MPCs, for the closedloop interconnected full-order model in RaiSim.The joystick provides timevarying reference trajectories.The figure depicts the optimal trajectories generated by (a) the centralized MPC (16) and (b) the distributed MPC (22) for agent 1.Here, we consider a trot gait over rough terrain with an unknown payload of 5 (kg) between the agents and subject to unknown, time-varying, and external disturbance forces applied to the robots.

Fig. 12 .
Fig. 12. Snapshots demonstrating the performance of the proposed layered control algorithm for a series of cooperative locomotion experiments.Indoor experiments: (a) rough terrain with the agents traversing arbitrarily displaced wooden blocks, (b) asymmetrical terrain with one agent being on a compliant surface and elevated by 10 (cm), (c) slippery surface covered by a cooking spray, and (d) tethered pulling.The robots are loaded with a payload of 4.53 (kg) (36% uncertainty in one robot's mass) in (a), (c), and (d).The friction coefficient is taken as µ = 0.3 in (c) and µ = 0.6 in (a), (b), and (d).Here, (a) and (b) show the snapshots where the centralized MPC is applied, while (c) and (d) show the snapshots where the distributed MPC is employed.Videos of all experiments are available online [90].

Fig. 13 .
Fig. 13.Comparison between the desired and optimal velocities of the robots for the nominal trot experiment on flat ground.Optimal velocities are provided from the high-level centralized MPC.Time-varying desired trajectories are provided by the joystick to coordinate the robots' motions.It is observed that the centralized MPC's outputs can successfully track the desired trajectories.

Fig. 15 .Fig. 16 .
Fig. 15.Plots of the desired and optimal velocities for cooperative locomotion experiments on (a) rough terrain, (b) a slippery surface, and (c) subject to external disturbances with the centralized MPC.Time-varying desired trajectories are provided by the joystick.It is clear that the centralized MPC's outputs can robustly track the desired trajectories in the presence of uncertainties and disturbances.

Fig. 17 .
Fig. 17.Phase portraits for (a) the body roll and (b) the body pitch of agent 1 with the centralized MPC and (c) the body roll and (d) the body pitch of agent 1 with the distributed MPCs during different experiments.The plots show the robustness of the cooperative locomotion over rough terrain covered with randomly dispersed wooden blocks, the slippery surface, and subject to unknown external disturbances.

Fig. 18 .Fig. 19 .
Fig. 18.Plots of the desired and optimal velocities for cooperative locomotion experiments on (a) rough terrain, (b) a slippery surface, and (c) subject to external disturbances with the distributed MPC.Time-varying desired trajectories are provided by the joystick.It is clear that the distributed MPC's outputs can robustly track the desired trajectories in the presence of uncertainties and disturbances.

Fig. 20 .
Fig. 20.Snapshots demonstrate the proposed layered controller's performance for a series of cooperative locomotion experiments.Outdoor experiments: (a) cooperative locomotion on gravel, (b) transitioning from concrete surface to grass, (c) cooperative locomotion on mulch, and (d) cooperative locomotion on grass.The robots cooperatively transport a payload of 4.53 (kg) (36% uncertainty) in (b) and (c) and 6.80 (kg) (55% uncertainty) in (a) and (d).Here, (a) and (c) show the snapshots where the distributed MPC is adopted, while (b) and (d) show the snapshots where the centralized MPC is employed.

Fig. 21 .
Fig. 21.Plots of the virtual constraints of agent 1 during cooperative locomotion with unknown payloads and on various outdoor terrains, including (a) locomotion on gravel, (b) transitioning from concrete to grass, (c) locomotion on mulch, and (d) locomotion on grass.The payload is 4.53 (kg) in (b) and (c) and 6.80 (kg) in (a) and (d).Here, (a) and (c) depict the evolution of virtual constraints with the distributed MPC at the high level.In addition, (b) and (d) illustrate the evolution of the virtual constraints with the centralized MPC at the high level.Here, we plot the components of virtual constraints in (24) that correspond to the COM position along the x and y axes (m) (i.e., COM position tracking) and the body's roll and pitch angles (rad) (i.e., angle tracking).It is clear that the full-order system tracks the prescribed optimal and reduced-order trajectories generated by the MPCs.

Fig. 22 .
Fig. 22. Illustration of the comparison results between the nominal, centralized, and distributed MPCs.(a) The snapshot shows the RaiSim simulation environment with one of the randomly generated rough terrains.The maximum height of the generated terrains is 12 (cm) (46% uncertainty in robots' height).(b) The plot describes the success rate of the proposed trajectory planners over 1500 randomly generated rough terrains in numerical simulations.The overall success rate of the nominal, centralized, and distributed MPCs over randomly generated rough terrain is 0%, 54.2%, and 53.8%, respectively.(c) The snapshot shows the RaiSim simulation environment with one of the randomly generated external forces and a randomly generated payload.The arrows illustrate the applied external forces on each agent.The maximum external force is 80 (N) (65% of one robot's weight) on the x-, y-, and z-directions.The evolution of the forces in each direction is sinusoidal with a maximum random period of 4 (s).External forces are applied from 1 (s) to 60 (s).The maximum payload mass is 5 (kg).(d) The plot describes the success rate of the proposed trajectory planners with 1200 randomly applied external forces and payloads in numerical simulations.The overall success rate of the nominal, centralized, and distributed MPCs subject to unknown external forces and payloads is 0%, 55.6%, and 53.9%, respectively.

Fig. 23 .
Fig. 23.Plots of the locally estimated Lagrange multiplier λ by two agents using the distributed MPC algorithm.It is clear that the local values reach an agreement and stay close to each other.

Fig. 24 .
Fig.24.Plots of the estimated height of agents' front right legs.Initial locomotion has complete synchrony before encountering the rough terrain described in the gray area in the plot.After engaging the rough terrain, it is clear that asynchrony happens between agents.However, the layered control approach robustly stabilizes the cooperative gait.

Fig. 25 .
Fig. 25.Phase portraits for (a) the body roll and (b) the body pitch of agent 1 with the centralized MPC and (c) the body roll and (d) the body pitch of agent 1 with the distributed MPC during different experiments.The plots show the robustness of the cooperative locomotion over rough terrain with fixed DOFs in holonomic constraints on the roll, pitch, and yaw directions.