Abstract

Safety is an important feature to be considered in control and reinforcement learning systems, and has become a critical aspect to study in modern Artificial Intelligence (AI). There are multiple different ways of defining safety in these systems, which include VaR, CVaR, Positive Invariance, etc. Our main focus in this paper is on Positive Invariance, which is a widely used method for defining safety in such systems. It deals with the ability of the system to remain within a safe set under given dynamics. Traditional methods provide deterministic guarantees but often fail when it comes to stochastic disturbances or modeling errors. Probabilistic Positive Invariance (PPI) is a method designed to deal with such situations. There are various methods in both Control and Reinforcement Learning that are based on the idea of treating safety as the property of a system remaining within a safe set of states. Many of these methods try to achieve PPI in different ways, and in this thesis, we study and analyze some of the main methods that aim to achieve the same. We examine their relationships, their differences, and their properties in both deterministic and stochastic settings, with and without goal-based systems, across different horizons and other parameters. We studied five main baselines: control-based methods such as the Probabilistic Control Barrier Function (PCBF) and Lyapunov-based Region of Attraction (ROA), and planning-based methods such as the Primal-Dual Safety-Only formulation, AvoidOnly dynamic programming, and Minimax robust control. Each method is reformulated as a planning-based Markov Decision Process (MDP). Evaluation is done under similar experimental conditions, followed by analysis using the PPI Alignment Score and Safe-Set Jaccard Similarity Score to observe behavior and relationships between them.

Experimental results show how deterministic control formulations (CBF, Lyapunov) provide strict invariance guarantees but shrink feasible regions, while probabilistic relaxations of control-based and planning methods (PCBF, Probabilistic Lyapunov, Primal-Dual, Avoid-Only) offer broader applicability with controlled risk. Minimax acts as an upper-limit reference for worst-case safety. We found several interesting relations between these different methods, all of which inherently attempt to achieve PPI. This study gives insight into the relations and trade-offs between different methods that try to achieve a very similar goal in terms of determinism, conservativeness, and probabilistic assurance. It serves as a guide and provides a comprehensive framework that helps in selecting suitable safety methods based on system and computational requirements.

Download / Access

You can read the full master’s thesis on ProQuest.

Award

Received the Outstanding Master’s Thesis Award [2025-2026] Link.