[Thesis Defense] Pouya Hamadanian: Causal Simulation and Decision Making in Engineered Networks

Data-driven decision-making has become increasingly prevalent in networks and systems, offering the promise of adaptive, high-performance control in complex environments. However, in practice, such methods often fail to deliver in production deployments. This thesis identifies and addresses two fundamental challenges that limit their practical success. First, data-driven techniques are typically developed using simulation environments that make unrealistic assumptions about real-world dynamics; they assume they can faithfully match real system dynamics when they replay system traces under alternate trajectories. We find that such assumptions often lead to learned policies that are ineffective or even counterproductive in real deployments. Second, standard decision-making algorithms, especially reinforcement learning methods, assume stationarity in the environment. This premise rarely holds in real-world systems, where user workloads, system dynamics, and operational regimes evolve over time. Non-stationary dynamics undermine credit assignment and lead to rapid policy deterioration.

To address these challenges, this thesis develops techniques along two key axes. First, it introduces causal simulation frameworks that treat system modeling as a counterfactual inference problem, enabling accurate simulation of decision outcomes by learning causal dynamics from observational traces. Second, it proposes new in-situ decision-making algorithms that explicitly accommodate non-stationarity. These include a reinforcement learning algorithm that preserves past knowledge by constraining policy updates with a similarity metric, and an exploration/exploitation architecture that leverages shared structure across heterogeneous units while statistically switching among top-performing configurations to adapt to time-varying objectives.

We validate our techniques across diverse domains, including video streaming optimization, load balancing with production workloads from Microsoft, urban road safety interventions, and classical reinforcement learning benchmarks. Our contributions not only improve performance in these settings but also offer a principled foundation for deploying data-driven methods in dynamic, partially observed real-world systems. Together, they provide a blueprint for robust and adaptable decision-making in modern networks and systems.