Improving Robustness of Autonomous Earth-Observing Spacecraft Using Training Environment Enhancements Journal Article uri icon

Overview

abstract

  • Recent research is developing autonomous policies for agile Earth-observing satellite (AEOS) tasking using deep reinforcement learning (DRL). Prior work uses a fixed training environment with enhanced safety challenges to allow DRL to train a policy over a short number of orbits. This paper investigates how training environment enhancements can improve these policies’ robustness by varying the training environment and extending simulated mission times. DRL enables real-time, onboard decision-making while accounting for complex dynamics and constraints. However, policies often struggle to generalize to longer missions or conditions different from training, partly due to the short simulated mission times. Multi-spacecraft systems amplify these challenges, where battery capacity and solar panel efficiency diverge over time due to external factors. Although satellites might run copies of a single policy, its robustness is essential to accommodate agent variability and ensure mission success. This paper investigates training with two curricula, constantly degraded environments, and a domain randomization approach to improve AEOS policy robustness and performance. Policies are tested on episodes 125-times longer than training episodes under nominal and degraded conditions. The proposed training environment enhancements’ applicability is demonstrated in a heterogeneous spacecraft system, achieving a 63% reduction in failures and a 2.8% increase in median cumulative reward.

publication date

  • January 4, 2026

Date in CU Experts

  • January 13, 2026 3:20 AM

Full Author List

  • Mantovani LQ; Schaub H

author count

  • 2

Other Profiles

International Standard Serial Number (ISSN)

  • 1940-3151

Electronic International Standard Serial Number (EISSN)

  • 2327-3097

Additional Document Info

start page

  • 1

end page

  • 12