Lunar lander problem In this lab we will solve a classical problem in optimal control theory: the lunar lander. Solving the Lunar Lander Problem using Reinforcement Learning. , 1966). Reinforcement Learning (RL) is an area of machine learning concerned with enabling an agent to navigate an environment with uncertainty in order to maximize some notion of cumulative long-term reward. Solving Lunar Lander Problem with Multiple Uncertainties ICCPR 2023, October 27–29, 2023, Qingdao, China Lu et al. Keywords: Reinforcement Learning, Lunar Lander Problem, Deep Learning. Ground controllers are struggling to re-establish contact with the small satellite, designed to map the Moon's water distribution . “Alright team, keep working the problem,” Crain Jan 12, 2024 · It carried a lunar lander built for NASA by Astrobotic. 1 Background The benchmark problem of lunar lander landing site selection was presented at the Evolutionary Computing Symposium held in 2018 in Fukuoka, Japan. 89 Episode 1000 Average Score: 153. “Our targeted landing site near the lunar South Pole is one of the most scientifically interesting, and geographically challenging locations, on the Moon,” Nicky Fox, a NASA administrator said in a press release. landing design strategy was provided by Cheatham and. In this paper, we implement and analyze two different RL techniques, Sarsa and Deep QLearning, on OpenAI Gym's LunarLander-v2 environment. Implemented solutions for the Lunar Lander problem using Advantage Actor-Critic (A2C) and Proximal Policy Optimization (PPO) algorithms within the framework of Reinforcement Learning, focusing on optimizing policy by implementing policy iteration approaches and value functions for efficient and robust lander control. 18 Episode 200 Average Score: -121. Gou and Liu [3] proposed a novel Mar 6, 2025 · The second lunar lander mission by Intuitive Machines reached the surface of the moon March 6, but its status after landing was not clear. 05 Episode 500 Average Score: 41. 82 Episode 900 Average Score: 164. position. Mar 7, 2025 · CAPE CANAVERAL, Fla. MODEL A. Hours after launch, though, it announced a fuel leak in the lander. The Reinforcement Learning is an area of machine learning concerned with enabling an agent to solve a problem with feedback with the end goal to maximize some form of cumulative long-term reward. Oct 6, 2021 · The s tate components in the 2-D Lunar Lander problem are presented below. 首先,我们需要了解一下环境的基本原理。当选择我们想使用的算法或创建自己的环境时,我们需要开始思考每一步的 observation 和 action。 Here, the path that the lander follows to land safely can be arbitrary. lander. We show that a detailed analysis in the related 3D phase space uncovers the existence of infinitely many safe landing curves, contrary to several former 2D descriptions that implicitly claim the existence of just one such curve. In this project, using the techniques of PBRL, you will solve the lunar lander problem with an additional requirement that the lander should follow a specially curated path (for example, a straight line path). Here, the agent can take 4 different actions This project was completed in the KTH EL2805 Course (Reinforcement Learning). Jan 9, 2024 · The lander is equipped with engines and thrusters for maneuvering, not only during the cruise to the moon but for lunar descent. 2. If lander moves away from landing pad it loses reward back. Firing main engine is -0. 21. How much more? Nov 5, 2013 · Possibly save on shipping that way, too. With our best models, we are able to achieve average rewards of 170+ with the Sarsa agent and 200+ with the Deep Q-Learning agent on the original problem. Von Braun peppered him with penetrating equipment questions. Feb 28, 2024 · Rohit Sachin Sadavarte, Rishab Raj, and B Sathish Babu. The probe landed within 250 meters of its targeted landing The Lunar Lander problem aims to successfully land a rocket-propelled spacecraft in moon-like conditions as quickly and safely as possible. [5] solved the Lunar Lander problem using a model-based approach where instead of learning the system’s dynamics, a model directly learns the optimal parameters for controlling the spacecraft. I'm just really looking forward to getting Lunar Lander working again, and I'm more than willing to put the work in to get it done. ipynb. The landing area is static and it is located always at the (0, 0) coordinates. pos = self. Jul 6, 2022 · The Lunar Lander problem. While this was beginning to work, it seemed like maybe even more training would help. Mar 6, 2025 · CAPE CANAVERAL, Fla. The goal is to develop an intelligent agent capable of landing a lunar module safely on the Jul 21, 2023 · Revisiting the lunar lander problem from scratch has been a rewarding and enlightening experience. Our results lead to a deeper understanding of the Jul 12, 2019 · But one day late in the Apollo program, Cassetti had to brief key leadership on the lunar lander’s weight gain problem. Jan 9, 2024 · Astrobotic Technology, the company that developed the first lunar lander to launch from the United States in five decades, said it is abandoning an attempt to put its Peregrine spacecraft on Mar 6, 2025 · The company’s first lander, Odysseus, tipped over in February 2024 as it touched down on the lunar surface in the IM-1 mission. (AP) — A private lunar lander is no longer working after landing sideways in a crater near the moon’s south pole and its mission is over, officials said Friday. 06 Episode 1200 Mar 6, 2025 · This is Intuitive Machines’ second lunar landing, as last year it became the first private company to ever make a soft landing on the moon with its Odysseus lander. 01 Episode 800 Average Score: 173. They delved The purpose of the following reinforcement learning experiment is to investigate optimal parameter values for deep Q-learning (DQN) on the Lunar Lander problem provided by OpenAI Gym. In this paper, we implement and analyze two different RL techniques, Sarsa and Deep Q-Learning, on OpenAI Gym's LunarLander-v2 environment. The environment is provided by OpenAI Gym. 1 Introduction The Lunar Lander problem is a simulated task that involves training an intelligent system to control a lander and achieve a safe landing on the Moon [1,2]. In 2021 IEEE International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS). Astrobotic hoped to be the first private business to land on the moon. The company blamed a faulty laser altimeter that was to gauge distance from the lunar surface. In today’s IM-2 mission, the Athena lander touched down on Mons Mouton, a lunar plateau near the moon’s south pole. We then introduce additional uncertainty to the original Solving The Lunar Lander Problem under Uncertainty using Reinforcement Learning About Implementation of reinforcement learning algorithms for the OpenAI Gym environment LunarLander-v2 Aug 27, 2024 · Astrobotic’s Peregrine lunar lander was unable to make it to the moon because of a failure of a single valve, leading to work to redesign the valve and overall propulsion system on the company Feb 14, 2024 · The Lunar Lander problem presents a formidable challenge in the realm of reinforcement learning, necessitating the creation of autonomous spacecraft capable of safe landings on the lunar surface. NASA’s Jul 1, 2017 · A detailed description of the lunar lander. and also verify the robustness of these techniques as additional. In this paper, two different Reinforcement Learning techniques from the value-based technique and policy gradient based method headers are implemented and analyzed. Episode finishes if the lander crashes or comes to rest, receiving additional -100 or +100 points. Each leg ground contact is +10. lunar lander problem using traditional Q-learning techniques, and then analyze different techniques for solving the problem and also verify the robustness of these techniques as additional uncertainty is added. Hell, I do that on e-pay when I actually use it. Astrobotic released a photo from a lander-mounted camera, which the company said showed a “disturbance” in a section of thermal insulation. See the file instructions. 2021. Q-learning can be used to solve a wide range of tasks such as playing video games or stock-trading. Output State Components. h5 (keras model file) │ presentation │ │ Safe_Landings_In_Deep_Space_Presentation. vel = self. Bennett (Cheatham et al. 393 Episode 400 Average Score: -12. Table 3-1: Heuristic State Components. The probe landed within 250 meters of its targeted landing. ppsx (Presentation show file) │ │ Safe_Landings_In_Deep_Space_Presentation. The design of the reinforcement system is in RL_system. “Each success and setback are opportunities to learn and grow, and we Jan 8, 2024 · Astrobotic's history-making private lunar lander has experienced an anomaly on its way to the moon. We then introduce additional uncertainty to the Mar 6, 2025 · Moon Lander’s Journey to Lunar Surface Ends With Uncertainty. The landing had some problems Mar 7, 2025 · Athena launched last Wednesday aboard a SpaceX Falcon 9 rocket, which also carried NASA's Lunar Trailblazer probe, which has also faced problems. Mar 28, 2020 · Download Citation | On Mar 28, 2020, Soham Gadgil and others published Solving The Lunar Lander Problem under Uncertainty using Reinforcement Learning | Find, read and cite all the research you points. In this study, three prominent reinforcement learning algorithms, namely Deep Q-Network (DQN), Double Deep Q-Network (DDQN), and Policy Gradient, are Feb 28, 2025 · HOUSTON — Two spacecraft, one from a startup and the other built by a major aerospace company, are experiencing problems after their launch as rideshares on a lunar lander mission. Framework The framework used for the lunar lander problem is gym, a toolkit made by OpenAI [12] for developing and comparing Jan 12, 2024 · It carried a lunar lander built for NASA by Astrobotic. IV. py, and training is done in RL_system_training. We left off with training a few models in the lunar lander environment. But hours after launch, it reported a fuel leak in the lander. This is a Deep Reinforcement Learning solution for the Lunar Lander problem in OpenAI Gym using dueling network architecture and the double DQN algorithm. linearVelocity. DDQN Agent in Lunar Lander This repo explores strategies around various reinforcement learning techniques, specifically Q-learning. 0。 一、初识 Lunar Lander 环境. 702 Episode 600 Average Score: 84. To solve the Lunar Lander problem two similar deep RL methods were used. pptx (Powerpoint file) │ Lunar_Lander_Keyboard_Play. The news Mar 11, 2025 · He noted that the lander was able to transmit some data. Landing outside landing pad is possible, but is penalized. ipynb (Human The Lunar Lander environment is based on the open source physics engine Box2D. This was the first U. The probe landed within 250 meters of its targeted landing Jan 9, 2024 · Astrobotic Technology, the company that developed the first lunar lander to launch from the United States in five decades, said it is abandoning an attempt to put its Peregrine spacecraft on Mar 6, 2025 · The company’s first lander, Odysseus, tipped over in February 2024 as it touched down on the lunar surface in the IM-1 mission. The Astrobotic Peregrine moon lander launched into an elliptical orbit in the wee hours of Mar 7, 2025 · Athena, a commercially developed lander, touched down on the lunar surface on Thursday at 11:28 am local time in Houston (17:28 UTC). 46 Episode 1100 Average Score: 184. Although, I have no problems with paying up front. That aligns with what is known so far of the problem, the company said. Previous attempts to solve the Lunar Lander problem without additional uncertainties have been successful with heuristics and Reinforcement Learning (RL) techniques such as Q-learning, Deep Q-learning (DQL Solving The Lunar Lander Problem under Uncertainty using Reinforcement Learning Abstract: Reinforcement Learning (RL) is an area of machine learning concerned with enabling an agent to navigate an environment with uncertainty in order to maximize some notion of cumulative long-term reward. The algorithms chosen under these Nov 24, 2020 · Reinforcement Learning (RL) is an area of machine learning concerned with enabling an agent to navigate an environment with uncertainty in order to maximize some notion of cumulative long-term reward. Jul 14, 2024 · Here we introduce the considered benchmark problem by present-ing its background and formulation, and providing the competition conditions and results. Reinforcement Learning with the Lunar Lander Intro OpenAI Gym provides a number of environments for experimenting and testing reinforcement learning algorithms. Optimal control problem for lunar soft landing. lunar lander to go into space in more than 50 years. Welcome to part 2 of the reinforcement learning with Stable Baselines 3 tutorials. Intuitive Machines, a Houston company aiming to repeat a 2024 landing, said its spacecraft, Athena, has power and is communicating Oct 13, 2020 · We revisit the control problem for a spacecraft to land on the moon surface at rest with minimal fuel consumption. pdf to get a full introduction of the problem and the details of the implemented algorithms. S. But not long after the launch, the mission was abandoned. I told him where he was referenced from. 14 Episode 700 Average Score: 142. Nov 23, 2020 · lunar lander problem using traditional Q-learning techniques, and then analyze different techniques for solving the problem. In this environment, we need to train a lander so safely land on the moon. lunar lander problem using traditional Q-learning techniques, and then analyze different techniques for solving the problem and also verify the robustness of these techniques as additional May 7, 2021 · Episode 100 Average Score: -203. IEEE, 1–6. 23 Episode 300 Average Score: -55. (AP) — A privately owned lunar lander touched down on the moon with a drill, drone and rovers for NASA and other customers Thursday, but quickly ran into trouble and may have fallen over. 本文继续上文内容,首先使用 lunar lander 环境开始着手,所使用的 gym 版本是 0. lander to go into space in more than 50 years. Nov 24, 2020 · We then introduce additional uncertainty to the original problem to test the robustness of the mentioned techniques. Solved is 200 points. By building the RL environment and a full-stack numerical simulation, I gained a deeper SCS-RL-3547-Final-Project │ assets (Git README images store directory) │ gym (Open AI Gym environment) │ modelweights (model history) │ │ LunarLander. The lander has thrust control, and the goal is to adjust the thrust to achieve a smooth and accurate This is a capstone project for the reinforcement learning specialization by the University of Alberta which provides some of the utility code. Lunar Lander is an environment provided by OpenAI. 3 points each frame. Deep Q-learning (DQN) is essentially a Q-learning algorithm with an approximation of the Q-value function that receives a state as an input and returns actions with their q-values instead of the tabular Q-value function that simply maps a state-action pair to a Q value. This Apr 26, 2023 · The Lunar Lander is a classic reinforcement learning environment provided by OpenAI’s Gym library. The goal is to successfully land a spacecraft on the ground. The Jan 9, 2024 · Astrobotic Technology, the company that developed the first lunar lander to launch from the United States in five decades, said it is abandoning an attempt to put its Peregrine spacecraft on Mar 6, 2025 · The company’s first lander, Odysseus, tipped over in February 2024 as it touched down on the lunar surface in the IM-1 mission. kyex ycgye dpe zrejaq avkhx dwnk umgyc vwse bfaa oiptj fgfm sbcoko ssxsggae tchmyt akgjfwh