jm + spacex   2

Autonomous Precision Landing of Space Rockets - Lars Blackmore
from 'Frontiers of Engineering: Reports on Leading-Edge Engineering' from the 2016 Symposium, published by the National Academies Press, regarding the algorithms used by SpaceX for their autonomous landings:
The computation must be done autonomously, in a fraction of a second. Failure to find a feasible solution in time will crash the spacecraft into the ground. Failure to find the optimal solution may use up the available propellant, with the same result. Finally, a hardware failure may require replanning the trajectory multiple times.

Page 39
Suggested Citation:"Autonomous Precision Landing of Space Rockets - Lars Blackmore." National Academy of Engineering. 2017. Frontiers of Engineering: Reports on Leading-Edge Engineering from the 2016 Symposium. Washington, DC: The National Academies Press. doi: 10.17226/23659. ×

A general solution to such problems has existed in one dimension since the 1960s (Meditch 1964), but not in three dimensions. Over the past decade, research has shown how to use modern mathematical optimization techniques to solve this problem for a Mars landing, with guarantees that the best solution can be found in time (Açikmeşe and Ploen 2007; Blackmore et al. 2010).

Because Earth’s atmosphere is 100 times as dense as that of Mars, aerodynamic forces become the primary concern rather than a disturbance so small that it can be neglected in the trajectory planning phase. As a result, Earth landing is a very different problem, but SpaceX and Blue Origin have shown that this too can be solved. SpaceX uses CVXGEN (Mattingley and Boyd 2012) to generate customized flight code, which enables very high-speed onboard convex optimization.
spacex  blue-origin  convex-optimization  space  landing  autonomous-vehicles  flight  algorithms 
8 days ago by jm
SpaceX software dev practices
Metrics rule the roost -- I guess there's been a long history of telemetry in space applications.

To make software more visible, you need to know what it is doing, he said, which means creating "metrics on everything you can think of".... Those metrics should cover areas like performance, network utilization, CPU load, and so on.

The metrics gathered, whether from testing or real-world use, should be stored as it is "incredibly valuable" to be able to go back through them, he said. For his systems, telemetry data is stored with the program metrics, as is the version of all of the code running so that everything can be reproduced if needed.

SpaceX has programs to parse the metrics data and raise an alarm when "something goes bad". It is important to automate that, Rose said, because forcing a human to do it "would suck". The same programs run on the data whether it is generated from a developer's test, from a run on the spacecraft, or from a mission. Any failures should be seen as an opportunity to add new metrics. It takes a while to "get into the rhythm" of doing so, but it is "very useful". He likes to "geek out on error reporting", using tools like libSegFault and ftrace.

Automation is important, and continuous integration is "very valuable", Rose said. He suggested building for every platform all of the time, even for "things you don't use any more". SpaceX does that and has found interesting problems when building unused code. Unit tests are run from the continuous integration system any time the code changes. "Everyone here has 100% unit test coverage", he joked, but running whatever tests are available, and creating new ones is useful. When he worked on video games, they had a test to just "warp" the character to random locations in a level and had it look in the four directions, which regularly found problems.

"Automate process processes", he said. Things like coding standards, static analysis, spaces vs. tabs, or detecting the use of Emacs should be done automatically. SpaceX has a complicated process where changes cannot be made without tickets, code review, signoffs, and so forth, but all of that is checked automatically. If static analysis is part of the workflow, make it such that the code will not build unless it passes that analysis step.

When the build fails, it should "fail loudly" with a "monitor that starts flashing red" and email to everyone on the team. When that happens, you should "respond immediately" to fix the problem. In his team, they have a full-size Justin Bieber cutout that gets placed facing the team member who broke the build. They found that "100% of software engineers don't like Justin Bieber", and will work quickly to fix the build problem.
spacex  dev  coding  metrics  deplyment  production  space  justin-bieber 
march 2013 by jm

Copy this bookmark: