Found in 12 comments on Hacker News
eggy · 2022-12-22 · Original thread
I still play with Rust (and Zig), but I have decided to put my work efforts into SPARK[1], the subset of Ada, for high-integrity software and formal verification. I know AdaCore and Ferrous Systems are collaborating in trying to bring a lot of Ada/Spark's capabilities to Rust, but this is still going to be some time. Ada has a longer legacy in this game.

I am working on safety critical control systems and there is a lot to prune from existing work in this area in SPARK and Ada.

This book[2], "Building High Integrity Applications with SPARK" really sold me on the benefits. It is a great intro of how SPARK was used for the CubeSat program at Vermont Technical College (VTC). A CubeSat was successfully launched in 2013. I could not find any critical mission software written in Rust yet especially that far back.

I found Julia's selection for a flight collision avoidance system perplexing[3], but it proves a PLs syntax and evangelism (maybe not in this case), can bias a selection of a PL for such tasks. People find Ada/SPARK's syntax and methods verbose and similar to Pascal or other PLs, however, I was up and running quickly in it as opposed to Rust.

You can program to bare metal in SPARK, and the tools are already there to provide high-integrity software. The tools for SPARK automate most of the verification. Most of Rust's wishlist is already in SPARK (and Ada). I hope the Ferrous Systems and AdaCore collab bears fruit for Rust, but I have been programming since 1978, and I play with many PLs, but I found Rust's initial learning curve very off putting and I don't have the time to wait for it to mature to SPARK's level. And this is from someone who loves J/APL/BQN and Lisp, so I have no issue with different programming paradigms. I found Zig much easier to get things done from the start, although it does not strive to be a SPARK or Rust.

[1] https://www.adacore.com/sparkpro

[2] https://www.amazon.com/Building-High-Integrity-Applications-...

[3] https://juliahub.com/case-studies/lincoln-labs/

eggy · 2022-09-09 · Original thread
Hah, Chevy Spark, and it's a gasoline car!

I usually put "SPARK2014 programming" or "SPARK2014 Ada" in my searches. I used to get annoyed by all of the Apache Spark stuff that came up, but that seems to have waned a bit this past year or two.

Vermont Technical College (VTC) used SPARK2014 to program their CubeSat and the book, "Building High Integrity Applications with SPARK"[1] is a great intro to it in a nice context of practical use.

There was also an article on the AdaCore site about someone who rewrote some drone firmware in SPARK2014, which is another nice real-world application. [2]

[1] https://www.amazon.com/Building-High-Integrity-Applications-...

[2] https://blog.adacore.com/how-to-prevent-drone-crashes-using-...

If you want to learn it, here's the main book people are using:

https://www.amazon.com/Building-High-Integrity-Applications-...

There's tooling like Cryptol and SPARK to make this a lot easier than it was in the past. Cryptol can generate software or hardware that implements the high-level spec of the algorithm. Far as practical use, certainly not just one even though still extremely rare. Here's some more:

Altran/Praxis Correct-by-Construction for a cert authority

http://www.sis.pitt.edu/jjoshi/Devsec/CorrectnessByConstruct...

Same company ports Skein from C to SPARK, finds trouble

http://www.skein-hash.info/sites/default/files/SPARKSkein.pd...

SPARK Ada is available under GPL and commercial licenses. The book below is quite easy to read compared to most stuff with formal methods. The companies building stuff like this usually also have tools to generate tests (property-based testing) from contracts/specs and can automatically turn anything you can't prove into runtime checks. Shocked it's not used more often in areas like smart contracts or cryptocurrency protocols.

https://www.amazon.com/Building-High-Integrity-Applications-...

Galois built Cryptol for NSA but open sourced it.

https://www.cryptol.net/

Rockwell Collins built SHADE and some other tools for high-assurance crypto. They build stuff for the defense sector mainly funded by NSA. They even have a separation-preserving, verified CPU.

http://www.csl.sri.com/users/shankar/VGC05/Hardin.pdf

Cleanroom just uses functional decomposition, the simplest of the control flow primitives, and math you learned in middle or high school. Got results on lots of projects.

http://infohost.nmt.edu/~al/cseet-paper.html

https://pdfs.semanticscholar.org/12d5/23e586ffde5cbe020cb3fa...

(Second link has a table and description showing the results they got. Stavely's book distills it into lightweight, less-processy form.)

Design by Contract itself is like assertions on steroids. If your language lacks support, you can build it into your functions at beginning, middle and end. If OOP language, you might use constructors and destructors. If trying to understand it, I have a link that you can give even to project managers.

https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-b...

One benefit of making it a formal spec is you can generate tests directly from your specs. This is an old, old technique being rediscovered recently. Various names include specification-based, model-based, contract-based, and property-based test generation. The last thing you can do is manually or automatically convert your specs into runtime checks for the above and/or fuzz testing. The failure takes you exactly to the property that failed if it was one of them.

Finally, for just the most critical things, you can try formal proof on them. SPARK Ada is a great example used in industry with the book being pretty easy to follow.

https://www.amazon.com/Building-High-Integrity-Applications-...

The nice thing about that toolset, esp the proprietary version, is that you can use automated provers to avoid having to do mathematical proof by hand. If something doesn't pass, you have several options: do some manual work on hints to the automated provers or actual proofs in a proof assistant; monkey around with the code to see if different structure or algorithm gets it through; put in runtime checks for just the properties you couldn't prove. If you do the last one and keep programming while solvers run, then the productivity is similar to Design-by-Contract with the extra benefit some properties might hold in all cases. You might also get performance boost by reducing unnecessary, safety checks via proofs that were successful.

They probably should've started with a Wirth-style language plus Design-by-Contract and static analysis. That would knock out tons of stuff at compile and runtime. If looking for mature language, the best one for predictability is SPARK Ada hands down. It was designed for easy analysis in automated provers. It's automation level is a lot higher than it used to be. Folks wanting provable programs should either be considering using it or prototyping their programs in it with semantics-preserving translation to the blockchain language/VM.

https://en.wikipedia.org/wiki/SPARK_(programming_language)

https://www.amazon.com/Building-High-Integrity-Applications-...

Note: The book is designed for people without formal methods background. I can't say how easy or not it will be since I just got it. :)

Are we there yet? 20 Years of Industrial Theorem Proving with SPARK

https://proteancode.com/keynote.pdf

IRONSIDES DNS illustrates it can handle bigger problems

http://ironsides.martincarlisle.com

Frama-C and SPARK both build on Why3 platform with its intermediate language WhyML. Work built on any of these three can usually be ported to the others. So, I'll drop a recent example in Frama-C that's really math heavy with a style amenable to high automation.

https://hal.archives-ouvertes.fr/hal-01681134/document

If you're learn Ada, be sure to check out Building High Integrity Applications in SPARK to get the most benefit from it. I also find Barnes book does a good job chapter by chapter showing how it's systematically designed for safety. Finally, if anyone doubts improvements that can happen, I have the most apples to apples study you'll see comparing Ada and C.

https://www.amazon.com/Building-High-Integrity-Applications-...

https://www.adacore.com/books/safe-and-secure-software

http://archive.adaic.com/intro/ada-vs-c/cada_art.pdf

What makes this a good write-up is you could give it to a project manager as quickly as an engineer. An updated version of this with links to current tech, esp low cost or easy to integrate, might be worthwhile. Other improvements since then include generating tests directly from the contracts, using contracts with automated provers, leaving contracts in as runtime checks while throwing fuzzers at that code, and using contracts for bug repair. They can't do everything but they're a very high-ROI technique with some immediate benefits plus maybe some down the road if your software/company gets bigger.

Write-ups or examples of some of the above:

https://hillelwayne.com/post/pbt-contracts/

http://www.skein-hash.info/sites/default/files/SPARKSkein.pd...

https://www.amazon.com/Building-High-Integrity-Applications-...

https://blog.adacore.com/running-american-fuzzy-lop-on-your-...

https://www.microsoft.com/en-us/research/wp-content/uploads/...

“but after studying many other ICO, I am convinced now that this is not just a short-lived fad, but a real new model to fund projects, and especially open-source projects related to blockchains.”

There’s been quite a few of us suggesting people jump into blockchain just to get funding for more important stuff (e.g. language-level tooling or libraries) they’ll produce as part of the coin offering they sell to investors. Plus, with the focus on correctness, it’s easier than ever to get the private sector to invest in high-assurance tooling. Problem is they keep trying to clean-slate everything when there's piles of work to build on with better benefits in the long term. And this work is hard even when doing simple things if you want end result to be widely usable. So, you better have a good justification for trying to redo decades worth of hard work on your own with some VC funding.

Anyone wanting a shortcut should build your language on top of capabilities of the SPARK Ada language [1] with this book [2]. Your smart contracts get all the benefits it currently offers plus whatever extra you build plus whatever they build with the influx of money for Pro edition. Recent projects already add some pointer [3] and floating-point [4] safety to what they already mostly automate with that tooling. This is also a mature tool whose development and commercial use goes back decades [5].

[1] https://en.wikipedia.org/wiki/SPARK_(programming_language)

[2] https://www.amazon.com/Building-High-Integrity-Applications-...

[3] https://arxiv.org/abs/1710.07047

[4] http://lists.forge.open-do.org/pipermail/spark2014-discuss/2...

[5] http://www.spark-2014.org/uploads/itp_2014_r610.pdf

[Copying/pasting my reply to author in Disqus since at work. ]

Good write-up on the difficulties. The first I saw was Guttman’s paper that set back the hype quite a bit:

http://www.cypherpunks.to/~peter/04_verif_techniques.pdf

His post and yours both oversell the problems while underselling the successes. It’s best to keep big picture in mind. Yes, it’s extremely hard. Yes, we shouldn’t rely exclusively on it or have too much expectations. On Guttman’s end, the Orange Book of TCSEC (later, EAL7 of Common Criteria) required many activities like I included in my own security framework:

http://pastebin.com/y3PufJ0V

Each was supposed to check the others. Each time, you want experienced people doing it since it’s the people that make these processes work. What’s mostly missing in this post and esp in Guttman’s is all the evidence of defect reduction. Most of the work he cites were tracking problems they found throughout their lifecycle. They all said the formal specifications caught bugs due to ambiguity. Some found formal proofs beneficial with often mentioning “deep” errors they likely wouldn’t find with testing or code review. TLA+ use at Amazon was great example where errors showed up 30+ steps into protocols. Most of old verifications found they caught errors just simplifying their specs for the prover even if they wouldn’t run proofs. And for high-assurance security, the A1-class systems would go through 2-5 years of analysis and pentesting where evalutators said they did way better than systems using mainstream practices. Most of those can’t even tell you how many covert channels they have.

If you’re interested in methods that helped in practice, I dropped a start on a survey paper on Lobste.rs trying to shorten several decades worth of material into an outline post:

https://pastebin.com/uyNfvqcp

Now, all that said, full proof is hard with a lot of problems. You hit nail on the head about the infrastructure problem. I’ve always strongly encouraged people to concentrate on building up one or two projects to have infrastructure for most common ways of solving problems in programming with as much automation as possible. They have at least been building most things on Coq, Isabelle/HOL, HOL4, and ACL2. Having prior work to build on has been helping them. As you’ve seen, though, they’ve not made this a priority: the tooling still is horrible. So, horribly-difficult work with horribly-difficult tooling. That leads to the next proposal I did that might help you or some of your readers who want 80/20 rule of most benefits without most problems:

https://news.ycombinator.com/item?id=15781508

https://lobste.rs/s/n7v658/your_thoughts_on_this_advice_thos...

Here’s some of those methods on top of the TLA+ and Alloy links:

https://www.eiffel.com/values/design-by-contract/introductio...

https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-b...

http://www.anthonyhall.org/c_by_c_secure_system.pdf

https://www.amazon.com/Building-High-Integrity-Applications-...

Here’s my recommendation in general. Learn lightweight methods such as Meyer’s Design-by-Contract for most stuff, TLA+ for concurrency/order, Alloy for structure, SPARK for low-level code, and so on. TLA+ on distributed stuff seems to sell people the most. With a formal method in hand, then apply this methodology: memory-safe language + lightweight method to model-check important stuff + Design-by-Contract + tooling to generate tests from specs/contracts + fuzzer running with runtime checks enabled. This combination will bring the most quality improvement for least effort. In parallel, we have the experts formally verifying key components such as kernels, filesystems, network stacks, type systems, compilers, security protocols, and so on. They do so modelling real-world usage with each one coming with contracts and guidance on exactly how they should be used. The baseline goes up dramatically with specific components getting nearly bullet-proof.

Also, you said you wanted entire system to be reliable. There was one that set the gold standard on that by noting how everything failed with systematic mitigations for it in prevention and/or recovery:

http://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf

https://en.wikipedia.org/wiki/NonStop_(server_computers)

The OpenVMS clustering software alone let their systems run for years at a time. One ran 17 years at a rail yard or something supposedly. The NonStop systems took that to hardware itself with them just running and running and running. Applying a combination of balanced assurance like I describe, NonStop techniques for reliability, and Cambridge’s CHERI for security should make one hell of a system. You can baseline it all with medium-assurance using tools that already exist with formal verification applied to a piece of the design at a time. The work by Rockwell-Collins on their commercially-deployed AAMP7G’s shows that’s already reachable.

http://www.ccs.neu.edu/home/pete/acl206/slides/hardin.pdf

So, there’s you some weekend reading. Merry Christmas! :)

For people wanting to learn, this article trying to use it for audio applications will give you a nice taste of the language:

http://www.electronicdesign.com/embedded-revolution/assessin...

This Barnes book shows how it’s systematically designed for safety at every level:

https://www.adacore.com/books/safe-and-secure-software

Note: The AdaCore website has a section called Gems that gives tips on a lot of useful ways to apply Ada.

Finally, if you do Ada, you get the option of using Design-by-Contract (built-in to 2012) and/or SPARK language. One gives you clear specifications of program behavior that take you right to source of errors when fuzzing or something. The other is a smaller variant of Ada that integrates into automated, theorem provers to try to prove your code free of common errors in all cases versus just ones you think of like with testing. Those errors include things like integer overflow or divide by zero. Here’s some resources on those:

http://www.eiffel.com/developers/design_by_contract_in_detai...

https://en.wikipedia.org/wiki/SPARK_(programming_language)

https://www.amazon.com/Building-High-Integrity-Applications-...

The book and even language was designed for people without a background in formal methods. I’ve gotten positive feedback from a few people on it. Also, I encouraged some people to try SPARK for safer, native methods in languages such as Go. It’s kludgier than things like Rust designed for that in mind but still works.

GPL download for AdaCore GNAT:

https://www.adacore.com/community

For small modules, SPARK with this book from the limited feedback I have:

https://www.amazon.com/Building-High-Integrity-Applications-...

Verification of larger programs is hard for a lot of reasons. Easiest method is to do Design-by-Contract in a memory-safe language. Anything easy to specify in contracts goes there where stuff easier to test for is done that way. If your language doesn't support it, you can do it with asserts, conditionals in functions, or constructors/destructors in OOP.

http://www.eiffel.com/developers/design_by_contract_in_detai...

Then, you hit the app with tests generated from the contracts and fuzz testing that both run while the contract conditions have runtime checks in the code. That means a failure will take you right to the precondition, invariant, or post condition that is broken.

For analyzing distributed stuff, TLA+ is your best bet with an accessible tutorial for a subset available here:

https://learntla.com/introduction/

Fresh book recommendations delivered straight to your inbox every Thursday.