(Second link has a table and description showing the results they got. Stavely's book distills it into lightweight, less-processy form.)
Design by Contract itself is like assertions on steroids. If your language lacks support, you can build it into your functions at beginning, middle and end. If OOP language, you might use constructors and destructors. If trying to understand it, I have a link that you can give even to project managers.
One benefit of making it a formal spec is you can generate tests directly from your specs. This is an old, old technique being rediscovered recently. Various names include specification-based, model-based, contract-based, and property-based test generation. The last thing you can do is manually or automatically convert your specs into runtime checks for the above and/or fuzz testing. The failure takes you exactly to the property that failed if it was one of them.
Finally, for just the most critical things, you can try formal proof on them. SPARK Ada is a great example used in industry with the book being pretty easy to follow.
The nice thing about that toolset, esp the proprietary version, is that you can use automated provers to avoid having to do mathematical proof by hand. If something doesn't pass, you have several options: do some manual work on hints to the automated provers or actual proofs in a proof assistant; monkey around with the code to see if different structure or algorithm gets it through; put in runtime checks for just the properties you couldn't prove. If you do the last one and keep programming while solvers run, then the productivity is similar to Design-by-Contract with the extra benefit some properties might hold in all cases. You might also get performance boost by reducing unnecessary, safety checks via proofs that were successful.
Note: The book is designed for people without formal methods background. I can't say how easy or not it will be since I just got it. :)
Are we there yet? 20 Years of Industrial Theorem Proving with SPARK
IRONSIDES DNS illustrates it can handle bigger problems
Frama-C and SPARK both build on Why3 platform with its intermediate language WhyML. Work built on any of these three can usually be ported to the others. So, I'll drop a recent example in Frama-C that's really math heavy with a style amenable to high automation.
Write-ups or examples of some of the above:
There’s been quite a few of us suggesting people jump into blockchain just to get funding for more important stuff (e.g. language-level tooling or libraries) they’ll produce as part of the coin offering they sell to investors. Plus, with the focus on correctness, it’s easier than ever to get the private sector to invest in high-assurance tooling. Problem is they keep trying to clean-slate everything when there's piles of work to build on with better benefits in the long term. And this work is hard even when doing simple things if you want end result to be widely usable. So, you better have a good justification for trying to redo decades worth of hard work on your own with some VC funding.
Anyone wanting a shortcut should build your language on top of capabilities of the SPARK Ada language  with this book . Your smart contracts get all the benefits it currently offers plus whatever extra you build plus whatever they build with the influx of money for Pro edition. Recent projects already add some pointer  and floating-point  safety to what they already mostly automate with that tooling. This is also a mature tool whose development and commercial use goes back decades .
Good write-up on the difficulties. The first I saw was Guttman’s paper that set back the hype quite a bit:
His post and yours both oversell the problems while underselling the successes. It’s best to keep big picture in mind. Yes, it’s extremely hard. Yes, we shouldn’t rely exclusively on it or have too much expectations. On Guttman’s end, the Orange Book of TCSEC (later, EAL7 of Common Criteria) required many activities like I included in my own security framework:
Each was supposed to check the others. Each time, you want experienced people doing it since it’s the people that make these processes work. What’s mostly missing in this post and esp in Guttman’s is all the evidence of defect reduction. Most of the work he cites were tracking problems they found throughout their lifecycle. They all said the formal specifications caught bugs due to ambiguity. Some found formal proofs beneficial with often mentioning “deep” errors they likely wouldn’t find with testing or code review. TLA+ use at Amazon was great example where errors showed up 30+ steps into protocols. Most of old verifications found they caught errors just simplifying their specs for the prover even if they wouldn’t run proofs. And for high-assurance security, the A1-class systems would go through 2-5 years of analysis and pentesting where evalutators said they did way better than systems using mainstream practices. Most of those can’t even tell you how many covert channels they have.
If you’re interested in methods that helped in practice, I dropped a start on a survey paper on Lobste.rs trying to shorten several decades worth of material into an outline post:
Now, all that said, full proof is hard with a lot of problems. You hit nail on the head about the infrastructure problem. I’ve always strongly encouraged people to concentrate on building up one or two projects to have infrastructure for most common ways of solving problems in programming with as much automation as possible. They have at least been building most things on Coq, Isabelle/HOL, HOL4, and ACL2. Having prior work to build on has been helping them. As you’ve seen, though, they’ve not made this a priority: the tooling still is horrible. So, horribly-difficult work with horribly-difficult tooling. That leads to the next proposal I did that might help you or some of your readers who want 80/20 rule of most benefits without most problems:
Here’s some of those methods on top of the TLA+ and Alloy links:
Here’s my recommendation in general. Learn lightweight methods such as Meyer’s Design-by-Contract for most stuff, TLA+ for concurrency/order, Alloy for structure, SPARK for low-level code, and so on. TLA+ on distributed stuff seems to sell people the most. With a formal method in hand, then apply this methodology: memory-safe language + lightweight method to model-check important stuff + Design-by-Contract + tooling to generate tests from specs/contracts + fuzzer running with runtime checks enabled. This combination will bring the most quality improvement for least effort. In parallel, we have the experts formally verifying key components such as kernels, filesystems, network stacks, type systems, compilers, security protocols, and so on. They do so modelling real-world usage with each one coming with contracts and guidance on exactly how they should be used. The baseline goes up dramatically with specific components getting nearly bullet-proof.
Also, you said you wanted entire system to be reliable. There was one that set the gold standard on that by noting how everything failed with systematic mitigations for it in prevention and/or recovery:
The OpenVMS clustering software alone let their systems run for years at a time. One ran 17 years at a rail yard or something supposedly. The NonStop systems took that to hardware itself with them just running and running and running. Applying a combination of balanced assurance like I describe, NonStop techniques for reliability, and Cambridge’s CHERI for security should make one hell of a system. You can baseline it all with medium-assurance using tools that already exist with formal verification applied to a piece of the design at a time. The work by Rockwell-Collins on their commercially-deployed AAMP7G’s shows that’s already reachable.
So, there’s you some weekend reading. Merry Christmas! :)
This Barnes book shows how it’s systematically designed for safety at every level:
Note: The AdaCore website has a section called Gems that gives tips on a lot of useful ways to apply Ada.
Finally, if you do Ada, you get the option of using Design-by-Contract (built-in to 2012) and/or SPARK language. One gives you clear specifications of program behavior that take you right to source of errors when fuzzing or something. The other is a smaller variant of Ada that integrates into automated, theorem provers to try to prove your code free of common errors in all cases versus just ones you think of like with testing. Those errors include things like integer overflow or divide by zero. Here’s some resources on those:
The book and even language was designed for people without a background in formal methods. I’ve gotten positive feedback from a few people on it. Also, I encouraged some people to try SPARK for safer, native methods in languages such as Go. It’s kludgier than things like Rust designed for that in mind but still works.
GPL download for AdaCore GNAT:
Verification of larger programs is hard for a lot of reasons. Easiest method is to do Design-by-Contract in a memory-safe language. Anything easy to specify in contracts goes there where stuff easier to test for is done that way. If your language doesn't support it, you can do it with asserts, conditionals in functions, or constructors/destructors in OOP.
Then, you hit the app with tests generated from the contracts and fuzz testing that both run while the contract conditions have runtime checks in the code. That means a failure will take you right to the precondition, invariant, or post condition that is broken.
For analyzing distributed stuff, TLA+ is your best bet with an accessible tutorial for a subset available here: