Making Software: What Really Works, and Why We Believe It

kennethh · 2022-02-08 · Original thread

Surprised nobody have mentioned Making software, https://www.amazon.com/Making-Software-Really-Works-Believe/...

It takes a empirical view on the software process. Does writing tests first help you develop better code faster? Can code metrics predict the number of bugs in a piece of software? Do design patterns actually make better software?

itsdrewmiller · 2021-04-07 · Original thread

Code review approaches are discussed in this book:

https://www.amazon.com/Making-Software-Really-Works-Believe/...

IIRC the evidence was that formal code reviews were no more effective than async peer review (but both were effective at uncovering bugs).

arichard123 · 2019-09-25 · Original thread

My understanding is that it is harder for engineers to estimate how long a job takes, than to do the job. That is to say, that the complexity of doing a time estimation task is higher than the complexity of task you are estimating. I read this in the book "Making Software: What really works and Why we believe it" [0]. The assumption in the article is that an engineer can produce reliable estimates for complex work, and appropriately guide their managers, and don't think that's true.

[0] https://www.amazon.co.uk/Making-Software-Really-Works-Believ...

js8 · 2019-04-17 · Original thread

> Unfortunately, the phrasing would be more like, It doesn't matter what's true, it matters what i can sell to management.

Yes, that's my main beef with it. SCRUM could be based on science, but sadly, it isn't. Here's an interesting book: https://www.amazon.com/Making-Software-Really-Works-Believe/...

moneymakersucks · 2017-05-17 · Original thread

Linked FTA, book about software dev based on empirical studies

https://www.amazon.com/Making-Software-Really-Works-Believe/...

pnathan · 2015-05-02 · Original thread

Interesting work.

It has some pretty basic flaws that are pervasive that the reader should be aware of. First off is the "productivity" measure: the last time I looked around, this was nearly impossible to quantify for software development. The author chose SLOC and development time as stand-ins for productivity. SLOC has a connection with code quality and time of development, but development time is well known to vary[1].

In particular, development time in this thesis is linked to results in a pretty classic "Psychologist's fallacy"[2]; the author generalizes their experiences as conclusive. This implies in particular that "development time" in this thesis should be thrown out for meaningful conclusions, as the sample size is 1. It is, however, an interesting experience report.

Another basic flaw is the connecting of 'modern' with 'good'. The author remarks the main disadvantage of C and Fortran are their age; this crops up here and there. Workflow tooling in Go and Rust are major focuses by the developers: C workflow tooling is usually locally brewed. This does not make C-the-language worse.

More on a meta level, my advisor in my Master's work drummed into me that I should NOT insert my opinion into the thesis until the conclusion. So I found the editorializing along the way very annoying.

And finally, and very unfortunately, the experience level of the author appears to be low in all three languages; this is significant when it comes to implementing high performance code.

---

Now, for the interesting / good parts of this experience report.

Standout interesting for me was the Rust speedup on the 48-core machine. I did not expect that, nor did I expect Go to make such a good showing here as well.

In general Go performance/memory made a surprisingly good showing (to me) for this work. I shall have to revise my opinion of it upward in the performance axis.

I am both surprised and vaguely annoyed by the Rust runtime eating so much memory (Would love to hear from a Rust contributor why that is and what's being scheduled to be done about it).

Of course the stronger type system of Rust catching an error that C and Go didn't pick up is both (a) humorous and (b) justifies the type community's work in these areas. I look forward to Rust 1.0!

One note by the author that is worth calling out stronger is the deployment story: Go has a great one with static linking, whereas C gets sketchy and Rust is... ??. I believe Rust has a static linker option, but I havn't perused the manual in that area for some time. For serious cloud-level deployments over time, static linking is very nice, and I'm not surprised Google went that route. It's something that would be very nice to put as a Rust emission option "--crate-type staticbin".

Anyway. I look forward to larger sample sizes and, one day, a better productiv

[1] http://www.amazon.com/Making-Software-Really-Works-Believe/d... The situation is actually far worse than just varying developer time.

[2] https://en.wikipedia.org/wiki/Psychologist%27s_fallacy

bhntr3 · 2015-04-28 · Original thread

The lack of citations is concerning. I could read his book but it worries me when numbers like these are thrown around without some way to verify the analysis.

The "28x productivity" (or 10x or Xx ...) claim always triggers warning bells for me. http://morendil.github.io/folklore.html does a pretty good job explaining how the research backing that widely accepted "fact" may be questionable.

I also really like Dan Luu's review of the research behind static typing at http://danluu.com/empirical-pl/ as an example of how a "well studied" claim can still be questionable due to the massive difficulty in evaluating software engineering empirically.

Software engineering is a very human activity and that makes it very hard to measure and quantify.

A book that does a better job is Making Software (http://www.amazon.com/Making-Software-ebook/dp/B004D4YI6G/re...) but as my first link points out, it still has some issues. At least its goal is to get more rigorous in our scientific analysis of software engineering.

e1g · 2014-05-16 · Original thread

"Making Software" [1] has a chapter called "How Effective is Test-Driven Development?" where they do a meta review of 32 clinical studies on TDD pulled from 325 reports. They filtered out reports based on scientific rigour, completeness, overlap, subjectivity etc, and ended up with 22 reports based on 32 unique trials. "TDD effectiveness" was defined as improvement across four dimensions - internal code quality, external system quality, team productivity, and test quality - and each dimension is strictly defined in sufficient detail.

They report that high-rigour studies show TDD has no clear effect on internal code quality, external system quality, or team productivity. While some studies report TDD has a positive impact, just as many report it has a negative impact or makes difference at all. The only dimension that seems to be improved is "test quality" - that is, test density and test coverage - and the researches note that the difference was not as great as they had expected.

The chapter concludes that despite mixed results, they still recommend trialling TDD for your team as it may solve some problems. However it is important to be mindful that there is yet no conclusive evidence that it works as consistently or as effectively as anecdotes from happy practitioners would suggest.

The meta-study is relatively short and can be read online [2].

[1] http://www.amazon.com/Making-Software-Really-Works-Believe/d...

[2] http://hakanerdogmus.net/weblog/wp-content/uploads/tdd-sr-bo...

ggreer · 2014-02-19 · Original thread

While this list is certainly educational, remember that most of these "laws" are heuristics based on personal experience.

The discipline of software development is about as rigorous as parapsychology. To learn what we really know about software, I recommend Making Software: What Really Works & Why We Believe It[1], The Leprechauns of Software Engineering[2], and browsing It Will Never Work in Theory[3].

And no matter what languages, testing frameworks, or methodologies you use, always remember:

Every "bug" or defect in software is the result of a mismatch between a person's assumptions, beliefs or mental model of something (a.k.a. "the map"), and the reality of the corresponding situation (a.k.a. "the territory").[4]

If you want your map to reflect the territory accurately, you'll need more than a few "laws" handed down as gospel.

1. https://www.amazon.com/Making-Software-Really-Works-Believe-...

2. https://leanpub.com/leprechauns

3. http://neverworkintheory.org/

4. http://lesswrong.com/lw/2rb/why_learning_programming_is_a_gr...

gruseom · 2014-01-13 · Original thread

it makes a huge number of claims but does not back up these claims by any empirical evidence

Name a book that does! The corpus of evidence in our field is quite weak and doesn't support almost any interesting claim you might make about it. For example, there's no empirical data supporting version control [1].

There are books that purport to derive software development lessons from the research literature, but they're either not very good (because they rely on crappy studies and/or go beyond what the studies support) or not very interesting (because they merely transcribe the results of very narrow studies). An anthology of the literature was published recently [2], and it falls in the latter category. I bought it to get the state of the art in empirical research and found it contained almost nothing of use to me as a practitioner. Many of the studies were shoddy (e.g. qualitative case histories--a euphemism for "anecdote") and the better ones are so careful to limit themselves to the tiny experiments they did that you can't derive anything from them. I could mention a couple of exceptions, though, if anyone's interested.

I think the root problem is that software development is complicated, so to do rigorous formal studies on it is expensive. The amount of funding available for this work is a drop in the bucket of what it would need to be. In short, the market value of empirical evidence on software development isn't high enough for anyone to pay for it.

[1] http://neverworkintheory.org/2012/12/12/empirical-evidence-f.... and http://neverworkintheory.org/2012/12/30/why-we-need-evidence.... Notice that the HN discussion of the first post completely failed to answer its question: https://news.ycombinator.com/item?id=4931251.

[2] http://www.amazon.com/Making-Software-Really-Works-Believe/d...

artumi-richard · 2013-08-29 · Original thread

The book "Making Software: What Really Works, and Why We Believe It" (http://www.amazon.co.uk/Making-Software-Really-Works-Believe...) has a section on this.

Chapter 8 "Beyond lines of Code: Do we need more complexity metrics?" by Israel Herraiz and Ahmed E Hassan.

Their short answer is that, in the case they looked at, all the suggested metrics correlated with LOC, so you may as well use LOC as it's so easy to measure.

IIRC they believe it's only good to compare LOC between different employees if they are doing pretty much the exact same task however, but since LOC is correlated with code complexity, there is some measure there.

I recommend the book, as really focusing on the science of computer science.

gruseom · 2012-09-05 · Original thread

The thing about "many good ideas and rules of thumb" is, I've got a few dozen of those of my own! Most of us do. It would be interesting if there were decisive evidence against any of them, but even when I read studies whose conclusions contradict my beliefs, the studies are so flimsy that I find it easy to keep my beliefs.

There does seem to be a recent wave of software engineering literature, exemplified by http://www.amazon.com/Making-Software-Really-Works-Believe/d... (which I haven't read). Are you familiar with this more recent stuff? Does it represent new research or merely new reporting on old research? If the former, are the standards higher?

tokenadult · 2012-07-12 · Original thread

I highly recommend the book "Making Software: What Really Works, and Why We Believe It".

John Graham-Cumming, the author of the article submitted here, has a review of this book on Amazon.com:

http://www.amazon.com/Making-Software-Really-Works-Believe/d...

"This isn't a book about evangelizing the latest development fad, it's about hard data on what does and does not work in software engineering."

nswanberg · 2012-02-21 · Original thread

In the past six years or so since this was written, Greg Wilson has advocated gathering data on this and other programming related topics (watch http://vimeo.com/9270320 if you have an hour, or get a flavor for the arguments here http://blog.stackoverflow.com/2011/06/se-podcast-09/), and has compiled a book of essays (with research) on the topic: http://www.amazon.com/Making-Software-Really-Believe-ebook/d...

I haven't yet read the book, and so don't know it's conclusions, if any, on the various methodologies, but if you're curious about research in the area it would be a great start.

The other thing worth pointing out is that, while the Google Yegge describes in the essay might be different in 2012 than it was in 2006, it's still a different level of organization than even the average software development company, let alone a non-software company that happens to employ internal or contract developers. And Steve is writing about developing within Google.

ISBN: 0596808321