IIRC the evidence was that formal code reviews were no more effective than async peer review (but both were effective at uncovering bugs).
Yes, that's my main beef with it. SCRUM could be based on science, but sadly, it isn't. Here's an interesting book: https://www.amazon.com/Making-Software-Really-Works-Believe/...
It has some pretty basic flaws that are pervasive that the reader
should be aware of. First off is the "productivity" measure: the last
time I looked around, this was nearly impossible to quantify for
software development. The author chose SLOC and development time as
stand-ins for productivity. SLOC has a connection with code quality
and time of development, but development time is well known to vary.
In particular, development time in this thesis is linked to results in
a pretty classic "Psychologist's fallacy"; the author generalizes
their experiences as conclusive. This implies in particular that
"development time" in this thesis should be thrown out for meaningful
conclusions, as the sample size is 1. It is, however, an interesting
Another basic flaw is the connecting of 'modern' with 'good'. The
author remarks the main disadvantage of C and Fortran are their age;
this crops up here and there. Workflow tooling in Go and Rust are
major focuses by the developers: C workflow tooling is usually locally
brewed. This does not make C-the-language worse.
More on a meta level, my advisor in my Master's work drummed into me
that I should NOT insert my opinion into the thesis until the
conclusion. So I found the editorializing along the way very annoying.
And finally, and very unfortunately, the experience level of the
author appears to be low in all three languages; this is significant
when it comes to implementing high performance code.
Now, for the interesting / good parts of this experience report.
Standout interesting for me was the Rust speedup on the 48-core
machine. I did not expect that, nor did I expect Go to make such a
good showing here as well.
In general Go performance/memory made a surprisingly good showing
(to me) for this work. I shall have to revise my opinion of it upward
in the performance axis.
I am both surprised and vaguely annoyed by the Rust runtime eating so
much memory (Would love to hear from a Rust contributor why that is
and what's being scheduled to be done about it).
Of course the stronger type system of Rust catching an error that C
and Go didn't pick up is both (a) humorous and (b) justifies the type
community's work in these areas. I look forward to Rust 1.0!
One note by the author that is worth calling out stronger is the
deployment story: Go has a great one with static linking, whereas C
gets sketchy and Rust is... ??. I believe Rust has a static linker
option, but I havn't perused the manual in that area for some
time. For serious cloud-level deployments over time, static linking is
very nice, and I'm not surprised Google went that route. It's
something that would be very nice to put as a Rust emission option
Anyway. I look forward to larger sample sizes and, one day, a better productiv
The situation is actually far worse than just varying developer time.
The "28x productivity" (or 10x or Xx ...) claim always triggers warning bells for me. http://morendil.github.io/folklore.html does a pretty good job explaining how the research backing that widely accepted "fact" may be questionable.
I also really like Dan Luu's review of the research behind static typing at http://danluu.com/empirical-pl/ as an example of how a "well studied" claim can still be questionable due to the massive difficulty in evaluating software engineering empirically.
Software engineering is a very human activity and that makes it very hard to measure and quantify.
A book that does a better job is Making Software (http://www.amazon.com/Making-Software-ebook/dp/B004D4YI6G/re...) but as my first link points out, it still has some issues. At least its goal is to get more rigorous in our scientific analysis of software engineering.
They report that high-rigour studies show TDD has no clear effect on internal code quality, external system quality, or team productivity. While some studies report TDD has a positive impact, just as many report it has a negative impact or makes difference at all. The only dimension that seems to be improved is "test quality" - that is, test density and test coverage - and the researches note that the difference was not as great as they had expected.
The chapter concludes that despite mixed results, they still recommend trialling TDD for your team as it may solve some problems. However it is important to be mindful that there is yet no conclusive evidence that it works as consistently or as effectively as anecdotes from happy practitioners would suggest.
The meta-study is relatively short and can be read online .
The discipline of software development is about as rigorous as parapsychology. To learn what we really know about software, I recommend Making Software: What Really Works & Why We Believe It, The Leprechauns of Software Engineering, and browsing It Will Never Work in Theory.
And no matter what languages, testing frameworks, or methodologies you use, always remember:
Every "bug" or defect in software is the result of a mismatch between a person's assumptions, beliefs or mental model of something (a.k.a. "the map"), and the reality of the corresponding situation (a.k.a. "the territory").
If you want your map to reflect the territory accurately, you'll need more than a few "laws" handed down as gospel.
Name a book that does! The corpus of evidence in our field is quite weak and doesn't support almost any interesting claim you might make about it. For example, there's no empirical data supporting version control .
There are books that purport to derive software development lessons from the research literature, but they're either not very good (because they rely on crappy studies and/or go beyond what the studies support) or not very interesting (because they merely transcribe the results of very narrow studies). An anthology of the literature was published recently , and it falls in the latter category. I bought it to get the state of the art in empirical research and found it contained almost nothing of use to me as a practitioner. Many of the studies were shoddy (e.g. qualitative case histories--a euphemism for "anecdote") and the better ones are so careful to limit themselves to the tiny experiments they did that you can't derive anything from them. I could mention a couple of exceptions, though, if anyone's interested.
I think the root problem is that software development is complicated, so to do rigorous formal studies on it is expensive. The amount of funding available for this work is a drop in the bucket of what it would need to be. In short, the market value of empirical evidence on software development isn't high enough for anyone to pay for it.
 http://neverworkintheory.org/2012/12/12/empirical-evidence-f.... and http://neverworkintheory.org/2012/12/30/why-we-need-evidence.... Notice that the HN discussion of the first post completely failed to answer its question: https://news.ycombinator.com/item?id=4931251.
Chapter 8 "Beyond lines of Code: Do we need more complexity metrics?" by Israel Herraiz and Ahmed E Hassan.
Their short answer is that, in the case they looked at, all the suggested metrics correlated with LOC, so you may as well use LOC as it's so easy to measure.
IIRC they believe it's only good to compare LOC between different employees if they are doing pretty much the exact same task however, but since LOC is correlated with code complexity, there is some measure there.
I recommend the book, as really focusing on the science of computer science.
There does seem to be a recent wave of software engineering literature, exemplified by http://www.amazon.com/Making-Software-Really-Works-Believe/d... (which I haven't read). Are you familiar with this more recent stuff? Does it represent new research or merely new reporting on old research? If the former, are the standards higher?
John Graham-Cumming, the author of the article submitted here, has a review of this book on Amazon.com:
"This isn't a book about evangelizing the latest development fad, it's about hard data on what does and does not work in software engineering."
I haven't yet read the book, and so don't know it's conclusions, if any, on the various methodologies, but if you're curious about research in the area it would be a great start.
The other thing worth pointing out is that, while the Google Yegge describes in the essay might be different in 2012 than it was in 2006, it's still a different level of organization than even the average software development company, let alone a non-software company that happens to employ internal or contract developers. And Steve is writing about developing within Google.
Fresh book recommendations delivered straight to your inbox every Thursday.