Another long-ass post on software estimates

I had intended to clarify my position on this excellent parable on software estimation months ago, but forgot about it. Then my co-worker Chris Nelson wrote a lovely piece and asked for comment, which I provided.

It, uh, proved to be too long for WordPress’ comments system. So I emailed it. To a DoNotReply@ address. Cascading failures. So I reprint my comment here.

Software is different from other forms of engineering in some fundamental ways.
For one, we rarely begin from a fully formed specification, a shared knowledge base and a known set of tools. Instead, we’re frequently called upon to use brand new tools in a brand new domain to meet a spec drawn on the back of a greasy paper plate from the pizza shop. In a road building analogy, you might begin without a clear understanding of whether you’re building a footpath, an elevated highway, or a runway, and you may have one expert in each of these on the team. And it’s entirely likely you’ll realize what you’re building is a subterranean parking garage.
It’s also different in that you can build partial systems and change them rapidly while they’re under use. Nobody would build a bridge that goes one way and takes traffic before all its pylons are secured, but in software we commonly do just that — show the facade of a fully constructed bridge and open traffic to a single lane, just to see if our bridge provides value to travelers. Maybe they wanted a tunnel. We can change that in the middle of a commute.
Finally, the velocity (work performance) of a given engineer can vary wildly from day to day and from task to task. Part of this is different expectations of what is to be built, part of it is how much of the problem is “in cache” (awareness of code, awareness of the construction process, awareness of the domain) and part of it is simply how people work. There are many software enterprises that heap accidental complexity on developers in the form of silly processes, bad tools or unwise technical requirements delivered by nontechnical employees.
There’s an equation I like to use wrt estimates:
delivery date = start date + (work / velocity)
In an industry with “blue book” values for how long work will take at a maximum and consistent minimum velocity across employees, you can have very reliable delivery dates. In software, we’re actually very good at identifying blue book style work — so good that the blue book value is 0. We turn the generation of trivial work into a scripted task.
This means that any remaining work is necessarily highly uncertain, and that uncertainty can blow up any of the three variables. It’s a coefficient, but not one you can solve by simply doubling estimates — in the extreme, the value of the coefficient of uncertainty approaches infinity.
And it’s very difficult to scale velocity — typically we solve this by adding new people to the team, but the incoherence penalty caused by bringing their velocity up to par degrades the total velocity of the team such that they are now further behind than they started. You can ask people to work longer hours, but the increased defect rate and decreased performance of overtime introduces another incoherence penalty after about 9-10 contiguous hours. You can improve velocity by organizing to optimize engineer performance — this can be a huge deal, read “the o ring theory of devops” — but you will eventually hit a limit, and velocity improvements take a long time to materialize.
So velocity should be considered more or less invariant. The only thing you can do is pick one of the other two variables and make them invariant, allowing the third to blow up. If you don’t do this, you should expect both work (including quality) and the delivery date to inflate.
Scrum and Kanban, in the abstract, attempts to solve this problem by setting a concrete deadline, but allowing the actual work to vary according to available velocity. This is where your single lane bridge is developed, a lane at a time.
And “Estimate-free” development is common when the work absolutely, positively MUST be done, regardless of schedule. In place of estimation, a bare-minimum work scope is defined and progress is regularly reviewed — but a release date is neither known nor announced until it becomes obvious to the team. (Generally there IS a deadline for this class of work, but it’s more a checkpoint date to verify the aptitude of the assumptions that drive the work, not an expectation of a working product).
In the above parable, engineers were delivered a due date, which became an estimate, and they began a work plan. As the work continued, the estimate was revised — but the work plan was not, even though it became obvious they could not meet the date. Human effort cannot make up the difference.
In a SCRUM version of the parable, engineers would start walking, walk for 1 day uninterrupted, and then report their progress back. A new plan would be developed — maybe, instead of walking, you take a bike. Or maybe, instead of walking the coastline, you follow the highway. After 1 more day, work would again be compared to the plan, and so forth. Eventually, everybody would realize what’s really important is that the team be in LA on the end date, and somebody would suggest they take a bus. The rest of the walk would be logged as technical debt, eventually abandoned and life moves on.
In an estimate-free version of the parable, the date doesn’t matter but the walking of the coastline does. After about a week, there would be a meeting to discuss progress and remind everybody exactly what matters about the walk — maybe seaside views matter but climbing every peak doesn’t, maybe we bypass part of the coast by hiking along the 101 from San Luis Obispo to Santa Barbara — and the team would go back to walking. Management would dispatch outfitters to keep them motivated and productive every day, and eventually they’d complete the route – but it’s going to take 20-30 days, dependent on uncontrollable externalities like the weather, injuries and trail quality.
By the way, it’s not possible to eliminate the unknown externalities that make estimation impossible but a good prototype, built on a tight time constraint, can help to identify them earlier. If you plan before you prototype, the plan’s pretty much guaranteed wrong.

The iPad Pro 9.7″ Smart Cover is a hot mess


As a software architect, I do a lot of modeling. White-boarding and pencil-on-paper are my preferred media but there are times when the visceral, ephemeral nature of these forms doesn’t cut it. Like when working with remote teams. Or giving a talk in a room with no white-board, which is sadly the majority of rooms.

I’ve experimented with iPad-first modeling in the past with poor results. The software isn’t terrible (I especially like the apps Jot and Grafio), but the hardware interface doesn’t lend itself to intricate models, or even legible text, when applied with the speed of a marker. An Adonit Jot (no relation to the software) multi-touch stylus improves the effective resolution but only by about 2x.

For this reason I am demoing (bought, but might wind up returning/reselling) an iPad Pro 9.7″ and Apple Pencil. The 32GB model, plus pencil, run $698, which is roughly $200 more expensive than the 64GB Air 2 I bought 2 months ago. This would be worth the money if it allows me to produce expressive software diagrams quickly.

I enjoy the iPad and Pencil — they are good enough that the barrier is now my skill with my software. The quad speakers are loud and accurate. And the screen is the best I’ve ever seen, even at very low brightness levels.

However: the new Smart Cover is a ridiculous mess.

Earlier Smart Covers dramatically enhance the iPad experience. In one thoughtfully designed sheet of premium material you have an origami-like stand, along with some screen drop protection, an unobtrusive profile when closed and best of all the microfiber interior. iPads suck up fingerprints like crazy and having a cleaning cloth attractively attached to the unit helps improve effective screen quality. You could even argue that it’s a security measure (given that lifting a fingerprint could give an attacker access to the device). Besides, they look cool, and flipping the magnets on and off is a satisfying tic.

The Smart Cover for the Air was even better than the original, much more close fitting and with a rubber hinge that was less attractive but more durable and seems to pick up less grime. It also avoided the weird hanging edge beneath the earlier iPad’s curved right rear bezel.

So what’s wrong with the new one? Well, Apple engineers apparently expect every Pro buyer to invest not just in a Smart Cover — which is all a responsible user should need to protect their device and also exposes the exquisite thinness and stability of the aluminum design — but a $69 silicone sleeve. And to accommodate this expectation, the Smart Case for the Pro has just a little bit of extra tolerance all over it: meaning that if you don’t buy a sleeve, it fits like complete shit.

What’s more, it’s now priced $10 more than the standard Air sleeve. For some bizarre reason, Air sleeves are incompatible with the Pro, to the point that they reversed the magnets on the iPad Pro’s surface, so an Air sleeve actively repels from the edge, rather than clinging to and protecting it.

The effect here is something like an ill-fitting suit, with the added advantage of an annoying edge right where the fingers of your left hand go, blocking access to the volume controls. In an upright position, the iPad reclines much further than the iPad Air. And there’s a new, far more unsatisfying iPad tic I’ve adopted: trying to smooth out the lumpy back surface.

Here’s a quick gallery, comparing the original leather Smart Cover on an iPad 3, a $39 silicon Smart Cover on an Air 2 and a $49 Pro Smart Cover (in dark blue):

I mean, it’s not a pure cash grab. I suppose if I wanted a big, nasty bit of rubber on my iPad, I would want it to work well with my cover. I guess workers of the BYOD/pop-up shop economy want something durable. And of course there’s the iPhone 6-style jutting rear camera lens,  which means this excellent drawing tool doesn’t sit perfectly level on a table without a case.

But come on. Lots of us buy thin, attractive, lightweight devices precisely because they’re — can you believe it — thin, attractive and lightweight. We would like a well-fitting cover so we can carry them without getting our disgusting mammalian oils all over the world’s best display. And we’d prefer to not pay a $10 premium for an essential accessory that makes a great device look clownish.

Now there are other cases, I know — but because of the patents on this technology, the only magnetic cover you can buy for the iPad Pro is this one.

Pencil + iPad + Smart Cover + Silicon case. An $816 ecosystem to add accurate drawing to my $538 ecosystem.


But for crying out loud — EXPLAIN THIS SHIT. As if charging the Pencil this way wasn’t dumb enough, how is the charging interface TOO LONG for the lighting port it plugs into?

Test first, optimize last

Here’s the workflow I prefer these days when implementing new work, be it a change to code, a change to process or a design for a system:

  1. Specify your intent – You need to explain what you are building to yourself and others. Doing so with an executable specification framework will also allow you test that your code meets your expectations given your assumptions even as you make changes. A basic layout (in comments) or a set of tasks may also help you (and others) record assumptions and be sure that specs were met.
  2. Make it work – well enough to meet your existing functional requirements. Do not invent requirements, obsess over clarity, details or performance. Just meet the spec.
  3. Make it good – Refactor your code until it’s clear enough that you can pass off maintenance or come back to repair and extend the functionality in a year’s time. If it’s for a customer, make it handsome enough that they will want to use it. If it’s for a service, make it succeed (and fail) in ways that are auditable and consistent.
  4. Make it fast – fast enough to meet your expected capacity needs. Test it! With modern hardware prices, serial performance is rarely a concern and a spot check is good enough.

I don’t know about you, but the ordering of these phases is almost precisely the opposite of how I worked when I was junior — to the extent to which they’re discrete at all. I’d almost always start with the view that the implementation of the feature (still hazy in my mind, it’ll materialize as I build it) must be optimally and provably fast, complete with short circuit logic and extensive variable re-use. Then I’d focus on building as much flexibility into the functionality as I could — more is better, more features for re-use! Finally, as the deadline whizzed past I’d wire up the feature before dropping it on QA’s doorstep like a cat with a dead bird.

The danger of producing software in this manner is that the activity that provides the highest value — testing the feature — arrives late in the process with little direction and receives the least amount of attention. The activities that matter less — implementation choices — consume the majority of the software budget. Code designed to be clever and efficient is now a more complex system to debug, and fixes are likely to remove either or both of these qualities. The end result is frequently a fast, extensible piece of shit.

Inverting the pattern is sort of like eating your vegetables first and your dessert last: you ensure health (a proof of correctness) without sacrificing enjoyment (the satisfying feeling of legible, optimized code). But the real key to this workflow is keeping the phases separate and actually investing time in each phase. Knowing what you are doing, and that you can always clean up and optimize later, is the key to focusing on an accurate implementation. If you always make it good and if necessary make it fast before calling a cycle complete, you should carry less technical debt from your implementation choices.

Now, there are times when the entire reason for a change is to optimize a quality such as API clarity or system performance. This same workflow should apply there, but under these circumstances the specification should describe the improvement you’re hoping to make (such as “improve request throughput by 10x from baseline”) and the kinds of out-of-phase optimization you should avoid would be improvements that aren’t directly related to or that only contribute lightly to the spec.

Finally: when working with VERY old code, I tend to add an additional phase before specification: clean it up. A modern IDE will offer a number of safe non-functional refactoring operations (such as identifying and removing dead branches, removing stub documentation, renaming poorly named methods or adjusting scope) that take little time to apply and make functional rework more successful.