Inventor's Paradox

Last week in Intertwingler.

Sep 20, 2023

Last week I wrote a configuration vocabulary for Intertwingler, my nascent engine for creating dense hypermedia networks. This was a reluctant decision, but now having done it, I believe it was not only the right move, but one I should have done weeks ago. What it means for the system is that the configuration now moves independently from the executable code, and is usable in future Intertwingler ports to other programming languages. It also provides a much more expressive and less restrictive repertoire for telling the engine how to behave, and as a byproduct, has made it reconfigurable while in use—something I was planning but hadn’t considered a priority. In a previous issue I likened this project to writing a poem, in the sense that the ultimate product won’t consist of very many words, but each and every word will be thick with meaning. I have remarked numerous times about Intertwingler that there’s almost nothing to it, but the part that isn’t nothing is a precision instrument.

Intertwingler Configuration Vocabulary: Properties — The Intertwingler Configuration Vocabulary from the point of view of its *properties* (rounded boxes).

Indeed, most of Intertwingler—especially now—is just a receptacle for declarative configuration data (leaving aside, for the moment, the content it is meant to manage). This makes the procedural aspect of it extremely abstract, almost like an interpreter for a domain-specific language. In a way that’s kind of what it is. What this means is that to get the code right, you have to write out a whole schwack of configuration. What that means is you have to decide how you’re going to represent said configuration, and hash out all the things you’re going to need the configuration to say.

The Paradox in Question

This brings me to why I decided to write about this matter. In my career—which has admittedly been skewed, if not squarely in the zone of bespoke, one-off information infrastructure, then definitely in that general direction—I have found a certain phenomenon to hold way more often than one would imagine it has any business doing. This is a thing called the Inventor’s Paradox. Coined by the mathematician George Pólya in the 1940s, it refers to the situation where given two plans, one conservative and one ambitious, it’s the ambitious one that actually turns out to be more likely to be successful.

In mathematics as in programming, the goal is to simplify the representation of a system so it’s easier to get your head around. Rather than working to solve a specific case, solve a general case that implicitly nets the specific one. This often entails introducing new concepts and going abstract, which has the unfortunate optics—at least in programming—of “overengineering”, or worse, going “off-task”.

I bring this up because while I’ve been witness to as many boondoggles as anybody would over a two-plus-decade-long career, I actually suspect more time and resources have been lost due to not being aggressive enough.

The original plan for my Summer of Protocols project was to do a minimal set of interventions on an existing piece of software, to get it into a serviceable state in time for uptake by the affiliate researchers when they started halfway through the summer, at which point I would switch into support mode, and work on the essay—the other half of the deliverables. This turned out to be incompatible with the expectations of the program, as it was the essay component that was expected at the halfway mark, a fact that wasn’t clear at the outset. I also estimate that about a quarter of the notional available hours were consumed by group activities of one species or another, that with the exception of the week-long retreat, were concentrated in the front half of the program.

Since this was the inaugural Summer of Protocols, there was a fair bit of pivoting and improvisation. I imagine the process will get more dialed in subsequent years.

I had anticipated some of the available time to come right off the top, although not as much as what ended up. I hadn’t budgeted to have a circulating draft of the essay done by the halfway mark (I had something started, but ultimately submitted 5,500 brand-new words disgorged three days before the end of the program), and I was definitely wrong about the original development strategy.

The problem was, it was just too damn parochial. The strategy was couched in “how do we fix this piece of software?” versus “what is the desired behaviour, and what conceptual structures are necessary to describe it?” The original plan was preoccupied with adapting an existing implementation. After all, the received wisdom is that you should never attempt a full rewrite. Ironically, when I finally ship this thing, very little of the code I started with in May will remain. That is, I probably spent more effort preserving code I will ultimately end up deleting, than if I had started a new project from scratch.

That said, there is the other matter of the fact that it took me eleven entire weeks to finally warm up to the idea of an effective rewrite (and rename, which had an unexpected rejuvenating effect on the project). And once I did, I couldn’t even touch it for a while: week 12 I was in Vancouver visiting family, week 13 was the retreat in Seattle, and I spent weeks 14, 15, and 16 responding to my experience at the retreat, which was that nobody understood what the hell I was working on.

It was not obvious how not obvious this project was. One thing I think I did succeed in doing over the summer was make it more legible. I have, for instance, convinced myself that it’s not too misleading to say that Intertwingler is an engine for making websites in the sense that WordPress is an engine for making websites. Even though Intertwingler is nothing at all like WordPress, it fills a similarly-shaped hole—at least one that somebody who has worked in an office in the last ten or twenty years would recognize.

The Need for Passive Reporting

A perennial irony of trying to create a novel piece of software is that until you have a complete-enough instance of it in your possession, trying to describe the problem it solves is a futile exercise in hand-waving. Why? Because if you could describe the project satisfactorily in terms of something that already exists, an entirely sensible question is why are you writing a novel piece of software instead of using the already-existing one? There’s a canned answer of course: that I’m trying to accomplish something that no✱ extant piece of software is capable of, and I’m actually rather annoyed that it doesn’t exist already, and that the task of dragging it into being has apparently fallen to me.

✱ The kind of argument that elicits this response also skates over the fact that virtually all software uses other software in some capacity or other, whether as a direct dependency or some other relationship, like a build tool or operating system. I write novel software when there is a hole in the universe of a recognizable shape, and I write my own takes on existing capabilities when I assess that there is some qualitative idea missing from the current crop of offerings. Given that, there is unambiguously a horizon across which I don’t have enough of a beef with what’s already on offer to go and intervene. In contrast, the genuine juche, unabomberesque solipsist would insist on writing their own material all the way down the stack. In other words, when I say “no other software does this”, I mean the topmost layer. It’s also a basic courtesy to give people the benefit of the doubt when they’re trying something they consider to be new, especially since if they’re wrong, it’s their time, not yours, that has been wasted.

The dilemma, then, is that every hour you spend trying to explain what you’re working on is an hour you could be spending working on it. The ideal position, of course, is to just finish whatever it is before anybody has an opportunity to ask about it. That constraint, however, puts a sharp upper bound on what can be accomplished. In my experience, it’s challenging to invest a significant fraction of one’s time on an endeavour without drawing attention to it. If you’re getting paid for said endeavour, then you’re absolutely going to have to account for how you spend your time. Even a self-funded effort, though, quickly bumps into other people, and the less legible it is, the more effort you have to spend explaining. The frustrating thing about novel software in particular is that what you’re trying to create is a specific pattern of behaviour, and there’s no substitute for just showing it in action. (Again, if there was, you could just use that, and not have to bother making your own.)

Looking back on the summer, I spent a big chunk of time trying to explain and account for what I was working on, and that naturally ate into the time available to actually work on it. I should note that 75%—25% of the hours having been absorbed, as mentioned, by group activities—is a relatively high baseline to begin with. You wouldn’t get that high a ratio in an ordinary corporate setting. I didn’t track fine-grained hours so I don’t know what proportion exactly, but let’s say conservatively that I spent half of what remained, or 37% of the total hours, just on reporting. The video I did near the end of the program took over two solid weeks by itself, or 11% of the total project hours, to prepare.

I don’t regret doing it, but it took a whole week to write the script, and several days to record all the B-roll. In between there somewhere was shooting and editing. Outside of a casual TikTok or YouTube spot, there is nothing quick about video. Now, I have experienced some value from streaming on Twitch and YouTube, but the more successful streams consist mostly of chatting and don’t make much progress. Actual substantive work is boring as hell to watch.

Any time you have to explicitly prepare a report of any kind is time subtracted from advancing the very thing you are reporting on. It follows then that you can gain a lot of leverage if you can do at least some of the reporting passively. The infamous Green Grid of GitHub is such an example, although it only counts that you’re working—a figure that, I add, can be gamed—not what you’re working on.

The green grid also only counts commits to main branches, so if you spend weeks on a side branch—a not uncommon pattern of activity—none of that work shows up until you merge it back into main, at which point it shows up retroactively. (That’s got to make for some weird incentives.) On the flip side, I have personally interacted with semi-automated accounts piloted by random people pushing trivial changes to my projects so they can clock both a green square and an authorship credit. This, in turn, is because (naïve) software companies hire based on how green your grid is.

This need for passive reporting is actually one of my main motivations for making Intertwingler. One of the first things I plan to do with it is a remake of my IBIS tool. The prototype, which I’ve had for a decade, is missing essential functionality on top of being too damn hard to use, and too damn sclerotic to fix. What it does afford, ideally, is the logging of the planning process, the decisions that go into the plans, and the rationale for those decisions. It manifests as a website you can go to—and crucially, send stakeholders to—and click through the issues that the strategy is designed to address, the arguments for one plan over another, and most importantly, the new information that arises in carrying out the plan. What’s missing from the prototype is a means of engineering time.

While it is a truism that things take longer than you expect, and that you don’t have as much time as you think you do, it’s difficult to gauge how much. In fact, it costs time to figure out how much time you don’t have, or how much time you think a task is going to take. And then there’s the small matter of processes that generate questions that have to be answered to move forward, which have an even chance of generating questions of their own. I have actually given up on time estimates for software development, because it’s clear as day that a process that generates questions that generate more questions is not going to be statistically well-behaved for a conventional estimate (which depends on the central limit theorem holding) to be meaningful.

Upending Project Dynamics

Or perhaps more accurately, as I argue in an older video, we flip the bias of the estimate. It’s actually pretty cheap to create a mental tally of at least how much time a particular task will take, but an accurate estimate of the time it will take at most rapidly approaches the time it takes to just do the damn thing. I established this years ago with a technique I called behaviour sheets, which produced spectacularly accurate time estimates, with one unfortunate caveat: if you started a behaviour sheet first thing Monday morning, you could tell by lunchtime on Wednesday that the software module in question would be finished by end of day Friday. In other words, despite an uncanny accuracy, it took as long to estimate the job as it did to do it.

Because of the way software is structured, I anticipate this relationship is slightly sublinear. The prescriptions and proscriptions that make up a behaviour sheet I doubt grow in number as quickly as the lines of code they correspond to. Some of them, moreover, will be generic enough to be reusable, and one could conceivably build up a library of them over time.

The thing to recognize here is that an accurate “at-most” estimate depends on eliminating all but the normal—as in distribution—amount of surprise. These recursive, question-generating questions, are antithetical to that. Yet, they stand in the way of success. So what to do?

The first thing is to recognize that exhausting the source of questions isn’t just so-called “requirements analysis” or even user research. It’s impossible to decouple from the job because it is the job, arguably most of it. This is where the cheap, “at-least” estimates come in. Every piece of software, down to a fine-grained slice, can be ascribed a value per run. The code that makes direct contact with paying customers is obviously the easiest to count, but software depends on software so it should always be possible to compute the contribution of an individual module or function—that is, if you had the infrastructure to do it.

But say you did have that infrastructure, you could look at the problem of resource control for software development completely differently. Rather than try to figure out what a particular software intervention is going to cost—time of course being a linear-ish function of money—you try to figure out what it’s worth. You allocate a fixed amount of resources and compute the best bang-for-buck interventions, the ones that lie in the sweet spot of both valuable and expedient. You use a risk-weighted discounting method to figure out how many runs the software will need to earn its keep, how long that is likely to take, and the associated probabilities—that is, a differential equation, but we have computers for that.

rNPV, or risk-adjusted net present value, is where I’d start for a model. If you had the software today and could start selling✱ it, what does the probability curve look like for earning at least X dollars by Y date? Now how would that change if we shifted the whole thing Z days into the future?
(✱ “Selling” here is very broadly conceived, and many software interventions will have to have their value per run determined some other way. Not impossible in principle, but I acknowledge that there will be the need for some creative proxy metrics from time to time. Generative AI is actually interesting in this regard, because it has relatively well-defined unit economics. This is because training the models and subsequently running them has a palpable cost that dwarfs any development work, so in that sense it’s much more like a conventional construction or manufacturing project. It also doesn’t hurt them that they literally charge a fee per execution, so the value per run on the software is simply the net of that.)

Focusing on value per run, and treating it probabilistically, enables us to look at time—or rather, discretized, contiguous chunks of time—as casino chips. Every unit of time—I have in the past argued for a four-hour block I call a cell—is like a bet, or even more accurately, like a move in a board game, because each move changes the state of the whole system. As you make successive moves, some possibilities get foreclosed while new ones open up. Moreover, since each software intervention has a valuation, and a great deal of the cost of software development is answering questions about how the software should behave, then every question, or rather its answer, has a value as well.

What remains with this strategy is to allocate enough time so you can make the best of your available moves. The playbook here is analogous to the tried-and-true method of venture capital: half the bets remain underwater, the median bet breaks even, and five to ten percent of the bets go to the moon. I don’t have enough empirical data working under this regime yet to know what the exact distribution is, but a safe starting assumption is that the value curve of this kind of activity is about as steep as those of the VCs that fund it. You need to put up at least enough money to buy enough raw practitioner time, such that there are enough “moves” that one intervention out of a pool of candidate interventions (which you also generate) pays off bigtime. That figure will likely vary with different categories of software development, but in principle the minimum viable spend—time as well as money—should be calculable.

So we cabin the risk of cost overruns and benefit shortfalls by fixing the spend at a value that should yield a return, but also by stipulating that only the value matters, rather than a specific predefined set of deliverables. Another way to say this is you can have it fast, cheap, and good, as long as you aren’t too picky about what “it” is.

This may be counterintuitive to some, who practice software development by gathering requirements and then working to satisfy them. The problem is that a “requirements gathering phase” has a persistent downward pressure because everybody wants to “get building”. As such they are very rarely adequately resourced, because if they were, they would literally be the job, as the requirements don’t stop accruing just because the requirements phase does. Furthermore, requirements analysis is almost always articulated in terms of features (has X or must be able to do Y), rather than behaviour (must itself do X, must not do Y). So you get a scope and write the code, and then the bug reports start rolling in, saying it does X but shouldn’t, or doesn’t do Y but should. The Agile people assert that this is what iterating is for, and they’re right some of the time, but many of the questions about what the software should or should not do can be settled without writing even a single line of code.

Computing Time

The remaining wrinkle of this scheme has to do with the fact that time on the meter is embedded in dates on the calendar, and the shape of the time matters. Yet, the interface to other people for time on the meter is dates on the calendar. So, given an interval of dates on the calendar,

how much time (on the meter) do we actually have?
to do precisely what?

I’m not talking about billing hours here (per se), but rather the configuration of time as a resource. If a unit of substantive work is four contiguous hours (we can quibble over the exact number but I find it’s about what I personally need), you may only get one of those a day. Heck, you may not even get one of those in a day, with meetings and other responsibilities and such. The point is to be able to plot that—to be realistic about how much time—how many moves—are actually available in the larger envelope of dates on the calendar. Time, moreover, is a pernicious conflation of input and outcome. What ends up happening is events in the environment end up torpedoing the number of available moves, giving you less available time as an input, while the deadline stays the same.

This may be my own experience bias, and not to claim that such a situation can’t happen, but I have never seen a software project where the exact completion deadline genuinely mattered. For one, because “complete” is meaningless in software, there is only the question of whether or not the current state of your work is deployable. For another, the real risk of software development only begins once you deploy it, because only then can the real damage happen. And then there’s the matter of the marketing department. At the very least, in my experience, executives like to noodle with a shippable product for a while before they pull the trigger, so there is quite a bit of fudge between when the thing is ready to ship and when it actually does.

A development process like the one I’m describing would need to be instrumented out the yin-yang to be viable. Every issue, problem, or question has a value, and every response to these has an “at-least” estimate for cost. This justifies some time be spent investigating each, whether you actually get around to it or not. That is, the highest net-valued issues guide the process. The respective valuations and costs of every issue raised and response proposed are subject to change, potentially several times a day, per person working on the project. Managing this is not something you could pull off with a spreadsheet, or even conventional project management software, because issues, bugs, test cases, documentation, the calendar, and whatever the heck you’re working on at the moment are so heavily…intertwingled. In particular, you’re going to need to know:

how many moves you have left in the allocation,
how many moves you’ve spent so far and the results you’ve gotten out of them, i.e., what is your current position,
the aggregate (discounted) value of the outstanding issues, and
what interventions can be attempted with the remaining available moves—i.e., how much value you can still create.

With enough people working in parallel, this display could change significantly from one hour to the next. It certainly will change from one day to the next. If this information ends up being an extra chore you have to update by hand, it’s never going to fly—it’s hard enough to get people to write coherent commit messages. The only way to pull something like this off is the same way GitHub does its green grid: the passive monitoring of everyday data offgassing. The question then is how do we make a substrate for this work that’s the path of least resistance to use, that also generates data that we can compute?

That’s One Smooth Yak

Not gonna lie, instrumenting the software development process is one of the major ambitions of Intertwingler. We choose timid, conservative solutions because the overhead of calculating an assertive solution with conventional tools is too damn high. We likewise can’t even begin to contemplate creative new legal arrangements because we don’t have the infrastructure to make good on our promises. Intertwingler is meant to help with that. The way it helps is by making it very cheap to make a very large number of very small objects that are durably addressable over the Web, and made out of data that can be consumed by other systems without a lot of special consideration. Of course the tool I could have used the most over the course of this program was the very tool I’m trying to make: something that helps me model the ambitious and ultimately more viable solution, so I don’t have to hedge with the conservative one—which turned out to be doomed anyway.

If I already had the thing I’m trying to make, I could just point you to the line of reasoning that justifies it. I wouldn’t have to spend two solid weeks preparing a video that does a mediocre job of explaining it. I wouldn’t have to do PowerPoint decks. I could generate test cases. There’s a bunch of gruntwork that I just wouldn’t have to do anymore because a simple automated procedure could take care of it.

For example, I tried keeping a project journal but eventually lost the thread, because it was one of those things that if you get behind on, it’s impossible to catch up. Moreover, the information that would end up in a journal is largely a duplicate of what ends up in things like commit logs, from which journal entries can largely be reconstructed. I would say between my git commits and my iCal, I could generate a reasonably accurate project journal with daily entries. The problem is, connecting these systems is ordinarily a huge pain in the ass. git and iCal both have their own ways of representing data, so I’d need an adapter for each. Fine, but adapt the data to what? That’s where Intertwingler comes in. It provides an addressable space where data can mingle, as well as a receptacle I can bolt an adapter to. There are all sorts of other data sources I can’t wait to explore once I get Intertwingler up and running.

This article took two and a half days to write—two and a half days that could have been used to write the “spine” of Intertwingler, that takes the configuration and acts on it—which is what I’m going to start on immediately after I send this. Hopefully my next status update can be done from inside Intertwingler itself. I have one other scope-increasing rabbit hole I am considering, which is to port over some parameter-handling code I wrote a decade ago. The motivation is that it’s a solved problem and I intended to port it at some point anyway. The alternative is to try to carry on without it, which equates to copying its functionality, only badly. The Inventor’s Paradox strikes again.

The Making of Making Sense

Discussion about this post