Computers, Amirite?

Like I said, a rapid-fire strafing of ideas and occurrences. Really.

Oct 27, 2023

I’ve been head-down trying to get the spine of Intertwingler put together, which I was anticipating doing last week, but last week turned into a firefighting week, and I want to get back to it. So this is just going to be a rapid-fire (where “rapid” is over 8,000 words and thus two major excisions; one linked and the other forthcoming) strafing of ideas and occurrences I encountered since I last wrote.

I’m trying to get back into the morning warmup routine, at least once a week. This was last Monday (now that it’s four days from when I started, I mean last-last Monday), when things were looking much more optimistic that they were going to go to plan.

My Annual Podcast Appearance

I went on my first podcast in over a year, Green Pill with Kevin Owocki, who is making his way through the Summer of Protocols cohort, to talk about SoP and Intertwingler. When I listened back, I was immediately motivated to go and write a one-pager on intertwingler.net. If this software is gonna be An Actual Product™, I really am going to have to discipline myself around the fact that there are five distinct constituencies that this thing serves, and they’re not all going to be interested in the same aspects of it. All in all it was a good conversation, especially since it spurred me into organizing my thoughts for the next one.

Since I reference a tonne of stuff, I made a bullet list. It didn’t make it into their show notes, but there it is.

What If Message Integrity?

Something that has been beaning around in my head for weeks is the report from Citizen Lab about Egyptian politician Ahmed Eltantawy getting hacked by the Predator rootkit, by way of a man-in-the-middle injection attack on an unencrypted website he happened to visit. Rather, it was a remark by Eva Galperin that stuck with me, about “this is why we must finish encrypting the goddamn web”. And like, sure? I mean, I agree to a first approximation (though I can imagine the occasional scenario where you would actually want cleartext HTTP), but it got me thinking. The fact that the exploit was delivered by a man-in-the-middle is kind of a side show: TLS (SSL) encrypts and authenticates network connections; it doesn’t authenticate content. The exploit at the tip of the spear was an arbitrary code execution vulnerability in WebKit (the rendering engine in Safari), meaning it would have worked the same whether the connection to the website was encrypted or not.

Just so we’re clear, I am certain Ms. Galperin understands the distinction between transport layer security and the cryptographic integrity of the actual content being transported. Indeed, moreover, if the target was visiting encrypted websites instead of unencrypted ones, it wouldn’t have been nearly as easy for the attackers to get him. All I’m saying is she’s the one who got me thinking about it.
Edited to add: Apparently it was JavaScript, because of course it was. Thanks to Will for tracking that down.

Why I’m thinking about this, though, is because why don’t we have an agreed-upon convention for message integrity over HTTP? I mean, there was the Content-MD5 header for a while (dead for obvious reasons), and there is the multipart/signed content type which could be workable—though I am imagining a header might be nicer. You could piggyback ETag, though something might peel it off or change it in transit. And you’d need some metadata, like who the signing identity is, and whether the signature was taken inside or outside the Content-Encoding. It could probably also hook into X.509’s public key infrastructure that’s already there for TLS. I dunno, I’m spitballing here.

I just remembered that HTML has sub-resource integrity now for <script> tags, though those are just plain hashes that are only implicitly authenticated by dint of being embedded in a page you presumably already trust.
Edited to add: I also just remembered JWT is a thing.

Message integrity of this kind I suspect is going to become increasingly important, especially in federated situations, where we're all performing store-and-forward duties for pieces of content we didn't create. It’s certainly going to be important to me going forward with Intertwingler, and the internet at large would benefit from it too. Another reason why it’s been on my mind is a different vulnerability recently found in libwebp. Now, WebP is Google’s slick new image compression format, and the fact that its reference implementation had an exploitable vulnerability in it is even more disconcerting than the WebKit one I mentioned. Why? because images have way farther reach than Web pages do. Consider link previews in just about any chat app or feature from WhatsApp to Instagram to Slack to Mastodon—I’m thinking about this because Signal advised users to turn that feature off. For a while there, anything that used libwebp was liable to get popped. Here’s the problem though: link previews are useful, so how do we make sure they’re safe?

Binary file format codecs are notorious for arbitrary code execution bugs because not only are they usually sketchy as hell, the exploit payloads are hard to distinguish at a glance from valid content. Images are particularly pernicious because they get pulled in automatically by all sorts of events, and nothing thinks twice about parsing them.

Fetching the text of a link preview is a relatively safe operation, but that metadata can refer to an image that is only associated with the preview and nothing else—not that you would notice being hacked by an image containing a sufficiently well-crafted exploit, even if you were looking straight at it. If there was an integrity mechanism, though, you could download the image, but only open it if it was certified by an entity you trust.

This brings me to ads, inspired by YouTube’s new pattern of finger-wagging you—and even issuing “strikes”—if it detects you’re running an ad blocker. Well, part of the reason I run an ad blocker is because ads are a known attack vector. So, fine, Google, you want to insist I watch your ads? You certify, on pain of legal action, that they’re clean. In fact, shit—If I was Cytrox or NSO Group, why would I bother breaking into a data centre (or, you know, send government black-baggers) to install some cockamamie man-in-the-middle proxy server? I could just take out an ad. Who cares how many other people got owned by it as long as the target did?

I mean I suppooooose something about zero-days being expensive and a move like that would make one more likely to be found and patched sooner than later, and probably something about having to navigate KYC with fake credentials which are expensive to produce and potentially expose you to attribution, blah blah whatever. It’s a different set of risks.
Or, you know, you just do it through official government accounts and if the ad carrier balks you threaten to cut them out of your country, because nobody who uses these things seems to give a shit about keeping it a secret, at least for for very long. Moreover, if the ad networks are as good as they claim they are (lol), you should be able to send a payload to a specific person. You know, just in case anybody cared about collateral damage too (also lol).

Anyway, all this business gets me thinking about being generally less promiscuous about our outgoing network connections. URLs in particular are super easy to turn into canaries to detect if somebody has done something like look at an email; if you’re reading this on Substack, this has already been attempted on you. In fact, somebody did this to me the other day: I was like why the hell does this plain-text-ass lookin’ email have a banner across the top saying “external resources blocked” or whatever? And then I remembered that flap a while back about that company Superhuman that was an email client for marketroids that bugged every message you sent, so you could sneakily determine if the recipient had seen it. (This wasn’t that, it was a competitor.) Anyway, I hit “view source”, and sure enough there was a tracking pixel.

When I see something like this, I’m like, are you trying to trick me? Or do you think you’re entitled to information about when I look at your message? In either case it isn’t a good look; in fact I’m not sure which is worse. But it dovetails with the advertising situation. The point here is, like YouTube’s little tantrum—echoing many other companies—about why they should be entitled to make you look at ads: what are reasonable expectations here? The refrain is typically on the order of “if you come on my property, you’re subject to my rules”, but the code is being executed on my computer. So fuck you, Google, I’m not on your property, you’re on mine. Same deal for buddy with the bugged email.

So anyway, I dunno, on one side I’m saying that nobody has a right to run their software on your hardware unless you made some kind of binding agreement to do so (and even then, honestly, how would they ever know you were complying?) This includes deterministic stuff like making your computer download a 1x1 transparent pixel image from a URL with enough entropy in it to positively identify that it was you that did it. On the other, if an entity wants to condition a service on a tit-for-tat exchange of information (including running their code on your computer), they should have a duty of care to ensure that what they’re sending you is clean. But they can’t commit to this unless there’s some kind of mechanism.

Meta in Myanmar, Butthurt on Sand Hill Road

Erin Kissane has published a four-part, nigh-book-length survey of (I refuse to call them Meta) Facebook’s conduct around the genocide in Myanmar. You should read it. In fact, if Erin writes so much as a grocery list, you should read that too. I would also pair this with Evelyn Douek and Alex Stamos’s recent episode of Moderated Content—who really should have her on.

If nothing else it would be interesting because Stamos was featured in chapter 3. The tie-in here I was thinking though is that the episode is about content-moderating Israel-Gaza, but a chunk of it is dedicated to some inside baseball. If you weren’t paying attention, Marc Andreessen made himself the main character of the internet last week by disgorging a 5000-word “manifesto” citing none other than overt fascist Filippo Marinetti and making a shoutout to fictional Ayn Rand character John Galt. (Not sure if trolling?) I think Stamos is right that this declamation is motivated by getting his ass handed to him by a recent round of really bad bets, and I think Renée Diresta is right that this screed is a catechism that Andreessen will expect you to recite if you go to a16z for money. I did my part and made a spreadsheet I called pmarca’s pantheon, based on his list of “patron saints” at the bottom of the document. It mostly consists of run-of-the-mill libertarian economists (a pleonasm, I know) and tank-thinkers, only half of whom are actually dead (so much for “saints”). I don’t really have much else to say about it, other than more of them than I expected have titles—a funny thing for a libertarian to care about—and fewer of them than I expected have been on TED or guests on Econtalk. If you want an actual take, go read Dave Karpf, or Ed Zitron if you wanna go a little more maudlin. Ezra Klein just put one up too. (There are no doubt hundreds; these were just the ones that were handy.)

The Great Deshittification

Last week I helped Venkat Rao deshittify his flagship blog, Ribbonfarm, running atop the Augean stable known as WordPress. He asked me to write a post-mortem about it, which gets into the filthy technical details of MySQL and Unicode. If that’s the kind of thing you crave reading about, click away.

Programmable Software Is Accessible Software

The other of last week’s fires happened to be sited by my wonderful partner, who is a sound editor, tasked with cutting up hundreds of pew pews and explosions for one of the few productions that made it to post over the strikes this summer. She, like almost everybody in her industry, uses this antique piece of software called Pro Tools to do it. Holy shit is all I gotta say. Based on what I’ve seen, I’d roughly estimate a solid 60-70% of her job is ETL, schlepping around chunks of data big and small, shaping it, getting it in and out of this monstrosity. This is before she does anything she’s actually trained for.

From what I’m told, the reason why people still use Pro Tools is that pretty much every piece of data in the industry that is of any value is saved in its proprietary file format. Moving off of it is a huge middle finger that can only realistically be pulled off by some bigshot starting a fresh new project, who can dictate much more than what software their team uses.

Because this thing is almost four decades old, it has virtually no data semantics that aren’t directly coupled to the UI. People just didn’t make software like that back then. (Well, at least outside of Xerox PARC.) What this means is that in order to abridge boring, repetitive, error-prone data-schlepping tasks, you have to cobble together a hodgepodge of scripts that pretend to click on UI elements and sniff data out of dialogue boxes. I was looking for a way to read off a hierarchical structure of audio tracks so I could generate a soundboard to help her with the pew pew, and I found that the only way to get at that structure is by measuring where a widget is on the screen and construct it by hand. And because you’re fake-manipulating the UI, all these operations are painfully slow by unencumbered computer standards, but way too fast for a human, so while it’s running, it looks like a poltergeist on meth is using your computer.

The other complicating factor about this too, is that Pro Tools requires a hardware dongle called an iLok to be plugged in while it’s running, and it shits its pants the instant you take it out. So the only time I can work on this stuff is on her physical work computer, at her house, while she’s not using it. Which is almost never, because she’s in the film industry and works twelve hours a day.

Anyway, the most fine-grained means available to “automate” Pro Tools is this thing called SoundFlow, which basically bolts a JavaScript interpreter onto the Pro Tools (among other similarly hobbled apps) UI. Let’s just say that this thing is clearly made for sound engineers who dabble in programming, not somebody coming at it from the direction of software development. The documentation basically reduces to what you can scrape out of forum posts, as well as an API reference that’s just a dump of all the classes and contains literally zero un-generated prose. Needless to say, even the tiniest little intervention is excruciatingly slow-going, trial-and-error, cargo-culting, plodding along.

One thing that is kind of neat about SoundFlow though is it mediates between your computer and something like a Stream Deck or iPad, meaning you can assign macros to buttons and even have it switch up depending on what context you’re working in. Getting that going would cleave hours off her day, not to mention significantly lower her blood pressure. Of course, to really do what she wants, the SoundFlow people say I will need a level of access that is only available through their developer program. They just got back to me about it and it’s a whole limited-seat training program for people planning to sell apps into their store. So, no go with that.

This experience trying to glue process automation onto a tool that was never intended to be used that way (and has no incentive to adapt) got me thinking about my own conviction that every UI operation should have a symbolic equivalent. Another way to say this would be “No UI Without API”. What I mean by this is that every meaningful thing you can do to the application state in the user interface should correspond to exactly one subroutine, appropriately parametrized—that way actions taken in the UI can’t straddle changes in state. This also entails that no aspect of the application state is opaque to the API; everything is addressable. This is not just a good design practice but it’s essential for things like a working undo function and, oh right, accessibility. The timeline in Pro Tools—the place where you do all your work—is some kind of home-spun UI control, and thus totally opaque to SoundFlow. If it’s opaque to SoundFlow, it’s also going to be opaque to any kind of screen reader, since they would use the same mechanism. If my partner had any kind of significant vision impairment, I have no idea how she would do her job.

If you want an example of what this design pattern might be like, take a look under the hood of TLDraw.

So I guess the take-home here is that programmable software is also accessible software. It’s just an all-around good idea to design it that way. You begin with the semantics of the operation you’re trying to perform, and then you derive how that plays out in the user interface. This is natural, too, because when we describe the kinds of things people are trying to accomplish with a piece of software, and how it behaves in general, we tend to use words, occasionally with pictures or other props to clarify. What you would get from this exercise is a lexicon that was consistent from the code, to the API documentation, to the user manual, to the marketing material as well.

The Making of Making Sense

Discussion about this post

Ready for more?