In conversation with Aza Raskin

Many years ago I read The Humane Interface by Jef Raskin which completely upended my notion of computing. It showed a vision of a world of computing which was radically different from the prevailing paradigms. Until then, I hadn't even realised that it was even possible to question core artefacts like files or applications.

Jef lead a team to make that vision a reality in the form of Archy.

Archy never quite fulfilled its grand ambition, but its legacy continued in the forms of Enso (a Windows-focused implementation), and then Ubiquity (for Firefox).

Aza Raskin, Jef's son, was the common thread between all of these projects. I recently spent some time getting it up and running properly on modern Macs so I decided to chat with him to get some insight into this interesting chain of projects.

Team, Vision

HC: I wanted to start with the genesis of Archy. At what point did it coalesce and come together as a project?

AR: It’s been so long that I have to sort of dredge through the murky, muddy water of memory. So after Jef published his book, The Humane Interface, there was a strong desire to take many of those concepts that Jef had continued to work on around what are cognetics and the ergonomics of the mind and make a utility product that really worked the way our minds did. Many of them had first come to light in the Canon Cat after the Mac, before getting shut down by Canon when an electronic typewriter failed. They had put the Canon Cat into the same conceptual bucket as an electronic typewriter — just shows you what the thinking was like at that time. After selling something like 20,000 units they shut it down.

Canon Cat brochure, pages 10-11
– A scan of a Canon Cat advert by Marchin Wichary (source)

Archy really got going in my junior year of college, so that was 2003 when we started coding. We were working simultaneously on a contract for Samsung to redesign their phones and were thinking of doing it in a zooming concept and using that funding to fund the Raskin Centre for Humane Interfaces, RCHI, and turn that into the be-all end-all text editor, and that became Archy.

HC: Where did the team come from?

AR: I sort of imagine it to be like those 80s movies where the motley crew assemble to win the Super Bowl and has a sumo wrestler and clown, that was the kind of crew. People who had been in Jef’s orbit for a long time, like David Alzofon who did a lot of the original manual writing for — it may have even been the Apple II — an incredible technical writer. Something Jef said all the time was write the manual first, if you’re having trouble explaining how your product will be used, your users will have trouble using it.

HC: I feel that Archy as ended up was a partial implementation of a much bigger vision. What was the grand idea for what this could potentially be?

AR: Yeah, this was a turning of computing over on its head. The idea being that when you sat down at your computer it’s supporting the thing you need to be doing. Immediately your text editor was up and you could start working.

Archy Screenshot

You can think of applications as walled cities — they have to develop all of their own infrastructure. If you’re making Photoshop it needs to have spell check because you have a text editor in there, and if you’re making Word you need to have Photoshop abilities because you’re putting in photos and you’ll want to edit them. Over time the applications have to continue to increase in size and subsume more and more functionality until every application starts to converge from different directions on the same kind of application.
If that is where bloat comes from, and if 95% of the feature requests for features in Word are features that already existed in Word, maybe it’s the application as a framework which is broken. You want to tear it apart and just have functionality that you can use anywhere.

There were no files, because the best label for a file is the file itself, the content of it. Everything was in one long, conceptual document, but we know that human beings work very well with spatial memory, so what you want is to have all of your content and work projects stored spatially. It’s supposed to be a full-on zooming user-interface (ZUI), so no matter where you are you can zoom out and grab your bearings, and zoom in ad infinitum, a much better way of doing folders and taxonomies.

Computers have this incredible magical power and text can do so much more than just a Word document can do. There’s a recent project called Observable by Jeremy Ashkenas, and Mr. Doob and while it’s a very different take I think starts to get towards a little bit of what that vision of Archy was. That magic of being able to program a little, have parts of your text talk to other parts of your text, to have your documents be really alive and you can just cast magic at it. That it starts as simple as being able to type characters, that it has incremental search so that you can move at the speed of your own thought, that there are very few hand gestures that are unnecessary, all the to any bit of text can refer to any other bit of text, and you can run any command and do any functionality from anywhere, that you could open up your tools and it was all coded in the same environment, that you were making in, very Smalltalk or Alan Kay-like, and so your tools themselves can be modified in real-time to modify themselves if need be. I think that’s a little of the vision of what Archy was supposed to be like.

Lineage

HC: That’s fascinating. It seems there is that inspiration of lineage and indirectly drawing on the same resources. They are all modern environments which have little bits and pieces of that which feels like this resurgence of notebook-style editors.

AR: Exactly. Sort of like Jupyter but the next iteration. And all of these things start to point back to much older concepts, from HyperCard to even Xanadu.

HC: The Humane Interface and Jef’s work through the Canon Cat were big inspirations, but clearly your throwing around a lot of older projects as inspiration as well. Were there any others that you would say were concretely referenced as providing input?

AR: Yeah, originally we thought we might implement Archy on top of Smalltalk, and Scratch. I think it’s really interesting now that the next person picking up that mantle is Bret Victor now at Dynamicland. It’s a tangible version of very similar concepts.

HC: It’s interesting that you mention Smalltalk as it feels that there are similar ideas in there. For instance, no applications — you’ve just got content that you’re operating on.

AR: That’s exactly right. In some ways you can think of the command system of Archy as being just like Unix commands. You hold down a button (so it’s a quasimode, you start typing what you want to do, like ‘spell check this’. It knows what the input expects and what the output expects, so if you run it on text of course it spell checks the text, but if you run on an image it should run OCR and then do spell check on top of that. It should be able to ask the system ‘I have an image but I expect text, do you have anything that turns images into text?’ then it should all just happen behind the scenes.

HC: I can see that underlying similarity, and yet the concrete realisation of projects is obviously very different, you know, Smalltalk embraces iconic representation and involves heavy mouse usage, whereas the Archy system and what you’d built was almost a direct rejection of a lot to that.

AR: There’s certainly a very heavy keyboard focus because there’s so much emphasis put on thinking through the GOMS modelling, thinking through where errors happen and trying to minimise time and maximise information theoretic efficiency. The mouse was just not great for that.

Future Plans

HC: You’re quoted as saying that this was an environment of something that you’d like to boot up into, so maybe not an operating system per se, but essentially a complete computing environment. Am I correct in thinking that was the plan?

AR: Yeah absolutely. Imagine Chromebook-style. If we were re-implementing it now maybe we’d do in on top of the Web as a platform, and then you’re sort of done. You can boot up directly into it. That can be your world.

HC: So you were imagining hardware as well? From using it one thing I did feel was that the dedicated keys for LEAP and Command would certainly have been enormously useful.

AR: A LEAP key changes everything. If my memory serves we were thinking about a keyboard that has all of the software on it, so that you can walk around with your keyboard, plug it into a computer that would have Archy on a USB drive inside of it, and so the entirety of your text, history, preferences, everything would just come along with you. You just needed to carry your keyboard an plug it in.

Detail of the LEAP keys on a Canon Cat
— Detail of the LEAP keys on a Canon Cat (source)

HC: Having read The Humane Interface and the roadmap for Archy it seems that the ZUI was a big component that was missing. Had you started development on it?

AR: Yeah, we were really focusing on the text editing, maybe even the coding. If, conceptually, a spreadsheet, a document, and a code editor have a baby, what do you end up with?

Doug McKenna wrote a fully zooming map library behind the interface. One of the problems you end up having with infinite zooming spaces if you just do a naive implementation is that you can zoom in until you hit the end of the precision of floats and all of a sudden the ZUI just start to like bounce around. That requires some careful thought.

You might want a portal so that you zoom from one area into another — the equivalent of a symlink. Now you can teleport into another part of the ZUI, but when you zoom out you always want zoom out to be the equivalent of going back, like the back button in a browser. So you have to do careful hand-off of coordinate systems, a lot of really interesting caching problems, and so Doug wrote that for use with the cellphone prototype we were working on. We put a demo of how this might work, a Flash demo where you could zoom in and zoom out, see annotations on Web pages. Bret Victor actually got involved and he created a corollary zooming user interface prototype and we went back and forth with a couple of different iterations.

HC: There was some mailing list post from a while ago talking about the size of the project: ‘Archy turned out to be too big a project for what the small team were capable of. It was more complex given the number of specced features surrounding universal undo than any I know of.’ Can you elaborate on that?

AR: Oh yeah, it’s just really simply observations. Anything you do on the computer you should be able to undo. You don’t feel like you’re walking through landmines. If you do the wrong thing or touch the wrong button you won’t be able to recover from it. So the question is: if you are making an extensible system, how do you have it so that every application you can always get back to from the last step?

It’s something we want in real life, it’s the ability to take back the thing we just said, and at the very least we can do that on the computer.

But what does it mean to hit undo if you have a collaborative document that multiple people are editing, does hitting undo just undo yours, or does it undo everyone's? Should undo operate just in the document that I’m currently looking at, or does it undo the global thing?

There are a really interesting set of problems that come from a very simple thing which is the principle of being able to get back to where you were and keeping the environment always safe.

HC: Were there any big ticket items or anything glaring that you didn’t get on the roadmap?

AR: One of the largest problems was figuring out how this thing could really become your new environment to work in. How do you interoperate? You’re going to want to be doing all of your editing and coding inside of this thing, but the predominant form of doing that now are lots and lots of files. How do we interop between a world where there’s folders and files, to a world where there’s just one really long document or ZUI? Figuring out all of those bridges so that you could gracefully upgrade people from the current day to the Archy world. I think it was sort of an open-ended question: ‘How do we want to do that bridge’.

HC: Did you ever think about compromising, or was it the case that Archy was this very opinionated stance and other things were going to have to figure out how to fit into it?

AR: We were definitely thinking more of the “this is a different way of doing computing”, so we would rather figure out the way of doing it in our world. Even coding. We thought a lot about, that seemed to make sense in the zooming world.

So those were the big ticket thought items where we had first stabs and gestures at it, maybe even a couple of prototypes, but they were super-early stage. I think we all knew those were going to be a place to put a lot of conceptual work. Just trying things out and figuring out. It’s always through making that you shine a flashlight on the ideal between concept and implementation.

HC: Interesting that you mention that because I think whilst I understood the roadmap and all the pieces, but there were a few gaps where I didn’t quite see how it connected through, and one of them was the Web, and that left a bit of a question mark for me. The other one was creative media-style applications, audio workstations, video, drawing applications. In some places it seemed like that might be a hard conceptual gap to close.

AR: Yeah, I think the Web is a particularly interesting one, because our thought was wherever you have a link you would just embed the link, like the entire Webpage, as a thing next to a link that you could zoom in to. That’s the base layer, and you can start playing with that, and have something a little bit better, something new.

Drawing is much easier because it’s not an application, it’s not a thing you go to. It’s a set of tools that come to you. Any time, anywhere you’d be able to pull up the palette and just start drawing, whether it’s on a Web page or on a document that you’re typing. You can just draw on the canvas of the ZUI world in general. If you zoomed out you’d be near a set of photos from some trip, and you can just write using whatever photo editing or illustration tool you had, for instance, ‘My trip to Panama’, or just a giant heart. You can just modify the space.

That’s a big part of the shift, thinking about the nouns of the world, the substrate of the ZUI, of the text document, as just an object which you could call up different verbs to do whatever you want with them.

Beyond Archy

HC: I wanted to move on to the end of Archy and the beginning of Humanized and the work that you were doing there. Could you give us a quick recap of the circumstances surrounding that changeover?

AR: Around the time that Jef got very sick we also got a fairly large contract to work on a zooming operating system for a Samsung phone. So this was my final year of college, so pretty exciting to be like here is a concept of a phone that could get out to a whole slew of people. It was clear that — this is back in the day of flip phones and WAP browsing — so we were like, cool, instead of trying to fix the current system we’re trying to paint a picture of where these interfaces could go in the new form factor. Jef was really interested in a clear keyboard which you could type from behind, and we actually implemented.

And then, and I guess this sometimes happens, there was some political stuff that happened where the partner who was working closely with Samsung I think felt didn’t like that I and a couple of other people were much younger and still in college and leading the charge, and he wanted to lead. So he sort of threatened the money over it. I think that whole thing sort of, in many ways, took away the momentum of the thing we were working on. I remember I think I got fired, then hired again, then fired, then they tried to hire me again, and I was like ‘Nah, I don’t really want to do this’.

A group of us, Atul Varma, Jono DiCarlo, Andrew Wilson, and I decided to start a company to take one particular aspect of Archy and bring it to the world. Andrew was my college roommate, the other two, Jono and Atul, were in Jef’s class that he taught at The University of Chicago on interface design. And that’s how Humanized was formed, and in particular how Enso, which then became Ubiquity for Firefox ended up getting created.

HC: There’s an obvious connection between the two systems there. You were still iterating around the same idea, the vision that Jef had laid down. If Archy was the big vision, were you clear that you were still trying to get back to that, or would you rather see this broadly adopted, even if it’s just a small part?

AR: Yeah exactly, it was like “how do we take this to prototype and get it out?”.

For me the mantra that I was operating under then is that the best way to honour someone’s memory is to channel them into your own passion to make the world a better place.

The core concept that I was always super taken with with Archy is this idea that we need to switch to this model where you say what you want and the computer does it. So you can select some text and say ‘email this to Jono’, and it will know from my own past history whether to send in IM or text or an email. You can select something anywhere and say ‘map this’, and it will go off to the Web and find a map and even if you’re in Microsoft Word it will inject an image back or, if it can, something smarter, something live.

We thought with that this we could see how to implement it using accessibility controls to bring it to computers as they stood back then.

HC: So there was an acquisition by Mozilla between Enso and Ubiquity. So Ubiquity was Firefox focussed?

AR: Yeah, exactly. There was actually a hidden thing in there that Mark Shuttleworth of Canonical tried to buy our team first to have us go run design for Ubuntu. There was a good half-year where we were working on this deal where they wanted us to go take some Jef’s ideas, our ideas, some of the concepts of Archy and make it part of, at that point, the third largest desktop OS in the world.

HC: I’m glad you mentioned that as I had this vague recollection that I had heard that rumour somewhere but I wasn’t able to source it.

AR: Yeah it was never published. We flew over to Spain, we met all the team, we spent a long time thinking about it, but it was never public, I guess. Now it can be. Well we never said we couldn’t, we just never talked about it.

HC: It seems like there were fans, or acknowledgment of Archy through the continuing work of Humanized. You mentioned Brett Victor, Mark Shuttleworth, are you aware of anyone else who was inspired by or affiliated with the program?

AR: Yeah, let’s see. Of course with Ubiquity we grew that up to two million users, which was pretty exciting. That got the largest adoption. You can see in Quicksilver and in Alfred many similar ideas. Nicholas Jitkoff and I certainly had a lot of conversations, so there was certainly inspiration that went both ways there. I never knew the team at Alfred but there was some sort of back and forth. AI Writer from Information Architects takes a lot of inspiration from Jef’s work and Archy, and they even thought about building LEAP keys at some point.

There’s a zooming user interface called Raskin, which I haven’t played around much with, but that’s clearly inspired by Jef’s work, and our ZUI work in particular.

And in many ways, and I don’t know the causal link, but the voice commands of today, Siri and Cortana, all these things, to me trace back to Ubiquity and then Archy, as a kind of to speak to get it to do what you want it to do.

Even the zooming user interface with the iPhone, where you click on the folders and you sort of zoom in and zoom out. That’s like an actual ZUI used by hundreds of millions of people, as limited as it is.

HC: Even things like persistent documents, no-save and auto-save, kind of restore back to the current state of a simpler thing that’s now pretty pervasive. I don’t really recall it being that common.

AR: It was not. I remember going round the early Web and just talking about have ridiculous it was that we even still had the save icon which was the floppy disk. I was like, dude no one knows what this is. A lot of that kind of work, the Web 2.0 kind of work of, like don’t use a warning when you use undo, and having the undo features in Gmail, I think that certainly was in Jef’s work first. That one I don’t know how direct the inspiration was, but I think the argument could be made that Jef’s work was pretty influential there.

Retrospective

HC: A lot has happened since then. I’m thinking the big change to personal computing which came in around 2007, the introduction of the iPhone. Obviously Archy and in some sense the other interfaces were very keyboard driven, you seem to also be suggesting that it could have also been voice-driven? The ZUI might carry over to a multi-touch world to some extent?

AR: Yeah I feel like the ZUI world could fit very well on a phone. It could be a really natural way of seeing all of your photos and all of your documents, and having a sort of spatial map. I think the fundamental problem we always ran into with ZUIs when we start to actually implement, because the amount of information you have to put in to get some place, versus just tapping on something, makes it a little harder. That was a problem we never solved. You’d have to navigate, zoom-in and zoom-out, as opposed to doing two or three taps to get some place. There was a tension in the design which we had never fully resolved.

HC: Is your gut feeling that that was something that was just inherent, or was it something that required engineering?

AR: Yeah I think it would require some real building and testing. The other paradigms have had thirty years now of A/B testing at an industry scale for what makes for a good GUI, and here we were starting from the very beginning. What makes a good ZUI? The rest of the world has a twenty to thirty year head start. So i think it would take some iteration to get it right.

HC: When you look back on the work you were doing with Archy and then through to Ubiquity, was there anything that you would have done differently, maybe if you’d had the full benefit of knowing what was to come?

AR: Oh man, so many things. Hindsight is a terrible mistress in that way. I think if I was going back I still think the direction that we’ve gone with Siri and Alexa, we could have pushed on that further. I’d have done it all open source, and I’d have done it cross platform, and focussed on in on that idea. We could have pushed it further, faster. We took the quasi-modes very seriously, and some said that may have hampered adoption.

HC: Do you think that was too much of a habit shift for people?

AR: Yeah, and it was also really awkward on current keyboards. We always had to come up with these clever hacks that you could type at the same time and you had to hit shift twice, or you’d have to remap your caps lock key, it was just a little bit inelegant. So that I think would certainly be one change.

I think there was a big opportunity early, early Web to do a text editor. We were really inspired by MoonEdit, which is a collaborative text editor who had its own client pre-Web. I think if we started with ‘Okay, what does the Web enable?’ Let’s jump in with new things that the Web can do and use that as the thin end of the wedge to get sort of the Archy-style text editor going, instead of doing it in Python as a desktop app. I think that would have been a really interesting way to have brought the ideas out. You can imagine what would have happened if Google Docs had been Archy. That would have been fascinating.s