Exploring the history of Interfaces to understand and build better interfaces.
I have been working at Postman for nearly 4 years now. And one word that I hear or read almost every day is "API". My general understanding now is that APIs are hard to design, build and maintain, and the question as to "why?" keeps me awake at night. While trying to find the answer and some sleep, I was intrigued to ask a very first principle question, i.e. -
What the heck is an interface? Where did it come from?
What follows in this article is an exploration of the history, purpose and meaning of interfaces. I understand that if we can grasp the concept of an interface in abstract form, we should be able to use the learnings to build and design better APIs today.
In simple terms, it is a point where two systems, subjects, organisations, organisms, etc., meet and interact.
In mathematics, to be precise, a point is a location that does not have any length, width, shape, or size in geometry. However counterintuitive it may feel, I believe that it is valid for an Interface as well. An interface is a point, and like a point, it does not have any dimensions. Therefore, extrapolating the same idea, it should not be too bold to claim that the act of interfacing is an abstract concept.
Let's time-travel back to a point in time where humans had evolved just enough to be able to communicate with each other. Within our frame of analysis, we have one subject, humans. Looking at how humans interfaced with other humans using verbal and non-verbal communication gives us some key insights into what an "interface" is.
The interface between humans is the idea of a language or a sign.
But does this claim not negate the idea of an interface being concrete and not abstract?
Say an English speaking person is speaking to a french person who doesn't understand English. Are they interfacing? I would say no! Although a language is being spoken and a language is being heard, there is no interface because the language rules between the speaker and the receiver do not match. This specific matching of the abstract rules between the two interacting subjects is an Interface—a singular point of equity and equality.
Until this point, we have considered the interaction between only one kind of subject; humans. But soon enough, humans learnt to use the material around them to build tools. Tools are great examples of a heterogeneous interface, an interaction between two different subjects/objects.
For example, consider the simple stone hammer. A stone hammer would be built once, and humans would use it for various purposes like cracking nuts, felting animals, starting fires. The same tool here is being interfaced differently for different tasks. A physical or meta-physical manual of knowledge passed down the generations explaining how to use a stone to crack a nut, cut with a stone to felt animals, or hit two flints to start a fire. This manual is fundamentally the interface between human and material objects. These are again rules of the interaction which govern how a tool can be used.
A question begs to be asked here, why did we consider humans and materials as two different subjects? Are humans also not made up of materials? Yes, but the degree of intelligence between inanimate and animated objects is the essence of the division. Hence, replacing the human with any other organism with intelligence does not make much difference. As humans can exhibit most of the actions that other organisms can, just using the concept of human intelligence as a placeholder suffices as a real subject. On the other hand, inanimate objects do not have a decision-making ability and do conditionally change their environment by virtue of a choice.
For a very long time, the material had not shown any intelligence until, of course, computers became a thing. Computers, especially software, are inanimate objects which show some sign of artificial intelligence. This means that we again have a new subject to analyse the interfaces; A human-software interface (Human-machine interface), a software-material interface, and like a human-human interface, a software-software interface (API).
Computers are unique in the sense that, unlike humans and materials, computers do not occur naturally. They were created out of material, a.k.a, hardware and artificially given intelligence with software. This makes our analysis somewhat confusing, as, at any moment, it is challenging to gather which part of the computer we are talking about. Therefore we will consider the hardware the same as materials and only consider the software which projects itself as a new kind of intelligence. There is a future where the software will be as or more intelligent than humans, and the intelligence category would use the placeholder of a computer, making our diagram more straightforward. But until that time comes, the software is a thing to deal with.
Just like humans use language to communicate with each other, a programming language is used to communicate with computers. This bridges the foundational gap between humans and computers with a set of formal rules called grammar. The software part of a computer interacts with the hardware part by using instructions which are like manuals. Historically these instructions have been called drivers, instruction sets, control options, etc., and they form the backbone of the software-hardware interface.
There is another way a human can interact with a computer, i.e. by using a device like a mouse. In this case, the computer system exposes its hardware as a means of interaction rather than the software. It uses the signals generated by the hardware to understand the human intention. When the user follows the rules of interaction of a mouse, the cursor in the screen moves that user wants to. Moving the mouse up, move the cursor up; you get the idea. The other way around, if the machine wants to interface with the human, it uses a screen or printer to show symbols and glyphs for the user to comprehend.
Let's define the subject of data before we delve deeper into the interfaces of a computer. We understand the human brain is where knowledge, information are stored. The general understanding is that information is stored with the help of neurons. But the exact science of whether the storage happens inside a neuron or through the interconnection of neurons is still a big mystery to most of us. Probably only a neuroscientist understands the working.
Nevertheless, the brain can store, retrieve, and share information with other humans using language. However, the human species would not have reached dominance if we could not preserve data across a generational timeframe. Verbal information sharing is convenient, but it is a very lossy mode of communication.
A more permanent means of information exchange has been through writings and engravings. When we write something on a piece of paper or engrave something on any surface, we encode our thoughts and ideas onto the material. We take information from our brains and transfer it onto the material to be interfaced with later to extract it and feed it back into the brain.
Hence, we must recognise data and information as a subject of its own accord. Although not much is known about how brains store information, given that we invented computers, we know exactly how computers store information.
The energy token of a computer is the electrical power, and transistors form its brains. The nature of a transistor is that of a switch, i.e. it can turn on and off based on another electrical signal, extrapolating it further to say that a transistor exhibits a binary state. Let's imagine we have 26 transistors, and each transistor represents one alphabet. If only one transistor turned on at any moment, then if we observe which transistors are turning on and off over a period of time, we could express all kinds of written information.
In essence, a binary signal could mean anything and everything. Over many years specific rules have been established to express what a series of binary states could mean. This is a way computers interface with data, i.e. by data formats and encodings. The ASCII encoding of English alphabets and glyphs, which was a way for computers to interact with humans, is one such set of rules.
Do you see a pattern emerging here? A pattern that explains why subjects interact and need to interface with each other? There are only two reasons.
- To preserve information.
- To control the environment.
A human-human language interface is used to exchange information so that knowledge is preserved; the human-material interface is made so that tools can craft clothes or cut trees. Computers are not that special. They also interface for these very two reasons.
Lost to Time
With the ever-increasing world population, the information density of the world is increasing exponentially. However, not all information is valuable information. Some are gold, and some are garbage, more akin to gold in a dumpster.
To better understand the current circumstance let's again travel back to 1969 when ARPANET was established. Four research institutes at
- The University of California, Los Angeles (UCLA) hosting Network Measurement
- Stanford Research Institute hosting Network Information Center
- The University of California, Santa Barbara (UCSB) hosting Culler-Fried interactive mathematics
- University of Utah hosting a machine with computer graphics.
They were connected through telecommunication lines to exchange data. It is cathartic to think that we decided to use the telephones that carried our voices to connect computers. Computers interfacing like humans.
For the first time, computers miles apart had been interfaced together using a set of physical and logical rules written in the BBN Report 1822. In the following two decades, a lot of essential technologies were invented. The first one of them was the Network Control Protocol, a precursor of the TCP/IP protocol. The 1822 protocol was revolutionary, but once two computers were connected, multiple applications within the same computer wanted to talk to each other. The NCP has formalised that defined higher-level rules for a flow-controlled, bidirectional communication between various processes in different host computers. During this time, the popular TELNET and FTP protocols were also designed, later ported over to TCP/IP.
In 1974, the famous RFC 675 was introduced that formally defined the Internet Transmission Control Program (TCP in short). Until 1982 nearly four versions of this protocol were implemented by the ARPANET members, then finally, version 4 was accepted for production use in 1983, replacing the NCP.
A lot of important and incredible things happened in the year 1989. The Berlin Wall was broken, ending the cold war. The Tiananmen Square protests and massacres sealed the fate of China, and Nintendo released GameBoy. In the middle of all these, the scientist and engineers also completed the Internet protocol suite. These were all the rules that we humans decided that computers needed to follow to interface. Surprisingly, in 30 years, most of the same technologies are still being used extensively today. This was a big moment in the history of computers and interfaces, but not as big as the Information Management: A Proposal by Tim Berners-Lee.
Borrowing the first few lines from the original proposal,
This proposal concerns the management of general information about accelerators and experiments at CERN. It discusses the problems of loss of information about complex evolving systems and derives a solution based on a distributed hypertext system.
I think it had become obvious by that time that there would be more computers in the future than there are humans and that these computers would live longer than humans and remember more information than any human could. Now that they are all connected means that, unlike humans, computers exhibit a much powerful collective mind.
The implementation of this proposal is what makes up most of our explorable internet today. And as w3 tells us that URI specifications, HTTP, and HTML are still being refined as Web technology continues to spread, the future is going to be no different.
In 1968, for the first time, the term "application program interface" was recorded in the paper called Data structures and techniques for remote computer graphics. Rightly so, it explained the rules that a graphics program should follow to interact with the rest of the computer system, which freed the programmer from dealing with the graphics display device. It provided an abstraction for hardware independence so that the computer or display could be replaced without needing any change.
In 1990, the API was simply defined as "a set of services available to a programmer for performing certain task" by Carl Malamud.
Remote procedure calls being the first implementation of networked API resulted from programmers wanting to call libraries located on other computers. This definition still holds in the age of Web APIs and Representation state transfer being an alternate to the library based APIs.
Furthermore, an API can be defined as a set of rules that software must follow in order to interact with another software to utilise the services provided by the other software. As discussed before, these services can be of two types either providing information or mutating their environment, like turning on and off a power plant.
Many believe that JSON or XML is what makes an API, but that would be as arrogant as saying English makes a language. In contrast, English is just a choice of language. Like, if one speaks imperfect English, no one would understand, similarly, if the information is modelled badly in JSON, it will be incomprehensible by both computers and humans alike.
A Work in Progress
It is 2021, and the world has already seen a change where computers provide services to not only other computers but also humans. For example, Computers are now responsible for all of our monetary transactions, our COVID vaccine was built by a computer, a computer is a key to our cars, a robot is now performing surgery on us. The term API seems to no longer sufficiently explains how computers are being interfaced today.
One may argue that this kind of interfacing between a computer and a human is a solved problem and is called a User Interface, and they would be right to an extent. User interfaces have very successfully been able to show humans information in a visual format that is easily understood. A user interface acts as a transparent conversation of data representation of a computer to the data comprehension of a human. But there is a catch!
In the last decade or so, the utility of computers has risen one abstraction level up. It is no longer just a network of information like Tim suggests in his paper.
Now a days computers operate as a collective to provide services to humans, even the critical ones.
Is it still ok to call it an API or User Interface? I don't know! But I have a strong urge to call it something different so that we don't make untrue assumptions based on historical understanding. While the real intellectuals explore what the correct word is for this new age of Computer-Human Service Interface, we can maybe call it a CHSI for lack of a better term.
I believe that what makes a good interface is still a big question that many of us are trying to answer, but here are a few questions we can ask ourselves when designing the interfaces to make a good one.
- What is the service being provided by the interface? Many of them are unclear about what service they provide; is it an API or a CHSI? Does it provide information, or does it control its environment?
- Is it easy to comprehend and parse the information? The APIs tends to get human-optimised rather than computer optimised, and sometimes the CHSIs become optimised for computers rather than humans.
- Have the rules been clearly defined? A few of these rules could be around authentication and authorisation or the limits under which the API/CHSIs must be used.
- What happens in case of inadequate interfacing? Does the receiver or transmitter explain what is wrong? Is it similar to how two people conversing with each other give ques when they don't understand something?
I will continue exploring this idea and ask more questions until we all find the answer to what makes a good API or good CHSI. Until then, here are a few useful links that I happened to stumble upon while writing this article, you might be interested in.
Disclaimer: The ideas and opinions in this post are my own and not the views of my employer.