The Office Analogy of a Software Application | by Aidan Donnelly


Aug 26, 2023
I like analogies! They sometimes help me to communicate technical topics to non-technical people. Whats under the hood of a software application can be a total mystery to most people. We all use apps on our phone and see things on our desktop like YouTube, Spotify, Photo Galeries, and blog posting websites like this one. But underneath the surface, the technology that provides these applications is not understood by most of the people who use them. That’s not a criticism, it is just a feature of any technology: the users don’t know how the technology is made. Most of us wouldn’t know where to start making even an ancient technology like a metal axe, yet alone modern technology like an automobile. So in this digital age, I have family friends and colleagues who struggle to understand what is behind a software application and how it works. Even though I am working for a technology company, my colleagues can struggle to understand. Perhaps they work in the product specification or design or they may have a background in sales or marketing or customer support. All of these really great colleagues doing excellent work can find it difficult to understand the details of why our software may be suffering from downtime, or may have performance issues. Even semi-technical colleagues can ask seemingly good questions and make suggestions: “Can you add more servers?” “What about a bigger database?” “What if we split the application into multiple smaller ones, that will help, right?”
I have found that analogies can help to explain things to a wider audience. In this article I use an analogy of a software system as a pre-computer age office block where the employees and the files and the letters and phone calls they receive are the equivalent to modern technical infrastructure and software systems.
So, lets begin. It’s the 1960s, and computers haven’t been invented. A burgeoning HR company called People Corp occupies a 60 story building downtown in a large city. Business is booming, and the business is the outsourcing of human resource processes for small companies who cannot afford a dedicated team of HR professionals. They would like someone else to manage the paperwork involved when you recruit, sign contracts, evaluate performance, pay salaries and track absences of employees. All kinds of files or ‘records’ exist for these activities. The company has more and more customers every year as folks see the value of outsourcing their HR processes, and don’t have to deal with all that paperwork.
Behind the office reception, and the pictures of the company executives, and posters of customers shaking hands with the company owner, there are frantic people getting the company’s work done. Let’s dive into how things happen…
The company gets instructions from customers through the mail — which is delivered constantly to the 1st floor reception. The mail contains all kinds of requests from customers to update their records, query their records, and so on. Customer files are all stored in the basement filing room. Each business department is spread out on different floors of the 60 floor building. So when a letter is received in reception, or a call is made to the 2nd floor telephone exchange desk, it has to be routed to the correct department on the correct floor, and then to the right worker within that department, who then answers the phone (or opens the envelope) and understands what the customer is asking for. The worker, let’s call him Sam, now has a task to perform.
Sam embarks on a journey to perform his task. His journey may involve visits to multiple other floors. He sometimes gives sub-tasks to other colleagues if he needs information from their department. They then go on their own journeys. Usually the workers have to go to the basement to consult the filing cabinets. They may have to walk up and down the filing room looking for the right cabinet, and looking for the right file in the cabinet. Sometimes the layout of the files and cabinets is helpful — they are stored alphabetically. Sometimes the layout is ad-hoc and files are hard to locate. Once the worker finds the file they were looking for, it might be an easy task to update the file there and then and put it back in the filing cabinet. But often, the tasks is more complicated. They have to return to their desk, via the colleagues that they asked for other information from, and eventually back to their own desk on their own floor of the building.
They might have to sit down with all the files they had collected and do some edits to particular files, or create new files, and so on. In the worst case scenarios, the task is complex, and only when they have finished the first part of the task do they realize they have to kick off another journey through the office in order to finish the customer request. Perhaps again visiting other departments, and making additional multiple trips to the filing room. Worse still, sometimes a worker gets back to their desk and just have to wait as their colleagues finish the sub-tasks to get the information they need. Worse again, some files are sensitive, and the company does not allow the workers on the higher floors to enter the filing rooms, only a special group of people (filing clerks) are allowed in there, so requestors have to relay the information they want to the filing clerks, and wait for the clerks to return from the filing room.
When Sam has finished his task he has to put all the files back where they belong and tell his colleagues he is finished, so they also need to put their related files back in the filing room in the basement. Eventually Sam will write a letter back to the customer, or call them back. But given how the tasks can take a long time, it could be days later, or even weeks before the customer hears a response.
The process of completing tasks can be very inefficient, but it works, and it’s worked for a long time. The company has been successful and customers are happy with their service. However as the company has grown, more floors have been added to the building, and more files, and more workers. There are times when a lot of letters are delivered at once, usually Monday morning. There are also times when a department makes a change in their process definition, and it causes them to need to do more work than is necessary. For example a task may involve an extra piece of information to be sought or updated, and need to make four trips to the basement filing room instead of two. There are also times when a new department is born, with a new service for customers. This means more files, more workers trying to move up and down the floors, and into the filing room in the basement.
In these busy times in the building, the underlying inefficiency of the work processes can suddenly magnify and cause the company to struggle to do anything. The whole building can be clogged with workers and it actually becomes pointless to open any letters from customers, because there are no workers available to take the tasks.
How do you even get around a 60 floor office building? Elevators! The building was built with eight elevators which can hold 20 people each. For many years, this was just fine, and nobody had to wait too long. Lately, and especially at the busy times of the week, these elevators become full, or take a long time to arrive, or stop at every floor because there are so many workers using them. The filing room also becomes full of people and it’s hard to move around and see the files. Sometimes workers are searching through the exact cabinet that another worker needs and they have to wait for them to finish. Sometimes the file a worker needs has been taken by someone else and they need to wait for it to return. This used to rarely happen, but now the company is so busy, it is happening more and more. Workers are also reporting that the colleagues who they are depending on are not available because they too are busy with other customer tasks. They are not at their desk so workers waste time just waiting for them to return.
Now that you understand how the company works, what do you think the company management should do to make things go more smoothly? Let me help you by listing some options:
- Plan to move the company into separate buildings and a separate department per building so that there is less contention on space and resources. Departments can scale to the space they need without impacting other departments. Departments may also want to invent their own work processes and maybe even move the filing cabinets directly beside the desks of the people who use those files most often.
- Stay in the same building but change the process by which tasks are handled so that it requires less people, less interactions, less travel around the building and ultimately less time to fulfill a customer request.
- Stay in the same building but add more infrastructure and people to manage the infrastructure better so that it scales to the usage peaks better: e.g.
- Add more elevators, probably to the outside of the building because there are no more elevator shafts inside.
- Extend the basement filing room or make a 2nd filing room so more people can fit inside.
- Add a 3rd filing room which is only accessible through a dedicated elevator.
- Add a 4th filing room for the department which uses the filing room the most.
- Add a 5th filing room which is a special one where you can ask for a photocopy of an original file from the 1st filing room only.
- Hire people to make sure that all the filing rooms are correctly signposted, cabinets and files are in alphabetical order.
- Hire people to continually re-sort the filing cabinets and files.
- Hire people to guide people through the filing rooms.
- Hire people to tell workers to hurry up if they are spending too long looking through a filing cabinet, and record their name and report them for slow work.
- Hire experts who just know stuff and can tell the workers what they want to know without needing to go search through the files.
- Hire more workers in the mail room: more people to open letters and know what department to send them to.
Unfortunately this is an article and I can’t hear your thoughts on these options, or hear any other ideas you may have to solve the problems of the company. You might laugh at some of the options above, or you might consider some as valid but short term solutions, or you might see that some will only work in the long term and will require a great investment of time and money.
I will not decode the analogy fully, and spoil the exact mappings of this story to a software system but suffice to say the elevators represent the compute infrastructure, the filing rooms are the storage layers or databases, the letters are customers clicking on actions in the user interface of the applications, and the workers running around the building are the code paths (threads of requests) through the computer programs.
Some of the options to improve represent techical options like adding more compute (or computers), or continually increasing the sizes of individual databases, or adding read replica databases, caching layers, more database administrators, or more support staff, or system and site reliability engineers to constantly look for low level tweaks and improvements.
In a real, distributed or monolithic SaaS application, these options work for limited periods but at a certain scaling point become only marginally helpful — because no matter how much compute and storage you add the performance of the system is ultimately a function of the process of how the company works. If that process is inefficient, you can get away with it when you have fewer customers and low rates of requests. But as you grow, an inefficient process quickly becomes so time consuming that it prevents tasks from being done in any reasonable time, which in turn blocks other tasks from executing due to contention on resources such as database tables.
In our analogy the time it takes the worker to complete the task is a function of the way the building is laid out, the travel time around the building, the dependencies on other people in the building, and how the files are accessed and changed. It doesn’t matter how fast the filing system is, or how many filing rooms are available, or how easy it is to get an elevator to the filing rooms, it is the fact that you have to visit the filing room so often which makes the overall process slow.
That doesn’t mean that adding more computers and storage is not important. It can be the most critical short term work that we can do for ensuring that we have capacity to support the existing customer load and the process with which that load is dealt with. It’s also important to keep the speed of the underlying data access as fast as it can be (because it will overall affect performance if it is slow). But the corollary is not true: Making the infrastructure capacity larger and making the infrastructure faster does not make the overall process of servicing customer requests faster. You can see that the business processes in the analogy are incredibly inefficient waste of workers time. And the process depends on physical limitations: elevators, filing rooms, cabinets and filing clerks to staff them. At a certain point, adding more of these doesn’t help the overall situation because the workers are going to be idle as they contend for access to the resources — the elevators and the filing rooms. That is the exact point where many software systems struggle to regain performance. They are on a logarithmic curve where the most gains have already been made and every additional effort has diminishing returns.
In the analogy and in real life, the first two options are the key solutions, but they require much more overhaul of business process and thinking to achieve. That takes time, and sometimes requires talking to customers to explain that a feature they like can’t be delivered in the same way they always liked it. Maybe a large report with many columns worked great when it had 100 rows in it, but now there are 100,000 rows and is slow to load. In reality, managing customer expectations is the slowest part of changing a software application because. Especially if your customer depends on you to run their business, and therefore can’t easily accept that this process has to change on their side too. So the next time you are waiting more than a few seconds for a report to load, or you click a button on your favouire app and see a spinning wheel, spare a thought for the engineers who have thrown everything they can at the problem. They may be waiting for the business to simplify the processes, so they can speed up the software that implements those processes.
link