14 October - Infrastructure
How we use microgateways attached to APIs to improve our platform
2 mins read
Software development is an intellectual activity. As such, it relies on people’s creativity and ability to solve problems and come up with good solutions to implement complex features. But as your teams grow, aligning a common goal can be challenging. Set up a command and control chain with too many processes and checkpoints, and you’ll end up with constrained, procedure-oriented teams that lead to dull solutions. You may deliver your SLAs but run the risk of not having exciting features implemented.
On the other hand, let engineers loose, allowing them to work on their desired projects as they see fit, and you may see incomplete solutions that don’t resemble a cohesive product. Maybe you have a comprehensive set of development guides and standards, but these are seldom helpful in a low-alignment scenario.
So how can we make sure we’re all heading in the same direction, with a common understanding of what constitutes a robust implementation? It turns out that these are the same challenges faced in many open-source software projects, with potentially hundreds – or even thousands in some cases – developers with varying backgrounds and views about the project, all working on the same codebase.
So we turned to the well-known PEP-20, also known as The Zen of Python, to come up with our own set of principles and belief systems that should guide our design decisions towards what we think makes up mature and scalable software and teams. We call it The Zen of Engineering.
As the name implies, this should be taken as a contemplative text, and we encourage everyone to come up with their interpretations and apply them to their everyday design decisions. They’re arranged in no particular order and should be open to personal understanding since they can be applied to a myriad of situations as you stumble across development dilemmas. They are introduced to every new engineer we onboard at Pismo to instil these principles from the onset. And we try to do this openly, as they’re not set in stone, being revisited from time to time.
I’ll present them here along with some possible interpretations so we can see how these principles can be applied to a wide range of situations, avoiding some of the obvious ones. But remember these are just suggested ways of reading through those lines.
One thing to note here is that we’re not trying to avoid complexity altogether, considering we work in an inherently complex field. But still, we should strive for simplicity. Simple solutions tend to be more performant and maintainable. Ask yourself how easy (or difficult) it would be for you to come back to your code six months from now. Would it be a challenge to implement new features or fix a bug there? If another team member were to take over the development of this feature, would she have a hard time doing so?
Although we should aim for simplicity in designing our applications, simple solutions are not to be mistaken for naive implementations. You cannot expect an overly simplified solution to solve a complex problem. When refining your implementation, you may have to do some fine-tuning, but the simplistic alternatives are usually pretty obvious.
Even though we should embrace complexity, over-engineering and unnecessary layers of domains, services, or interdependencies can hinder the maintenance and performance of an application. In a sense, simplicity could be viewed as a search for the least amount of complexity.
We could read this in different ways. The first one is pretty straightforward: we should be explicit about our intentions in code. We should name a variable according to its intended uses. We could leverage a language’s type systems to clarify the expected inputs and outputs. But we could also apply this principle to domains beyond the source code and how it is architected. We as professionals should also be explicit about our intentions. Did I make myself clear in the last refining meeting? Did every developer and project owner understand all the project’s requirements? This explicit mindset also leads to more scalable code and teams.
Every API-based service deals with resource states and methods to change these states according to a given set of business rules. But having to deal with state change across your logic flow can be a sign of code smell. Putting the business context inside a mutable hashmap and passing it around, adding new data to it, removing keys, changing values, and having your request handler magically spit out your response quickly leads to cumbersome code that’s hard to maintain and debug.
On the other hand, having strict schemas with strongly typed objects and passing just the right amount of values (not references!) to communicate state across the different domains makes your code easier to understand and fosters component decoupling. We should also consider that, as a RESTful product, our services are mostly IO-bounded, so the cost of copying values between method calls should be negligible. Even then, if copying values starts becoming a concern, with large data structures being transferred over, we should take a step back and evaluate if we designed our domains properly in the first place.
The first things that come to mind when reading this one are the benefits of the horizontal scaling of your infrastructure instead of the vertical scaling. But we can have alternative interpretations as well because this can hint at more profound meanings around how we work as a team. Having horizontal relationships among team members positively impacts the quality of interactions that lead to better feature design. This means we should be open to listening to others, taking in suggestions, and making sure every squad member has a chance to voice their opinion. To have meaningful one-on-one meetings, asking for and giving honest feedback while also being frank and straightforward when dealing with problems and potential blockers. In other words, to be soft on people and hard on issues.
We could think of this as another way of stating how tasks’ lifecycles could benefit from an agile perspective, with several small, incremental deliveries and high business value. But we could apply those same aspects to how we design our systems. For instance, large amounts of data could be processed at once or in small batches. Would the latter scale better by sharing resources and avoiding long-lived lock mechanisms? Should we schedule a batch job, or should we go for a more reactive design?
Have you ever encountered those try/catch/finally blocks in which different errors are treated as one single kind of failure? These can become hard to debug as your code grows in complexity. Every error is unique and carries context about its surrounding state, and as such, should be treated uniquely. The Go language kind of enforces this by design and, although it could get quite verbose, dealing with errors where they happen is a good pattern to follow. But this principle goes well beyond error handling in code and has a human dimension to it. At some point, a bug will certainly sneak into production. Most of them will have minor effects on your system, but now and then, some will cause a major headache.
We design resilient systems that minimise the chances of seeing those critical situations, but none are 100% effective. A robust design accepts this as a fact and leverages its context to improve the process continually. Here at Pismo, we don’t go witch-hunting to apportion blame. Instead, we acknowledge that there may be loopholes hidden somewhere in our development processes. And the very ones who were responsible for deploying a critical bug are precisely the ones with the best conditions to lead the efforts of fixing these loopholes and come up with strategies for making sure these scenarios do not reoccur in the future.
These last two are closely related, approaching the same concept from different ends. This one entails that if you’re having a hard time explaining a feature implementation, you’re probably not fully understanding it yourself. Verbose documentation with conditional explanations shows that your domains are not very well defined, your design is complicated, overengineered, or even entirely wrong. If it takes too many words to explain something, we may have to go back to the sketch board.
On the other hand, you may find that you can easily put your features into simple words. That’s a good thing since it may indicate that your solution was well-thought-out and you fully understand all its implications. This alone is not sufficient to call it a good idea, but it certainly hints to you that you’re heading in the right direction.
And how about you? What is your take on these principles?