“Why say Why?” — Team Autonomy and Systems Evolution

Adapting trade studies in agile environments as a strategic catalyst for team autonomy and evolutionary excellence in engineering systems.

May 27, 2024

Plans fail for lack of counsel, but with many advisers, they succeed. —Proverbs 15:22

Have you ever stumbled across a line of code that didn’t make sense to you? It might have been syntactically correct, but you just couldn’t figure out why it was there. A good example is this:

public Response ValidateCustomer(ValidationRequest request){
  if(request.Amount==0) request.Amount = 10000;
  ... some other code
}

This code is clear in what it is doing, but it’s not clear why it was done.

Why aren’t we rejecting the request as a validation error?
Why should I set it to 10000, why not 700

The Git History:

You check the git history and see that the code was written by an engineer
that you respect. This means that it’s probably not random. The commit message reads:

hotfix(ABCD-123): Due to a broken flow on the mobile application, 
some transaction types, we do not have the amount of the payment for some
payment types at this point. An amount less than 10000 would lead to 
validation failures on the third party.
The fix would be effected in ticket(124) by the mobile team and deployed 
in the next release.

Commit Message:

Reading this message, you realize that this was a hot fix attempt at resolving an issue that couldn’t be resolved from the source (e.g., the mobile app).
Having this extra information, you are now empowered to make the code 1% better. You simply have to verify that the issue has indeed been fixed and released, then you can remove the logic.

The Power of Contextual Cues:

On a code level, comments and commit messages serve as contextual cues to the future readers, letting them know why certain decisions were made and the conditions in which they can be undone or improved. The power of these cues in evolving our code cannot be understated.

A chain of thought started to go through my mind:

How can we translate the usefulness of this concept at a higher level?
What is the equivalent of this in designing systems?
What is the application of this in building teams?

Decision Making

One important ability that gives engineering leaders peace of mind is an autonomous team. Decision making is an integral part of autonomy. Being the sole decision maker in a team can be burdensome.

This article will explore a key practice that can transform the ability of your systems to evolve and train your team to be better at making decisions and acting autonomously.

If you work in an organization with a high degree of specialization, this article may not be directly relevant as the specialized teams apply this in finer details. However, if you wear multiple hats — like me, acting as an architect, engineering manager, technical product owner, and/or principal engineer — then being able to adapt this into your team’s engineering process would serve the team well, especially in organizations with a high degree of agility, i.e., startups or teams in their nascent stages.

Systems Thinking

Engineering leaders are constantly working with systems of people, processes, and technologies. Therefore, it is important for them to have a good understanding of systems thinking. To gain this understanding, I explored the discipline of systems engineering, particularly in mission-critical spaces.

Definition according to Wikipedia

Systems engineering is an interdisciplinary field of engineering and engineering management that focuses on how to design, integrate, and manage complex systems over their life cycles. At its core, systems engineering utilizes systems thinking principles to organize this body of knowledge

Systems engineering is typically used for large-scale systems in critical
areas with significant financial and human implications. This is common in defense, space programs, and manufacturing plants. Following the entire engineering process results in properly documented and successful systems. It is a very thorough process.

After studying this discipline for hours, I discovered three key concepts that I was not familiar with but found to be quite interesting and relevant to my daily work as an engineering manager and architect:

Requirements Engineering
Trade Studies
Traceability

Trade Studies

Trade studies are decision-making activities used to identify the most acceptable technical solution among a set of proposed solutions.

Trade studies are an evaluation process that identifies a course of action
from multiple alternatives, leaving you with a transparently documented
decision and the rationale behind it.

Trade studies is something we do every day. For example, we might decide

which toothpaste to use
which school to attend
which technology stack to build a new software application on.

Our daily lives are full of examples of trade studies.

How do we carry out trade studies?

There are five steps to conducting a trade study:

Identify the requirements or objectives. Clarify the requirements from the stakeholders. This could be in the form of a Product Requirement Document, a Solution Requirement Specification, or a high-level Software Design Specification. The objective of the trade study should be clear.
Create an evaluation criteria. These are quantitative or qualitative assessment points to determine what solutions meet the requirements. It may be given to you or they may be drafted by you based on the requirements and constraints of the project. Whatever way it is chosen, it should always be agreed upon by the stakeholders.
Identify alternatives. Brainstorm on possible paths that could lead you to your solution. Create an extensive list and then round it down to between 2–4 of the best alternatives.
Evaluate alternatives based on the criteria. Consider each alternative critically and judge it against the evaluation criteria.
Choose the best alternative. Compare and rank the results of your evaluation and determine the best alternative. This is a collaborative process, so if a clear solution cannot be determined, including more eyes could help narrow down the decision.

Trade studies are a lot more than just a few simple steps. I would encourage you to do some more research on the topic. This article only explores the adaptation of the practice to a team with a mindset of “shipping fast.”

Since we are constantly building new features, products, and systems, it makes sense to have a framework that makes it easy for engineers to document and also think deeply about the trade-offs we make in fulfilling requirements.

Decision Making Framework

As part of every design document for a feature, system or component, a section for decision making is mandated. This section in the document would explore all the following:

Technical Decisions: Any choice on how a certain requirement can be delivered (e.g., choice of technology stack, choice of database).
Functional Decisions: Any choice of evaluating what the system should do.
Simplification Decisions: Decisions on reducing complexity of the system.
Operational Decisions: Choices on the day-to-day management and maintenance of a system.

Building any system at all would always require trade-offs, so a lot of decisions would be taken. Not every decision needs to be documented. Only important decisions are documented to not lose the essence of the document.

Requirement

One of the three concepts from Systems Engineering that I found interesting is traceability. It is important to always reference the decision to a requirement, making it easy to understand where this evaluation stemmed from. To make the framework more concise, the Requirement segment can actually be removed, and the Problem Header can be a link to the requirement.

To put this into practice, we would build off of this requirement from the product team:

The system must be able to scale seamlessly and cost effectively to accommodate a 50% increase in user load within the next year.

Problem

This is probably the most important piece of information here. How the problem is phrased would determine to a large extent the effectiveness of the exploration or if the next engineer or architect would find this relevant. In line with the requirements, describe what warranted the need for exploring alternatives.

A problem statement could be:

The scalability of a system is largely dependent on the database design. What would be the most effective database to meet the scaling demands?

Alternatives

Now is the best time to explore the database options. First, cast off restraint and just list them out. Then, you can narrow them down to the top 2 to 4 alternatives that closely meet the demands.

In this case, we could explore:

Postgres SQL
MongoDB
Cockroach DB

Criteria

The key criteria for evaluating the alternatives can then be listed. Prioritize the criteria that directly impact the requirement, then criteria that may affect other requirements, then including requirements that constrain the project as a whole.

Ease of scale
Pricing
Data Modelling Compatibility
Operational Overhead
Performance
Ease of Integration

Reasoning/Evaluation

At this stage, the different alternatives are evaluated against each criterion, weighing the pros and cons. It is mandatory to document each alternative, whether or not it is selected. There are three methods I’ve explored for presenting these evaluations.

Simple Explanation

The justification for choosing an option and why the other was not chosen can be simply explained in terms of the important considerations. For example, if a database is too expensive, we can simply state:

OPTION 3: Although CockroachDB is easy to scale, it is not within our budget 
for this project.

Pros and Cons

Listing the pros and cons of each alternative is a good way to present the evaluations, especially when there are only a few criteria. As more criteria emerge, the presentation becomes more cumbersome. Only the criteria that differ for each option are evaluated.

OPTION 1: POSTGRES SQL
Pros:
It is an open-source database, so it is cost-effective due to no licensing 
costs.
Cons:
It is more complex to implement scaling needs, as sharding must be considered.

OPTION 2: MONGODB
Pros:
Horizontal scalability is built in, making it simple to scale with increasing 
demands.
The plans are cost-effective, as we can operate with less than $5 based on 
current projections.
Cons:
It is difficult to run complex queries and aggregations.

Criteria Evaluation Matrix

The tabular matrix uses scores to determine how each alternative performs against each criterion. The weighting of the criteria varies depending on their priority. For example, let’s use two criteria and assign them weights:

Pricing: 1,
Ease of Integration: 0.3,

Let’s say Postgres SQL DB scores a 4/5, in terms of pricing and 3/5, in terms of ease of integration and Cockroach DB scores a 3/5, in pricing and 5/5, in ease of integration then the aggregate score would be

PostgresSQL: (4*1.0) + (3*0.3) = 4.9
CockroachDB: (3*1.0) + (5*0.3) = 4.5

Based on these scores Postgres SQL becomes a prime choice.

Decision

A clear statement of the chosen alternative is presented, highlighting what it achieves and the risks that have been accepted. This can be presented in an easy-to-understand manner. The reasoning behind each alternative’s evaluation would have already been expressed, so the reader can go over it to determine why it was chosen.

However, I find using Architectural Decision Records (ADRs) to document the decision to be a very useful technique. It provides sufficient context for the reader to understand why it was chosen without having to engage in critical thinking if it is not required.

The formats for ADRs include:

Short Format:

In the context of <use case/user story/ requirement>, facing <concern> we decided for <option> to achieve <quality>, accepting <downside>.

Long Format:

In the context of <use case/user story/requirement>, facing <concern> we decided for <option> and neglected <other options>, to achieve <system qualities/desired consequences>, accepting <downside/undesired consequences>, because <additional rationale>.

Decision

In the context of ensuring the scalability of our system, facing the dual concerns of accommodating a 50% increase in user load within the next year and maintaining a low cost, we decided for MongoDBto achieve optimal scalability and cost efficiency, accepting the non-relational constraints of the database.

Tips on having an effective trade study

Order matters. Write your decisions in a logical order, such as in the order of their dependencies or unveiling. For example, if you need to decide on a technology stack before choosing a database, then write about the technology stack first.
Use an interactive editor. Use tools like Confluence that make it easy to give feedback and collaborate on decisions. This will help you keep track of the context of each decision and make it easier to trace back to corresponding statements, such as requirements or other decisions.
Bring the decision closer to the code. Use markdowns to store architectural decisions within the code whenever possible. This is not always easy, as these decisions are often high-level and cut across multiple components. However, you should still consider storing them within the repositories that they cut across, making them easy to apply when needed.
Keep it organized. Make sure your documents are well-organized, using tables, bullet points, and headings as needed. This will make them easier to read and understand.
Proofread. Proofread your decisions carefully to ensure that they are accurate and free of errors. A simple typo or an irrelevant sentence could make it difficult for the next reader to understand the thought process behind the decision.
Collaborate and review. Set up review processes to ensure that at least one other person reviews your decisions. If possible, have one senior, junior, and cross-functional team member review them.

Conclusion

Agility requires adaptation and evolution. Documenting the reasons behind decisions, not just the decisions themselves, makes it easier to evolve our systems in a seamless way. It is essential for engineers to document the choices they make for the long-term health of their systems. By providing contextual cues to future readers, we can empower them to make the code and systems we build better.

As an engineering leader, every choice you make in your engineering process is not only about efficiency, but also about developing the team’s values and skillset. Applying trade studies in your engineering process trains engineers to make decisions, which eventually leads to a more autonomous team.

Disclaimer: The techniques used in the article were not created or designed by me. They were only leveraged to solve a problem

Feel free to leave comment on how you handle decisions in your own team and the impact it’s had on the culture of the team and the evolution of the systems 😊😊.

Thank you for reading Engineering Leadership with Tobe (ELT). This post is public so feel free to share it.

Engineering Leadership with Tobe (ELT)