It is All The Same

The first step to understanding testing APIs is to understand it’s still the same as testing any other thing. I have noted the primary differences to keep in mind below, but beyond these you will want to apply the testing principles you would for any other application (eg, performance testing, integration testing)

Identify Users

Users of APIs consist of two parts - the actual consumer (typically another app, such as a mobile app, or a website), and the developers writing the consumers.

Know who your targeted consumers are, the same way you would know your target users.

Acknowledge the User Interface

The user interface for an API consists of 2 primary things - the documentation and the response messages (both good and error).

A developer writing a consuming app needs to be able to use the documentation and response messages to fix any mistakes they have made and know when it is successful - the same way a web site would guide a user through a task.

User Flows

APIs are always written in service of something. There’s tasks they are enabling the users of the consuming products to do. While API calls can be made independent of each other, there still exists primary user flows they enable. Identify these flows and understand how they fit together.

Automated Testing Basics

Automate the above section. Put yourself in your users’ shoes and use the documentation and responses to guide your path. If you have to ask the developers who worked on the part you are automating against, that’s a UI bug which should be addressed.

As the users of the API are primarily developers and their apps, automating tasks and interacting with the API the same way they do is one of the most important things you can do.

Further Resources

API Tester is my attempt to automate myself out of ever doing boilerplate API testing ever again.

It checks many common issues with APIs such as consistent error messaging via doing a variety of calls based on how you tell it the API works (ideally based on the API’s documentation)

While as of writing this API Tester’s DSL is rough to use, the concepts it represents are valuable ones any API Team should consider.

I subscribe to this idea: “Every test has a purpose and every concern has a layer” (if this phrase sounds nonsensical to you, please review the Swiss Cheese Model which sounds nice and lovely.
As a result, I hereby propose we purge the phrase of “Functional Tests” from our collective vocabulary. It is a phrase which tells you nothing other than that there are tests and they tests functionality. Likewise, I propose we stop considering “Integration Tests” as its own test type and instead consider it as an umbrella term for a test category.
What I am getting it is those two phrases fail to convey the meaning. What is the purpose of a test who belongs in the “Functional” suite? How do you know a test belongs there? When implementing functionality, how do you know when you need to write a functional test?

As a result, I think that asking unit vs integration vs functional is asking the wrong question.

Name test suites after their purpose. What is the testing protecting? What concern is the test addressing?

Note, I do usually keep unit as it is a term the industry recognizes and has some level of agreement, but I define unit as a test whose purpose is to protect and document the functionality of code, specifically in a line or method. To me the focus of these tests is the ease of writing, the speed of running, and the stability/reliability of the test. If it needs mocks to achieve that, then use mocks. If it needs to be a “social” unit test to achieve that, have it be social. The only exception is I do not allow internet connectivity in my unit tests as internet connections always slow them down. I ultimately believe that unit tests are written for the express purpose of letting developers know when they have made unintended changes to their code and whatever method allows for the quickest and most reliable tests of this is that project’s version of a unit test.

“Functional Tests” is the worst term as most tests are technically functional tests. Instead ask yourself - is this a user flow test? Am I protecting the my own API contract? Is this a usability test?

For all tests, instead of “functional”, name them after their purpose. For example, frequently the tests named “Functional” are actually a variant of “User Flow” tests. Their purpose is to test the ability of the system to handle expected user journeys and interactions. By changing the name to “User Flow” thinking about these tests becomes easier. When implementing functionality, I can ask “what user flows have changed, been introduced, or removed as a result of this?” With that question, I am well armed to make appropriate changes to my User Flow suite in response to a card.

As another example, integration is another umbrella term. Many of these tests are concerned with contract, usability, or might not even need to exist. If you check the Swiss Cheese Model, you will see that the architecture section highlights that one should identify and characterize integrations. Those characteristics then point to what integration tests you would want. For example, if you have a dependency which is under development and volatile, you might consider a contract test to ensure that development does not break your project. Another example is if your dependency has frequent downtime, you might want to consider some form of monitoring or heartbeat. Finally many integration tests are written without a purpose beyond “we integrate, therefor integration test”. These tests take time away from what could be useful tests. Identify your concerns and test those. Testing for the sake of testing helps no one.

It is much easier to enforce test boundary via thinking of test purpose first.

Every test has a purpose and every concern has a layer.

Each test suite represents a layer (and things other than test suites can by layers), so make your layers based on what sort of concerns/problems you expect that layer to be able to catch.

*I recognize the irony of this post given this website has a (not yet fleshed out) Test Types section. That section is intended to demonstrate various test types from a purpose view point to give inspiration and examples for use in developing a test plan.

Note: This is my personal definition and represents how I approach my work, not necessarily what various companies in the industry hire for. While I feel like this is something companies should incorporate into their own QA positions, I recognize this is not the case everywhere.

A QA, within a software development team, is the Quality Advocate first and foremost. It is their job to ensure the quality and work required is balanced with speed of delivery. A QA will create and maintain the test strategy such that major concerns are addressed, bugs are found, and the customer experience is known.

A QA should know what the critical user flows for customers are, as well as knowing and understanding personas of users. They should be able to leverage this knowledge to understand how a change could impact users and how users can potentially read documentation.

A QA should know and understand the architecture of the product. They should be able to leverage this knowledge to identify points of concern (eg, integrating with a volatile dependency).

A QA seeks cooperation, not confrontation. The test strategy is everyone’s responsibility to add to and is owned by the team. The QA acts as the shepherd for the strategy - if there’s a problem with the strategy (eg, one layer takes too long to be useful), the QA will ensure it is fixed (which might involve the rest of the team, or accepting a solution from a team member). The QA will ensure quality related metrics are tracked and communicated when the metrics means something has to change.

A QA can troubleshoot problems found (or do root cause analysis) and assist the developers in fixing it. They should also be able to understand how to change the test strategy to ensure such a problem is caught sooner in the future, or know when such a change costs more than it is worth.

A QA can learn how to appropriately leverage the rest of their development team when they need help. For example, if the QA workload has gotten large enough they feel they are falling behind, QAs can identify tests or other automated tasks developers can aid with. On the other hand, if the QA workload is particularly light, the QAs can look at what tests / automated tasks they can write to lower the developer workload. A QA should learn what tasks can easily be distributed, and which people on their team need a little training or prep to aid with.

A QA will analyze upcoming stories to understand how they affect the user flows and what is needed to test the stories. They will use this knowledge to prepare what they can ahead of time (eg, datasets) and start reaching out for the items they need.

A QA will build a test strategy which maximizes exploratory time via automation and other means. Exploratory tests are usually some of the most important forms of testing, but regression should never be neglected. QAs use their knowledge covered in other sections to identify critical areas for regression and how to implement the regression in an effective way.

My Definition

Quality is informed confidence. If my product is high quality, I should be confident it is doing the right things in the right way, and I should be able to prove this.

How to Accomplish


Identify the sources which would cause a lack of confidence - risks to the product or the product could cause to the business (eg, does it add an attack surface to security?)

Learn about the concerns the stakeholders have regarding the product and what it would take to assuage those concerns. Sometimes they stem from the stakeholder’s own experiences you might not have considered, sometimes from a lack of knowledge.

Finally in analysis, you want to identify the goals of the project and ways to measure or otherwise prove it is doing its job.


The test strategy should follow one simple goal: “Every layer has a purpose and every concern has a layer”.

For the items identified in analysis, you want to ensure everything is covered in some way.

The strategy should be put together in such a way that you can pick up any piece of work and easily identify the needed tests using it.


Reiterating what was covered in analysis - what metrics can be placed around the product to prove it is both working and doing the right things?

Example Tech metrics

  • Uptime

Example Business metrics

  • Usage
  • Time required to do task (ensure you collect this metric prior to solution being implemented as well if relevant)

If you have read the Swiss Cheese Model section and are familiar with agile, you might have noticed something missing. In fact, if you read many of my posts or have worked with me, you will probably notice I rarely refer to the pyramid despite its ubiquity within the agile world.

The pyramid model is just that, a model. Models are useful for predicting and acting as a guide for decisions, but all models have limitations and understanding their limitations helps you make better decisions and understand when you need a new model.

I personally feel the pyramid is at its best in particular conversations – specifically, the cost vs. effect conversation.

If you look at the standard model:

Standard Test Pyramid

The test pyramid has a clear concern – certain test types cost more than others due to a combination of the length of the feedback cycle, stability, maintenance, brittleness, etc – so you should have more of the inexpensive tests and less of the expensive ones.

This focus makes the pyramid an effective tool for asking the question of where a test should be (the place which gives it the best cost v. effect ratio). This is a good thing to keep in mind and use within your own strategy building, but there is more to a strategy than its cost and feedback cycle.

The pyramid itself is a good rule of thumb for what was the standard project when it was created (websites with backends and maybe an outside dependency or two). However, the world of IT has moved in very different directions since its inception. Many projects exist in worlds far beyond the single website world (eg, microservices). The pyramid is an effective tool for some conversations, but the greater strategy for these projects needs a new model.

Similarly, many project similarly utilize tools which are the great tools for their use case, but those tools do not support unit tests which fit the pyramid (eg, data transformations in Apache Spark). While ideally we pick tools which support unit testing, this is not always possible and we should be able to adapt to the situation. These projects also need a model beyond the test pyramid.

The testing realm of IT needs a flexible model which can handle the fact projects are not simple and tooling is not always ideal. We can keep the principles of the test pyramid (cost effectiveness and quick feedback) close while understanding there is more to a quality strategy.

What makes a good error message?

A. Informs the user what went wrong
B. Gives the user technical information
C. Assures the user things will be looked into
D. Tells the user what to do?

Think about it before looking at the answer. Who it is for? What they are meant to do with the information? Why they might encounter these messages?

Think about your own experiences with things you use – what do you with error messages? What do you wish you had been told in the error message?

If all of this led to you guessing D, congratulations!

None of the other error messages are helpful to the user. A and B can be harmful to the application as they can lead to security risks due to the possibility of exposing information which could be exploited. C can be misleading. Only D gives the user actions for them to do.

Sometimes the only action you can tell them is to try again or contact customer service and that is okay! The point behind these messages is to empower the user. If the error is something they can fix, they should know that. If the error is something they cannot fix, they should know that too!

Whenever you encounter an error message, you should always ask yourself – “What is this telling me?” and “Is this enough information for an average user to make a reasonable action?”

Error messages are an important part of the user experience, whether you are working with a GUI, API, or some other form of customer interaction.


User flows give the team a big picture idea of what they are working towards – how the API will fit into a client program. Even APIs which will be consumed by multiple clients can benefit from understanding these flows.

These are models of how we would expect the user to use the system, screen by screen. The screens don’t need to be high fidelity (eg, a black box labeled “Login Screen” can easily suffice for a login), just a visual representation of where a particular screen would be. Connecting the screens would be transitions where any called services would be pointed out. You do not have to map all the user flows, but having your critical flows mapped can make following conversations easier.

For example, a flow involving logging in would only call out the happy path login. We would have sad paths in a separate documentation unless the sad path is complicated and important enough to warrant drawing a flow around it.

These can be used to aid priority sessions (eg, which flows are highest priority of enablement), determine what is and isn’t out of scope for that particular discussion, and help with sharing context to new team members. This section will be referenced in later sections and I personally believe this is the most important suggestion.


In an API project, your primary customers are the teams consuming the API. Ideally, you would treat your documentation as your User Interface and as such, it should get just as much attention and testing as everything else.


The swagger documentation tool enjoys a high popularity. It results in pretty documentation which does come from the code, but has similar problems which comments do as far as getting out of date. Simply generating documentation from code just gives objects and the docs still need to be extended by hand. Additionally the docs can still be inaccurate.


The ideal for documentation is a way to auto generate similar to swagger, but be connected to tests. The documentation which results will still need some editing by hand to truly be useful, but coming from tests increases confidence in the docs.



Most API projects will be paired with a team building some sort of client, typically mobile. Sometimes this team only exists in theory, other times they will be already working, and other times they will be a project which starts up at the same time.

Best case scenario, you will be able to work closely together to define things like the contract, can test against each other easily, and have a good working relationship.

This part of the document would be completely unnecessary if the best case scenario ever happened. If you are on a unicorn project, skip this section and bravo to you.

Chances are pretty high an upstream team will lag behind, is either not co-located or you cannot work closely with them for some other reason, and has a slightly different understanding of various parts from your team. This is understandable and gives the API team both an edge in influencing the upstream team and a concern about how things will look once the upstream team finishes.

The top recommendation for working with an upstream team would be to develop some full stack user flows and integration tests which can run on every check in should partially alleviate concerns about interoperability. Getting a set of pact tests going where the two teams agree on the contract and set up independent tests to make sure both teams are obeying the contract at the same time will help keep down integration problems.

Ultimately, understand the upstream team is your first and primary customer. They need to be able to use the API in an efficient manner which fits the flows they are enabling. When deciding on story priority order, it helps to understand (preferably using user flows) what features they are enabling next and what they will need from the API team.


Most API projects will also deal with one, if not multiple dependencies. These dependencies could have volatile test environments, have unexpected ways of responding, or depending on the project, these dependencies could even have work going on them while you are trying to develop against them!

In these instances, many of the above suggestions will still be helpful, but there are additional things you can do depending on which issues you are facing.


These dependencies will have a team actively working on them for various reasons. This team is hopefully an ally of some kind and will respond to requests, but even if the team isn’t actively helping yours, there are steps you can take.

Contract Tests are the big deal here. These tests simply check that the contract of the call you are using downstream has not changed. Ideally these tests would run every time the other team changes something, but if you do not have insight into that (which is normal), you can figure out approximately how often the other team is expected to push changes and code based on that. Past projects have used every 2 hours during working hours, just once a day, and just once a week after the designated weekly “push time” .


The generic good practice is to setup some sort of health check monitoring which is capable of notifying the appropriate people when the dependency is down. Hopefully your team and whoever is supporting the dependency are able to work out a way to make it more reliable in the future.


APIs are unique in that the lack of GUI does make it easier to run reliable performance tests against them. Most good practices from general software development still apply here (eg, have a separate environment, use the same data set for repeat runs for reliable results) with the addendum that if you can procure an environment just for testing, performance testing can absolutely be part of the pipeline.

Even if you cannot procure such an environment, you can still use throughput tests against the same environment you use for other automated tests. These throughput tests simply notify if the time it takes to make a call has significantly changed in some way.

Paired with monitoring metrics in production, you can build a reasonable, layered approach to watching the performance of your project.


The biggest thing to remember with other teams working on the same project is everyone is on the same side. We all succeed if the project succeeds. It is really easy to get caught in the trap of acting like other teams are an enemy of some kind and develop a toxic relationship.


They suck. If you are capable of replacing them with something which works better, I want you to teach me your ways.

These serve a purpose where most API projects wind up dealing with an external testing team who find defects and are not sure where the defects should go. The goal for this meeting, as in all meetings, is to be done with as accurate results as possible.

On previous projects, I have heavily encouraged the testing teams to contact the API teams about defects which could potentially be from our side so we can investigate them prior to the meeting – if we are lucky, we can narrow down how many defects need to be discussed this way and limit the meeting to defects which are tricky, have business repercussions (eg, solving it would result in a change or the bug itself is the result of an unexpected scenario which should be handled) or are otherwise not easily dealt with.



Useful for manual testing, sharing collections across the team, and the team behind it is adding many features.

Can attach a theoretical postman script of what should work to a story under development or to a defect


Useful for automated tests in Java due to its DSL.

On the surface, testing an API is similar to testing of other products – user flows, security, edge case behavior, etc, but different in that the first consumer of the API is a developer rather than a customer using a GUI of an app.

For this reason, part of the focus is on the dev experience – can a developer debug the problem with their code’s interaction with the API using the API error messages? Can they easily program their code to interpret errors? Can they use the documentation to figure out how to use the API for their purpose?

A lot of this is coverable by writing user flows for the API. By going through the experience of using it yourself, you can understand what the users will need to go through in order to accomplish their goals. Likewise, when it comes to debugging your user flows, always be looking for where the API is falling short on helping you. For example, when debugging the user flows, do the errors tell you when you are missing fields? Are you getting consistent error messages? Can you program your user flows against a consistent message format?

APIs are unique in that, unless it is asynchronous, they are easily covered by full stack automated tests. It is much easier to put a suite of maintainable tests around them than it is to put a similar suite around the end product (especially if that product is a mobile app). As a result, creating a large, robust suite of user flow tests (tests which attempt to accomplish goals a user of the end application would have) is a very good thing! You still want to make sure your tests give quick feedback and are reliable, but most APIs can have larger user flow test suites than UIs – whose user flow test suites can become flaky and slow.

Do you need to know how to code to test an API?

Not completely, but it helps. Similar to how knowing what happens during air travel can inform testing an app intended for a traveler, understanding what a dev goes through helps with testing a tool intended for developers. I would highly recommend a QA write the user flows themselves, pairing with developers as desired, because the debugging process for the user flows IS part of testing the API.

Writing and maintaining the suite exposes the team to the same pain as consumers of their API will feel, specifically in the user flow portion of the test suite.

To begin your journey through quality, you need to first setup your machine. This post assumes you are looking at a full stack website.


To start with, you need to have the OS itself ready to go. Ideally your base for testing supports all the tooling you need. If you need to test things in a specific OS, VMs are generally good enough. These days, companies like Microsoft even offer free VMs for this purpose.


Your primary browser ought to be one which matches your target users and all browsers your team plans to support should be thoroughly tested. However, even if you are not supporting them, other browsers ought not reveal or allow anything untoward.

For example, you will want to use the Lynx browser to see what happens in an all text environment.

You will also want to ensure one of your normal browsers is easy to turn off Javascript, Flash, etc. Likewise, moving security settings from nothing to most secure.


Most of the major browsers come with useful features for testing. Ensure your primary browser has the following features either built in or accessible through add ons and be comfortable accessing them, later posts will go in depth with ways to use them

View page source
Manipulate page source
Javascript console
Network calls with response headers & bodies, request headers & bodies, and timing
Local storage & cookies viewing and manipulation
Mobile emulation
Ability to change user agent


There a number of useful addons out there which will make testing far easier. Here’s a list of ones I commonly use:

Link Redirect Trace is a useful extension showing you all the hops through a redirect loop
Wave Evaluation Tool is a quick way to analyze the accessibility of your website


Additionally, tools outside the browser can greatly expand your capabilities

Burp Suite has a number of useful tools, allowing you to do things like crawl a website or use its proxy abilities to modify incoming and outgoing requests
Postman is very useful as you can build up a collection of requests and use its variable and environment capabilities to easily work directly with APIs
VM Box for testing functionality in other OSs.
Any document for tracking what you are doing. Specifically, if there is something repetitive you cannot automate, writing it down is a good thing to do.