This article, despite its innocent title, provoked such a verbose discussion on Stackoverflow that we could not ignore it. An attempt to grasp the immensity - to clearly tell about the competent design of the REST API - apparently, the author succeeded in many ways, but not completely. In any case, we hope to compete with the original in the degree of discussion, as well as the fact that we will join the army of Express fans.
Enjoy reading!
REST APIs are one of the most common types of web services available today. With their help, various clients, including browser applications, can exchange information with the server via the REST API.
Therefore, it is very important to design the REST API correctly so that you don't have problems along the way. Consider security, performance, and the usability of the API from a consumer perspective.
Otherwise, we will provoke problems for customers using our API - which is frustrating and annoying. If we do not follow common conventions, then we will only confuse those who will maintain our API, as well as customers, since the architecture will be different from the one that everyone expects to see.
This article will look at how to design REST APIs in such a way that they are simple and understandable for everyone who consumes them. We will ensure their durability, security and speed, since the data transmitted to clients through such an API can be confidential.
Since there are many reasons and options for a network application to fail, we must ensure that errors in any REST API are handled gracefully and accompanied by standard HTTP codes to help the consumer deal with the problem.
Accept JSON and return JSON in response
REST APIs must accept JSON for the request payload as well as send JSON responses. JSON is a data transfer standard. Almost any network technology is adapted to use it: JavaScript has built-in methods for encoding and decoding JSON, either through the Fetch API or through another HTTP client. Server-side technologies use libraries to decode JSON with little or no intervention on your part.
There are other ways to transfer data. XML as such is not very widely supported in frameworks; usually you need to convert the data to a more convenient format, which is usually JSON. On the client side, especially in the browser, it is not so easy to deal with this data. You have to do a lot of extra work just to ensure the normal data transfer.
Forms are convenient for transferring data, especially if we are going to transfer files. But for transferring information in text and numerical form, you can do without forms, since most frameworks allow JSON transfer without additional processing - just take the data on the client side. This is the most straightforward way to deal with them.
To ensure that the client interprets the JSON received from our REST API exactly as JSON, set
Content-Type
the response header to a value application/json
after the request is made. Many server-side application frameworks set the response header automatically. Some HTTP clients look Content-Type
at the response header and parse the data according to the format specified there.
The only exception occurs when we try to send and receive files that are transferred between the client and the server. Then you need to process the files received as a response and send the form data from the client to the server. But this is a topic for another article.
We also need to make sure that JSON is the response from our endpoints. Many server frameworks have this feature built in.
Let's take an example of an API that accepts a JSON payload. This example uses the Express backend framework for Node.js. We can use a program as middleware
body-parser
to parse the JSON request body and then call a method res.json
with the object we want to return as a JSON response. This is done like this:
const express = require('express');
const bodyParser = require('body-parser');
const app = express();
app.use(bodyParser.json());
app.post('/', (req, res) => {
res.json(req.body);
});
app.listen(3000, () => console.log('server started'));
bodyParser.json()
parses the request body string into JSON, converting it to a JavaScript object, and then assigning the result to the object req.body
.
Set the Content-Type header in the response to a value
application/json; charset=utf-8
without any changes. The method shown above is applicable to most other backend frameworks.
We use names for paths to endpoints, not verbs
The names of paths to endpoints should not be verbs, but names. This name represents the object from the endpoint that we retrieve from there, or which we manipulate.
The fact is that the name of our HTTP request method already contains a verb. Putting verbs in the names of the paths to the API endpoint is impractical; moreover, the name turns out to be unnecessarily long and does not carry any valuable information. The verbs chosen by the developer can be put simply depending on his whim. For example, some people prefer the 'get' option, and some prefer 'retrieve', so it is better to limit yourself to the familiar HTTP GET verb that tells you what the endpoint is doing.
The action must be specified in the name of the HTTP method of the request we are making. The most common methods contain the verbs GET, POST, PUT, and DELETE.
GET fetches resources. POST sends new data to the server. PUT updates existing data. DELETE deletes data. Each of these verbs corresponds to one of the operations from the CRUD group .
Considering the two principles discussed above, to receive new articles, we must create routes of the form GET
/articles/
. Similarly, we use POST /articles/
to update a new article, PUT /articles/:id
to update an article with the given one id
. The DELETE method is /articles/:id
designed to delete an article with a given ID.
/articles
Is a REST API resource. For example, you can use Express to do the following with articles:
const express = require('express');
const bodyParser = require('body-parser');
const app = express();
app.use(bodyParser.json());
app.get('/articles', (req, res) => {
const articles = [];
// ...
res.json(articles);
});
app.post('/articles', (req, res) => {
// ...
res.json(req.body);
});
app.put('/articles/:id', (req, res) => {
const { id } = req.params;
// ...
res.json(req.body);
});
app.delete('/articles/:id', (req, res) => {
const { id } = req.params;
// ...
res.json({ deleted: id });
});
app.listen(3000, () => console.log('server started'));
In the above code, we have defined endpoints for manipulating articles. As you can see, there are no verbs in path names. Names only. Verbs are used only in the names of HTTP methods.
The POST, PUT, and DELETE endpoints accept a JSON request body and return a JSON response as well, including a GET endpoint.
Collections are called plural nouns
Collections should be named with plural nouns. It's not often that we need to take just one item from a collection, so we need to be consistent and use plural nouns in collection names.
The plural is also used for consistency with naming conventions in databases. As a rule, a table contains not one, but many records, and the table is named accordingly.
When working with an endpoint,
/articles
we use plural when naming all endpoints.
Nesting resources when working with hierarchical objects
The path of endpoints dealing with nested resources should be structured like this: add the nested resource as a pathname following the name of the parent resource.
You need to make sure that the nesting of resources in the code exactly matches the nesting of the information in our database tables. Otherwise, confusion is possible.
For example, if we want to receive comments on a new article at a certain endpoint, we must attach the path / comments to the end of the path
/articles
. In this case, it is assumed that we consider the comments entity as a child entity article
in our database.
For example, you can do this with the following code in Express:
const express = require('express');
const bodyParser = require('body-parser');
const app = express();
app.use(bodyParser.json());
app.get('/articles/:articleId/comments', (req, res) => {
const { articleId } = req.params;
const comments = [];
// articleId
res.json(comments);
});
app.listen(3000, () => console.log('server started'));
In the above code, you can use the GET method on the path
'/articles/:articleId/comments'
. We receive comments comments
on the article that matches articleId
, and then return it in response. We add 'comments'
after the path segment '/articles/:articleId'
to indicate that this is a child resource /articles
.
This makes sense since comments are child objects
articles
and it is assumed that each article has its own set of comments. Otherwise, this structure can be confusing for the user, since it is usually used to access child objects. The same principle applies when working with POST, PUT, and DELETE endpoints. They all use the same structure nesting when constructing path names.
Neat error handling and return standard error codes
To avoid confusion when an error occurs on the API, handle errors carefully and return HTTP response codes indicating which error occurred. This provides API maintainers with sufficient information to understand the problem. It is unacceptable for errors to crash the system, therefore, they cannot be left without processing, and the API consumer must deal with such processing.
The most common HTTP error codes are:
- 400 Bad Request - Indicates that the input received from the client failed validation.
- 401 Unauthorized - means that the user has not logged in and therefore does not have permission to access the resource. Typically, this code is issued when the user is not authenticated.
- 403 Forbidden - Indicates that the user is authenticated but does not have permission to access the resource.
- 404 Not Found - means the resource was not found
- 500 Internal server error is a server error and should probably not be thrown explicitly.
- 502 Bad Gateway - Indicates an invalid reply message from the upstream server.
- 503 Service Unavailable - means that something unexpected happened on the server side - for example, server overload, failure of some system elements, etc.
You should issue exactly the codes that correspond to the error that prevented our application. For example, if we want to reject data that came as a request payload, then, in accordance with the rules of the Express API, we must return a code of 400:
const express = require('express');
const bodyParser = require('body-parser');
const app = express();
//
const users = [
{ email: 'abc@foo.com' }
]
app.use(bodyParser.json());
app.post('/users', (req, res) => {
const { email } = req.body;
const userExists = users.find(u => u.email === email);
if (userExists) {
return res.status(400).json({ error: 'User already exists' })
}
res.json(req.body);
});
app.listen(3000, () => console.log('server started'));
In the above code, we are holding in the users array a list of existing users who have known email.
Further, if we try to send a payload with a value
email
already present in users, we get a response with a code of 400 and a message 'User already exists'
indicating that such a user already exists. With this information, the user can get better - replace the email address with the one that is not yet on the list.
Error codes should always be accompanied by messages that are informative enough to fix the error, but not so detailed that this information could be used by attackers who intend to steal our information or crash the system.
Whenever our API fails to shutdown properly, we must carefully handle the failure by sending error information to make it easier for the user to correct the situation.
Allow sorting, filtering and pagination of data
The bases behind the REST API can grow a lot. Sometimes there is so much data that it is impossible to get all of it back in one go, as this will slow down the system or even bring it down. Hence, we need a way to filter items.
We also need ways to paginate data (pagination) so that we only return a few results at a time. We don't want to take too long on resources trying to pull in all the requested data at once.
Both filtering and data pagination can improve performance by reducing the use of server resources. The more data accumulates in the database, the more important these two possibilities become.
Here's a small example where the API can accept a query string with various parameters. Let's filter the items by their fields:
const express = require('express');
const bodyParser = require('body-parser');
const app = express();
//
const employees = [
{ firstName: 'Jane', lastName: 'Smith', age: 20 },
//...
{ firstName: 'John', lastName: 'Smith', age: 30 },
{ firstName: 'Mary', lastName: 'Green', age: 50 },
]
app.use(bodyParser.json());
app.get('/employees', (req, res) => {
const { firstName, lastName, age } = req.query;
let results = [...employees];
if (firstName) {
results = results.filter(r => r.firstName === firstName);
}
if (lastName) {
results = results.filter(r => r.lastName === lastName);
}
if (age) {
results = results.filter(r => +r.age === +age);
}
res.json(results);
});
app.listen(3000, () => console.log('server started'));
In the above code, we have a variable
req.query
that allows us to get request parameters. We can then extract property values ββby destructuring individual query parameters into variables; JavaScript has a special syntax for this.
Finally, we apply filter on each query parameter value to find the items we want to return.
With this done, we return results as a response. Hence, when doing a GET request to the following path with a query string:
/employees?lastName=Smith&age=30
We get:
[
{
"firstName": "John",
"lastName": "Smith",
"age": 30
}
]
as the returned response because the filtering was on
lastName
and age
.
Likewise, you can take the page query parameter and return a group of records occupying positions from
(page - 1) * 20
to page * 20
.
Also in the query string, you can specify the fields by which sorting will be performed. In this case, we can sort them by these separate fields. For example, we may need to extract a query string from a URL like this:
http://example.com/articles?sort=+author,-datepublished
Where
+
means "up" and β
"down". Thus, we sort by author name alphabetically and by datepublished from newest to oldest.
Adhere to proven security practices
Communication between client and server should be mostly private, as we often send and receive confidential information. Hence, using SSL / TLS for security is a must.
The SSL certificate is not that difficult to upload to the server, and the certificate itself is either free or very cheap. There is no reason why we shouldn't allow our REST APIs to communicate over secure channels rather than open ones.
A person should not be given access to more information than he requested. For example, an ordinary user should not gain access to the information of another user. Also, he should not be able to view the data of administrators.
To promote the principle of least privilege, you must either implement role checking for a specific role, or provide more granularity of roles for each user.
If we decide to group users into several roles, then the roles need to be provided with such access rights that ensure that everything that the user needs to do, and no more. If we prescribe in greater detail the access rights to each opportunity provided to the user, then we need to ensure that the administrator can grant these capabilities to any user, or take away these capabilities. In addition, you need to add some predefined roles that can be applied to a user group so that you do not have to manually set the necessary rights for each user.
Cache data to improve performance
Caching can be added to return data from a local memory cache rather than retrieving some data from the database whenever users request it. The advantage of caching is that users can retrieve data faster. However, this data may be outdated. This can also be fraught with problems when debugging in production environments, when something went wrong, and we continue to look at the old data.
There are a variety of caching options available, such as Redis , in-memory caching, and more. You can change the way data is cached as needed.
For example, Express provides middleware
apicache
to add caching capability to your application without complicated configuration. Simple in-memory caching can be added to the server like this:
const express = require('express');
const bodyParser = require('body-parser');
const apicache = require('apicache');
const app = express();
let cache = apicache.middleware;
app.use(cache('5 minutes'));
//
const employees = [
{ firstName: 'Jane', lastName: 'Smith', age: 20 },
//...
{ firstName: 'John', lastName: 'Smith', age: 30 },
{ firstName: 'Mary', lastName: 'Green', age: 50 },
]
app.use(bodyParser.json());
app.get('/employees', (req, res) => {
res.json(employees);
});
app.listen(3000, () => console.log('server started'));
The above code simply refers to
apicache
with apicache.middleware
, resulting in:
app.use(cache('5 minutes'))
and that's enough to apply application-wide caching. We cache, for example, all the results in five minutes. Subsequently, this value can be adjusted depending on what we need.
API versioning
We must have different versions of the API in case we make changes to them that could disrupt the client. Versioning can be done on a semantic basis (for example, 2.0.6 means that the major version is 2, and this is the sixth patch). This principle is now accepted in most applications.
This way you can gradually retire old endpoints rather than forcing everyone to simultaneously switch to the new API. You can save the v1 version for those who do not want to change anything, and provide the v2 version with all its new features for those who are ready to upgrade. This is especially important in the context of public APIs. They need to be versioned so as not to break third-party applications that use our APIs.
Versioning is usually done by adding
/v1/
,/v2/
, etc., added at the beginning of the API path.
For example, here's how to do it in Express:
const express = require('express');
const bodyParser = require('body-parser');
const app = express();
app.use(bodyParser.json());
app.get('/v1/employees', (req, res) => {
const employees = [];
//
res.json(employees);
});
app.get('/v2/employees', (req, res) => {
const employees = [];
//
res.json(employees);
});
app.listen(3000, () => console.log('server started'));
We just add the version number to the beginning of the path leading to the endpoint.
Conclusion
The most important takeaway from designing high-quality REST APIs is to maintain consistency by following the standards and conventions of the web. JSON, SSL / TLS, and HTTP Status Codes are a must-have on the modern web.
Performance is equally important. You can increase it without returning too much data at once. In addition, you can use caching to avoid asking for the same data over and over again.
Endpoint paths must be named consistently. You should use nouns in their names, as verbs are present in the names of HTTP methods. Nested resource paths must follow the parent resource path. They should communicate what we receive or manipulate, so that we do not have to additionally consult the documentation to understand what is happening.