Working with unexpected data in JavaScript





One of the main problems with dynamically typed languages โ€‹โ€‹is that you cannot always guarantee correct data flow because you cannot force a parameter or variable to be set to a value other than null, for example. In such cases, we tend to use simple code:



function foo (mustExist) {
  if (!mustExist) throw new Error('Parameter cannot be null')
  return ...
}


The problem with this approach is code pollution, since you have to test variables everywhere and there is no way to guarantee that all developers will actually run this test always, especially in situations where a variable or parameter cannot be null. Often we don't even know that such a parameter can have the value undefined or null - this often happens when different specialists work on the client and server parts, that is, in the vast majority of cases.



To optimize this scenario a bit, I started looking for how and with which strategies the best way to minimize the surprise factor. That's when I came across a great article by Eric Elliott.... The purpose of this work is not to completely refute his article, but to add interesting information that I have been able to discover over time thanks to my experience in the field of JavaScript development.



Before I start, I would like to go through some of the points that are covered in this article and express my opinion as a server component developer, since another article is more client-oriented.



How it all began



The data processing problem can be due to several factors. The main reason, of course, is user input. However, there are other sources of malformed data in addition to those mentioned in another article:



  • Database records
  • Functions that implicitly return null data
  • External APIs


In all cases considered, different solutions will apply, and later we will analyze each of them in detail, remembering that none is a panacea. Most of the problems are caused by human error: in many cases languages โ€‹โ€‹are prepared to work with null or undefined data (null or undefined), but in the process of transforming this data, the ability to process it may be lost.



User entered data



In this case, we have very few opportunities. If the problem lies in user input, it can be solved with the so-called hydration (in other words, we have to take the raw input that the user sends to us (for example, as part of an API payload) and transform it into something with which we can work without errors).



On the server side, when using a web server like Express, we can perform all the operations with user input in the client side using standard tools like JSON schema  or Joi .



An example of what can be done using Express or AJV is given below:



const Ajv = require('ajv')
const Express = require('express')
const bodyParser = require('body-parser')
 
const app = Express()
const ajv = new Ajv()
 
app.use(bodyParser.json())
 
app.get('/foo', (req, res) => {
  const schema = {
    type: 'object',
    properties: {
      name: { type: 'string' },
      password: { type: 'string' },
      email: { type: 'string', format: 'email' }
    },
    additionalProperties: false
    required: ['name', 'password', 'email']
  }
 
  const valid = ajv.validate(schema, req.body)
    if (!valid) return res.status(422).json(ajv.errors)
    // ...
})
 
app.listen(3000)


Look: we are checking the main part of the route. By default, this is the object we get from the body-parser package as part of the payload. In this case, we are passing it through the JSON schema , so it will be validated if one of these properties is of a different type or format (in the case of email).



Important! Note that we are returning an HTTP 422 for an unprocessed object . Many people interpret a query error, such as an invalid body or query string, as error 400  Invalid query - this is partly true, but in this case the problem was not in the request itself, but in the data that the user sent with it. So the optimal response to the user would be error 422: this means that the request is correct, but it cannot be processed because its content is not in the expected format.



Another option (besides using AJV) is to use the library I created with Roz . We called it Expresso , and it is a set of libraries that make it a little easier to develop APIs that use Express. One such tool is  @ expresso / validator , which essentially does what we demonstrated above, but can be handed over as middleware.



Additional parameters with default values



In addition to what we previously checked, we found it possible to pass a null value to our application in case it is not sent in an optional field. Imagine, for example, that we have a pagination route that takes two parameters, page and size, as query strings. However, they are optional and should default to default if not received.



Ideally, our controller should have a function that does something like this:



function searchSomething (filter, page = 1, size = 10) {
  // ...
}


Note. Just like with the 422 error that we returned in response to paging requests, it is important to return the correct error code, 206 Incomplete content , whenever we respond to a request for which the amount of data returned is part of a whole, we return 206. When the user has reached the last page and there is no more data, we can return a code of 200, and when the user tries to find a page outside the total page range, we return code 204 No content .



This would solve the problem when we get two empty values, but this is a very controversial aspect of JavaScript in general. Optional parameters take a default value only if the value is empty, however this rule does not work for the value null, so if we do the following:



function foo (a = 10) {
  console.log(a)
}
 
foo(undefined) // 10
foo(20) // 20
foo(null) // null


and we need the information to be treated as null, we cannot rely solely on optional parameters for this. Therefore, in such cases, we have two ways:



1. Use the If statements in the controller



function searchSomething (filter, page = 1, size = 10) {
  if (!page) page = 1
  if (!size) size = 10
  // ...
}


It doesnโ€™t look very good, and is rather inconvenient.



2. Use JSON schemas  directly on the route



Again, we can use AJV or @ expresso / validator to validate this data:



app.get('/foo', (req, res) => {
  const schema = {
    type: 'object',
    properties: {
      page: { type: 'number', default: 1 },
      size: { type: 'number', default: 10 },
    },
    additionalProperties: false
  }
 
<a href=""></a>  const valid = ajv.validate(schema, req.params)
    if (!valid) return res.status(422).json(ajv.errors)
    // ...
})


Working with Null and Undefined Values



I am personally not happy with the idea of โ€‹โ€‹using both null and undefined in JavaScript to prove that the value is empty, for several reasons. In addition to the difficulties with bringing these concepts to the abstract level, one should not forget about optional parameters. If you still have doubts about these concepts, let me give you a great example from practice:







Now that we understand the definitions, we can say that in 2020 there will be two major functions in JavaScript: the null coalescing operator and optional chaining. I will not go into details now, since I  have already written an article about this. (it's in Portuguese), but note that these two innovations will greatly simplify our task, since we can concentrate on these two concepts, null and undefined with the appropriate operator (??), instead of using logical negatives like! obj that are fertile ground for mistakes.



Functions that return null implicitly



This problem is much more difficult to solve because of its implicit nature. Some functions process data on the assumption that it will always be provided, but in some cases this is not the case. Let's consider a standard example:



function foo (num) {
  return 23*num
}


If num is null, the result of this function will be 0, which was not expected. In such cases, we have no choice but to test the code. There are two types of testing that can be done. The first is to use a simple if statement:



function foo (num) {
  if (!num) throw new Error('Error')
  return 23*num
}


The second way is to use the Either monad , which is covered in detail in the article I mentioned. This is a great way to handle ambiguous data, that is, data that may or may not be null. This is because JavaScript already has a built-in function that supports two streams of actions, Promise:



function exists (value) {
  return x != null ? Promise.resolve(value) : Promise.reject(`Invalid value: ${value}`)
}
 
async function foo (num) {
  return exists(num).then(v => 23 * v)
}


This is how you can delegate the catch statement from exists to the function that called foo:



function init (n) {
  foo(n)
    .then(console.log)
    .catch(console.error)
}
 
init(12) // 276
init(null) // Invalid value: null


External APIs and database records



This is a very common case, especially when there are systems developed from databases that were created or populated earlier. For example, a new product that uses the same database as its successful predecessor, thereby integrating users of different systems, and so on.



The big problem with this is not the fact that the database is unknown - in fact, this is the reason, since we do not know what was done at the database level, and we cannot confirm whether we will receive data with a value of null or undefined or not. ... We cannot but say about poor quality documentation when the database is not properly documented and we face the same problem as before.



There is almost nothing we can do here, and I personally prefer to check the state of the data to make sure I can work with it. However, you cannot validate all of the data, since many of the returned objects may simply be too large. Therefore, before performing any operations, it is recommended to check the data involved in the operation of the function, such as a map or a filter, to make sure whether it is undefined or not.



Generating errors



It is good practice to use assertion functions  for databases and external APIs. Essentially, these functions return data, if any, and an error is generated otherwise. The most common use case for this type of function is when we have an API, for example to search for a specific data type by identifier, the well-known findById:



async function findById (id) {
  if (!id) throw new InvalidIDError(id)
 
  const result = await entityRepository.findById(id)
  if (!result) throw new EntityNotFoundError(id)
  return result
}


Replace Entity with the name of your entity, such as UserNotFoundError.



This is good, since we can have a function within the same controller to find users by ID and another function that uses this user to find other data, for example, the profiles of this user in another collection of databases. When calling the profile lookup function, we use assertion to ensure that the user actually exists in our database. Otherwise, the function will not even be executed and you can search for the error directly on the route:



async function findUser (id) {
  if (!id) throw new InvalidIDError(id)
 
  const result = await userRepository.findById(id)
  if (!result) throw new UserNotFoundError(id)
  return result
}
 
async function findUserProfiles (userId) {
  const user = await findUser(userId)
 
  const profile = await profileRepository.findById(user.profileId)
  if (!profile) throw new ProfileNotFoundError(user.profileId)
  return profile
}


Note that we will not make a database call if the user does not exist, since the first function ensures that the user exists. Now we can do something like this in the route:



app.get('/users/{id}/profiles', handler)
 
// --- //
 
async function handler (req, res) {
  try {
    const userId = req.params.id
    const profile = await userService.getProfile(userId)
    return res.status(200).json(profile)
  } catch (e) {
    if (e instanceof UserNotFoundError || e instanceof ProfileNotFoundError) return res.status(404).json(e.message)
    if (e instanceof InvalidIDError) return res.status(400).json(e.message)
  }
}


We can find out the type of error returned by simply checking the instance name of the existing error class.



Conclusion



There are several ways to process data to ensure a continuous and predictable flow of information. Do you know any other tips ?! Leave them in the comments.



Like the material ?! Want to give advice, express an opinion, or just say hello? Here's how to find me on social media:








This article was originally posted on dev.to by Lucas Santos. If you have any questions or comments on the topic of the article, post them under the original article on dev.to



All Articles