Go Parse, Don't Validate

Go Parse, Don't Validate

One of the bugs I was tasked to solve at my #dayjob the second week there seemed simple on the surface. Potential applicants (aka leads), need to be submitted to the company wide server. The lead was original stored in our team's database so we'd retrieve the lead and then try to submit it. Basically the code ended up looking like this:

func SubmitLeadToBanner(lead Lead) error {
    if !pushLogicService.ShouldPushToBanner(lead) {
        return nil
    }
    ridm, err := prospectSubmission.SubmitToBanner(lead)
    if err != nil {
        return err
    }
    lead.Ridm = ridm
    return lead.Save()
}

The issue was that leads sometimes weren't being sent with their phones and emails. As a newcomer on the job the solution seemed simple; validate that there are phones and emails before submitting.

+ if len(*lead.Phones) == 0 || len(*lead.Emails) == 0 {
+     return errors.New("no phones or emails")
+ }
 ridm := prospectSubmission.SubmitToBanner(lead)

It worked and passed testing and so went into production. The problem is, I had just introduced another bug. Do you see it?

The issue was that lead.Phones and lead.Emails were a pointer to slices which means they could potentially be null. Although they often were not null, the problem was the database ORM would sometimes not return phones and emails and in that case the pointer was null instead of just an empty slice. So, of course, I just added another check.

-    if len(*lead.Phones) == 0 || len(*lead.Emails) == 0 {
+    if lead.Phones != nil && len(*lead.Phones) == 0 || lead.Emails != nil && len(*lead.Emails) == 0 {

Now, this worked and solved the problems. But it doesn't sit right with me for a few reason:

  • We check if the lead should be pushed inside the SubmitLeadToBanner

  • If the phones or emails field are null there is nothing we can do other than abandon trying to submit the lead

  • There is actually some retry logic above this but it's unclear how that happens from this code.

The Solution: Parsing

I enjoy programming. What I don't like is tracking down bugs. In my search to improve the bug-freeness of my code I came across the Haskell article Parse, Don't Validate and its sequel, Names are not Type Safety. The gist of the articles is that the type should describe the safe operations available. This means instead of using a bool to determine if a struct matches certain criteria we return a new type which describes this invariant. Take for example:

type NonEmpty[T any] struct {
    val []T  
}

func ParseNonEmpty[T any](val []T) (NonEmpty[T], error) {
    if len(val) == 0 {
        return NonEmpty[T]{}, errors.New("slice is empty")
    }
    return NonEmpty[T]{val: val}, nil
}

If the lead type used this struct for phones and emails it would then be an error for the database to try and read an empty slice. This is useful because it puts the error right where the issue originates, at the database. And now it is trivial to try querying again to see if they are there.

Keeping it Safe

One important aspect, that I think is often missed when implementing parsing instead of validation is that the type itself doesn't provide much safety if we circumvent it. Say there was some code that was modified to index the third element of a NonEmpty[T]. However, NonEmpty[T] does not have a way to index the third element. A quick solution is to just add a method.

func (n *NonEmpty[T]) Index(idx int) T {
    return n.val[idx]
}

However, this is not a safe operation to do. Indexing into a slice can throw a panic. Of course, an error field could be added to the return types. This works but isn't the best option. Being able to guarantee that the operation is safe is best. Adding an iterator function to NonEmpty which would be equivalent to a functional map operation is one solution but sometimes you truly want a single element. In that case, parse a new type:

type Atleast3[T any] struct {
    val []T
}

func ParseAtleast3[T any](val []T) (Atleast3[T], error) {
    if len(val) < 3 {
        return Atleast3[T]{}, errors.New("not enough elements")
    }
    return Atleast3[T]{val:val}, nil
}

func (a *Atleast3[T]) Index(idx ZeroUpToThree) T {
    return a.val[idx.val]
}

type ZeroUpToThree struct {
    val uint8
}

func ParseZeroUpToThree(val int) (ZeroUpToThree, error) { /* implementation omitted */ }

Now when Atleast3 is indexed it is guaranteed to be safe and never throw a panic.

I'm not sure I want to write code like this for everything as it adds a lot of extra code. Over time I would create a library that contained my most common invariants. For the boundaries between my service and others like receiving or sending JSON, I think it can be beneficial. It ensures that the code that can go wrong is all located early in the control flow making it simple to retry or provide accurate error messages to the caller. It also helps eliminate certain tests since the type guarantees certain conditions can never occur.

Did you find this article valuable?

Support Over-Engineered by becoming a sponsor. Any amount is appreciated!