If you’ve ever built a User object in a data model, you likely added an email attribute. Then, that email attribute needs a status column for whether it’s been confirmed.
But in order to build the confirmation functionality, you need a random secret token that can be used for the confirmation process. Now you have a user object with two extra columns that only relate to the email address.
If you’re naming those columns accurately, they’d likely end up with the word “email” in them. i.e.
email_confirmed_at. While that’s not necessarily enough justification for a separate model just for email addresses, it should trigger your spidey-sense.
Those columns will invariably need additional methods that will also use the word “email” in their naming and likely won’t touch any other aspects of your user model.
What about a secure process to change the email? Time to add a column for
new_email_address, and we’ll assume you can reuse the confirmation columns along with it.
This is still workable, but it’s still not unquestionable justification for a dedicated model. It’s getting closer, though.
These days, it’s not uncommon to have multiple email addresses, and that will be a clear cut case where it’s time for a dedicated model. Add in the accumulated other signals, and it’s likely a safe bet you have a need for compartmentalizing all of the email-related logic as well.
Easy enough, right? Once you build that email model, though, now you have to manage the relationship between your users and their email addresses. A method call that may have previously been
user.email is now
So it’s not all upside.
But, in mosts cases, you could still add an
And that leads to what has become my favorite benefit of recognizing seams. Everything becomes easier to test. With a non-trivial amount of email-related logic, I now only need an instance of
Scenarios like this aren’t uncommon with data modeling, and once you start recognizing them, you can’t unsee them. If you find that an object’s logic is proliferating in a way that only references only a single column (or maybe a couple of related columns), that’s often a sign.
Or if you find yourself naming things in a way that relate to a single column or group of columns, that’s a sign.
Friction with writing tests has become another signal to me that a seam is lurking somewhere. If I’m writing tests and they feel tedious and complex to set up, it’s often a clue that I’m overlooking an independent object bolted on to another.
Other times, it’s very possible or even likely that one object will inevitably have multiple instances of something. Work and personal email addresses or phone number. Billing and mailing addresses. Payment methods.
In some cases, it will be incredibly obvious that some columns or groups of columns deserve their own objects, but others may be more nuanced. And when you see these relationships start to show themselves, you don’t immediately need to jump to a one-to-many relationship even if you think it’s likely.
These days, I’ll almost always model email addresses as its own object with its own table, but out of the gate, it’s still one-to-one with users. That way, it’s not necessary to build the entire interface for managing multiple email addresses because that’s almost never a critical requirement.
If or when the day comes where multiple email addresses becomes a requirement, the data model is ready for it. In the meantime, email (or address or phone number) logic is nicely contained in a dedicated object, and the user object isn’t spilling over with ancillary concerns.
Be on the lookout for those seams, but make sure not to over-proliferate a bunch of dedicated objects because in all my years, I’ve never needed a dedicated model/table for each and every character in an email address.
That’s not to say there’s never been a reason for it, but hopefully just imaginining
email.characters.sort.join reinforces that diminishing returns come into play at some point.