For Maximum Accessibility, Be Careful About Using a .dev DomainThinking back on the process of troubleshooting why a small number of people weren't able to access PicPerf’s .dev domain. Best guess: old firewall rules still block certain TLDs out of caution.
It's been an interesting couple of weeks debugging a DNS/connectivity issue for PicPerf's [now former] domain, so I'm taking the time to write a few thoughts down before it all goes stale. Hopefully, they're helpful to others who run into issues one day.
Disclaimer: I’m far from an expert DNS, networking, and firewalls. If you see any incorrect assumptions, here, or have any helpful insight on the problem, feel free to butt in!
I purchased picperf.dev back in June and immediately began using it to power both the marketing site, as well as every image request being proxied through it. For example, a URL like this would intercept the image and return an optimized version:
For the vast majority of visitors, everything was working flawlessly and I was pretty excited about PicPerf's momentum. That excitement only grew when Laravel News (whom I've followed for years) started using it to optimize images on their redesigned website. Things were good.
But then, a couple of reports started to come in. For a small number of users, a hodgepodge of errors were being thrown when images were requested, resulting in no images being rendered on LN's site. They weren't even consistent, including things like
SSL_ERROR_ZERO_RETURN. Here's one of the screenshots showing what was going on.
It was a mess. And very odd because the errors seemed to go away when switching from a WiFi connection to mobile data. I was never personally able to replicate the issue, no matter what browser I was using or the network to which I was connected.
The internet lacked many resources on what might be happening. The closest example of a similar problem I came across was here, but even that wasn't happily resolved. My head immediately gravitated toward a couple different potential causes, and I began to dig in there.
First, I thought it might be related to how I had configured the domain's DNS and set it up with my Cloudflare Worker. At the time, I was using worker routes to catch every request:
But I thought that I had possibly missed something, causing the DNS to not play nicely with particular firewalls and networks. So, I configured a custom domain for the worker instead.
The benefit to this approach is that my worker becomes my origin – it's no longer just intercepting requests as they attempt to reach another destination. And on top of that, all of the DNS is managed by Cloudflare, so there'd be no chance of me screwing it up.
I was optimistic this would take care of the issue. But it did not. After some back & forth with the person experiencing the issue first-hand, nothing had changed.
My next thought was that my SSL certificates were in an odd state, causing hiccups before the DNS could ever successfully resolve. In my "edge certificates" settings, I saw several of them set up after moving to a custom worker domain. I honestly wasn't that familiar with why each was needed:
So, I blew them all away generated a single new one. On my end, just as before, it worked fine:
By this time, I was already talking with a couple of other people who were having similar issues. They were even seeing other errors new to the mix, like
Regardless, I was optimistic my new SSL certificate would finally put this problem to rest, but no. It did not.
As I was talking to people and trying to get to the root of this, a few had said that this must be an issue with the user's network and/or firewall, and as far as I could tell, that checked out. There was no evidence of my worker even receiving these failed requests. It was crapping out at the DNS resolution stage. But reports (still very few of them, relative to the amount of traffic running through PicPerf) were still coming in every so often, and I couldn't swallow that this was an isolated quirk for a few networks. I was a loss.
So, I mentally backed up and looked at the domain: picperf.dev. I remembered when Google made the .dev TLD available for purchase back in 2019. It personally impacted me because I was using it for local WordPress development at the time, and when .dev became a first-class TLD, I moved things over to .test instead. It wasn't just me - it was a pretty common practice. Many firms used .dev for their development and staging environments.
Keep in mind: I'm pretty unexperienced when it comes to this level of stuff. But it seemed like the root of the issue could be one of two things:
Maybe some of them are still stuck in the past, exercising rules intended to prevent malicious activity.
Totally made-up (but feasible) scenario: I'm a bad guy who hates Your Company™, and I knew there's a chance your development environment is on yourcompany.dev. I might've purchased that TLD as soon as I got the chance. I'd stick some bad code on there to steal all the secrets and do other bad-guy things. Without targeted firewall rules in place, an unsuspecting team member could navigate to that domain and BAM – I got 'em. It seems plausible.
Whether it's intentional or the remains of a no-longer-used setup, some development environments set up a local DNS service to intercept requests to a particular TLD. Laravel Valet is just one example, and it's something they explicitly addressed a long time ago. To verify it, I tried to manually set the TLD used by Valet to .dev, and was met with this warning:
I went for it anyway, and sure enough, I could no longer access picperf.dev after committing the change:
Being that this issue came up from people within the Laravel community, it wouldn't surprise me if this had been the cause, at least for a few of them. (Thanks to Eric for calling this out.)
Regardless of the exact source, I moved forward with assuming it was all related to my TLD. As a test, I connected my worker to a subdomain on jamcomments.com. After hours of troubleshooting and many back & forth messages with multiple people, the issue went away entirely.
So, I took it to the next level and purchased picperf.io. A part of me was concerned that since the domain would be so new, I'd run into other firewall problems. But nope – everything worked flawlessly. Again. Needless to say, I'm using .io as the primary TLD, and redirecting all .dev traffic there now. It's good to see things calm down.
It goes without saying, but moving forward, I'm gonna be a bit more hesitant about purchasing new-ish TLDs, especially if I'm building a product that needs to be maximally accessible at great scale. I initially chose a .dev domain because it sounded cool and I was still bought into the hype of it being made available. Not gonna that mistake again, even if the problem was unique to the engineering community who may have old, weird demons crawling through their machines.
At this point, I'm exhausted and still a little dazed, but more than anything, I'm just glad the debacle is over.
I'd be remiss if I didn't give a hat tip to Eric Barnes w/ Laravel News and the handful of other guys who helped me troubleshoot this. It's one of those problems that probably wouldn't have been caught if it weren't being used at scale, and they were all incredibly gracious and patient as we figured it out.
Alex MacArthur is a software engineer working for Dave Ramsey in
Soli Deo gloria.
Get irregular emails about new posts or projects.No spam. Unsubscribe whenever.