How and why you should change your blog URLs to dateless format

How and why you should change your blog URLs to dateless format

Tom Chantler

Summary

Some URLs contain the publication date and some don't. Articles on this website used to include dated URLs, but now I firmly believe that you should not include the publication date in the URL. Instead you should include it in a prominent position in your article, preferably near the top. In this article, I'll explain why.

Additionally, if you have a Ghost blog and you want to change to a dateless format, there's a safe way of doing it that won't break anything. This method will also work for most other web sites (not just Ghost) with little or no modification.

Background

When I published the first article on this blog, I chose the following URL:

https://tomssl.com/2014/12/31/welcome-to-tom-ssl

Ever since then, I've included the date as part of the URL. Until now.

I used to think that it was good to include the publication date in blog post URLs and that the only reason people didn't include the date was to conceal the age of their content. I'm sure I'm not the only one who's tried to follow a tutorial online, only to find that it was hopelessly out of date and that there was no way of finding out when it was written (just that it was too long ago).

I've changed my mind

Now I think that you should not include the date in the URL, but that you should include it somewhere prominent in the document instead. There are a few reasons for this.

What if you want to rewrite an article?

What happens if you have an article which ranks highly on search engines, but which needs to be updated or even rewritten? I've seen some people put a link to a new article at the top of the old article with a note saying "don't read this article, click here instead". I don't think that looks great. Why not just update the existing article and make it's clear that's what you've done?

Ah, but what if you want to keep the old article for posterity? Well, that's up to you. Some articles should be kept and some shouldn't. Some should be updated and some shouldn't. You could have a link to the old article inside the new one if you like. You need to decide for yourself. At least this way you have that choice.

e.g. Consider my article about fixing the WiFi on your Surface Pro 3, originally written over five years ago. When I updated it, I just added a section to the top of the article. That was fine and I'm never going to rewrite that from scratch. If setting up your own email server with Mail-in-a-box changes fundamentally, I might well rewrite the articles about that, since it's great and I still use it all the time. As it happens, the instructions still work fine. Maybe I could just refresh the screenshots and put a note at the top saying it was updated briefly?

Dates in URLs can put people off

Many technical articles are time-sensitive and, if you're looking for up-to-date information about a subject, you know that an article written five years ago won't have the latest information (unless it's been updated recently). I think this naturally extends to all articles and, when people see an ancient date in the URL for any article, they are naturally wary and have to weigh up in their mind if it's worth reading or not. That's just human nature.

Some articles are not time-sensitive, however. For example, my article about a sensible password strategy.

Or what about something in between; consider the article I wrote about creating a free reverse proxy in Azure? Despite being technical, it's still relevant and accurate, but it's also five years old. Would some people be put off from clicking by seeing the date in the URL in their search engine results? Probably.

If you update an article, you should probably add a section to the top to say that it's changed. If you completely rewrite it, you can change the publication date if you like.

What you must do if changing to dateless URLs

If you've decide to change an existing web site to use dateless URLs, there is only one thing you must do:

  • Make sure you redirect all of your old links to the new ones with a 301 (permanent) so that you don't break anything.

In other words, the old URL must continue to work and then you issue a 301 redirect to say that the resource has been moved permanently.

You should also make sure you update your comments (if you've got comments enabled). In short, make sure everything continues to work seamlessly. You know how sad it is when you find a web page you want to read and it no longer exists. Don't let that happen.

How to change a blog to dateless URLs

As previously mentioned, the only thing you have to do is to create a 301 redirect for every URL and point it to the new one. If you've got a sensible URL structure like this:

https://example.com/yyyy/mm/dd/description-of-post

and are migrating to the (even more) sensible:

https://example.com/description-of-post

then you can use a regular expression to strip out the date. Yes, this might mean that the date doesn't have to be correct, but that's okay as we're just trying to make sure we don't break the internet; it's not a test to make sure people know what the old URL was. Search engines know about 301 redirects, so they will update themselves accordingly.

How to change Ghost to dateless URLs

Routes

For the last few years, by default, Ghost no longer includes the date in the URL. Thus, if you're including URLs with dates in a newer version of Ghost, then you will already have had to alter the routes file. Now you need to change it back, like this:

Download the routes.yaml file from Settings → Labs → Routes and change the permalink part of the collections section from this:

permalink: /{year}/{month}/{day}/{slug}/

to this (which is the default, so if you're migrating your blog from an older version of Ghost and removing the dates at the same time, you won't need to change it at all):

permalink: /{slug}/

And then upload it again.

Redirects

The regular expression to remove dates is slightly more complicated than you might think.

The important thing to remember is that any redirection script will alter the URL and then tell your browser to go to the new URL. This means you will pass through the redirection script again, so you need to make sure it doesn't do something daft, like getting stuck in a loop.

The other thing to note is that you have to escape the escape character.

\ is the escape character for regex and / is a special character, used to denote the boundaries of the regex. This means that, if we want to use a literal / in our regular expression, we need to escape it, like this \/.
To complicate matters further, JSON also uses \ as an escape character. This means that, in the redirects.json file, we also have to escape the \, so we end up with \\/, which ultimately means a single /. This sort of thing can be very confusing.

Removing the date from the URL

This first bit will remove the dates and also remove the trailing slash /. When I didn't do this, it would redirect to https://correcturl// with two slashes on the end and it would return a 404 (Not Found). That's because Ghost helpfully adds a slash on the end of the URL. As you can see, this expects the existing date to be of the format YYYY/MM/DD (or YYYY/DD/MM, of course, remember that it doesn't actually match up the values, just the pattern). If that's not the case, you'll need to change the numbers in curly braces.

{
	"from": "^\\/\\d{4}/\\d{2}/\\d{2}/(.*[^\\/]{1,})([\\/]{1,}$)",
	"to": "/$1/",
	"permanent": true
}

This next bit makes sure that any extra trailing slashes are removed, but only if there are two or more. If you think about it, this makes sense. Remember when we said ghost helpfully added / to the end of the URL? That means that, if we strip the final / away, ghost will add it and redirect us. And we'll strip it again. And then ghost will add it and redirect and... you get the idea. We get stuck in a redirect loop. Thus, the following regex removes all of the terminal slashes, if there are two or more.

{
	"from": "(.*[^\\/]{1,})([\\/]{2,}$)",
	"to": "$1/",
	"permanent": true
}

Now all that remains to be done is to get these changes active in your blog.

Go to the Ghost Admin Panel and choose  Settings → Labs → Routes and download your redirects.yaml file. Add the two sections to the bottom of your file. If you don't have any other redirects, your entire redirects.yaml file will look like this:

[
    {
		"from": "^\\/\\d{4}/\\d{2}/\\d{2}/(.*[^\\/]{1,})([\\/]{1,}$)",
		"to": "/$1/",
		"permanent": true
	},
	{
		"from": "(.*[^\\/]{1,})([\\/]{2,}$)",
		"to": "$1/",
		"permanent": true
	}
]

Yep, I know that the to: sections are not quite the same. These are taken from my real file which is really working.

Update your comments

Now that you've changed the URL, you need to make sure that your comments are still attached to the right articles. If you're using Disqus then you can simply login to your account at https://disqus.com/admin and click on Community → Tools → Migration Tools (which takes you to https://[your-disqus-shortname].disqus.com/admin/discussions/migrate/).  

Disqus 301 Redirect Crawler
Disqus 301 Redirect Crawler

Now you can fix the comments by using the 301 crawler. It seems to work.

Conclusion

In this article I explained why I think you should not include dates in your blog article URLs. I also showed you how you can remove the dates from your URLs without breaking your site with a "simple" (as in rather complicated) regular expression (or two) and gave a specific example for use with a Ghost blog, which will be useful if you've been following my series of articles on how to upgrade Ghost. Finally, there's a note about migrating your comments, which is very easy with Disqus (and any other comment providers that have a 301 redirect crawler).


Main photo by Mona Mok on Unsplash (edited by me).