Let’s say you have a website built on a platform that excels at design and it’s available at example.com
. But that platform falls short at blogging. So you think to yourself, “What if I could use a different blogging platform and make it available at example.com/blog
?”
Most people would tell you that goes against how DNS and websites are supposed to work and to use a subdomain instead. But there are benefits to keeping your content on the root domain that we just don’t get with subdomains.
There’s a way to serve two different platforms on the same URL. And I’m going to show you the secret sauce so that, by the end of this article, we’ll make blog.example.com
serve as example.com/blog
.
Why you’d want to do this
Because you’re here, you probably already know why this is a path to pursue. But I’d like to ensure you are here for the primary reason to do this: SEO. Check out these 14 case studies that show positive results when people move their subdomains over to subdirectories. You want your blog and your domain to share SEO value. Putting it on a subdomain would somewhat disconnect the two.
This was my reason, and wound up merging two platforms, where the main domain was on WordPress and the subdomain was on Drupal. But this tutorial is platform agnostic — it’ll work with just about any platform.
That said, the Cloudflare approach we’re covering in this tutorial is incompatible with Shopify unless you pay for Cloudflare’s Enterprise plan. That’s because Shopify also uses Cloudflare and does not allow us to proxy the traffic on their free pricing tier.
Step 0 (Preview)
Before I jump in, I want to explain the high level of what’s going to happen. In short, we’ll have two websites: our main one (example.com
) and the subdomain (blog.example.com
). I use “blog” as an example, but in my case, I needed to drop in Drupal with a different type of content. But a blog is the typical use case.
This approach relies on using Cloudflare for DNS and a little extra something that’ll provide the magic. We’re going to tell Cloudflare that when someone visits example.com/blog
, it should:
- intercept that request (because
example.com/blog
doesn’t really exist), - request a different domain (
blog.example.com/blog
) behind the scenes, and - deliver the results from that last step to the visitor masked through
example.com/blog
.
Okay, let’s dive into it in more detail!
Step 1: Using Cloudflare
Again, we’re using Cloudflare for the DNS. Pointing your domain’s DNS there is the first step to getting started.
The reason for Cloudflare is that it allows us to create Workers that are capable of running a bit of code anytime somebody visits certain URLs (called Routes which we’ll create in step 3). This code will be responsible for switching the websites behind the scenes.
Cloudflare has an excellent guide to getting started. The goal is to point your domain’s — wherever it is registered — to Cloudflare’s nameservers and confirm that Cloudflare is connected in your Cloudflare account.
Step 2: Create the Worker
This code will be responsible for switching the websites behind the scenes. Head over to Workers and click Create a Service.
Name your service, then select “HTTP handler”:
Click Create Service and then Quick Edit.
Paste in the following code and replace the domain names with your own on line 16:
// Listen for every request and respond with our function.
// Note, this will only run on the routes configured in Cloudflare.
addEventListener('fetch', function(event) {
event.respondWith(handleRequest(event.request))
})
// Our function to handle the response.
async function handleRequest(request) {
// Only GET requests work with this proxy.
if (request.method !== 'GET')
return MethodNotAllowed(request);
// The URL that is being requested.
const url = new URL(request.url);
// Request "origin URL" aka the real blog instead of what was requested.
// This switches out the absolute URL leaving the relative path unchanged.
const originUrl = url.toString().replace('https://example.com', 'https://blog.example.com');
// The contents of the origin page.
const originPage = await fetch(originUrl);
// Give the response our origin page.
const newResponse = new Response(originPage.body, originPage); return newResponse;
}
// Hey! GET requests only
function MethodNotAllowed(request) {
return new Response(`Method ${request.method} not allowed.`, {
status: 405,
headers: { 'Allow': 'GET' }
})
}
Lastly, click Save and Deploy.
Step 3: Add Routes
Now let’s inform Cloudflare which URLs (aka Routes) to run this code on. Head over to the website in Cloudflare, then click Workers.
There is the Workers section on the main screen of Cloudflare, where you edit the code, and then there is the Workers section on each website where you add the routes. They are two different places, and it’s confusing.
First off, click Add Route:
Because we are adding a blog that has many child pages, we’ll use https://example.com/blog*
. Note the asterisk acts as a wild card for matching. This code will run on the blog page and every page that begins with blog
.
This can have unintended consequences. Say, for example, you have a page that starts with “blog” but isn’t a part of the actual blog, like https://example.com/blogging-services
. That would get picked up with this rule.
Then, select the Worker in the Service dropdown.
We have a lot of the work done, but there are more routes we need to add — the CSS, JavaScript, and other file paths that the blog is dependent on (unless all the files are hosted on a different URL, such as on a CDN). A good way to find these is by testing your route and checking the console.
Head over to your https://example.com/blog
and make sure something is loading. It’ll look messed up because it’s missing the theme files. That’s fine for now, just as long as it’s not producing a 404 error. The important thing is to open up your browser’s DevTools, fire up the console, and make note of all the red URLs it can’t find or load (usually a 404 or 403) that are a part of your domain.
You’ll want to add those as routes… but only do the parent paths. So, if your red URL is https://example.com/wp-content/themes/file1.css
, then do https://example.com/wp-content*
as your route. You can add a child path, too, if you want to be more specific, but the idea is to use one route to catch most of the files.
Once you add those routes, check out your URL and see if it looks like your subdomain. If it doesn’t, check the previous steps. (Chances are you will need to add more routes.)
It’s best to do a quality check by navigating to multiple pages and seeing if anything is missing. I also recommend opening up DevTools and searching for your subdomain (blog.example.com
). If that’s showing up, you either need to add routes to target those resources or do something with your platform to stop outputting those URLs. For example, my platform was outputting a canonical tag with my subdomain, so I found a plugin to modify the canonical URL to be my root domain.
noindex
)
Step 4: The secretest of sauces (You might see that we have a problem. Our URLs are available at two different URLs. Yeah, we can use the canonical
attribute to inform Google which URL is our “main” one, but let’s not leave it up to Google to pick the right one.
First, set your entire subdomain as noindex
(the way to do this will vary by platform). Then, in the Cloudflare Worker, we are going to add the following line of code, which basically says to remove noindex
when the current URL is accessed through the proxy.
newResponse.headers.delete("x-robots-tag");
The full code solution is provided at the end of this article.
Step 5: Modify the sitemap
The last thing to do is to modify the subdomain’s sitemap so it doesn’t use the subdomain in it. The way to do this will vary by platform, but the goal is to modify the base/absolute/domain in your sitemap so that it prints example.com/mypost
) instead of blog.exmaple.com/mypost
. Some plugins and modules will allow this without custom code.
That’s that! The solution should be working!
Limitations
This Cloudflare magic isn’t without its downsides. For example, it only accepts GET
requests, meaning we can only get things from the server. We are unable to POST
which is what forms use. So, if you need to have your visitors log in or submit forms, there will be more work on top of what we’ve already done. I discussed several solutions for this in another article.
As noted earlier, another limitation is that using this approach on Shopify requires subscribing to Cloudflare’s Enterprise pricing tier. Again, that’s because Shopify also uses Cloudflare and restricts the ability to proxy traffic on their other plans.
You also might get some issues if you’re trying to merge two instances of the same platforms together (e.g. both your top-level domain and subdomain use WordPress). But in a case like that you should be able to consolidate and use one instance of the platform.
Full solution
Here’s the code in all its glory:
// Listen for every request and respond with our function.
// Note, this will only run on the routes configured in Cloudflare.
addEventListener('fetch', function(event) {
event.respondWith(handleRequest(event.request))
})
// Our function to handle the response.
async function handleRequest(request) {
// Only GET requests work with this proxy.
if (request.method !== 'GET') return MethodNotAllowed(request);
// The URL that is being requested.
const url = new URL(request.url);
// Request "origin URL" aka the real blog instead of what was requested.
// This switches out the absolute URL leaving the relative path unchanged.
const originUrl = url.toString().replace('https://example.com', 'https://blog.example.com');
// The contents of the origin page.
const originPage = await fetch(originUrl);
// Give the response our origin page.
const newResponse = new Response(originPage.body, originPage);
// Remove "noindex" from the origin domain.
newResponse.headers.delete("x-robots-tag");
// Remove Cloudflare cache as it's meant for WordPress.
// If you are using Cloudflare APO and your blog isn't WordPress, (but
// your main domain is), then stop APO from running on your origin URL.
// newResponse.headers.set("cf-edge-cache", "no-cache"); return newResponse;
}
// Hey! GET requests only
function MethodNotAllowed(request) {
return new Response(`Method ${request.method} not allowed.`, {
status: 405,
headers:
{ 'Allow': 'GET' }
})
}
If you need help along the way, I welcome you to reach out to me through my website CreateToday.io or check out my YouTube for a video demonstration.
I don’t think this is an elegant enough solution. It will also remove the deindex header from posts or pages I want Google to not index. And if managing that is required in the code then that’s a code smell as we should be managing content’s state in the CMS.
What is the reason the subdomain exists?
Why does your Drupal application not simply live in the sub-directory you want to serve it from?
Hey John,
Thanks for the tips. Using Cloudflare this way is helping us as it should to resolve our problem and works just fine.
I think that the concern should be that it looks like a black hat technique:
https://www.thesitewizard.com/sitepromotion/cloaked-domain-redirection-issues.shtml
What do you think? Does it make sense?
Unfortunately, the newResponse.headers.delete(“x-robots-tag”); is not removing the noindex tag on my wordpress site.