How to generate a robots.txt file in React Router 7 and Remix

Learn how to dynamically generate a robots.txt file for your React Router 7 or Remix app with environment-based rules and automatic sitemap integration.
A properly configured robots.txt
file is crucial for controlling how search engines crawl and index your website. Instead of serving a static file, generating it dynamically allows you to adapt the rules based on your environment.
A robots.txt
file tells search engines which parts of your site they’re
allowed to visit and which parts to skip. It helps control crawler behavior to
avoid unnecessary load or indexing of unimportant routes. While it doesn’t
guarantee pages stay out of search results, it’s useful for guiding how bots
interact with your app.
#Install the @forge42/seo-tools
package
If you haven't already installed it for your sitemap generation, add the @forge42/seo-tools package. It provides utilities for handling SEO-related tasks and has a dedicated @forge42/seo-tools/remix
export that works with Remix and React Router 7 applications.
npm install @forge42/seo-tools
#Create the robots.txt
resource route
Create a file at app/routes/robots.txt.ts
. This will serve as a resource route for dynamically generating your robots.txt.
import { generateRobotsTxt } from "@forge42/seo-tools/robots";
import { href } from "react-router";
import type { Route } from "./+types/robots.txt";
export async function loader({ request }: Route.LoaderArgs) {
const isProductionDeployment = process.env.VERCEL_ENV === "production";
const { origin } = new URL(request.url);
const robotsTxt = generateRobotsTxt([
{
userAgent: "*",
[isProductionDeployment ? "allow" : "disallow"]: ["/"],
...(isProductionDeployment
? {
disallow: ["/api/"],
}
: {}),
sitemap: [origin + href("/sitemap.xml")],
},
]);
return new Response(robotsTxt, {
headers: {
"Content-Type": "text/plain",
},
});
}
#Breaking Down the Implementation
Let's examine what each part of this code does:
#Environment-Based Access Control
const isProductionDeployment = process.env.VERCEL_ENV === "production";
I am using Vercel as my hosting service and here as an example, but you can adapt this check based on your deployment platform:
- Vercel:
process.env.VERCEL_ENV === "production"
- Netlify:
process.env.CONTEXT === "production"
- Generic:
process.env.NODE_ENV === "production"
#Dynamic Allow/Disallow Rules
[isProductionDeployment ? "allow" : "disallow"]: ["/"],
This pattern uses computed property names to either:
- Production: Allow all routes (
allow: ["/"]
) - Non-production: Block all routes (
disallow: ["/"]
)
Blocking staging, preview, or development environments should prevent search engines from indexing incomplete or test content that could hurt your SEO.
#Production-Specific Rules
...(isProductionDeployment
? {
disallow: ["/api/"],
}
: {})
In production, we still want to block certain paths like API endpoints that shouldn't be crawled by search engines.
#Automatic Sitemap Integration
sitemap: [origin + href("/sitemap.xml")],
This automatically includes your sitemap URL, using:
origin
from the request to get the correct domainhref()
function for type-safe route references
If you want to learn how to generate a sitemap in React Router 7, check out my sitemap generation guide.
#Register the route
Add the robots.txt
route to your route configuration:
export default [
// ...
route("robots.txt", "routes/robots.txt.ts"),
] satisfies RouteConfig;
#Test the robots.txt output in the browser
Visit /robots.txt
in your browser to see the generated file. You should see different output based on your environment:
Production environment:
User-agent: *
Allow: /
Disallow: /api/
Sitemap: https://yourdomain.com/sitemap.xml
Non-production environment:
User-agent: *
Disallow: /
Sitemap: http://localhost:5173/sitemap.xml
#Add dynamic logic or custom rules (optional)
You can extend the robots.txt with more sophisticated rules:
const robotsTxt = generateRobotsTxt([
{
userAgent: "*",
[isProductionDeployment ? "allow" : "disallow"]: ["/"],
...(isProductionDeployment
? {
disallow: [
"/api/",
"/admin/",
"/private/",
"/*.json$", // Block JSON files
"/search?*", // Block search result pages
],
crawlDelay: 1, // Be nice to servers
}
: {}),
sitemap: [origin + href("/sitemap.xml")],
},
// Specific rules for different bots
...(isProductionDeployment
? [
{
userAgent: "Googlebot",
allow: ["/api/og/*"], // Allow OG image generation
},
]
: []),
]);
#Validate your robots.txt
Use these tools to validate your robots.txt file:
For more advanced robots.txt configurations and SEO tools, check out the official documentation.