How to generate a robots.txt file in React Router 7 and Remix

Learn how to dynamically generate a robots.txt file for your React Router 7 or Remix app with environment-based rules and automatic sitemap integration.

A properly configured robots.txt file is crucial for controlling how search engines crawl and index your website. Instead of serving a static file, generating it dynamically allows you to adapt the rules based on your environment.

What exactly is a robots.txt file?

A robots.txt file tells search engines which parts of your site they’re allowed to visit and which parts to skip. It helps control crawler behavior to avoid unnecessary load or indexing of unimportant routes. While it doesn’t guarantee pages stay out of search results, it’s useful for guiding how bots interact with your app.

#Install the @forge42/seo-tools package

If you haven't already installed it for your sitemap generation, add the @forge42/seo-tools package. It provides utilities for handling SEO-related tasks and has a dedicated @forge42/seo-tools/remix export that works with Remix and React Router 7 applications.

npm install @forge42/seo-tools

#Create the robots.txt resource route

Create a file at app/routes/robots.txt.ts. This will serve as a resource route for dynamically generating your robots.txt.

app/routes/robots.txt.ts
import { generateRobotsTxt } from "@forge42/seo-tools/robots";
import { href } from "react-router";
import type { Route } from "./+types/robots.txt";

export async function loader({ request }: Route.LoaderArgs) {
  const isProductionDeployment = process.env.VERCEL_ENV === "production";
  const { origin } = new URL(request.url);

  const robotsTxt = generateRobotsTxt([
    {
      userAgent: "*",
      [isProductionDeployment ? "allow" : "disallow"]: ["/"],
      ...(isProductionDeployment
        ? {
            disallow: ["/api/"],
          }
        : {}),
      sitemap: [origin + href("/sitemap.xml")],
    },
  ]);

  return new Response(robotsTxt, {
    headers: {
      "Content-Type": "text/plain",
    },
  });
}

#Breaking Down the Implementation

A little note from me

I put a lot of effort into creating content that is informative, engaging, and useful for my readers. If you find it valuable, please consider subscribing to my newsletter.

Let's examine what each part of this code does:

#Environment-Based Access Control

const isProductionDeployment = process.env.VERCEL_ENV === "production";

I am using Vercel as my hosting service and here as an example, but you can adapt this check based on your deployment platform:

  • Vercel: process.env.VERCEL_ENV === "production"
  • Netlify: process.env.CONTEXT === "production"
  • Generic: process.env.NODE_ENV === "production"

#Dynamic Allow/Disallow Rules

[isProductionDeployment ? "allow" : "disallow"]: ["/"],

This pattern uses computed property names to either:

  • Production: Allow all routes (allow: ["/"])
  • Non-production: Block all routes (disallow: ["/"])
Tip

Blocking staging, preview, or development environments should prevent search engines from indexing incomplete or test content that could hurt your SEO.

#Production-Specific Rules

...(isProductionDeployment
  ? {
      disallow: ["/api/"],
    }
  : {})

In production, we still want to block certain paths like API endpoints that shouldn't be crawled by search engines.

#Automatic Sitemap Integration

sitemap: [origin + href("/sitemap.xml")],

This automatically includes your sitemap URL, using:

  • origin from the request to get the correct domain
  • href() function for type-safe route references
Tip

If you want to learn how to generate a sitemap in React Router 7, check out my sitemap generation guide.

#Register the route

Add the robots.txt route to your route configuration:

app/routes.ts
export default [
  // ...
  route("robots.txt", "routes/robots.txt.ts"),
] satisfies RouteConfig;

#Test the robots.txt output in the browser

Visit /robots.txt in your browser to see the generated file. You should see different output based on your environment:

Production environment:

User-agent: *
Allow: /
Disallow: /api/
Sitemap: https://yourdomain.com/sitemap.xml

Non-production environment:

User-agent: *
Disallow: /
Sitemap: http://localhost:5173/sitemap.xml

#Add dynamic logic or custom rules (optional)

You can extend the robots.txt with more sophisticated rules:

const robotsTxt = generateRobotsTxt([
  {
    userAgent: "*",
    [isProductionDeployment ? "allow" : "disallow"]: ["/"],
    ...(isProductionDeployment
      ? {
          disallow: [
            "/api/",
            "/admin/",
            "/private/",
            "/*.json$", // Block JSON files
            "/search?*", // Block search result pages
          ],
          crawlDelay: 1, // Be nice to servers
        }
      : {}),
    sitemap: [origin + href("/sitemap.xml")],
  },
  // Specific rules for different bots
  ...(isProductionDeployment
    ? [
        {
          userAgent: "Googlebot",
          allow: ["/api/og/*"], // Allow OG image generation
        },
      ]
    : []),
]);

#Validate your robots.txt

Use these tools to validate your robots.txt file:

For more advanced robots.txt configurations and SEO tools, check out the official documentation.