Skip to main content
Back to Blog
Tutorials
3 min read
November 24, 2024

How to Create a Sitemap and robots.txt in Next.js App Router

Generate a dynamic sitemap.xml and robots.txt in Next.js App Router. Covers static pages, dynamic routes, blog posts, and sitemap indexing for large sites.

Ryel Banfield

Founder & Lead Developer

Sitemaps tell search engines what pages exist on your site and when they were last updated. robots.txt tells crawlers which pages to index and which to skip. Both are essential for SEO.

Sitemap

Basic Static Sitemap

// app/sitemap.ts
import type { MetadataRoute } from "next";

export default function sitemap(): MetadataRoute.Sitemap {
  return [
    {
      url: "https://yourdomain.com",
      lastModified: new Date(),
      changeFrequency: "weekly",
      priority: 1,
    },
    {
      url: "https://yourdomain.com/about",
      lastModified: new Date(),
      changeFrequency: "monthly",
      priority: 0.8,
    },
    {
      url: "https://yourdomain.com/services",
      lastModified: new Date(),
      changeFrequency: "monthly",
      priority: 0.9,
    },
    {
      url: "https://yourdomain.com/contact",
      lastModified: new Date(),
      changeFrequency: "monthly",
      priority: 0.7,
    },
  ];
}

This generates /sitemap.xml automatically.

Dynamic Sitemap with Blog Posts

// app/sitemap.ts
import type { MetadataRoute } from "next";

async function getBlogPosts() {
  // Fetch from your CMS, database, or file system
  const posts = await getAllPosts();
  return posts;
}

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const baseUrl = "https://yourdomain.com";

  // Static pages
  const staticPages: MetadataRoute.Sitemap = [
    {
      url: baseUrl,
      lastModified: new Date(),
      changeFrequency: "weekly",
      priority: 1,
    },
    {
      url: `${baseUrl}/about`,
      lastModified: new Date(),
      changeFrequency: "monthly",
      priority: 0.8,
    },
    {
      url: `${baseUrl}/services`,
      lastModified: new Date(),
      changeFrequency: "monthly",
      priority: 0.9,
    },
    {
      url: `${baseUrl}/blog`,
      lastModified: new Date(),
      changeFrequency: "daily",
      priority: 0.9,
    },
    {
      url: `${baseUrl}/contact`,
      lastModified: new Date(),
      changeFrequency: "monthly",
      priority: 0.7,
    },
  ];

  // Dynamic blog posts
  const posts = await getBlogPosts();
  const blogPages: MetadataRoute.Sitemap = posts.map((post) => ({
    url: `${baseUrl}/blog/${post.slug}`,
    lastModified: new Date(post.date),
    changeFrequency: "monthly" as const,
    priority: 0.6,
  }));

  return [...staticPages, ...blogPages];
}

Multiple Sitemaps for Large Sites

For sites with thousands of pages, use sitemap indexing:

// app/sitemap.ts
import type { MetadataRoute } from "next";

export default function sitemap(): MetadataRoute.Sitemap {
  return [
    // This becomes a sitemap index pointing to sub-sitemaps
  ];
}

// app/sitemap/[id]/route.ts — or use generateSitemaps:
export async function generateSitemaps() {
  const totalPosts = await getPostCount();
  const sitemapSize = 50000; // Max URLs per sitemap
  const numSitemaps = Math.ceil(totalPosts / sitemapSize);

  return Array.from({ length: numSitemaps }, (_, i) => ({ id: i }));
}

export default async function sitemap({ id }: { id: number }): Promise<MetadataRoute.Sitemap> {
  const start = id * 50000;
  const end = start + 50000;
  const posts = await getPostsRange(start, end);

  return posts.map((post) => ({
    url: `https://yourdomain.com/blog/${post.slug}`,
    lastModified: new Date(post.date),
  }));
}

robots.txt

Basic robots.txt

// app/robots.ts
import type { MetadataRoute } from "next";

export default function robots(): MetadataRoute.Robots {
  return {
    rules: {
      userAgent: "*",
      allow: "/",
      disallow: ["/api/", "/admin/", "/private/"],
    },
    sitemap: "https://yourdomain.com/sitemap.xml",
  };
}

Environment-Aware robots.txt

Block indexing on staging/preview environments:

// app/robots.ts
import type { MetadataRoute } from "next";

export default function robots(): MetadataRoute.Robots {
  const baseUrl = process.env.NEXT_PUBLIC_URL || "https://yourdomain.com";
  const isProduction = process.env.NODE_ENV === "production"
    && baseUrl === "https://yourdomain.com";

  if (!isProduction) {
    return {
      rules: {
        userAgent: "*",
        disallow: "/",
      },
    };
  }

  return {
    rules: [
      {
        userAgent: "*",
        allow: "/",
        disallow: ["/api/", "/admin/"],
      },
    ],
    sitemap: `${baseUrl}/sitemap.xml`,
  };
}

This prevents staging and preview deployments from being indexed by search engines.

Blocking Specific Bots

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: "*",
        allow: "/",
        disallow: ["/api/", "/admin/"],
      },
      {
        userAgent: "GPTBot",
        disallow: "/", // Block OpenAI's crawler
      },
      {
        userAgent: "CCBot",
        disallow: "/", // Block Common Crawl
      },
    ],
    sitemap: "https://yourdomain.com/sitemap.xml",
  };
}

Priority Values Guide

Page TypeSuggested Priority
Homepage1.0
Main service/product pages0.9
Blog index, about page0.8
Individual blog posts0.6
Contact, legal pages0.5-0.7
Utility pages0.3

Priority values are relative hints to search engines, not absolute rankings.

Change Frequency Guide

FrequencyUse For
alwaysReal-time content (stock prices, live feeds)
hourlyNews sites, high-frequency updates
dailyBlog index, active forums
weeklyHomepage, frequently updated pages
monthlyMost content pages, blog posts
yearlyLegal pages, rarely changed content
neverArchived content

Verification

  1. Visit yourdomain.com/sitemap.xml — should show XML
  2. Visit yourdomain.com/robots.txt — should show rules
  3. Submit sitemap to Google Search Console
  4. Use Google's robots.txt tester to verify rules
  5. Check that blocked paths return the expected behavior

Common Mistakes

  1. Blocking CSS/JS files: Search engines need these to render pages
  2. Different robots.txt on staging: Staging should block all crawlers
  3. Stale lastModified dates: Use actual modification dates, not new Date()
  4. Missing sitemap reference: Always include sitemap URL in robots.txt
  5. Forgetting dynamic routes: Generate sitemap entries for all public pages

Need SEO Setup?

We configure sitemaps, robots.txt, structured data, and technical SEO for every website we build. Contact us for professional SEO implementation.

sitemaprobots.txtSEONext.jstutorial

Ready to Start Your Project?

RCB Software builds world-class websites and applications for businesses worldwide.

Get in Touch

Related Articles