Code examples and configuration guides for implementing AIPREF across different platforms
AIPREF can be implemented using two standard methods:
HTTP headers are preferred because they apply per-resource and take precedence over robots.txt. Both methods can be used together for maximum compatibility.
Add the Content-Usage header to all responses or specific locations in your Nginx configuration.
server {
listen 80;
server_name example.com;
# Add Content-Usage header to all responses
add_header Content-Usage "train-ai=n, train-genai=n" always;
location / {
root /var/www/html;
index index.html;
}
}server {
listen 80;
server_name example.com;
# Public documentation - allow all
location /docs/ {
add_header Content-Usage "bots=y, train-ai=y, train-genai=y, search=y" always;
root /var/www/html;
}
# Premium content - block AI training
location /premium/ {
add_header Content-Usage "train-ai=n, train-genai=n, search=y" always;
root /var/www/html;
}
# Private API - block everything except authenticated bots
location /api/ {
add_header Content-Usage "bots=n, train-ai=n, train-genai=n, search=n" always;
proxy_pass http://backend;
}
}always parameter ensures the header is added even for error responses (4xx, 5xx).Configure Content-Usage headers using the mod_headers module in Apache.
# Enable mod_headers if not already enabled # LoadModule headers_module modules/mod_headers.so # Add Content-Usage header to all responses Header set Content-Usage "train-ai=n, train-genai=n"
<VirtualHost *:80>
ServerName example.com
DocumentRoot /var/www/html
# Default: block AI training
Header set Content-Usage "train-ai=n, train-genai=n, search=y"
# Public docs: allow all
<Directory "/var/www/html/docs">
Header set Content-Usage "bots=y, train-ai=y, train-genai=y, search=y"
</Directory>
# Premium content: strict controls
<Directory "/var/www/html/premium">
Header set Content-Usage "bots=n, train-ai=n, train-genai=n, search=y"
</Directory>
</VirtualHost>Implement Content-Usage headers in Next.js using middleware or custom headers in next.config.js.
Create middleware.ts in your project root:
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
export function middleware(request: NextRequest) {
const response = NextResponse.next();
// Site-wide preference
response.headers.set(
'Content-Usage',
'train-ai=n, train-genai=n, search=y'
);
// Path-specific preferences
if (request.nextUrl.pathname.startsWith('/docs')) {
response.headers.set(
'Content-Usage',
'bots=y, train-ai=y, train-genai=y, search=y'
);
} else if (request.nextUrl.pathname.startsWith('/api')) {
response.headers.set(
'Content-Usage',
'bots=n, train-ai=n, train-genai=n, search=n'
);
}
return response;
}
export const config = {
matcher: [
'/((?!_next/static|_next/image|favicon.ico).*)',
],
};/** @type {import('next').NextConfig} */
const nextConfig = {
async headers() {
return [
{
source: '/:path*',
headers: [
{
key: 'Content-Usage',
value: 'train-ai=n, train-genai=n, search=y',
},
],
},
{
source: '/docs/:path*',
headers: [
{
key: 'Content-Usage',
value: 'bots=y, train-ai=y, train-genai=y, search=y',
},
],
},
];
},
};
module.exports = nextConfig;Add Content-Usage headers using Express middleware.
const express = require('express');
const app = express();
// Global AIPREF middleware
app.use((req, res, next) => {
res.setHeader('Content-Usage', 'train-ai=n, train-genai=n, search=y');
next();
});
// Your routes
app.get('/', (req, res) => {
res.send('Hello World');
});
app.listen(3000);const express = require('express');
const app = express();
// Middleware factory for AIPREF
function aipref(preferences) {
return (req, res, next) => {
res.setHeader('Content-Usage', preferences);
next();
};
}
// Public docs - allow all
app.use('/docs',
aipref('bots=y, train-ai=y, train-genai=y, search=y'),
express.static('public/docs')
);
// Premium content - block AI training
app.use('/premium',
aipref('train-ai=n, train-genai=n, search=y'),
express.static('public/premium')
);
// API - block all automated access
app.use('/api',
aipref('bots=n, train-ai=n, train-genai=n, search=n')
);
app.listen(3000);Add Content-Usage directives to your robots.txt file at the root of your domain.
User-Agent: * Allow: / Content-Usage: train-ai=n, train-genai=n, search=y
User-Agent: * Allow: / # Default preference for most content Content-Usage: train-ai=n, train-genai=n, search=y # Public documentation - allow AI training User-Agent: * Allow: /docs/ Content-Usage: bots=y, train-ai=y, train-genai=y, search=y # Premium content - strict controls User-Agent: * Allow: /premium/ Content-Usage: bots=n, train-ai=n, train-genai=n, search=y # Private sections - block all User-Agent: * Disallow: /private/ Content-Usage: bots=n, train-ai=n, train-genai=n, search=n
Use gatsby-plugin-netlify or gatsby-ssr.js:
// gatsby-ssr.js
export const onPreRenderHTML = ({ getHeadComponents }) => {
if (typeof window !== 'undefined') {
return;
}
};
export const onRenderBody = ({ setHeadComponents }) => {
setHeadComponents([]);
};
// Use gatsby-plugin-netlify for headers
// In gatsby-config.js:
module.exports = {
plugins: [
{
resolve: 'gatsby-plugin-netlify',
options: {
headers: {
'/*': [
'Content-Usage: train-ai=n, train-genai=n, search=y',
],
},
},
},
],
};For Hugo, configure headers in your deployment platform (Netlify, Vercel) or use a _headers file:
# static/_headers (for Netlify) /* Content-Usage: train-ai=n, train-genai=n, search=y /docs/* Content-Usage: bots=y, train-ai=y, train-genai=y, search=y
After implementing AIPREF, verify that your headers are being sent correctly:
curl -I https://example.com # Look for: # Content-Usage: train-ai=n, train-genai=n, search=y
1. Open your website in a browser
2. Open DevTools (F12)
3. Go to Network tab
4. Reload the page
5. Click on the main document request
6. Check Response Headers for Content-Usage