Repository Guidelines

Agent Discovery

This site publishes machine-readable resources for AI agents.

  • robots.txt with Content-Signal: search=yes, ai-train=no, ai-input=yes. Retrieval for grounding and search indexing is permitted. Training on this content is not.
  • /llms.txt and /llms-full.txt: short summary and full markdown bundle.
  • /posts.json: machine-readable post list with summaries, thread/surface labels, canonical URLs, and raw-markdown URLs.
  • /feed.xml, /sitemap.xml: standard feeds.
  • /.well-known/api-catalog: RFC 9727 linkset pointing to the resources above.
  • /.well-known/agent-skills/index.json: RFC-v0.2.0 Agent Skills index with research-context and blog-reader SKILL.md files.
  • HTML <link rel="..."> elements on every page advertise the same resources (used in place of HTTP Link headers, which GitHub Pages cannot set).
  • WebMCP: navigator.modelContext.provideContext() registers list_posts, get_post, search_posts, and list_resources tools for browser-resident agents.

Intentionally omitted

  • OAuth 2.0 / OIDC discovery, Protected Resource metadata: this site has no protected APIs. Publishing empty metadata would be misleading.
  • MCP Server Card: no MCP server runs on this origin. If a remote MCP server for this content is added later, its card will live at /.well-known/mcp/server-card.json.
  • HTTP Link response headers (RFC 8288): GitHub Pages on the default *.github.io domain cannot set custom response headers. HTML <link> elements are used as the discoverable substitute. A future move to a custom domain behind Cloudflare would enable true header-based negotiation and Accept: text/markdown content negotiation.

Project Structure & Module Organization

  • _posts/: Blog posts in YYYY-MM-DD-title.md with YAML front matter.
  • _layouts/: Templates (default.html, home.html, post.html) used by pages and posts.
  • assets/: Static assets (CSS at assets/css/style.css).
  • Pages: index.md, projects.md, blog.md, about.md.
  • _config.yml: Site settings (title, links, plugins).

Site Hierarchy & Content Placement

This is a personal research website, not a resume, sales site, or exhaustive archive. Keep the tone restrained, specific, and research-native. Avoid labels such as “portfolio”, “featured”, “case study”, “impact”, and broad marketing copy.

Core navigation

  • / (index.md): front door and curated orientation.
    • Research threads: a compact selection of active/high-signal research threads.
    • More writing: the newest posts where surface: blog.
  • /projects/ (projects.md): research index for frontier-lab readers.
    • Current threads: active or high-signal research systems/workstreams.
    • Papers: arXiv papers only, plus closely supporting code/model/docs links.
    • Production and OSS infrastructure: open-source systems contributions.
  • /blog/ (blog.md): essays, working notes, technical reports, and thinking in public. It lists surface: blog posts first, then surface: artifact posts.
  • /about/ (about.md): biography/background only. Do not turn it into another research index.

Post metadata contract

Every post in _posts/ must include:

  • description: one concise, front-matter-controlled summary used by indexes and SEO metadata. Do not rely on Jekyll automatic excerpts.
  • thread: one simple taxonomy label. Current labels include Scientific discovery, Verification, Post-training, Interpretability, Evaluation, and Learning systems.
  • surface: one of:
    • blog: appears first in /blog/ and may appear in homepage More writing.
    • artifact: research output/post exists as supporting material and appears below essays on /blog/.
    • archive: remains published and discoverable, but is not part of the core Writing index.

Placement rules

  • Put reflective essays, working notes, and thinking-in-public pieces in _posts/ with surface: blog.
  • Put papers, code/model releases, benchmark reports, and system artifacts in _posts/ with surface: artifact; add the strongest items manually to /projects/ when they belong in Current threads or Papers.
  • Put older or lower-priority posts in _posts/ with surface: archive when they should remain public but not occupy the core Writing page.
  • Add arXiv papers to /projects/ under Papers, not to Writing.
  • Add supporting company or prior-lab writing only when it strengthens a broader research entry; do not create standalone archive sections for it.
  • Keep homepage overlap intentional and minimal: it may repeat selected Research and Writing entries as navigation, but it should not become a complete index.

Build, Test, and Development Commands

  • bundle install: Install Ruby gems.
  • bundle exec jekyll serve --livereload: Run locally at http://localhost:4000/.
  • JEKYLL_ENV=production bundle exec jekyll build: Produce _site/ for a production build.
  • bundle update: Refresh locked dependencies when needed.

Coding Style & Naming Conventions

  • Indentation: 2 spaces for HTML, YAML, CSS, and Liquid.
  • Markdown first; use Liquid tags sparingly and keep logic in layouts.
  • Filenames: kebab-case; posts follow YYYY-MM-DD-title.md.
  • Front matter: lowercase keys; include layout, title, date, and optional categories.
  • Accessibility: add descriptive alt text for images.

Testing Guidelines

  • No formal test suite. Validate with:
    • bundle exec jekyll doctor to catch config issues.
    • A full local build and click-through to check links and formatting.
  • If changing layouts or CSS, include a screenshot of the affected pages.

Commit & Pull Request Guidelines

  • Commits: short, imperative, present tense (e.g., update blog, revise copy); add scope when helpful (e.g., layout: adjust header spacing).
  • PRs must include:
    • Summary of changes and rationale.
    • Screenshots/GIFs for visual changes.
    • Any config updates in _config.yml called out.
    • Confirmation that site builds locally.

Security & Configuration Tips

  • Do not commit secrets or API keys.
  • Keep external scripts minimal; prefer local assets in assets/.
  • GitHub Pages auto-deploys from main; use JEKYLL_ENV=production for parity when testing locally.