Skip to main content

Functional Taxonomy

This document organizes every documented file by functional purpose. Each file is assigned to exactly one category by what it does and not implementation language. An importance score is assigned by how critical the function is.

Importance Scoring

ScoreMeaningCriteria
5Critical PathSite breaks without it; core build/rendering
4Major FeatureKey user-facing functionality
3SupportingEnables major features to work
2EnhancementImproves UX/performance, not essential
1PeripheralUtilities, edge cases, rarely used

Category Overview

CategoryDescriptionFilesCritical (5)
Build PipelineSite generation, compilation, deployment153
Annotation & MetadataLink metadata scraping, storage, display332
Popup SystemHover popups, popins, extract coordination122
Link ProcessingArchives, icons, auto-linking, IDs221
Content RenderingTransclusion, DOM rewriting, content loading102
Typography & LayoutText transforms, sidenotes, columns161
Theming & UIDark mode, reader mode, colors, CSS200
Utilities & InfrastructureHelpers, config, templates, server721

Total: 200 files


1. Build Pipeline

Core infrastructure that compiles Markdown into the deployed website. Without these, no site gets built.

FileScoreRole
sync-sh5Master build orchestrator (~1,900 lines), coordinates all phases
hakyll-hs5Hakyll site generator entry point, Pandoc transform pipeline
bash-sh5Bash helper library used by sync.sh
preprocess-markdown-hs4Markdown preprocessing before Hakyll
generate-directory-hs4Generates tag directory index pages
generate-link-bibliography-hs4Creates per-page link bibliographies
check-metadata-hs3Validates annotation metadata database
test-hs3Test suite runner
pre-commit-hook3Git hook triggering asset rebuilds
markdown-lint2Markdown linting checks
markdown-footnote-length-hs2Footnote length warnings
markdown-length-checker-hs2Code block line length checker
anchor-checker2HTML anchor validation
duplicate-quote-site-finder-hs1Near-duplicate quote detection
cycle-hs2Prevents infinite rewrite loops

2. Annotation & Metadata

The annotation system provides rich metadata for links (title, author, date, abstract, tags) displayed in popups.

Core System

FileScoreRole
link-metadata-hs5Central manager: lookup, creation, caching
annotation-hs5Master dispatcher routing URLs to scrapers
link-metadata-types-hs4Type definitions (Metadata, MetadataItem)
gtx-hs4GTX format parser/writer
annotations-js4Frontend annotation data loading/caching

Scrapers

FileScoreRole
annotation-arxiv-hs3Arxiv API metadata extraction
annotation-biorxiv-hs3BioRxiv/MedRxiv HTML scraping
annotation-gwernnet-hs3Local page annotation extraction
annotation-openreview-hs3OpenReview.net scraping
annotation-pdf-hs3PDF metadata via exiftool
annotation-dump-hs2Annotation database dump utility
openreviewabstract2Shell wrapper for OpenReview
preprocess-annotation2Annotation preprocessing

Metadata Processing

FileScoreRole
metadata-author-hs3Author name canonicalization
metadata-date-hs3Date parsing and validation
metadata-format-hs3Abstract/title HTML cleaning
metadata-title-hs3Title extraction from pages
paragraph-hs3gpt-4o-mini paragraph splitting
date-guesser2LLM date extraction
title-cleaner2LLM title cleanup
paragraphizer2LLM paragraph splitting
tagguesser2LLM tag suggestions
italicizer1LLM italicization
text2epositive1Text sentiment analysis

Config

FileScoreRole
config-metadata-author-hs2Author canonicalization rules
config-metadata-format-hs2HTML rewrite patterns
config-metadata-title-hs2Bad title string filters

3. Popup System

Displays preview content when hovering over links—the most distinctive user-facing feature.

Core Popup/Popin

FileScoreRole
popups-js5Main popup positioning, windowing (~2,700 lines)
extracts-js5Orchestrates content → popup coordination
popins-js4Mobile-friendly popin variant
extracts-annotations-js4Annotation-specific extract handling
extracts-content-js4Content extraction for popups
extracts-options-js3User preferences for extracts
extracts-load-js3Extract system bootstrapping

Backend Support

FileScoreRole
link-live-hs3Determines if URL can be iframed
config-link-live-hs2Domain whitelist/blacklist for live popups

Templates

FileScoreRole
pop-frame-title-standard3Standard popup title template
annotation-blockquote-inside3Annotation quote template variant
annotation-blockquote-outside3Annotation quote template variant

Multiple subsystems for enriching, archiving, and managing links.

FileScoreRole
link-archive-hs5Rewrites external URLs to local mirrors
config-link-archive-hs3Archive whitelist/blacklist (~800 domains)
link-archive3SingleFile-based archiving script
deconstruct-singlefile2Splits monolithic archives
FileScoreRole
link-icon-hs4Assigns icon based on URL/domain
config-link-icon-hs3URL→icon mapping rules
build-icon-sprite-file2SVG icon sprite generation

Auto-Linking & Interwiki

FileScoreRole
link-auto-hs4Auto-hyperlinks ~1000 terms/citations
config-link-auto-hs3Regex patterns for auto-linking
interwiki-hs3!W → Wikipedia expansion
config-interwiki-hs2Interwiki patterns
FileScoreRole
link-id-hs3Generates citation-style IDs (foo-2020)
config-link-id-hs2ID override mappings
link-suggester-hs2Generates Emacs link suggestions
link-extractor-hs2Extracts URLs from Markdown
link-prioritize-hs2Ranks unannotated links
link-titler-hs2Adds tooltips to bare links
link-tooltip-hs2Parses tooltips to metadata
FileScoreRole
links-css4Link icons and annotation styling
links4CSS link styling rules

5. Content Rendering

Loads, transforms, and displays content—the backbone of the frontend.

Core Frontend Framework

FileScoreRole
initial-js5GW namespace, notification center (~1,335 lines)
rewrite-js580+ DOM transformation handlers
content-js4Polymorphic content loading
transclude-js4Include-link resolution
rewrite-initial-js3Early DOM rewrites
utility-js4General JavaScript utilities

Backend Generators

FileScoreRole
generate-backlinks-hs3Generates reverse citations
generate-similar-hs3RP-tree embedding search
generate-similar-links-hs3Generates similar link HTML
link-backlink-hs3Backlinks database I/O

6. Typography & Layout

Text transformation, formatting, and page structure.

Typography Transforms

FileScoreRole
typography-hs5Title case, wbr, rulers, citations
config-typography-hs3Typography settings
typography-js3Client-side typography enhancements

Layout Components

FileScoreRole
sidenotes-js4Sidenote positioning algorithm
collapse-js3Collapsible section handling
layout-js3Block layout primitives
columns-hs2Multi-column list detection

Inflation Adjustment

FileScoreRole
inflation-hs3Dollar/Bitcoin inflation adjustment
config-inflation-hs2CPI/PCE/Bitcoin rate data

Image Processing

FileScoreRole
image-hs3Dimensions, inversion detection, lazy loading
image-focus-js3Lightbox/image viewer
invertornot2GPT-4V inversion classifier
image-margin-checker1Image margin analysis
should-image-have-outline2Corner analysis for outline CSS
build-inlined-images2Base64 inline images

7. Theming & UI

Visual customization, styling, and user interface.

Dark Mode

FileScoreRole
dark-mode-js4Dark mode toggle and persistence
dark-mode-initial-js4Early dark mode (FOUC prevention)
dark-mode-adjustments3Dark mode image filters
dark-mode-adjustments-css3Frontend dark mode CSS
build-mode-css2Generates dark mode CSS

Reader Mode

FileScoreRole
reader-mode-js3Reader mode functionality
reader-mode-initial-js3Early reader mode setup
reader-mode-css3Reader mode CSS
reader-mode-initial3Initial reader mode CSS

Colors & Styling

FileScoreRole
color-js3Color manipulation utilities
colors4Light mode color variables
colors-css4Frontend color definitions
color-scheme-convert2Color scheme conversion

Core CSS

FileScoreRole
default4Main site stylesheet
default-css4Frontend default CSS
initial4Initial/critical path CSS
initial-css4Frontend initial CSS
light-mode-adjustments2Light mode specific adjustments

Special Features

FileScoreRole
special-occasions-js2Holiday themes (Halloween, Christmas)
special-occasions-css2Holiday CSS

8. Utilities & Infrastructure

Supporting code, configuration, templates, and server setup.

Backend Utilities

FileScoreRole
utils-hs5~150 utility functions
query-hs3Pandoc AST queries
unique-hs2Duplicate detection
string-replace-hs2Parallel string replacement
text-regex-hs2Regex utilities and pattern matching
rename-hs2Page rename script generator
gwernnet-cabal2Cabal project configuration

Frontend Utilities

FileScoreRole
misc-js2Miscellaneous features
console-js2Console utilities
404-guesser-js2404 page redirect suggestions
gwtar-js2JavaScript tarball archive handling
gwtar-noscript-html1Noscript fallback for gwtar

Tags & Navigation

FileScoreRole
tags-hs4Tag management, directory listing
config-tags-hs3Tag aliases, hierarchy
guess-tag-hs2Tag suggestion from partial input
change-tag-hs2Batch tag operations

Content Features

FileScoreRole
blog-hs3Blog entry generation
x-of-the-day-hs2Quote/site/annotation of the day
config-x-of-the-day-hs2XOTD database paths

PHP Asset Pipeline

FileScoreRole
build_unified_assets4CSS/JS concatenation
build_asset_versions3Asset version manifest
build_font_css3Font CSS generation
build_versioned_font_css3Versioned font CSS
build_head_includes3Head HTML includes
build_body_includes3Body HTML includes
build_standalone_includes3Standalone page includes
build_paths2Build path constants
build_variables2Build variables
build_functions2Shared build utilities
version_asset_links2Cache busting
asset2Asset management

SVG/Font Processing

FileScoreRole
svg-squeeze2SVG optimization
svg-strip-background1SVG background removal
font-spec2Font specification

Python Utilities

FileScoreRole
clean-pdf2GPT-4 OCR/formatting cleanup
daterange-checker1Date range validation
collapse-checker1Collapse validation
htmlAttributesExtract1HTML attribute extraction
latex2unicode2LaTeX to Unicode conversion
seriate2Optimal ordering for similarity matrices

Shell Utilities

FileScoreRole
embed3OpenAI embeddings API wrapper
upload2File upload with processing
download-title2Download and extract title
gwsed2Site-wide string replacement
compress-gif2GIF compression optimization
compress-png2PNG compression optimization

Configuration Modules

FileScoreRole
config-misc-hs3Global settings
config-paragraph-hs2Paragraph settings
config-link-suggester-hs2Link suggester config
config-generate-similar-hs2Embedding settings
nginx-redirect-guesser-hs1404 redirect suggestions

HTML Templates

FileScoreRole
default4Main HTML template
sourcecode3Source code display template
annotation-blockquote-not2Annotation without blockquote
annotation-partial-inline2Partial annotation template
github-issue-blockquote-not2GitHub issue template
github-issue-blockquote-outside2GitHub issue template variant
google-cse2Google custom search template
google-search2Google search template
tweet-blockquote-not2Tweet template
tweet-blockquote-outside2Tweet template variant
wikipedia-entry-blockquote-inside2Wikipedia template
wikipedia-entry-blockquote-not2Wikipedia template variant
wikipedia-entry-blockquote-title-not2Wikipedia template variant

Include Templates

FileScoreRole
include-footer3Footer include
include-sidebar3Sidebar include
include-inlined-asset-links2Asset link include
include-inlined-head2Head include
include-inlined-standalone2Standalone include
template-html5-articleedit2Article edit template

Server & Nginx

FileScoreRole
gwern-net-conf4Main nginx configuration
redirect-nginx3Nginx redirect rules
redirect-nginx-broken2Broken redirect handling
memoriam-sh1Memorial system
rsyncd-conf2rsync daemon config
twdne-conf1TWDNE subdomain config

Misc Templates

FileScoreRole
idealconditionsdonotexistandwillneverhappen1Error template
unfortunatelytheclockisticking1Error template

Data Flow Summary

┌───────────────────────────────────────────────────────────┐
│ BUILD TIME (sync.sh) │
├───────────────────────────────────────────────────────────┤
│ │
│ Markdown → Pandoc AST → Haskell Transforms → HTML + JSON │
│ │
│ Key transforms: │
│ • Typography.hs: text polish │
│ • LinkAuto.hs: auto-linking │
│ • LinkMetadata.hs: annotation marking │
│ • LinkArchive.hs: archive localization │
│ • LinkIcon.hs: icon assignment │
│ │
│ Asset generation: │
│ • PHP scripts: CSS/JS bundling │
│ • Annotation fragments: HTML for popups │
│ • Embeddings: similar link computation │
│ │
└───────────────────────────────────────────────────────────┘
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
┌───────────────────────────────────────────────────────────┐
│ RUNTIME (Browser) │
├───────────────────────────────────────────────────────────┤
│ │
│ initial.js → GW.notificationCenter (pub/sub event bus) │
│ │
│ Content events: GW.contentDidLoad → GW.contentDidInject │
│ │
│ Phase order: transclude → rewrite → eventListeners │
│ │
│ Key modules: │
│ • extracts.js: popup/popin coordination │
│ • popups.js: windowing system │
│ • transclude.js: include-link resolution │
│ • rewrite.js: DOM transformations │
│ │
└───────────────────────────────────────────────────────────┘

Critical Path Files (Score 5)

These 12 files are essential—the site won't build or function without them:

CategoryFileWhy Critical
Buildsync-shMaster orchestrator
Buildhakyll-hsSite generator
Buildbash-shBuild helpers
Annotationlink-metadata-hsAnnotation system core
Annotationannotation-hsScraper dispatcher
Popuppopups-jsPopup display system
Popupextracts-jsContent coordination
Contentinitial-jsEvent system foundation
Contentrewrite-jsDOM transformation
Typographytypography-hsText transforms
Linklink-archive-hsArchive system
Utilitiesutils-hsShared helpers

See Also