Functional Taxonomy
This document organizes every documented file by functional purpose. Each file is assigned to exactly one category by what it does and not implementation language. An importance score is assigned by how critical the function is.
Importance Scoring
| Score | Meaning | Criteria |
|---|---|---|
| 5 | Critical Path | Site breaks without it; core build/rendering |
| 4 | Major Feature | Key user-facing functionality |
| 3 | Supporting | Enables major features to work |
| 2 | Enhancement | Improves UX/performance, not essential |
| 1 | Peripheral | Utilities, edge cases, rarely used |
Category Overview
| Category | Description | Files | Critical (5) |
|---|---|---|---|
| Build Pipeline | Site generation, compilation, deployment | 15 | 3 |
| Annotation & Metadata | Link metadata scraping, storage, display | 33 | 2 |
| Popup System | Hover popups, popins, extract coordination | 12 | 2 |
| Link Processing | Archives, icons, auto-linking, IDs | 22 | 1 |
| Content Rendering | Transclusion, DOM rewriting, content loading | 10 | 2 |
| Typography & Layout | Text transforms, sidenotes, columns | 16 | 1 |
| Theming & UI | Dark mode, reader mode, colors, CSS | 20 | 0 |
| Utilities & Infrastructure | Helpers, config, templates, server | 72 | 1 |
Total: 200 files
1. Build Pipeline
Core infrastructure that compiles Markdown into the deployed website. Without these, no site gets built.
| File | Score | Role |
|---|---|---|
| sync-sh | 5 | Master build orchestrator (~1,900 lines), coordinates all phases |
| hakyll-hs | 5 | Hakyll site generator entry point, Pandoc transform pipeline |
| bash-sh | 5 | Bash helper library used by sync.sh |
| preprocess-markdown-hs | 4 | Markdown preprocessing before Hakyll |
| generate-directory-hs | 4 | Generates tag directory index pages |
| generate-link-bibliography-hs | 4 | Creates per-page link bibliographies |
| check-metadata-hs | 3 | Validates annotation metadata database |
| test-hs | 3 | Test suite runner |
| pre-commit-hook | 3 | Git hook triggering asset rebuilds |
| markdown-lint | 2 | Markdown linting checks |
| markdown-footnote-length-hs | 2 | Footnote length warnings |
| markdown-length-checker-hs | 2 | Code block line length checker |
| anchor-checker | 2 | HTML anchor validation |
| duplicate-quote-site-finder-hs | 1 | Near-duplicate quote detection |
| cycle-hs | 2 | Prevents infinite rewrite loops |
2. Annotation & Metadata
The annotation system provides rich metadata for links (title, author, date, abstract, tags) displayed in popups.
Core System
| File | Score | Role |
|---|---|---|
| link-metadata-hs | 5 | Central manager: lookup, creation, caching |
| annotation-hs | 5 | Master dispatcher routing URLs to scrapers |
| link-metadata-types-hs | 4 | Type definitions (Metadata, MetadataItem) |
| gtx-hs | 4 | GTX format parser/writer |
| annotations-js | 4 | Frontend annotation data loading/caching |
Scrapers
| File | Score | Role |
|---|---|---|
| annotation-arxiv-hs | 3 | Arxiv API metadata extraction |
| annotation-biorxiv-hs | 3 | BioRxiv/MedRxiv HTML scraping |
| annotation-gwernnet-hs | 3 | Local page annotation extraction |
| annotation-openreview-hs | 3 | OpenReview.net scraping |
| annotation-pdf-hs | 3 | PDF metadata via exiftool |
| annotation-dump-hs | 2 | Annotation database dump utility |
| openreviewabstract | 2 | Shell wrapper for OpenReview |
| preprocess-annotation | 2 | Annotation preprocessing |
Metadata Processing
| File | Score | Role |
|---|---|---|
| metadata-author-hs | 3 | Author name canonicalization |
| metadata-date-hs | 3 | Date parsing and validation |
| metadata-format-hs | 3 | Abstract/title HTML cleaning |
| metadata-title-hs | 3 | Title extraction from pages |
| paragraph-hs | 3 | gpt-4o-mini paragraph splitting |
| date-guesser | 2 | LLM date extraction |
| title-cleaner | 2 | LLM title cleanup |
| paragraphizer | 2 | LLM paragraph splitting |
| tagguesser | 2 | LLM tag suggestions |
| italicizer | 1 | LLM italicization |
| text2epositive | 1 | Text sentiment analysis |
Config
| File | Score | Role |
|---|---|---|
| config-metadata-author-hs | 2 | Author canonicalization rules |
| config-metadata-format-hs | 2 | HTML rewrite patterns |
| config-metadata-title-hs | 2 | Bad title string filters |
3. Popup System
Displays preview content when hovering over links—the most distinctive user-facing feature.
Core Popup/Popin
| File | Score | Role |
|---|---|---|
| popups-js | 5 | Main popup positioning, windowing (~2,700 lines) |
| extracts-js | 5 | Orchestrates content → popup coordination |
| popins-js | 4 | Mobile-friendly popin variant |
| extracts-annotations-js | 4 | Annotation-specific extract handling |
| extracts-content-js | 4 | Content extraction for popups |
| extracts-options-js | 3 | User preferences for extracts |
| extracts-load-js | 3 | Extract system bootstrapping |
Backend Support
| File | Score | Role |
|---|---|---|
| link-live-hs | 3 | Determines if URL can be iframed |
| config-link-live-hs | 2 | Domain whitelist/blacklist for live popups |
Templates
| File | Score | Role |
|---|---|---|
| pop-frame-title-standard | 3 | Standard popup title template |
| annotation-blockquote-inside | 3 | Annotation quote template variant |
| annotation-blockquote-outside | 3 | Annotation quote template variant |
4. Link Processing
Multiple subsystems for enriching, archiving, and managing links.
Link Archives (Preemptive link-rot prevention)
| File | Score | Role |
|---|---|---|
| link-archive-hs | 5 | Rewrites external URLs to local mirrors |
| config-link-archive-hs | 3 | Archive whitelist/blacklist (~800 domains) |
| link-archive | 3 | SingleFile-based archiving script |
| deconstruct-singlefile | 2 | Splits monolithic archives |
Link Icons
| File | Score | Role |
|---|---|---|
| link-icon-hs | 4 | Assigns icon based on URL/domain |
| config-link-icon-hs | 3 | URL→icon mapping rules |
| build-icon-sprite-file | 2 | SVG icon sprite generation |
Auto-Linking & Interwiki
| File | Score | Role |
|---|---|---|
| link-auto-hs | 4 | Auto-hyperlinks ~1000 terms/citations |
| config-link-auto-hs | 3 | Regex patterns for auto-linking |
| interwiki-hs | 3 | !W → Wikipedia expansion |
| config-interwiki-hs | 2 | Interwiki patterns |
Link IDs & Bibliography
| File | Score | Role |
|---|---|---|
| link-id-hs | 3 | Generates citation-style IDs (foo-2020) |
| config-link-id-hs | 2 | ID override mappings |
| link-suggester-hs | 2 | Generates Emacs link suggestions |
| link-extractor-hs | 2 | Extracts URLs from Markdown |
| link-prioritize-hs | 2 | Ranks unannotated links |
| link-titler-hs | 2 | Adds tooltips to bare links |
| link-tooltip-hs | 2 | Parses tooltips to metadata |
Links CSS
| File | Score | Role |
|---|---|---|
| links-css | 4 | Link icons and annotation styling |
| links | 4 | CSS link styling rules |
5. Content Rendering
Loads, transforms, and displays content—the backbone of the frontend.
Core Frontend Framework
| File | Score | Role |
|---|---|---|
| initial-js | 5 | GW namespace, notification center (~1,335 lines) |
| rewrite-js | 5 | 80+ DOM transformation handlers |
| content-js | 4 | Polymorphic content loading |
| transclude-js | 4 | Include-link resolution |
| rewrite-initial-js | 3 | Early DOM rewrites |
| utility-js | 4 | General JavaScript utilities |
Backend Generators
| File | Score | Role |
|---|---|---|
| generate-backlinks-hs | 3 | Generates reverse citations |
| generate-similar-hs | 3 | RP-tree embedding search |
| generate-similar-links-hs | 3 | Generates similar link HTML |
| link-backlink-hs | 3 | Backlinks database I/O |
6. Typography & Layout
Text transformation, formatting, and page structure.
Typography Transforms
| File | Score | Role |
|---|---|---|
| typography-hs | 5 | Title case, wbr, rulers, citations |
| config-typography-hs | 3 | Typography settings |
| typography-js | 3 | Client-side typography enhancements |
Layout Components
| File | Score | Role |
|---|---|---|
| sidenotes-js | 4 | Sidenote positioning algorithm |
| collapse-js | 3 | Collapsible section handling |
| layout-js | 3 | Block layout primitives |
| columns-hs | 2 | Multi-column list detection |
Inflation Adjustment
| File | Score | Role |
|---|---|---|
| inflation-hs | 3 | Dollar/Bitcoin inflation adjustment |
| config-inflation-hs | 2 | CPI/PCE/Bitcoin rate data |
Image Processing
| File | Score | Role |
|---|---|---|
| image-hs | 3 | Dimensions, inversion detection, lazy loading |
| image-focus-js | 3 | Lightbox/image viewer |
| invertornot | 2 | GPT-4V inversion classifier |
| image-margin-checker | 1 | Image margin analysis |
| should-image-have-outline | 2 | Corner analysis for outline CSS |
| build-inlined-images | 2 | Base64 inline images |
7. Theming & UI
Visual customization, styling, and user interface.
Dark Mode
| File | Score | Role |
|---|---|---|
| dark-mode-js | 4 | Dark mode toggle and persistence |
| dark-mode-initial-js | 4 | Early dark mode (FOUC prevention) |
| dark-mode-adjustments | 3 | Dark mode image filters |
| dark-mode-adjustments-css | 3 | Frontend dark mode CSS |
| build-mode-css | 2 | Generates dark mode CSS |
Reader Mode
| File | Score | Role |
|---|---|---|
| reader-mode-js | 3 | Reader mode functionality |
| reader-mode-initial-js | 3 | Early reader mode setup |
| reader-mode-css | 3 | Reader mode CSS |
| reader-mode-initial | 3 | Initial reader mode CSS |
Colors & Styling
| File | Score | Role |
|---|---|---|
| color-js | 3 | Color manipulation utilities |
| colors | 4 | Light mode color variables |
| colors-css | 4 | Frontend color definitions |
| color-scheme-convert | 2 | Color scheme conversion |
Core CSS
| File | Score | Role |
|---|---|---|
| default | 4 | Main site stylesheet |
| default-css | 4 | Frontend default CSS |
| initial | 4 | Initial/critical path CSS |
| initial-css | 4 | Frontend initial CSS |
| light-mode-adjustments | 2 | Light mode specific adjustments |
Special Features
| File | Score | Role |
|---|---|---|
| special-occasions-js | 2 | Holiday themes (Halloween, Christmas) |
| special-occasions-css | 2 | Holiday CSS |
8. Utilities & Infrastructure
Supporting code, configuration, templates, and server setup.
Backend Utilities
| File | Score | Role |
|---|---|---|
| utils-hs | 5 | ~150 utility functions |
| query-hs | 3 | Pandoc AST queries |
| unique-hs | 2 | Duplicate detection |
| string-replace-hs | 2 | Parallel string replacement |
| text-regex-hs | 2 | Regex utilities and pattern matching |
| rename-hs | 2 | Page rename script generator |
| gwernnet-cabal | 2 | Cabal project configuration |
Frontend Utilities
| File | Score | Role |
|---|---|---|
| misc-js | 2 | Miscellaneous features |
| console-js | 2 | Console utilities |
| 404-guesser-js | 2 | 404 page redirect suggestions |
| gwtar-js | 2 | JavaScript tarball archive handling |
| gwtar-noscript-html | 1 | Noscript fallback for gwtar |
Tags & Navigation
| File | Score | Role |
|---|---|---|
| tags-hs | 4 | Tag management, directory listing |
| config-tags-hs | 3 | Tag aliases, hierarchy |
| guess-tag-hs | 2 | Tag suggestion from partial input |
| change-tag-hs | 2 | Batch tag operations |
Content Features
| File | Score | Role |
|---|---|---|
| blog-hs | 3 | Blog entry generation |
| x-of-the-day-hs | 2 | Quote/site/annotation of the day |
| config-x-of-the-day-hs | 2 | XOTD database paths |
PHP Asset Pipeline
| File | Score | Role |
|---|---|---|
| build_unified_assets | 4 | CSS/JS concatenation |
| build_asset_versions | 3 | Asset version manifest |
| build_font_css | 3 | Font CSS generation |
| build_versioned_font_css | 3 | Versioned font CSS |
| build_head_includes | 3 | Head HTML includes |
| build_body_includes | 3 | Body HTML includes |
| build_standalone_includes | 3 | Standalone page includes |
| build_paths | 2 | Build path constants |
| build_variables | 2 | Build variables |
| build_functions | 2 | Shared build utilities |
| version_asset_links | 2 | Cache busting |
| asset | 2 | Asset management |
SVG/Font Processing
| File | Score | Role |
|---|---|---|
| svg-squeeze | 2 | SVG optimization |
| svg-strip-background | 1 | SVG background removal |
| font-spec | 2 | Font specification |
Python Utilities
| File | Score | Role |
|---|---|---|
| clean-pdf | 2 | GPT-4 OCR/formatting cleanup |
| daterange-checker | 1 | Date range validation |
| collapse-checker | 1 | Collapse validation |
| htmlAttributesExtract | 1 | HTML attribute extraction |
| latex2unicode | 2 | LaTeX to Unicode conversion |
| seriate | 2 | Optimal ordering for similarity matrices |
Shell Utilities
| File | Score | Role |
|---|---|---|
| embed | 3 | OpenAI embeddings API wrapper |
| upload | 2 | File upload with processing |
| download-title | 2 | Download and extract title |
| gwsed | 2 | Site-wide string replacement |
| compress-gif | 2 | GIF compression optimization |
| compress-png | 2 | PNG compression optimization |
Configuration Modules
| File | Score | Role |
|---|---|---|
| config-misc-hs | 3 | Global settings |
| config-paragraph-hs | 2 | Paragraph settings |
| config-link-suggester-hs | 2 | Link suggester config |
| config-generate-similar-hs | 2 | Embedding settings |
| nginx-redirect-guesser-hs | 1 | 404 redirect suggestions |
HTML Templates
| File | Score | Role |
|---|---|---|
| default | 4 | Main HTML template |
| sourcecode | 3 | Source code display template |
| annotation-blockquote-not | 2 | Annotation without blockquote |
| annotation-partial-inline | 2 | Partial annotation template |
| github-issue-blockquote-not | 2 | GitHub issue template |
| github-issue-blockquote-outside | 2 | GitHub issue template variant |
| google-cse | 2 | Google custom search template |
| google-search | 2 | Google search template |
| tweet-blockquote-not | 2 | Tweet template |
| tweet-blockquote-outside | 2 | Tweet template variant |
| wikipedia-entry-blockquote-inside | 2 | Wikipedia template |
| wikipedia-entry-blockquote-not | 2 | Wikipedia template variant |
| wikipedia-entry-blockquote-title-not | 2 | Wikipedia template variant |
Include Templates
| File | Score | Role |
|---|---|---|
| include-footer | 3 | Footer include |
| include-sidebar | 3 | Sidebar include |
| include-inlined-asset-links | 2 | Asset link include |
| include-inlined-head | 2 | Head include |
| include-inlined-standalone | 2 | Standalone include |
| template-html5-articleedit | 2 | Article edit template |
Server & Nginx
| File | Score | Role |
|---|---|---|
| gwern-net-conf | 4 | Main nginx configuration |
| redirect-nginx | 3 | Nginx redirect rules |
| redirect-nginx-broken | 2 | Broken redirect handling |
| memoriam-sh | 1 | Memorial system |
| rsyncd-conf | 2 | rsync daemon config |
| twdne-conf | 1 | TWDNE subdomain config |
Misc Templates
| File | Score | Role |
|---|---|---|
| idealconditionsdonotexistandwillneverhappen | 1 | Error template |
| unfortunatelytheclockisticking | 1 | Error template |
Data Flow Summary
┌───────────────────────────────────────────────────────────┐
│ BUILD TIME (sync.sh) │
├───────────────────────────────────────────────────────────┤
│ │
│ Markdown → Pandoc AST → Haskell Transforms → HTML + JSON │
│ │
│ Key transforms: │
│ • Typography.hs: text polish │
│ • LinkAuto.hs: auto-linking │
│ • LinkMetadata.hs: annotation marking │
│ • LinkArchive.hs: archive localization │
│ • LinkIcon.hs: icon assignment │
│ │
│ Asset generation: │
│ • PHP scripts: CSS/JS bundling │
│ • Annotation fragments: HTML for popups │
│ • Embeddings: similar link computation │
│ │
└───────────────────────────────────────────────────────────┘
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
┌───────────────────────────────────────────────────────────┐
│ RUNTIME (Browser) │
├───────────────────────────────────────────────────────────┤
│ │
│ initial.js → GW.notificationCenter (pub/sub event bus) │
│ │
│ Content events: GW.contentDidLoad → GW.contentDidInject │
│ │
│ Phase order: transclude → rewrite → eventListeners │
│ │
│ Key modules: │
│ • extracts.js: popup/popin coordination │
│ • popups.js: windowing system │
│ • transclude.js: include-link resolution │
│ • rewrite.js: DOM transformations │
│ │
└───────────────────────────────────────────────────────────┘
Critical Path Files (Score 5)
These 12 files are essential—the site won't build or function without them:
| Category | File | Why Critical |
|---|---|---|
| Build | sync-sh | Master orchestrator |
| Build | hakyll-hs | Site generator |
| Build | bash-sh | Build helpers |
| Annotation | link-metadata-hs | Annotation system core |
| Annotation | annotation-hs | Scraper dispatcher |
| Popup | popups-js | Popup display system |
| Popup | extracts-js | Content coordination |
| Content | initial-js | Event system foundation |
| Content | rewrite-js | DOM transformation |
| Typography | typography-hs | Text transforms |
| Link | link-archive-hs | Archive system |
| Utilities | utils-hs | Shared helpers |
See Also
- Page Lifecycle - How a page transforms from Markdown to final HTML
- Documentation Home - Main index
- sync.sh - Master build orchestrator coordinating all phases
- hakyll.hs - Hakyll site generator and Pandoc pipeline
- LinkMetadata.hs - Central annotation database manager
- popups.js - Main popup windowing system
- extracts.js - Content extraction for popups
- Typography.hs - Text transforms: title case, citations, rulers