Skip to main content

Config.XOfTheDay

Path: build/Config/XOfTheDay.hs | Language: Haskell | Lines: ~499

Configuration constants for the "X of the Day" feature: paths, thresholds, and domain exclusions


Overview

This module centralizes all configuration for gwern.net's "X of the Day" features—Quote of the Day, Site of the Day, and Annotation of the Day. Each feature displays a rotating selection of curated content on the homepage, drawn from metadata databases.

The module defines three categories of configuration: (1) file paths to the source databases and output HTML files, (2) quality thresholds that filter which entries qualify for rotation, and (3) a comprehensive domain blacklist preventing certain sites from appearing in the Site of the Day feature.

The blacklist is notably large (~460 domains) and excludes common platforms (GitHub, Wikipedia, Twitter/X), major publications (NYT, Guardian, Nature), academic repositories (arXiv, JSTOR), and gwern.net itself—the rationale being these are either too common to be interesting discoveries or would create self-referential recommendations.


Public API

quoteDBPath :: FilePath

Path to the quote database: metadata/quotes.hs

quotePath :: FilePath

Output path for today's quote: metadata/today-quote.html

siteDBPath :: FilePath

Path to the site database: metadata/sites.hs

sitePath :: FilePath

Output path for today's site: metadata/today-site.html

annotDayDB :: String

Path to the annotation database: metadata/annotations.hs

annotPath :: String

Output path for today's annotation: metadata/today-annotation.html

minAnnotationAbstractLength :: Int

Minimum character length for annotation abstracts to qualify: 2000

Comment notes this threshold yielded 3,313 qualifying entries (vs 10,046 at >500 chars).

siteLinkMin :: Int

Minimum number of links required for a site to qualify: 3

siteBlackList :: [T.Text]

List of domain names excluded from Site of the Day recommendations.


Internal Architecture

The module contains no logic—it's purely declarative constants. The blacklist is a simple list literal of T.Text domain strings, designed for O(n) membership checks (or conversion to a Set by consumers).

Database File Format

The .hs extension on database files suggests they use Haskell's Read/Show serialization rather than JSON or YAML. This allows type-safe deserialization directly into Haskell data structures.

Output Files

The today-*.html files are pre-rendered HTML fragments included in page templates, regenerated during each build cycle.


Key Patterns

Threshold Tuning Comments: The module includes inline documentation about threshold selection rationale:

-- at >500, yielded 10,046 on 2023-03-08; >2,000 yielded a more reasonable 3,313

This captures the empirical reasoning for configuration choices, useful for future adjustments.

Domain Blacklist Organization: Domains are listed alphabetically within chunks of ~5 per line, making diffs readable and merges less conflicting.


Configuration

All values are compile-time constants. To modify:

SettingCurrent ValueEffect
minAnnotationAbstractLength2000Raises bar: fewer but meatier annotations
siteLinkMin3Ensures sites have meaningful presence in annotations
siteBlackList~460 domainsPrevents generic/common sites from appearing

Integration Points

Consumers: The build system modules that generate X of the Day content import these constants:

  • Selection logic uses thresholds to filter candidates
  • HTML generators write to the configured output paths
  • Recommendation engine checks blacklist before selecting sites

Related metadata files:

  • metadata/quotes.hs - Quote database
  • metadata/sites.hs - Site recommendation database
  • metadata/annotations.hs - Full annotation database

See Also

  • XOfTheDay.hs - Main module consuming this configuration
  • Hakyll.hs - Site generator that invokes XOTD generation
  • Annotation.hs - Populates the annotation database this module references
  • Config.Misc - Related configuration module with global constants
  • LinkMetadata.hs - Annotation database used for annotation-of-the-day
  • Blog.hs - Related daily content system for blog posts
  • sync.sh - Build orchestrator coordinating content generation