Methodology — AI News Social

Layer 1 · The short answer

Nine-criterion rubric, four categories, one Longer View, weekly.

Each week, a research pipeline searches roughly forty curated sources across news, academic repositories, and civic-tech outlets. Candidate articles are scored on nine inclusion criteria. Those that clear threshold are analyzed across eight critical-thinking dimensions and sorted into four topical categories. A library-grounded editorial layer then produces a Longer View essay, category reports, audience briefings, podcasts, and columnist pieces. Every empirical claim carries a direct citation.

76

Editions published

9

Rubric criteria

8

CT dimensions

4

What gets in, and how it's evaluated

The nine-criterion inclusion rubric

Every candidate article is scored on these nine criteria. Articles below a weighted threshold are excluded from the edition. Scores are preserved in the week's archive and exposed in the analysis page.

Criterion	What it rewards	Weight
Relevance	Direct thematic fit with the edition's four categories.	10%
Source authority	Institutional, academic, or domain-expert provenance.	10%
Recency	Published within the edition's calendar window.	8%
Reasoning quality	Clear argumentative structure; identifiable claims and warrants.	14%
Evidence	Primary sources, data, or documented reporting — not opinion-only.	14%
Specificity	Concrete cases, numbers, or named actors beat generalities.	10%
Argumentative depth	Engages counter-arguments and edge cases, not one-sided.	12%
Methodological transparency	For research: disclosed method, sample, limitations.	12%
Novelty	Adds something the edition does not already cover.	10%

The four categories

SA

Social Aspects of AI

Labor, inequality, governance, public reasoning, cultural effects — the human consequences of AI deployment.

AIL

AI Literacy

What people need to know to reason about AI systems — conceptual, practical, critical.

AIT

AI Tools

Concrete capabilities, product releases, technical developments — what's actually shipping and how it works.

HE

Higher Education

The professional audience layer: teaching, research, governance, policy in colleges and universities.

The eight critical-thinking dimensions

After articles pass the rubric, each is analyzed along eight dimensions used to build the synthesis layer:

Claim structure — what is being asserted, and how centrally.
Evidence base — what kind of evidence supports it, how strong.
Stance — where the piece stands on the live questions in its category.
Framing — which narrative frame organizes the argument (progress, threat, equity, governance, etc.).
Counter-argument engagement — how explicitly alternatives are addressed.
Concept co-occurrence — which ideas cluster together, across the corpus.
Causal attribution — what is claimed to cause what, with what confidence.
Temporal orientation — retrospective, present-tense, anticipatory, or longitudinal.

Library grounding

Editorial pieces (the Longer View, category essays, audience briefings, thinker columns) are written against the current week's accepted articles plus two read-only libraries of prior thinkers' work: a general English intellectual library and a LATAM intellectual library (for the Spanish edition). This keeps the editorial voice grounded in the tradition it draws from. Library retrieval is silent — the editorial never invokes a thinker who isn't retrievable in the corpus.

Layer 3 · The full pipeline

Cell-by-cell: what runs each week

The pipeline is organized into numbered stages. Each stage is a set of Python "cells" — small, restartable modules. A full run takes roughly 8–12 hours wall-clock, bottlenecked by the evaluation and long-form generation steps. Below is the run order.

STAGE 00  Config + orchestrator bring-up
STAGE 10  Search: SerpAPI + curated RSS + academic feeds
          → raw_search_results/*.json
STAGE 20  Extraction: Diffbot + fallback scrapers
          → accepted_articles_*.csv (pre-rubric pool)
STAGE 25  Editorial instinct (V3 additions):
          25_01 intellectual_lineage
          25_02 knowledge_gap
          25_03 topic_trajectory
          25_04 topic_arc_analysis      ← Longer View engine
          25_05 longer_view_topic_picker
STAGE 30  Scoring: nine-criterion rubric + dimension tagging
          → accepted_articles_*.csv (post-rubric, ranked)
STAGE 40  Content synthesis:
          40_01 briefing_context (per category)
          40_02 podcast_generation (Opus public_content)
          40_03 thinker_columns (McLuhan / Toffler / Asimov|Freire)
STAGE 50  Editorial essays (per category; Opus, library-grounded)
          50_04 named-thinker deep columns
STAGE 60  Publishing:
          60_01 briefings (4 HE audiences)
          60_02 essays (4 categories + Longer View)
          60_03–06 PDF compile (xelatex → Tufte layout)
          60_07 HTML assembly (template_v3.html)
          60_08 longer_view_en + longer_view_es
          60_09 methodology page refresh
STAGE 80  Audio production: Chatterbox multilingual
          80_01 podcast_images
          80_02 podcast_composition (TTS + stitching)
          80_03 website_video_prep
STAGE 90  Deploy + archive:
          90_01 archive_previous (COPY semantics; landmine-safe)
          90_02 deploy_website (FTP to ainews.social)
          90_02_5 email_buttondown (draft to EN + ES lists)
          90_03 post_deploy_verify
          90_04 youtube_upload
          90_05 corpus_update (sources + work planes)
          90_06 weekly_intelligence_archive

Expand: model routing

All model selection goes through config/models.yaml, never hard-coded into cells.

Public-facing content (editorials, essays, briefings, thinker columns, podcast scripts, Longer View): Anthropic Claude Opus — current routing: public_content → claude-opus-4-7
Image prompts (editorial illustrations, category hero images): Opus 4.6 (explicit override — see feedback note)
Image generation: Google Gemini Nano Banana Pro
Analytical work (scoring, dimension extraction, synthesis at scale): DeepSeek cloud chat + DeepSeek-R1 reasoner
Embeddings: OpenAI text-embedding-3-large (3,072-dim) for 2026-vintage collections; 1,536-dim retained for legacy stats
TTS: Chatterbox multilingual (local GPU) for production; ElevenLabs for one-time voice design only
STT / audio alignment: Faster-Whisper (local GPU)

Expand: reliability & audit

Every week's rubric scores are kept in the edition's archive and exposed on the /analysis/{week_id}/ page. Cronbach's alpha for the nine-criterion rubric is computed from paired independent re-scoring on a random sub-sample each week. Current operational target: α ≥ 0.75 across criteria. Rubric refinements are versioned in config/rubric/.

Editorial grounding is audited by the citation validator (orchestrator/citation_validator): every numerical claim or direct assertion in Longer View / essays / briefings must resolve to a bracketed reference pointing at an accepted article or a library passage. Unresolved citations are flagged at 60_07 and block deploy.

Expand: chat grounding & privacy

The per-edition chat widget is grounded only in the current week's edition — Longer View, category reports and essays, audience briefings, podcast transcripts, thinker columns, and accepted articles. It does not retrieve from prior weeks unless you explicitly ask it to compare editions.

Conversations are not used to train models. They are stored in short-lived logs for abuse prevention only, then discarded. No personal data is required to chat. The email subscription is handled by Buttondown under their privacy terms; we do not sell, share, or enrich subscriber lists.

Expand: open-source and reproducibility

The pipeline is published under CC-BY-NC for the content and MIT-style permissive licensing for reusable library code. Full source: github.com/diegobonilla/ai-news-social (V3 branch). The prompt templates that drive the editorial layer live in config/prompts_v2/ and are versioned alongside every released edition.

Questions

If something here is unclear — or if you think we're getting something wrong — write to diego@csus.edu. Corrections and criticisms are welcome and acknowledged.

How this publication is produced