Skip to main content

Xu Ben Project: An Open Research Log — June 2026

June 17, 2026

This is a research log, not a finished paper. It documents how a vague intuition — "Xu Ben seems to talk about civilisation more than institutions these days" — was tested, corrected, and rebuilt into a different kind of question over two days of intensive AI-assisted research.

The most important results are not about Xu Ben. They are about why the obvious analytical approach — count keywords, track trends — is inadequate, and what a better approach looks like.

The Original Question

Why did Chinese-language public intellectuals in the 2020s appear to shift from specific institutional, historical, and social analysis toward diagnostic discourse centred on "subjectivity," "the complete human," "civilisational crisis," and "meaning"?

Xu Ben (徐贲), a Chinese-American professor who has published extensively in both mainland Chinese media and overseas, was chosen as the entry point — not as a representative case, but as one author whose output is large enough and spans enough platforms to test the question.

How The Hypothesis Changed

The project began with a simple expectation: aggregate keyword frequency would show a decline in institutional vocabulary (制度, 公民, 公共) and a rise in diagnostic vocabulary (人类, 文明, 意义, 主体性) across Xu Ben's writing over time.

This expectation survived first contact with data — barely. Then it was corrected five times.

Correction 1: Genre Mix Effect

The aggregate diagnostic-share trend (25% → 31% → 35%) across three periods was partly a composition effect. Within academic essays (论文) alone, the trend flattened at 29% in 2021-2026. The aggregate was pulled up by a small number of high-diagnostic book reviews (读书). Genre-stratified reporting became mandatory.

Correction 2: Platform Selection Effect

The institutional vocabulary decline seen on Aisixiang (爱思想, a mainland academic repost platform) — from 99 to 52 per 10,000 characters — did not appear on the Caixin blog (财新博客) for the same period (71 → 69, essentially flat). Paired same-article comparison (14 articles appearing on both platforms) showed nearly identical keyword density, confirming the divergence was content selection — which articles each platform collected — not platform editing.

Correction 3: Publication Pipeline Effect

Checking book tables of contents against platform article titles revealed that essay collections showed 46–89% chapter-article overlap. The Caixin blog functioned as a draft workspace for one book (《颓废与沉默》, 2015); Aisixiang functioned as an archive layer for another (《人以什么理由来记忆》, 2008). Different platforms serve different books at different lifecycle stages. The "unit of analysis" shifted from the individual article to the publication pipeline.

Correction 4: The Same Word Does Different Work

Close reading of 6 articles (assisted by four AI reviewers across multiple rounds, with errors exposed and corrected along the way) revealed that the word 制度 performs at least four distinct text functions:

Table 1
Type What 制度 does
Institution-building Constructive framework: civic education, democratic governance
Institution-failure analysis Domination mechanism: propaganda, totalitarian education
Public-intellectual-failure diagnosis Background: focus on intellectual silence, cynicism
Humanistic-meaning diagnosis Exits the frame: AI, human subjectivity, civilisation

The same author could produce high-institutional and high-diagnostic texts in the same year (2016). The "decline" was not abandonment; it was a shift in which text function dominated.

Correction 5: Batch-Import Dates Distort Chronology

Aisixiang's "update time" (更新时间) includes batch-import events. One day in 2008 accounts for 12 articles uploaded within a single hour — clearly not a natural publication pattern. These are archive-loading dates, not writing dates. Any time-series analysis must detect and flag such clusters.

The Current Model

Xu Ben's public writing is best modeled as a circulation system. The same author produces different text types — institution-building, institution-failure analysis, public-intellectual-failure diagnosis, and humanistic-meaning diagnosis — across different platforms, genres, and publication cycles. The visible vocabulary shift is real in the current sampled materials but reflects text-function composition and pipeline position, not only intellectual trajectory.

The strongest finding is methodological: aggregate keyword trends mislead if they do not control for text function and lifecycle position.

Research Timeline

Table 2
Date Stage Key Event
2026-06-16 Seed note Published observation notes on Xu Ben's 2025-2026 AI-era writings
2026-06-16 Aisixiang pilot 89 articles across 5 genres; keyword groups v0.2; genre-stratified analysis
2026-06-16 Methodology Literature review (9 clusters); peer review by 3 AI reviewers
2026-06-17 Caixin comparison 42 Caixin articles; cross-platform paired comparison (14 pairs)
2026-06-17 Book overlap 4 book TOCs; 57 chapter-article matches; lifecycle direction analysis
2026-06-17 Close reading 6 articles × 4 AI reviewers; four-type text-function model
2026-06-17 Model consolidation v0.2 methodology summary; codebook; manual coding

What This Project Does Not Claim

  • "Xu Ben abandoned institutional analysis." (2016 samples prove otherwise.)
  • "Aisixiang proves a general author-level transformation." (Caixin does not show the same trend.)
  • "Censorship caused the diagnostic turn." (No evidence of same-article political editing; the mechanism is content selection, not content modification.)
  • "Chinese public intellectuals as a group shifted in the same way." (Only one author studied.)
  • "Media agenda explains the shift." (No empirical media-agenda data collected yet.)

Public Research Files

The following cleaned research files are available for review:

Full article texts, raw crawl data, AI reviewer outputs, and internal working notes are retained locally but not published (copyright, verification status, and relevance concerns).

AI-Assisted Research Statement

This project was conducted by the site author with assistance from multiple AI systems:

  • Claude Code (cc): data collection scripts, keyword scoring, cross-platform analysis, statistical comparison, close-reading file preparation, version control
  • Codex (coco): methodology design, findings consolidation, media-agenda framework, four-type text-function model proposal, editorial oversight
  • GPT: close reading (2 slots), theoretical framing, critical review
  • Gemini: close reading (1 slot + predictions), corrected twice on platform-editing claims
  • Grok: close reading (all 6 slots), self-corrected after re-reading, operational suggestions

AI systems contributed analysis, pattern detection, literature orientation, and error checking. AI systems did not make final interpretive judgments — those were made by the human researcher based on data, close reading, and cross-verification between AI outputs. The most valuable AI contribution was not agreement but disagreement: divergent readings between reviewers exposed incorrect assumptions (e.g., the "platform censorship" explanation) that would have survived unchecked in a single-reviewer workflow.

All research decisions, hypothesis revisions, and publication choices are the responsibility of the human author.