Xu Ben Project: An Open Research Log — June 2026 | ygpgsgl

This is a research log, not a finished paper. It documents how a vague intuition — "Xu Ben seems to talk about civilisation more than institutions these days" — was tested, corrected, and rebuilt into a different kind of question over two days of intensive AI-assisted research.

The most important results are not about Xu Ben. They are about why the obvious analytical approach — count keywords, track trends — is inadequate, and what a better approach looks like.

The Original Question

Why did Chinese-language public intellectuals in the 2020s appear to shift from specific institutional, historical, and social analysis toward diagnostic discourse centred on "subjectivity," "the complete human," "civilisational crisis," and "meaning"?

Xu Ben (徐贲), a Chinese-American professor who has published extensively in both mainland Chinese media and overseas, was chosen as the entry point — not as a representative case, but as one author whose output is large enough and spans enough platforms to test the question.

How The Hypothesis Changed

The project began with a simple expectation: aggregate keyword frequency would show a decline in institutional vocabulary (制度, 公民, 公共) and a rise in diagnostic vocabulary (人类, 文明, 意义, 主体性) across Xu Ben's writing over time.

This expectation survived first contact with data — barely. Then it was corrected five times.

Correction 1: Genre Mix Effect

The aggregate diagnostic-share trend (25% → 31% → 35%) across three periods was partly a composition effect. Within academic essays (论文) alone, the trend flattened at 29% in 2021-2026. The aggregate was pulled up by a small number of high-diagnostic book reviews (读书). Genre-stratified reporting became mandatory.

Correction 2: Platform Selection Effect

The institutional vocabulary decline seen on Aisixiang (爱思想, a mainland academic repost platform) — from 99 to 52 per 10,000 characters — did not appear on the Caixin blog (财新博客) for the same period (71 → 69, essentially flat). Paired same-article comparison (14 articles appearing on both platforms) showed nearly identical keyword density, confirming the divergence was content selection — which articles each platform collected — not platform editing.

Correction 3: Publication Pipeline Effect

Checking book tables of contents against platform article titles revealed that essay collections showed 46–89% chapter-article overlap. The Caixin blog functioned as a draft workspace for one book (《颓废与沉默》, 2015); Aisixiang functioned as an archive layer for another (《人以什么理由来记忆》, 2008). Different platforms serve different books at different lifecycle stages. The "unit of analysis" shifted from the individual article to the publication pipeline.

Correction 4: The Same Word Does Different Work

Close reading of 6 articles (assisted by four AI reviewers across multiple rounds, with errors exposed and corrected along the way) revealed that the word 制度 performs at least four distinct text functions:

Table 1

Type	What 制度 does
Institution-building	Constructive framework: civic education, democratic governance
Institution-failure analysis	Domination mechanism: propaganda, totalitarian education
Public-intellectual-failure diagnosis	Background: focus on intellectual silence, cynicism
Humanistic-meaning diagnosis	Exits the frame: AI, human subjectivity, civilisation

The same author could produce high-institutional and high-diagnostic texts in the same year (2016). The "decline" was not abandonment; it was a shift in which text function dominated.

Correction 5: Batch-Import Dates Distort Chronology

Aisixiang's "update time" (更新时间) includes batch-import events. One day in 2008 accounts for 12 articles uploaded within a single hour — clearly not a natural publication pattern. These are archive-loading dates, not writing dates. Any time-series analysis must detect and flag such clusters.

The Current Model

Xu Ben's public writing is best modeled as a circulation system. The same author produces different text types — institution-building, institution-failure analysis, public-intellectual-failure diagnosis, and humanistic-meaning diagnosis — across different platforms, genres, and publication cycles. The visible vocabulary shift is real in the current sampled materials but reflects text-function composition and pipeline position, not only intellectual trajectory.

The strongest finding is methodological: aggregate keyword trends mislead if they do not control for text function and lifecycle position.

Research Timeline

Table 2

Date	Stage	Key Event
2026-06-16	Seed note	Published observation notes on Xu Ben's 2025-2026 AI-era writings
2026-06-16	Aisixiang pilot	89 articles across 5 genres; keyword groups v0.2; genre-stratified analysis
2026-06-16	Methodology	Literature review (9 clusters); peer review by 3 AI reviewers
2026-06-17	Caixin comparison	42 Caixin articles; cross-platform paired comparison (14 pairs)
2026-06-17	Book overlap	4 book TOCs; 57 chapter-article matches; lifecycle direction analysis
2026-06-17	Close reading	6 articles × 4 AI reviewers; four-type text-function model
2026-06-17	Model consolidation	v0.2 methodology summary; codebook; manual coding

What This Project Does Not Claim

"Xu Ben abandoned institutional analysis." (2016 samples prove otherwise.)
"Aisixiang proves a general author-level transformation." (Caixin does not show the same trend.)
"Censorship caused the diagnostic turn." (No evidence of same-article political editing; the mechanism is content selection, not content modification.)
"Chinese public intellectuals as a group shifted in the same way." (Only one author studied.)
"Media agenda explains the shift." (No empirical media-agenda data collected yet.)

Public Research Files

The following cleaned research files are available for review:

Methodology Summary v0.2 — 2-page overview: hypothesis → corrections → current model
Current Findings — 8 findings with evidence and interpretation
Source Map — Platform survey across Aisixiang, Caixin, CDT, and others
Publication Map — Xu Ben's book bibliography with publication clusters
Book-Article Overlap Table — Which chapters match which platform articles
Text-Function Codebook v0.1 — Coding rules for the four text-function types
Keyword Groups v0.2 — The vocabulary classification used for scoring

Full article texts, raw crawl data, AI reviewer outputs, and internal working notes are retained locally but not published (copyright, verification status, and relevance concerns).

AI-Assisted Research Statement

This project was conducted by the site author with assistance from multiple AI systems:

Claude Code (cc): data collection scripts, keyword scoring, cross-platform analysis, statistical comparison, close-reading file preparation, version control
Codex (coco): methodology design, findings consolidation, media-agenda framework, four-type text-function model proposal, editorial oversight
GPT: close reading (2 slots), theoretical framing, critical review
Gemini: close reading (1 slot + predictions), corrected twice on platform-editing claims
Grok: close reading (all 6 slots), self-corrected after re-reading, operational suggestions

AI systems contributed analysis, pattern detection, literature orientation, and error checking. AI systems did not make final interpretive judgments — those were made by the human researcher based on data, close reading, and cross-verification between AI outputs. The most valuable AI contribution was not agreement but disagreement: divergent readings between reviewers exposed incorrect assumptions (e.g., the "platform censorship" explanation) that would have survived unchecked in a single-reviewer workflow.

All research decisions, hypothesis revisions, and publication choices are the responsibility of the human author.