Using Generative AI as Early Peer Review

4–6 minutes

Cheap Adversaries, Outsourced Ego, and Engineered Critique ← ChatGPT is obsessed with subtitles.

There is a peculiar anxiety around admitting that one uses generative AI in serious intellectual work. The anxiety usually takes one of two forms. Either the AI is accused of replacing thinking, or it is accused of flattering the thinker into delusion. Both charges miss the point, and both underestimate how brittle early-stage human peer review often is.

What follows is not a defence of AI as an oracle, nor a claim that it produces insight on its own. It is an account of how generative models can be used – deliberately, adversarially, and with constraints – as a form of early peer pressure. Not peer review in the formal sense, but a rehearsal space where ideas are misread, overstated, deflated, and occasionally rescued from themselves.

Audio: NotebookLM summary podcast of this topic.

The unromantic workflow

The method itself is intentionally dull:

  1. Draft a thesis statement.
    Rinse & repeat.
  2. Draft an abstract.
    Rinse & repeat.
  3. Construct an annotated outline.
    Rinse & repeat.
  4. Only then begin drafting prose.

At each stage, the goal is not encouragement or expansion but pressure. The questions I ask are things like:

  • Is this already well-trodden ground?
  • Is this just X with different vocabulary?
  • What objection would kill this quickly?
  • What would a sceptical reviewer object to first?

The key is timing. This pressure is applied before the idea is polished enough to be defended. The aim is not confidence-building; it is early damage.

Image: NotebookLM infographic on this topic.

Why generative AI helps

In an ideal world, one would have immediate access to sharp colleagues willing to interrogate half-formed ideas. In practice, that ecology is rarely available on demand. Even when it is, early feedback from humans often comes bundled with politeness, status dynamics, disciplinary loyalty, or simple fatigue.

Generative models are always available, never bored, and indifferent to social cost. That doesn’t make them right. It makes them cheap adversaries. And at this stage, adversaries are more useful than allies.

Flattery is a bias, not a sin

Large language models are biased toward cooperation. Left unchecked, they will praise mediocre ideas and expand bad ones into impressive nonsense. This is not a moral failure. It is a structural bias.

The response is not to complain about flattery, but to engineer against it.

Sidebar: A concrete failure mode

I recently tested a thesis on Mistral about object permanence. After three exchanges, the model had escalated a narrow claim into an overarching framework, complete with invented subcategories and false precision. The prose was confident. The structure was impressive. The argument was unrecognisable.

This is the Dunning-Kruger risk in practice. The model produced something internally coherent that I lacked the domain expertise to properly evaluate. Coherence felt like correctness.

The countermeasure was using a second model, which immediately flagged the overreach. Disagreement between models is often more informative than agreement.

Three tactics matter here.

1. Role constraint
Models respond strongly to role specification. Asking explicitly for critique, objections, boundary-setting, and likely reviewer resistance produces materially different output than asking for ‘thoughts’ or ‘feedback’.

2. Third-person framing
First-person presentation cues collaboration. Third-person presentation cues evaluation.

Compare:

  • Here’s my thesis; what do you think?
  • Here is a draft thesis someone is considering. Please evaluate its strengths, weaknesses, and likely objections.

The difference is stark. The first invites repair and encouragement. The second licenses dismissal. This is not trickery; it is context engineering.

3. Multiple models, in parallel
Different models have different failure modes. One flatters. Another nitpicks. A third accuses the work of reinventing the wheel. Their disagreement is the point. Where they converge, caution is warranted. Where they diverge, something interesting is happening.

‘Claude says…’: outsourcing the ego

One tactic emerged almost accidentally and turned out to be the most useful of all.

Rather than responding directly to feedback, I often relay it as:

“Claude says this…”

The conversation then shifts from defending an idea to assessing a reading of it. This does two things at once:

  • It removes personal defensiveness. No one feels obliged to be kind to Claude.
  • It invites second-order critique. People are often better at evaluating a critique than generating one from scratch.

This mirrors how academic peer review actually functions:

  • Reviewer 2 thinks you’re doing X.
  • That seems like a misreading.
  • This objection bites; that one doesn’t.

The difference is temporal. I am doing this before the draft hardens and before identity becomes entangled with the argument.

Guardrails against self-delusion

There is a genuine Dunning–Kruger risk when working outside one’s formal domain. Generative AI does not remove that risk. Used poorly, it can amplify it.

The countermeasure is not humility as a posture, but friction as a method:

  • multiple models,
  • adversarial prompting,
  • third-person evaluation,
  • critique of critiques,
  • and iterative narrowing before committing to form.

None of this guarantees correctness. It does something more modest and more important: it makes it harder to confuse internal coherence with external adequacy.

What this cannot do

It’s worth being explicit about the limits. Generative models cannot tell you whether a claim is true. They can tell you how it is likely to be read, misread, resisted, or dismissed. They cannot arbitrate significance. They cannot decide what risks are worth taking. They cannot replace judgment. Those decisions remain stubbornly human.

What AI can do – when used carefully – is surface pressure early, cheaply, and without social cost. It lets ideas announce their limits faster, while those limits are still negotiable.

A brief meta-note

For what it’s worth, Claude itself was asked to critique an earlier draft of this post. It suggested compressing the familiar arguments, foregrounding the ‘Claude says…’ tactic as the real contribution, and strengthening the ending by naming what the method cannot do.

That feedback improved the piece. Which is, rather conveniently, the point.

Cold, Grammar, and the Quiet Gatekeeping of Philosophy

5–7 minutes

A great deal of philosophy begins with the claim that we ought to examine our assumptions. Fewer philosophers seem interested in examining the mechanisms that decide which assumptions are allowed to count as philosophy in the first place.

This is not a polemic about the Analytic–Continental divide. It’s an observation about how that divide quietly maintains itself. The immediate provocation was banal. Almost embarrassingly so.

In English, the answer feels obvious. I am cold. The grammar barely registers. In French, Italian, or German, the structure flips. One has cold. Or hunger. Or thirst. Or age. Or a name, understood as something one performs rather than something one is. I spoke about this here and here. Indulge this link to the original position being argued.

On the surface, this looks like a curiosity for linguistics students. A translation quirk. A grammatical footnote. But grammar is rarely innocent.

Audio: NotepadLM summary podcast on this topic.

Grammar as Ontological Scaffolding

The verbs to be and to have are not neutral carriers. They quietly encode assumptions about identity, property, possession, and stability.

When I say I am cold, I cast coldness as a property of the self. It becomes something like height or nationality: a state attributable to the person. When I say I have cold, the experience is externalised. The self remains distinct from the condition it undergoes. Neither option is metaphysically clean.

Both structures smuggle in commitments before any philosophy has been done. One risks inflating a transient sensation into an ontological state. The other risks reifying it into a thing one owns, carries, or accumulates. My own suggestion in a recent exchange was a third option: sensing.

Cold is not something one is or has so much as something one feels. A relational encounter. An event between organism and environment. Not an identity predicate, not a possession.

This suggestion was met with a fair pushback: doesn’t saying that cold ‘belongs to the world’ simply introduce a different metaphysical assumption? Yes. It does. And that response neatly demonstrates the problem.

When Grammar Starts Doing Philosophy

The original claim was idiomatic, not ontological. It was a negative gesture, not a positive thesis. The point was not to relocate cold as a mind-independent substance floating about like a rock. It was to resist treating it as an essence of the person. But once you slow down, you see how quickly everyday grammar demands metaphysical loyalty.

Being invites substance. Having invites inventory. Sensing keeps the relation open, but even that makes people nervous. This nervousness is instructive. It reveals how much metaphysical weight we quietly load onto grammatical habits simply because they feel natural. And that feeling of naturalness matters more than we like to admit.

Two Philosophical Temperaments, One Linguistic Groove

At this point, the temptation is to draw a clean line:

On one side: the Anglo-American Analytic tradition, comfortable treating mental states as properties, objects, or items to be catalogued. Locke’s introspective inventory. Hume’s bundle. Logical positivism’s clean surfaces.

On the other: the Continental tradition, suspicious of objectification, insisting on an irreducible subject for whom experience occurs but who is never identical with its contents. Kant, Husserl, Heidegger, Sartre.

The grammar aligns disturbingly well. Languages that habitually say I am cold make it feel natural to treat experience as something inspectable. Languages that insist on having or undergoing experiences keep the subject distinct by default.

This is not linguistic determinism. English speakers can read phenomenology. German speakers can do analytic philosophy. But language behaves less like a prison and more like a grooved path. Some moves feel obvious. Others feel forced, artificial, or obscure.

Philosophies do not arise from grammar alone. But grammar makes certain philosophies feel intuitively right long before arguments are exchanged.

Where Gatekeeping Enters Quietly

This brings us to the part that rarely gets discussed.

The Analytic–Continental divide persists not only because of philosophical disagreement, but because of institutional reinforcement. Peer review, citation norms, and journal cultures act as boundary-maintenance mechanisms. They are not primarily crucibles for testing ideas. They are customs checkpoints for recognisability.

I have been explicitly cautioned, more than once, to remove certain figures or references depending on the venue. Don’t mention late Wittgenstein here. Don’t cite Foucault there. Unless, of course, you are attacking them. This is not about argumentative weakness. It’s about genre violation.

Hybrid work creates a problem for reviewers because it destabilises the grammar of evaluation. The usual criteria don’t apply cleanly. The paper is difficult to shelve. And unshelvable work is treated as a defect rather than a signal. No bad faith is required. The system is doing what systems do: minimising risk, preserving identity, maintaining exchange rates.

Cold as a Diagnostic Tool

The reason the cold example works is precisely because it is trivial.

No one’s career depends on defending a metaphysics of chilliness. That makes it safe enough to expose how quickly grammar starts making demands once you pay attention.

If something as mundane as cold wobbles under scrutiny, then the scaffolding we rely on for far more abstract notions – self, identity, agency, consciousness – should make us uneasy.

And if this is true for human languages, it becomes far more pressing when we imagine communication across radically different forms of life.

Shared vocabulary does not guarantee shared metaphysics. Familiar verbs can conceal profound divergence. First contact, if it ever occurs, will not fail because we lack words. It will fail because we mistake grammatical comfort for ontological agreement.

A Modest Conclusion

None of this settles which philosophical tradition is ‘right’. That question is far less interesting than it appears. What it does suggest is that philosophy is unusually sensitive to linguistic scaffolding, yet unusually resistant to examining the scaffolding of its own institutions.

We pride ourselves on questioning assumptions while quietly enforcing the conditions under which questions are allowed to count. Cold just happens to be a good place to start noticing.

A Footnote on Linguistic Determinism

It’s worth being explicit about what this is not. This is not an endorsement of strong linguistic determinism, nor a revival of Sapir–Whorf in its more ambitious forms. English speakers are not condemned to analytic philosophy, nor are Romance-language speakers predestined for phenomenology.

Grammar operates less like a set of handcuffs and more like a well-worn path. Some moves feel effortless. Others require deliberate resistance. Philosophical traditions co-evolve with these habits, reinforcing what already feels natural while treating alternatives as strained, obscure, or unnecessary.

The claim here is not necessity, but friction.

“Trust the Science,” They Said. “It’s Reproducible,” They Lied.

—On Epistemology, Pop Psychology, and the Cult of Empirical Pretence

Science, we’re told, is the beacon in the fog – a gleaming lighthouse of reason guiding us through the turbulent seas of superstition and ignorance. But peer a bit closer, and the lens is cracked, the bulb flickers, and the so-called lighthouse keeper is just some bloke on TikTok shouting about gut flora and intermittent fasting.

Audio: NotebookLM podcast on this topic.

We are creatures of pattern. We impose order. We mistake correlation for causation, narrative for truth, confidence for knowledge. What we have, in polite academic parlance, is an epistemology problem. What we call science is often less Newton and more Nostradamus—albeit wearing a lab coat and wielding a p-hacked dataset.

Let’s start with the low-hanging fruit—the rotting mango of modern inquiry: nutritional science, which is to actual science what alchemy is to chemistry, or vibes are to calculus. We study food the way 13th-century monks studied demons: through superstition, confirmation bias, and deeply committed guesswork. Eat fat, don’t eat fat. Eat eggs, don’t eat eggs. Eat only between the hours of 10:00 and 14:00 under a waxing moon while humming in Lydian mode. It’s a cargo cult with chia seeds.

But why stop there? Let’s put the whole scientific-industrial complex on the slab.

Psychology: The Empirical Astrological Society

Psychology likes to think it’s scientific. Peer-reviewed journals, statistical models, the odd brain scan tossed in for gravitas. But at heart, much of it is pop divination, sugar-dusted for mass consumption. The replication crisis didn’t merely reveal cracks – it bulldozed entire fields. The Stanford Prison Experiment? A theatrical farce. Power poses? Empty gestural theatre. Half of what you read in Psychology Today could be replaced with horoscopes and no one would notice.

Medical Science: Bloodletting, But With Better Branding

Now onto medicine, that other sacred cow. We tend to imagine it as precise, data-driven, evidence-based. In practice? It’s a Byzantine fusion of guesswork, insurance forms, and pharmaceutical lobbying. As Crémieux rightly implies, medicine’s predictive power is deeply compromised by overfitting, statistical fog, and a staggering dependence on non-replicable clinical studies, many funded by those who stand to profit from the result.

And don’t get me started on epidemiology, that modern priesthood that speaks in incantations of “relative risk” and “confidence intervals” while changing the commandments every fortnight. If nutrition is theology, epidemiology is exegesis.

The Reproducibility Farce

Let us not forget the gleaming ideal: reproducibility, that cornerstone of Enlightenment confidence. The trouble is, in field after field—from economics to cancer biology—reproducibility is more aspiration than reality. What we actually get is a cacophony of studies no one bothers to repeat, published to pad CVs, p-hacked into publishable shape, and then cited into canonical status. It’s knowledge by momentum. We don’t understand the world. We just retweet it.

What, Then, Is To Be Done?

Should we become mystics? Take up tarot and goat sacrifice? Not necessarily. But we should strip science of its papal robes. We should stop mistaking publication for truth, consensus for accuracy, and method for epistemic sanctity. The scientific method is not the problem. The pretence that it’s constantly being followed is.

Perhaps knowledge doesn’t have a half-life because of progress, but because it was never alive to begin with. We are not disproving truth; we are watching fictions expire.

Closing Jab

Next time someone says “trust the science,” ask them: which bit? The part that told us margarine was manna? The part that thought ulcers were psychosomatic? The part that still can’t explain consciousness, but is confident about your breakfast?

Science is a toolkit. But too often, it’s treated like scripture. And we? We’re just trying to lose weight while clinging to whatever gospel lets us eat more cheese.

AI is Science Fiction

In the heart of the digital age, a Chinese professor’s AI-authored Science Fiction novel snags a national award, stirring a pot that’s been simmering on the back burner of the tech world. This ain’t your run-of-the-mill Sci-Fi plot—it’s reality, and it’s got tongues wagging and keyboards clacking. Here’s the lowdown on what’s shaking up the scene.

AI Lacks Originality? Think Again

The rap on AI is it’s a copycat, lacking the spark of human creativity. But let’s not kid ourselves—originality is as elusive as a clear day in London. Originality is another weasel word. Everything’s a remix, a mashup of what’s been before. We’ve all been drinking from the same cultural well, so to speak. Humans might be grand at self-deception, thinking they’re the cat’s pyjamas in the creativity department. But throw them in a blind test with AI, and watch them scratch their heads, unable to tell man from machine. It’s like AI’s mixing up a cocktail of words, structures, themes—you name it—and serving up a concoction that’s surprisingly palatable. And this isn’t the first time, not long ago, an AI-created artwork won as best submission at a state fair. In some cases, they are seeking AI-generated submissions; other times, not so much.

AI and the Art Debate

So, AI can’t whip up human-level art? That’s the chatter, but it’s about as meaningful as arguing over your favourite colour. Art’s a slippery fish—try defining it, and you’ll end up with more questions than answers. It’s one of those terms that’s become so bloated, it’s lost its punch. To some, it’s a sunset; to others, it’s a can of soup. So when AI throws its hat in the ring, it’s not just competing—it’s redefining the game.

The Peer Review Question Mark

Here’s where it gets spicy. The book bagging a national award isn’t just a pat on the back for the AI—it’s a side-eye at the whole peer review shindig. It’s like when your mate says they know a great place to eat, and it turns out to be just okay. The peer review process, much like reviewing a book for a prestigious award, is supposed to be the gold standard, right? But this AI-authored book slipping through the cracks and coming out tops? It’s got folks wondering if the process is more smoke and mirrors than we thought.


What’s Next?

So, where does this leave us? Grappling with the idea that maybe, just maybe, AI’s not playing second fiddle in the creativity orchestra. It’s a wake-up call, a reminder that what we thought was exclusively ours—creativity, art, originality—might just be a shared space. AI’s not just imitating life; it’s becoming an intrinsic part of the narrative. Science fiction? More like science fact.

The next chapter’s unwritten, and who knows? Maybe it’ll be penned by an AI, with a human sitting back, marvelling at the twist in the tale.

Peer Review Poo

I’ve been a longstanding fan of science. I’ve never been a fan of Scientism™, which is the dogmatic belief that science is the gate to all knowledge and that the discipline is incorruptible. I’ve even complained in the past about the self-correcting aspect that has sometimes taken centuries and millennia.

In the case of the article that spawned this post, peer review has always felt a bit specious to me. Just getting picked to get into the review queue is political at the start, and few people are actually equipped to perform the review with any material degree of diligence.

Science being peer-reviewed was like a knee-jerk credibility play. Of course, this also reeked of the police department or CIA reviewing their own misdeeds. On the other hand, who else is going to review it? The problem is there is no downside for the shoddy reviewer. There might be three referees who review your work and provide commentary—and so what if they miss some things?

As shoddy as soft sciences are, even hard sciences had reproducibility challenges—and that’s if the domain is reproducible. Models about climate change are not exactly suitable for laboratory reproduction.

Science is getting less and less credible these days. Besides being coopted by moneyed interests, you’ve got the politicos subverting it for their own purposes. Of course, the mismanagement and propagandising of the Covid debacle is still a fresh wound. And as we watch many of the conspiracy claims being shown to be correct and the official message shown to be wrong and intentionally disinformative, it’s hard not to become a jaded cynic. What’s a sceptic to do?