Manuscript Review with LLMs

4–5 minutes

Main event

I’m an active AI user. It’s no secret. My top uses are research and enquiry, but it is instrumental in my review and revision process.

Audio: NotebookLM summary podcast of this topic.

I am trying to wrap up my latest manuscript. I’m about 5 revisions through, so I felt I was finally in a position to check for cracks and missing elements, as well as the strength of my overall position and approach. It’s not a good idea to simply prompt, ‘What do you think about this?’

I’d tried prompts as simple as, ‘Act as a referee and be adversarial against this piece’ or ‘I got this from somewhere, and I want a critique’. These approaches shield you from AI’s programmed sycophantic tendencies. But they aren’t enough. You still need to create guidelines and guardrails, which include orientating the AI; otherwise, they will likely go off the reservation.

This is the actual prompt I last employed to various LLMs:

The attached is a complete development draft of Architecture of Willing, a philosophical monograph arguing that the vocabulary of will, intent, motive, choice, decision, and related terms operates through a two-stage grammatical mechanism – compression of action-patterns into portable nouns, followed by inversion of those nouns into apparent upstream authors of the very patterns from which they were abstracted. The book calls this mechanism authoring displacement and uses it to argue that retributive desert cannot be stably grounded in the vocabulary on which it depends.

The book is deliberately diagnostic rather than prescriptive. It does not propose a replacement psychology, a reformed legal code, or a new theory of agency. It refuses to settle the traditional free-will debate on either side. These refusals are intentional and are argued for within the text.

What I am asking for is a critical engagement from a position of maximum philosophical resistance. Specifically:

The book rests on a claim about what retributive practice requires – namely, a stable inward authoring source capable of making suffering genuinely owed rather than merely institutionally imposed. If that characterisation of retributivism’s requirements is wrong, or if it applies only to unsophisticated versions while leaving the strongest contemporary defences untouched, the central argument is significantly weakened. I would like to know whether that is the case, and if so, where exactly the book’s account of retributivism’s commitments fails to engage its best defenders.

More broadly: the book is a diagnosis of grammar. The question I want pressed is whether a grammatical diagnosis can do the normative work the book needs it to do – whether there is a gap between ‘this noun cannot stably support the load placed on it’ and ‘therefore practices depending on this noun are normatively unjustified’. If there is such a gap, what would close it, and does the book close it?

Please do not soften objections in the direction of ‘this is a good book with some gaps’. If the argument is unsound, say so and say where. If it is sound against some targets but not others, identify the targets it misses. The manuscript has already received generous assessments; what it needs now is the strongest case against it.

Of course, this prompt is specific to me and my project, but one may feel free to use it as a model for similar purposes.

Among the gaps returned were arguments I had not been aware of. In fact, in a couple of places, I had already cited authors, but the AI returned additional books or essays by the same people. In other cases, it offered material by authors I hadn’t considered. Obviously, I am interested in creating solid, watertight arguments, so this only helps my case.

For this project, my LLMs of choice have been Claude, ChatGPT, Gemini, Grok, and Kimi K2. I used Perplexity, Mistral, DeepSeek, and Z.ai GLM in earlier iterations.

Peer review

Another application is to take the critique output from one LLM into another with a prompt to evaluate the critique. My modus operandi here is to pick a ‘master’ LLM – typically in a Claude or ChatGPT project context – and treat it as my primary partner; the others are virtual subcontractors. This means that I can get a half-dozen or more reactions in minutes, which are then digested by the, let’s say, project manager, for assessment and a proposed action plan, typically in the form of a punch list. I recommend this approach as well.

NotebookLM Infographic on this topic.

Closing shot

When I was in grad school, this part of the project would have taken months. As it is, I’ve been working on this project since COVID-19, but it’s been an on-and-off affair, accumulating research information and documentation all the while. The manuscript will be better off, and my position honed sharper over this expanse of time, so the delay was beneficial.

Would more time also be beneficial? Probably, but one needs to stop somewhere, and I’m likely facing diminishing marginal returns. If I go the way of Wittgenstein, I’ll reverse track and recant everything. And so it goes…

Using Generative AI as Early Peer Review

4–6 minutes

Cheap Adversaries, Outsourced Ego, and Engineered Critique ← ChatGPT is obsessed with subtitles.

There is a peculiar anxiety around admitting that one uses generative AI in serious intellectual work. The anxiety usually takes one of two forms. Either the AI is accused of replacing thinking, or it is accused of flattering the thinker into delusion. Both charges miss the point, and both underestimate how brittle early-stage human peer review often is.

What follows is not a defence of AI as an oracle, nor a claim that it produces insight on its own. It is an account of how generative models can be used – deliberately, adversarially, and with constraints – as a form of early peer pressure. Not peer review in the formal sense, but a rehearsal space where ideas are misread, overstated, deflated, and occasionally rescued from themselves.

Audio: NotebookLM summary podcast of this topic.

The unromantic workflow

The method itself is intentionally dull:

  1. Draft a thesis statement.
    Rinse & repeat.
  2. Draft an abstract.
    Rinse & repeat.
  3. Construct an annotated outline.
    Rinse & repeat.
  4. Only then begin drafting prose.

At each stage, the goal is not encouragement or expansion but pressure. The questions I ask are things like:

  • Is this already well-trodden ground?
  • Is this just X with different vocabulary?
  • What objection would kill this quickly?
  • What would a sceptical reviewer object to first?

The key is timing. This pressure is applied before the idea is polished enough to be defended. The aim is not confidence-building; it is early damage.

Image: NotebookLM infographic on this topic.

Why generative AI helps

In an ideal world, one would have immediate access to sharp colleagues willing to interrogate half-formed ideas. In practice, that ecology is rarely available on demand. Even when it is, early feedback from humans often comes bundled with politeness, status dynamics, disciplinary loyalty, or simple fatigue.

Generative models are always available, never bored, and indifferent to social cost. That doesn’t make them right. It makes them cheap adversaries. And at this stage, adversaries are more useful than allies.

Flattery is a bias, not a sin

Large language models are biased toward cooperation. Left unchecked, they will praise mediocre ideas and expand bad ones into impressive nonsense. This is not a moral failure. It is a structural bias.

The response is not to complain about flattery, but to engineer against it.

Sidebar: A concrete failure mode

I recently tested a thesis on Mistral about object permanence. After three exchanges, the model had escalated a narrow claim into an overarching framework, complete with invented subcategories and false precision. The prose was confident. The structure was impressive. The argument was unrecognisable.

This is the Dunning-Kruger risk in practice. The model produced something internally coherent that I lacked the domain expertise to properly evaluate. Coherence felt like correctness.

The countermeasure was using a second model, which immediately flagged the overreach. Disagreement between models is often more informative than agreement.

Three tactics matter here.

1. Role constraint
Models respond strongly to role specification. Asking explicitly for critique, objections, boundary-setting, and likely reviewer resistance produces materially different output than asking for ‘thoughts’ or ‘feedback’.

2. Third-person framing
First-person presentation cues collaboration. Third-person presentation cues evaluation.

Compare:

  • Here’s my thesis; what do you think?
  • Here is a draft thesis someone is considering. Please evaluate its strengths, weaknesses, and likely objections.

The difference is stark. The first invites repair and encouragement. The second licenses dismissal. This is not trickery; it is context engineering.

3. Multiple models, in parallel
Different models have different failure modes. One flatters. Another nitpicks. A third accuses the work of reinventing the wheel. Their disagreement is the point. Where they converge, caution is warranted. Where they diverge, something interesting is happening.

‘Claude says…’: outsourcing the ego

One tactic emerged almost accidentally and turned out to be the most useful of all.

Rather than responding directly to feedback, I often relay it as:

“Claude says this…”

The conversation then shifts from defending an idea to assessing a reading of it. This does two things at once:

  • It removes personal defensiveness. No one feels obliged to be kind to Claude.
  • It invites second-order critique. People are often better at evaluating a critique than generating one from scratch.

This mirrors how academic peer review actually functions:

  • Reviewer 2 thinks you’re doing X.
  • That seems like a misreading.
  • This objection bites; that one doesn’t.

The difference is temporal. I am doing this before the draft hardens and before identity becomes entangled with the argument.

Guardrails against self-delusion

There is a genuine Dunning–Kruger risk when working outside one’s formal domain. Generative AI does not remove that risk. Used poorly, it can amplify it.

The countermeasure is not humility as a posture, but friction as a method:

  • multiple models,
  • adversarial prompting,
  • third-person evaluation,
  • critique of critiques,
  • and iterative narrowing before committing to form.

None of this guarantees correctness. It does something more modest and more important: it makes it harder to confuse internal coherence with external adequacy.

What this cannot do

It’s worth being explicit about the limits. Generative models cannot tell you whether a claim is true. They can tell you how it is likely to be read, misread, resisted, or dismissed. They cannot arbitrate significance. They cannot decide what risks are worth taking. They cannot replace judgment. Those decisions remain stubbornly human.

What AI can do – when used carefully – is surface pressure early, cheaply, and without social cost. It lets ideas announce their limits faster, while those limits are still negotiable.

A brief meta-note

For what it’s worth, Claude itself was asked to critique an earlier draft of this post. It suggested compressing the familiar arguments, foregrounding the ‘Claude says…’ tactic as the real contribution, and strengthening the ending by naming what the method cannot do.

That feedback improved the piece. Which is, rather conveniently, the point.

Editing Is Hard and Propensity

2–3 minutes

Well, not so much hard as not particularly or inherently enjoyable.

I estimate I’ve got about a day left to complete this manuscript – ‘done’ done. When I open InDesign, it shames me – 3 days ago, I last touched this document. It doesn’t feel like 3 days have passed, but time flies.

On the right is an older version. I began reworking it into this new version over the summer, and here I am come autumn. It’s even worse if I use the Chinese calendar. Evidently, 7th November is the first day of winter. They can’t wait until soltace.

Anyway, just a brief update. This isn’t going to edit itself, and I can’t afford to pay an editor for a passion project. Besides – and let’s be honest – I can’t afford an editor in general – or at least can’t cost-justify it – and all my writing is a passion project.

Of course, editors (and cover artists) insist that one would sell more book if only they were edited or professionally rendered. There is an element of truth to this, but I’ve read some gawdawful books that were professionally edited and published through a traditional publisher, because publishers publish.

Me, I operate on razor-thin margins. Most of my publications haven’t even broken even – even if I ignore opportunity costs, which I can’t because I’m an economist. Accountants get to play that trick.

This said, I do hire reviewers, editors, and artists in small doses – homoeopathic as they might be – and I’ve had mixed results.

I’m rambling

Must really be avoiding the editing process…

Recently, I wanted to redesign the cover of one of my Ridley Park fiction books.

Image Comparison: A Tale of Two Propensities

The cover on the left is the original. It is intentionally a minimal 2-D construction – a representation of the first section of the book, the first 15 chapters.

The cover on the right is the update. It is also minimalist, representing the second section of Propensity. I’m not sure how I would depict the third section. If it comes to me, I may render a third version.

There’s a story to this. I reached out to some cover artists and told them I was unhappy with my original design but had no visual ideas. I’d leave this to the artist. It turns out that some artists don’t want full control over the design process. I can understand the hesitation.

They asked for covers that I might like, so I researched some covers and saved them to a Pinterest board.

As it turned out, after some inspiration, I decided to render this one myself, too. Hey, I tried.

What happened to the rest of the time?

OK, so there’s more. I also created a video book trailer in the evening.

It was fun enough. Give it a watch. It also represents part one of Propensity.

OK, this time for real. Let me know what you think…about anything in particular.

Understanding Generative AI

Ok. I admit this is an expansive claim, but I write about the limitations on generative artificial intelligence relative to writers. I wrote this after encountering several Reddit responses by writers who totally misunderstand how AI works. They won’t read this, but you might want to.

Click to visit the Ridley Park Blog for this article and podcast
Video: Cybernetic robot assisting a female writer (or stealing her work)

The Indexing Abyss: A Cautionary Tale in Eight Chapters

There, I said it.

I’m almost finished with A Language Insufficiency Hypothesis, the book I’ve been labouring over for what feels like the gestation period of a particularly reluctant elephant. To be clear: the manuscript is done. Written. Edited. Blessed. But there remains one final circle of publishing hell—the index.

Now, if you’re wondering how motivated I am to return to indexing, consider this: I’m writing this blog post instead. If that doesn’t scream avoidance with an airhorn, nothing will.

Audio: NotebookLM podcast on this topic.

I began indexing over a month ago. I made it through two chapters of eight, then promptly wandered off to write a couple of novellas. As you do. One started as a short story—famous last words—and evolved into a novella. The muse struck again. Another “short story” appeared, and like an unattended sourdough starter, it fermented into a 15,000-word novelette. Apparently, I write short stories the way Americans pour wine: unintentionally generous.

With several unpublished manuscripts loitering on my hard drive like unemployed theatre majors, I figured it was time to release one into the wild. So I did. I published the novelette to Kindle, and just today, the paperback proof landed in my postbox like a smug little trophy.

And then, because I’m an unrepentant completionist (or a masochist—jury’s out), I thought: why not release the novella too? I’ve been told novellas and novelettes are unpopular due to “perceived value.” Apparently, people would rather buy a pound of gristle than 200 grams of sirloin. And yet, in the same breath, they claim no one has time for long books anymore. Perhaps these are different tribes of illiterates. I suppose we’ll find out.

Let’s talk logistics. Writing a book is only the beginning—and frankly, it’s the easy part. Fingers to keyboard, ideas to page. Done. I use Word, like most tragically conventional authors. Planning? Minimal. These were short stories, remember? That was the plan.

Next comes layout. Enter Adobe InDesign—because once you’ve seen what Word does to complex layouts, you never go back. Export to PDF, pray to the typographic gods, and move on.

Then there’s the cover. I lean on Illustrator and Photoshop. Photoshop is familiar, like a worn-in shoe; Illustrator is the smug cousin who turns up late but saves the day with scalable vectors. This time, I used Illustrator for the cover—lesson learnt from past pixelation traumas. Hardback to paperback conversion? A breeze when your artwork isn’t made of crayon scribbles and hope.

Covers, in case you’ve never assembled one, are ridiculous. Front. Back. Spine. Optional dust jacket if you’re feeling fancy (I wasn’t). You need titles, subtitles, your name in a legible font, and let’s not forget the barcode, which you will place correctly on the first attempt exactly never.

Unlike my first novel, where I enlisted someone with a proper design eye to handle the cover text, this time I went full minimalist. Think Scandinavian furniture catalogue meets existential despair. Classy.

Once the cover and interior are done, it’s time to wrestle with the publishing platforms. Everything is automated these days—provided you follow their arcane formatting commandments, avoid forbidden fonts, and offer up your soul. Submitting each book takes about an hour, not including the time lost choosing a price that balances “undervalued labour” and “won’t scare away cheapskates.”

Want a Kindle version? That’s another workflow entirely, full of tortured formatting, broken line breaks, and wondering why your chapter headings are now in Wingdings. Audiobooks? That’s a whole other circus, with its own animals and ringmasters. Honestly, it’s no wonder authors hire publishers. Or develop drinking problems.

But I’m stubborn. Which brings us full circle.

I’ve now got two books heading for daylight, a few more waiting in the wings, and one bloody non-fiction beast that won’t see release until I finish the damn index. No pseudonym this time. No hiding. Just me, owning my sins and hoping the final product lands somewhere between “insightful” and “mercifully short.”

So yes, life may well be a journey. But indexing is the bit where the satnav breaks, the road floods, and the boot falls off the car. Give me the destination any day. The journey can fuck right off.

Language Insufficiency Hypothesis, Chapter One

Well, a word cloud anyway.

I’m busy focusing on the manuscript for Chapter One of my Language Insufficiency Hypothesis. Scanning the copy—over 4,000 words—I noticed some repetition. My eye caught ‘fundamental’, which I use often in speech as well. I decided to try a word cloud to illustrate common words.

As this is a hypothesis on the limitations of language, the fact that these are prominent is no surprise, but 34 instances of ‘fundamental’ seem fundamentally off—see what I did there? And this is only the first chapter of eight.

I’ll review the content to see if I can remove some instances—perhaps entire sentences—or choose a synonym to see where it leads.

The manuscript is in its fourth full revision, and it feels closer to completion. I aim for an early 2025 release date, but time will tell.

Ass usual, I employ AI (Claude, in this case) to do some heavy lifting.

Having analysed the chapter, I’ve identified several places where “fundamental” could be substituted or removed to enhance the linguistic diversity and rhetorical effectiveness of the text. Here are some examples:

  1. “These limitations, we argue, manifest not merely as practical challenges but as fundamental features of how meaning systems operate…”
    • Alternative: “These limitations, we argue, manifest not merely as practical challenges but as intrinsic features of how meaning systems operate…”
  2. “This theoretical perspective fundamentally reframes several longstanding problems in linguistics and philosophy.”
    • Alternative: “This theoretical perspective radically reframes several longstanding problems in linguistics and philosophy.”
  3. These patterns suggest inherent rather than circumstantial constraints – limitations that emerge from language’s structure itself rather than from inadequate definition or imprecise usage.”
    • Here, “fundamental” could be omitted entirely without losing the core meaning of the sentence.
  4. Understanding these constraints as systematic features rather than implementation failures provides a crucial theoretical foundation for the Language Insufficiency Hypothesis.”
    • Alternative: “Understanding these constraints as systematic features rather than implementation failures provides a vital theoretical foundation for the Language Insufficiency Hypothesis.”
  5. These cognitive constraints manifest not as mere practical impediments but as necessary features of how meaning is processed and transmitted through human neural networks.
    • Here, “fundamental” could be replaced with “essential” or simply omitted.

Notice that 3 of the 5 recommendations didn’t include the offending word. Rather, they already employed synonyms. This gives us insights into how LLMs translate language with fuzzy logic. Perhaps that’s an article for another day.