/ Ghost Writer / Docs
All Tools Open App GitHub Contact

Ghost Writer Documentation

Complete reference for the Ghost Protocol methodology, 5-stage pipeline, 40-point QA system, platform specs, voice profiles, detection evasion, and API integration.

1. Ghost Protocol Methodology

Ghost Writer operates under three core laws that govern every piece of content:

  1. Controlled Chaos — Deliberate imperfection in every piece. Sentence fragments, unexpected metaphors, paragraphs that run long because the thought demands it. Never uniform.
  2. Signature Voice — Every content type gets a specific voice profile. The engine writes as a defined persona with consistent tone, vocabulary, and structural habits.
  3. Invisible Architecture — Detection evasion is baked into generation, not bolted on. The writing passes because of how it's constructed, not because of post-processing.

2. 5-Stage Pipeline

Every piece of content flows through five sequential stages:

  1. Profile — Load voice profile, set perplexity/burstiness targets, identify platform constraints.
  2. Writer — GPT-5.4 generation with Ghost Protocol system prompt.
  3. QA Engine — 40-point check across 10 blocks.
  4. Adapter — Format for target platform (18 supported).
  5. Polish — Human-pass simulation with 2–3 small edits.

3. The 40-Point QA System

Every piece is validated against 40 checks organized into 10 blocks. Hard checks must pass; soft checks inform quality scoring.

Block A: Statistical (#1–7)

IDNameTypeTargetDescription
#1Sentence Length VarianceHardstdev ≥ 5Sentence length standard deviation must meet minimum for burstiness.
#2Vocabulary Richness TTRSoft≥ 0.45Type-token ratio indicates lexical diversity.
#3Hapax Legomena RatioSoft≥ 0.25Ratio of words used once to total unique words.
#4Average Sentence LengthSoft8–25 wordsWithin human-typical range.
#5Short Sentence PresenceHard≥ 1 sentence ≤ 5 wordsAt least one short sentence or fragment.
#6Long Sentence PresenceSoft≥ 1 sentence ≥ 25 wordsAt least one longer, complex sentence.
#7N-gram DiversitySoftVaried distributionToken distribution should not be overly predictable.

Block B: Classifier Resistance (#8–12)

IDNameTypeTargetDescription
#8Conjunction StartersHard≥ 1 paragraphAt least one paragraph starts with And/But/So.
#9Fragment UsageSoftContains fragmentsContent includes sentence fragments.
#10Parenthetical AsidesSoftContains () or —Parentheticals or em-dashes present.
#11Temperature VarianceSoft0.85–0.95Generation temperature at creation.
#12Model Attribution DefenseSoftVaried patternsPatterns that resist model-specific attribution.

Block C: Linguistic (#13–18)

IDNameTypeTargetDescription
#13Phrase BlacklistHard0 hitsZero hits from 120+ banned AI-detectable phrases.
#14Lexical DiversitySoftTTR ≥ 0.50Vocabulary richness threshold.
#15Readability VarianceSoftFlesch-Kincaid 20–100Readability score within range.
#16Syntactic VarietySoftstdev ≥ 4Sentence structure variation.
#17Emotional AuthenticitySoftVoice-drivenTone matches voice profile.
#18Metaphor/Analogy PresenceSoft≥ 1At least one metaphor or analogy.

Block D: Watermark (#19–20)

IDNameTypeTargetDescription
#19Unicode NormalizationHardCleanNo invisible characters or watermark artifacts.
#20Metadata CleanHardNoneNo embedded metadata or hidden markers.

Block E: Scoring (#21–25)

Confidence targeting, sentence-level clean, plagiarism check, anti-humanizer resistance, language authenticity.

Block F: Bias (#26–28)

Non-native bias clear, domain patterns validated, length optimization.

Block G: Adversarial (#29–31)

Pattern diversity, translation proof, authorship consistency.

Block H: Infrastructure (#32–34)

Multi-detector validation, plain text normalization, platform compliance.

Block I: Evaluation (#35–37)

Third-party benchmark, FPR exploitation clear, AI-assisted classification.

Block J: Governance (#38–40)

Disclosure compliance, audit trail, provenance proof.

4. Platform Specs

All 18 supported platforms with character limits, truncation rules, format, and best practices.

PlatformMax CharsTruncationFormatBest LengthKey Rules
LinkedIn3,000140 mobileplain300–12003 hashtags max, line breaks only
X/Twitter280 / 25K premiumplain200–280Thread format (1/n)
Reddit40,000markdown400–1500 wordsTL;DR for >300 words
Instagram2,200125plain125–5003–5 hashtags, emojis=2 chars
Facebook63,206125 mobileplain40–250Front-load message
TikTok4,000100plain100–300Hook-first
YouTube5,000200plain200–1000Timestamps, chapters
Substackunlimitedhtml/markdown800–3000 wordsH2/H3, pull quotes
Emailhtml+plain50–300 wordssubject<60, preheader<90
Blogunlimitedmarkdown800–2500 wordsH2/H3, meta<160
White Paperunlimitedmarkdown2000–5000 wordsExec summary, citations
Threads500500plain100–500Complete thought
Mediumunlimitedmarkdown800–2500 words5 tags, subtitle
Pinterest500plain100–300Keyword-rich, no hashtag spam
GBP1,500plain150–300CTA button types
Websiteunlimitedhtml300–800 wordsConversion copy
Replycontext-matchedcontext-matched50–200 wordsAcknowledge + answer
Reddit Comment10,000markdown50–300 wordsConversational

5. Voice Profiles

Four built-in profiles plus custom import:

ProfileStyleAvg WordsStdev
john-williamsDirect, opinionated, coaching analogies169
agencyProfessional, data-driven187
technicalPrecise, specification-heavy146
casualConversational, fragment-heavy1211

Voice Import

Paste 2–5 writing samples → engine analyzes sentence length, vocabulary, structure, tone → extracts fingerprint → saves as custom profile.

6. Detection Methodology

How each detector works and how Ghost Writer defeats it:

GPTZero

Uses perplexity + burstiness + 7 indicators. We target perplexity >30, burstiness stdev >5, inject fragments and unexpected word choices.

Pangram v3

Classifies as AI / AI-Assisted / Human with model attribution. We vary token distribution patterns and use voice-specific vocabulary.

Originality.ai v2

99% accuracy, catches paraphrasing. We generate with human patterns from scratch rather than paraphrasing AI text.

7. API Reference

POST /api/writing-agent — Generate content

// Request
{
  "platform": "linkedin",
  "voice": "john-williams",
  "topic": "Why PMax works better with brand campaigns",
  "context": "B2B SaaS audience",
  "length": "500"
}

// Response
{
  "content": "...",
  "platform": "linkedin",
  "checks": { "passed": 38, "failed": 2, "details": [...] }
}

POST /api/writing-agent-check — Check existing text

// Request
{
  "text": "Your existing content to analyze..."
}

// Response
{
  "readability": { "fleschKincaid": 65, "grade": "8th grade", ... },
  "tone": ["confident", "direct"],
  "aiScore": 0.12,
  "suggestions": [...]
}

POST /api/writing-agent-voice — Voice profile management

// Request (analyze samples)
{
  "action": "analyze",
  "samples": ["Sample 1...", "Sample 2...", "Sample 3..."]
}

// Response
{
  "fingerprint": {
    "sentenceLength": { "mean": 16, "stdev": 9, ... },
    "vocabulary": { "ttr": 0.52, "domainTerms": [...], ... }
  }
}

8. Research & Citations