Microsoft Word is still the default for many documentation teams. It works for the first ten pages and the first two reviewers. Then version chaos creeps in, track changes pile up, and the same paragraph lives in three slightly different files. This guide shows you how to migrate from Word to a docs-as-code workflow, step by step, with concrete commands, cleanup tips, and a setup that scales.

Why Move Away From Word?

Word is a great word processor. It is not a documentation system. The pain shows up the moment a document grows past one author and one release.

  • Track changes hell when several reviewers comment on the same paragraph in parallel.
  • No real modularity. Reusing a paragraph in three manuals means three copies that drift apart.
  • No single-source pipeline. PDF, web, and print versions are produced manually and quickly disagree.
  • Vendor lock-in. .docx is a complex format that only Word and a handful of clones render reliably.
  • Format drift. Headings, bullet styles, and table widths change every time someone “just fixes a spacing issue”.

Docs-as-code fixes these by treating documentation like source code: plain text, Git, reviews, and automated builds. For the full background, see our Docs as Code pillar guide. This article focuses purely on the migration from Word.

What You’ll Need Before You Start

You don’t need a build server to get started. You need four things:

  • Pandoc for the actual conversion. On macOS: brew install pandoc. On Windows or Linux, see the Pandoc install page.
  • adoc Studio as the editor. The Community Edition is free and enough for the entire pilot.
  • Git (optional for now). You can introduce it after the first document is migrated.
  • One representative Word file. Pick a real document, not a sample. The conversion has to survive your messiest paragraphs, not a clean test case.
The Migration Flow in Five Phases
Steps 1 + 2
Prepare
Pick a pilot and clean the Word file
Step 3
Convert
Pandoc turns .docx into AsciiDoc
Steps 4 to 6
Refine
Clean up, modularize, and style in adoc Studio
Step 7
Export
Generate a PDF and check it against the original
Steps 8 + 9
Scale
Add Git and roll out to the rest of your library

Phase 1: Prepare

The work you do before Pandoc decides the quality of the result. Pick the right pilot document and clean it up so the conversion runs cleanly.

Step 1: Inventory and Pick a Pilot

Resist the urge to migrate everything in week one. Start with one document that represents the typical complexity of your library. Good pilot candidates have:

  • a clear table of contents
  • a mix of headings, bullet lists, and numbered lists
  • at least one or two tables
  • embedded images
  • cross-references to other sections

Migrating one file like that teaches you 80 percent of what you’ll need for the rest. Bonus: you have a real before/after to show stakeholders.

Step 2: Clean the Word File Before Conversion

Every minute spent cleaning the Word source saves ten minutes after the conversion. Open the pilot file and do the following:

  • Accept all tracked changes and turn track changes off.
  • Replace direct formatting with styles. A bold paragraph that should be a heading must actually use the Heading style. Pandoc relies on Word’s styles to detect structure.
  • Simplify tables. Remove merged cells where possible. Convert decorative tables that are really layout tricks into normal paragraphs.
  • Export embedded objects. Extract embedded Excel sheets, equations, and SmartArt as PNG or SVG. Pandoc cannot reach inside those.
  • Delete dead content. Old comments, hidden text, leftover scratch notes. They will show up in AsciiDoc as noise.

You don’t need a perfect Word file. You need one that uses styles consistently and has no hidden surprises.

Phase 2: Convert

Pandoc takes over now. A single command turns your .docx file into a first AsciiDoc draft together with an images folder.

Step 3: Convert With Pandoc

The actual conversion is one command. From the folder that contains your .docx file:

pandoc handbook.docx -f docx -t asciidoc --wrap=none --extract-media=./images -o handbook.adoc

What the flags do:

  • -f docx tells Pandoc the source is a Word file.
  • -t asciidoc sets the target format.
  • --wrap=none keeps paragraphs on a single line, which makes diffs in Git much cleaner.
  • --extract-media=./images pulls every embedded image out of the .docx and into an images/ folder next to the AsciiDoc file.
  • -o handbook.adoc is the output file.

The result is a handbook.adoc file plus an images/ folder. Don’t be surprised by some artifacts: deeply nested lists, empty paragraphs, inline style spans. We clean those up in the next step.

Phase 3: Refine

Pandoc’s raw output is the starting point, not the destination. In adoc Studio you clean up the file, pull out reusable building blocks, and give the document a proper look.

Step 4: Open in adoc Studio

Drag handbook.adoc into a new project in adoc Studio. Activate the live preview so you see the rendered document next to the source.

Two cleanup tasks usually pay off immediately:

  • Find and replace the most common Pandoc artifacts. Empty [.underline] spans and stray +++ blocks are typical.
  • Reformat tables. AsciiDoc tables use a clean pipe syntax. Reflowing them once makes the rest of the document much easier to maintain.

You’re now in a real editor for the first time, with a structured language and a live preview. The document is no longer locked inside .docx.

Step 5: Restructure for Reuse

This is the step that makes docs-as-code worth the move. Take the converted file and split out anything that repeats:

  • Move the legal footer into _includes/legal-footer.adoc and include it where needed.
  • Define product names and version numbers as attributes at the top of the file:
    :product-name: Acme CRM
    :product-version: 4.2

    Then use {product-name} in the body. Renaming the product is now a one-line change.

  • Pull glossary entries, callouts, and disclaimers into their own small AsciiDoc files. Future you will thank present you.

Don’t over-engineer this on day one. Refactor what already repeats, leave the rest until you see the pattern.

Step 6: Apply a CSS Style

adoc Studio ships with several built-in styles, and you can also point it at your own CSS file. Pick a style that matches your brand, tweak the typography in the live preview, and keep that CSS file in the project. The same CSS now drives both the HTML preview and the PDF export.

This is where the difference becomes visible to stakeholders. A clean PDF, generated from plain text, beats a Word document that was hand-formatted by three different writers.

Phase 4: Export

Now the comparison matters. A PDF from adoc Studio next to the original shows whether the migration passes the test.

Step 7: Export and Compare

Export the document as PDF from adoc Studio. Open it side by side with the original Word PDF and walk through this checklist:

  • Headings render at the right level.
  • Tables are intact, with no missing rows or merged cells gone wrong.
  • Cross-references actually link to the right target.
  • The table of contents reflects the real structure.
  • Images appear in the right place and the right size.

Anything that fails the check goes back to Step 4. Most issues come from Word styles that weren’t applied consistently. Fix the source file once, rerun Pandoc, and you’re done.

Phase 5: Scale

What worked on the pilot, you repeat. Git adds versioning and reviews, and a clear plan pulls the rest of the library along.

Step 8: Add Git When You’re Ready

You’ve now proven the workflow on one document. This is the point where Git starts to pay off.

  • Initialize a Git repository inside the project. adoc Studio has built-in Git support, so you don’t need a terminal.
  • Make your first commit with the converted document. That’s your baseline.
  • Create a branch for the next round of edits. Push to GitHub or GitLab when you want subject matter experts to review.
  • SMEs can comment directly in the browser, line by line, in plain text. No more “v4_FINAL_marvin_edits.docx”.

If your team isn’t ready for Git yet, that’s fine. iCloud sync between Mac, iPad, and iPhone covers most solo and small-team workflows.

Step 9: Scale to the Rest of Your Library

With the pilot done, scale in waves rather than all at once.

  • Group remaining documents by type (manuals, release notes, internal SOPs). Migrate one type at a time.
  • Capture lessons learned from the pilot in a short style guide. What attributes do you use? Where do includes live? Which CSS class belongs to which document type?
  • Build a project template. A new manual is now “copy the template, replace the attributes, start writing”.

For larger libraries, see our no-nonsense migration playbook, which adds an effort calculator and a 30-day roadmap.

Common Pitfalls

A few things will trip up almost every team. Watch for them early:

  • Embedded Excel tables lose their formulas. Export them as proper tables or images first.
  • Equations built with Word’s equation editor don’t survive cleanly. Re-author them in AsciiDoc with stem: blocks or LaTeX.
  • Auto-numbered figures and tables in Word become static numbers in AsciiDoc unless you let AsciiDoc take over numbering. Strip the manual numbers and let the new pipeline number them.
  • Cross-references sometimes lose their targets. Search for <<>> and xref: after the conversion and verify each link.
  • Track changes leftovers. If you skipped Step 2, expect strikethrough text and stray comment markers in your AsciiDoc. Go back, accept changes in Word, and rerun Pandoc.

Where to Go Next

You now have a working Word-to-AsciiDoc workflow with a real document, a clean PDF, and the option to add Git when you need it. From here:

Word is the right tool for letters and memos. For documentation that grows, ships, and gets reviewed, the move to docs-as-code pays for itself in the first migrated handbook.