Microsoft Word is still the default for many documentation teams. It works for the first ten pages and the first two reviewers. Then version chaos creeps in, track changes pile up, and the same paragraph lives in three slightly different files. This guide shows you how to migrate from Word to a docs-as-code workflow, step by step, with concrete commands, cleanup tips, and a setup that scales.
Why Move Away From Word?
Word is a great word processor. It is not a documentation system. The pain shows up the moment a document grows past one author and one release.
- Track changes hell when several reviewers comment on the same paragraph in parallel.
- No real modularity. Reusing a paragraph in three manuals means three copies that drift apart.
- No single-source pipeline. PDF, web, and print versions are produced manually and quickly disagree.
- Vendor lock-in.
.docxis a complex format that only Word and a handful of clones render reliably. - Format drift. Headings, bullet styles, and table widths change every time someone “just fixes a spacing issue”.
Docs-as-code fixes these by treating documentation like source code: plain text, Git, reviews, and automated builds. For the full background, see our Docs as Code pillar guide. This article focuses purely on the migration from Word.
What You’ll Need Before You Start
You don’t need a build server to get started. You need four things:
- Pandoc for the actual conversion. On macOS:
brew install pandoc. On Windows or Linux, see the Pandoc install page. - adoc Studio as the editor. The Community Edition is free and enough for the entire pilot.
- Git (optional for now). You can introduce it after the first document is migrated.
- One representative Word file. Pick a real document, not a sample. The conversion has to survive your messiest paragraphs, not a clean test case.
Phase 1: Prepare
The work you do before Pandoc decides the quality of the result. Pick the right pilot document and clean it up so the conversion runs cleanly.
Step 1: Inventory and Pick a Pilot
Resist the urge to migrate everything in week one. Start with one document that represents the typical complexity of your library. Good pilot candidates have:
- a clear table of contents
- a mix of headings, bullet lists, and numbered lists
- at least one or two tables
- embedded images
- cross-references to other sections
Migrating one file like that teaches you 80 percent of what you’ll need for the rest. Bonus: you have a real before/after to show stakeholders.
Step 2: Clean the Word File Before Conversion
Every minute spent cleaning the Word source saves ten minutes after the conversion. Open the pilot file and do the following:
- Accept all tracked changes and turn track changes off.
- Replace direct formatting with styles. A bold paragraph that should be a heading must actually use the Heading style. Pandoc relies on Word’s styles to detect structure.
- Simplify tables. Remove merged cells where possible. Convert decorative tables that are really layout tricks into normal paragraphs.
- Export embedded objects. Extract embedded Excel sheets, equations, and SmartArt as PNG or SVG. Pandoc cannot reach inside those.
- Delete dead content. Old comments, hidden text, leftover scratch notes. They will show up in AsciiDoc as noise.
You don’t need a perfect Word file. You need one that uses styles consistently and has no hidden surprises.
Phase 2: Convert
Pandoc takes over now. A single command turns your .docx file into a first AsciiDoc draft together with an images folder.
Step 3: Convert With Pandoc
The actual conversion is one command. From the folder that contains your .docx file:
pandoc handbook.docx -f docx -t asciidoc --wrap=none --extract-media=./images -o handbook.adocWhat the flags do:
-f docxtells Pandoc the source is a Word file.-t asciidocsets the target format.--wrap=nonekeeps paragraphs on a single line, which makes diffs in Git much cleaner.--extract-media=./imagespulls every embedded image out of the.docxand into animages/folder next to the AsciiDoc file.-o handbook.adocis the output file.
The result is a handbook.adoc file plus an images/ folder. Don’t be surprised by some artifacts: deeply nested lists, empty paragraphs, inline style spans. We clean those up in the next step.
Phase 3: Refine
Pandoc’s raw output is the starting point, not the destination. In adoc Studio you clean up the file, pull out reusable building blocks, and give the document a proper look.
Step 4: Open in adoc Studio
Drag handbook.adoc into a new project in adoc Studio. Activate the live preview so you see the rendered document next to the source.
Two cleanup tasks usually pay off immediately:
- Find and replace the most common Pandoc artifacts. Empty
[.underline]spans and stray+++blocks are typical. - Reformat tables. AsciiDoc tables use a clean pipe syntax. Reflowing them once makes the rest of the document much easier to maintain.
You’re now in a real editor for the first time, with a structured language and a live preview. The document is no longer locked inside .docx.
Step 5: Restructure for Reuse
This is the step that makes docs-as-code worth the move. Take the converted file and split out anything that repeats:
- Move the legal footer into
_includes/legal-footer.adocand include it where needed. - Define product names and version numbers as attributes at the top of the file:
:product-name: Acme CRM :product-version: 4.2Then use
{product-name}in the body. Renaming the product is now a one-line change. - Pull glossary entries, callouts, and disclaimers into their own small AsciiDoc files. Future you will thank present you.
Don’t over-engineer this on day one. Refactor what already repeats, leave the rest until you see the pattern.
Step 6: Apply a CSS Style
adoc Studio ships with several built-in styles, and you can also point it at your own CSS file. Pick a style that matches your brand, tweak the typography in the live preview, and keep that CSS file in the project. The same CSS now drives both the HTML preview and the PDF export.
This is where the difference becomes visible to stakeholders. A clean PDF, generated from plain text, beats a Word document that was hand-formatted by three different writers.
Phase 4: Export
Now the comparison matters. A PDF from adoc Studio next to the original shows whether the migration passes the test.
Step 7: Export and Compare
Export the document as PDF from adoc Studio. Open it side by side with the original Word PDF and walk through this checklist:
- Headings render at the right level.
- Tables are intact, with no missing rows or merged cells gone wrong.
- Cross-references actually link to the right target.
- The table of contents reflects the real structure.
- Images appear in the right place and the right size.
Anything that fails the check goes back to Step 4. Most issues come from Word styles that weren’t applied consistently. Fix the source file once, rerun Pandoc, and you’re done.
Phase 5: Scale
What worked on the pilot, you repeat. Git adds versioning and reviews, and a clear plan pulls the rest of the library along.
Step 8: Add Git When You’re Ready
You’ve now proven the workflow on one document. This is the point where Git starts to pay off.
- Initialize a Git repository inside the project. adoc Studio has built-in Git support, so you don’t need a terminal.
- Make your first commit with the converted document. That’s your baseline.
- Create a branch for the next round of edits. Push to GitHub or GitLab when you want subject matter experts to review.
- SMEs can comment directly in the browser, line by line, in plain text. No more “v4_FINAL_marvin_edits.docx”.
If your team isn’t ready for Git yet, that’s fine. iCloud sync between Mac, iPad, and iPhone covers most solo and small-team workflows.
Step 9: Scale to the Rest of Your Library
With the pilot done, scale in waves rather than all at once.
- Group remaining documents by type (manuals, release notes, internal SOPs). Migrate one type at a time.
- Capture lessons learned from the pilot in a short style guide. What attributes do you use? Where do includes live? Which CSS class belongs to which document type?
- Build a project template. A new manual is now “copy the template, replace the attributes, start writing”.
For larger libraries, see our no-nonsense migration playbook, which adds an effort calculator and a 30-day roadmap.
Common Pitfalls
A few things will trip up almost every team. Watch for them early:
- Embedded Excel tables lose their formulas. Export them as proper tables or images first.
- Equations built with Word’s equation editor don’t survive cleanly. Re-author them in AsciiDoc with
stem:blocks or LaTeX. - Auto-numbered figures and tables in Word become static numbers in AsciiDoc unless you let AsciiDoc take over numbering. Strip the manual numbers and let the new pipeline number them.
- Cross-references sometimes lose their targets. Search for
<<>>andxref:after the conversion and verify each link. - Track changes leftovers. If you skipped Step 2, expect strikethrough text and stray comment markers in your AsciiDoc. Go back, accept changes in Word, and rerun Pandoc.
Where to Go Next
You now have a working Word-to-AsciiDoc workflow with a real document, a clean PDF, and the option to add Git when you need it. From here:
- Read the Docs as Code pillar guide for the bigger picture.
- See how adoc Studio compares to Microsoft Word head to head.
- Explore the Static Site Generator and Translation Management features for the next stage.
- Download adoc Studio and run the pilot on one of your own documents.
Word is the right tool for letters and memos. For documentation that grows, ships, and gets reviewed, the move to docs-as-code pays for itself in the first migrated handbook.