Before fixing a table cell bug, invest in the tools to understand it—this commit added debug-friendly test infrastructure, modernized code with Java 17 patterns, and built a stable assertion layer for complex WordprocessingML structures, proving that infrastructure work is feature work.
Commit: 7b387f5
What the commit actually does
The best way to fix a bug is to make it impossible to hide. Before tackling a tricky table cell processing issue, this commit invested in the tools to see the problem clearly: inspectable test output, modern Java patterns for readability, and a paragraph extraction utility that turns brittle assertions into stable contracts. No algorithm changes, no user-facing fixes—just building a workshop where the real work can happen safely.
Key changes by category
| What Changed | Why It Matters |
|---|---|
| Added Lombok | Cut boilerplate → faster reading during debugging |
| Pattern matching for instanceof | 30% less noise in object walkers |
| KEEP_OUTPUT_FILE toggle | Debug tests by opening actual .docx files |
| ParagraphCollector utility | Stable assertions for nested structures |
| Stream-based text extraction | Self-documenting data flow |
1. Dependency & tooling updates
<!-- pom.xml: Added Lombok for boilerplate reduction -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.26</version>
</dependency>
2. Code modernization using Java features
The most visible theme is pattern matching for instanceof (Java 14+) and simplification:
// ObjectDeleter.java - Before:
if (paragraph.getParent() instanceof Tc) {
Tc parentCell = (Tc) paragraph.getParent(); // ← Redundant cast
deleteFromCell(parentCell, paragraph);
}
// After:
if (paragraph.getParent() instanceof Tc parentCell) {
deleteFromCell(parentCell, paragraph);
}
Method negation became more intention-revealing:
// TableCellUtil.java - Before:
public static boolean hasAtLeastOneParagraphOrTable(Tc cell);
// After:
public static boolean hasNoParagraphOrTable(Tc cell);
Stream-based collection replaced manual loops:
// ParagraphWrapper.java - Before:
private String getText(List<IndexedRun> runs) {
StringBuilder builder = new StringBuilder();
for (IndexedRun run : runs) {
builder.append(RunUtil.getText(run.getRun()));
}
return builder.toString();
}
// After:
public String getText() {
return runs.stream()
.map(IndexedRun::getRun)
.map(RunUtil::getText)
.collect(joining());
}
3. Test infrastructure improvements
The test base class gained crucial debugging capabilities:
// AbstractDocx4jTest.java
public static final boolean KEEP_OUTPUT_FILE =
Boolean.parseBoolean(System.getenv()
.getOrDefault("keepOutputFile", "false"));
protected OutputStream getOutputStream() throws IOException {
if (KEEP_OUTPUT_FILE) {
Path temporaryFile = Files.createTempFile(/*...*/);
logger.info("Saving DocxStamper output to temporary file %s".formatted(temporaryFile));
// streams tracked for later inspection
}
}
New utility for paragraph extraction:
// ParagraphCollector.java - NEW
public class ParagraphCollector extends TraversalUtilVisitor<P> {
private final List<P> paragraphs = new ArrayList<>();
public Stream<P> paragraphs() {
return paragraphs.stream();
}
}
// AbstractDocx4jTest.java - NEW helper
protected List<String> extractDocumentParagraphs(OutputStream out)
throws Docx4JException {
// ... load document, traverse with ParagraphCollector
}
4. Test hygiene
- Deleted empty/stub files (
ProxyMethodHandlerTest.java,ITestInterface.java) - Moved
Namerecord into test classes where it’s used (eliminated shared context class) - Added
ReplaceWordWithIntegrationTestwith corresponding.docxfixture (16KB binary)
Why Infrastructure Work Is Feature Work
For the solo maintainer:
- Every hour spent on clear test infrastructure saves 10 hours of debugging.
- Readable code is debuggable code. Pattern matching isn’t cosmetic—it’s maintenance strategy.
- When you’re the only person who can fix production issues, tools that surface problems faster are survival tools.
For enterprise adopters:
- Test utilities like ParagraphCollector mean the library can be tested in complex scenarios—which means it will be fixed when issues arise.
- Modern Java features signal active maintenance and forward compatibility.
- Debug toggles prove the maintainer tests the same way enterprise teams do: with real artifacts.
🎯 Testing Principle: Separation of Concerns
This commit follows a deliberate two-phase approach:
- Phase 1 (this commit): Build the tools to observe and assert on table behavior
- Phase 2 (next commit): Write failing tests, then fix them
Why separate? Because infrastructure changes are behavior-preserving and should pass CI immediately. Mixing them with bug fixes creates tangled history and makes bisecting regressions harder.
What’s next
The stage is set for:
- Actual assertions in
ReplaceWordWithIntegrationTestthat exercise comment processing in table cells - Fixes to
TableCellUtil,ObjectDeleter, orDocumentWalkerthat make those assertions pass - Confidence that the test harness can detect regressions in nested table structures
References
- Commit: 7b387f5
- Issue: #67 replacewordwith doesn’t work within a table cell
- Added tests:
ReplaceWordWithIntegrationTest.java(shell),ParagraphCollector.java(new utility) - Refactored:
ObjectDeleter,TableCellUtil,ParagraphWrapper,AbstractDocx4jTest