Intel

AIKIDO-2026-10359

pythainlp is vulnerable to Path Traversal

Path Traversal Pre-CVE
Found by Aikido Intel before public disclosure or CVE publication.
Published Mar 13, 2026

65

Medium Risk

This Affects:

PYTHONpythainlp
3.0.7 - 5.2.0
Fixed in 5.3.0
Are you affected? Scan for Free

TL;DR

Affected versions extract corpus tar and zip archives using tarfile.extractall() and zipfile.extractall() without validating member paths or symlink targets. A malicious or tampered archive (e.g. from a compromised or MITM corpus source) can contain members with path traversal sequences (e.g. ..) or symlinks pointing outside the extraction directory, leading to files being written or overwritten outside the intended data directory. The patch adds _is_within_directory(), _safe_extract_tar(), and _safe_extract_zip() to validate all member paths and symlink targets before extraction and uses them in the corpus download flow.

Who does this affect?

You are affected if you are using a version that falls within the vulnerable range.

Background info

pythainlp is vulnerable to Path Traversal in versions 3.0.7 - 5.2.0.

How to fix this

Upgrade the pythainlp library to the patch version.