pythainlp is vulnerable to Deserialization of Untrusted Data
85
High Risk
The library loads model and vocabulary data from pickle files in thai2fit and w2p (and corpus loading paths). Loading pickle allows arbitrary code execution when the file content is untrusted or attacker-influenced (e.g. via a malicious corpus or model file). Before the fix, an attacker who could supply or influence a pickle file could achieve code execution in the process. The fix removes pickle from these code paths: thai2fit uses JSON for vocabulary and w2p uses npz for models, and corpus loading validates fields before processing.
You are affected if you are using a version that falls within the vulnerable range.
pythainlp is vulnerable to Deserialization of Untrusted Data in versions 0.0.1 - 5.3.0.
Upgrade the pythainlp library to the patch version.
Connect your repositories to instantly see whether vulnerable or malicious packages exist in your codebase.
Free. No credit card required.

SOC 2Compliant
ISO 27001Compliant