Intel

AIKIDO-2026-10636

tokenizers is vulnerable to Cross-Site Scripting (XSS)

Cross-Site Scripting (XSS) Pre-CVE
Found by Aikido Intel before public disclosure or CVE publication.
Published Apr 29, 2026

61

Medium Risk

This Affects:

PYTHONtokenizers
0.10.0 - 0.22.2
Fixed in 0.23.1
Are you affected? Scan for Free

TL;DR

The Python EncodingVisualizer in bindings/python/py_src/tokenizers/tools/visualizer.py embedded span_text directly into generated HTML without escaping. When visualization content includes attacker-controlled text, rendered output can inject unintended HTML or script into the consumer context. This creates a cross-site scripting risk in workflows that display the visualizer output in a browser or notebook frontend. The fix applies html.escape before interpolation and also closes any remaining open annotation span tags for safer, well-formed output.

Who does this affect?

You are affected if you are using a version that falls within the vulnerable range.

Background info

tokenizers is vulnerable to Cross-Site Scripting (XSS) in versions 0.10.0 - 0.22.2.

How to fix this

Upgrade the tokenizers library to the patch version.