Apache Parquet CVE-2025-30065

A critical vulnerability (CVE-2025-30065) in Apache Parquet's parquet-avro module allows arbitrary code execution through schema manipulation and crafted files. Cloudera advises upgrading to supported versions with fixes once they become available and implementing mitigations in the meantime.

Background

On April 1, 2025, a critical vulnerability in the parquet-avro module of Apache Parquet (CVE-2025-30065, CVSS score 10.0) was announced.

Cloudera has determined the list of affected products, and is issuing this TSB to provide details of remediation for affected versions.

Upgraded versions are being released for all currently affected supported releases of Cloudera products. Customers using older versions are advised to upgrade to a supported release that has the remediation, once it becomes available.

Vulnerability Details

Exploiting this vulnerability is only possible by modifying the accepted schema used for translating Parquet files and subsequently submitting a specifically crafted malicious file.

CVE-2025-30065: Schema parsing in the parquet-avro module of Apache Parquet 1.15.0 and previous versions allows bad actors to execute arbitrary code.

CVE:
NVD - CVE-2025-30065
Severity (Critical):
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H

Releases affected

  • Cloudera Flow Management (CFM): 2.1.7.2000 (SP2) and lower when using a record-oriented 'Parquet Reader' with the following NiFi processors:

    • PutIceberg
    • PutIcebergCDC
    • ParquetReader

Impact

Schema parsing in the parquet-avro module of Apache Parquet 1.15.0 and previous versions allows bad actors to execute arbitrary code. Attackers may be able to modify unexpected objects or data that was assumed to be safe from modification. Deserialized data or code could be modified without using the provided accessor functions, or unexpected functions could be invoked.

Deserialization vulnerabilities most commonly lead to undefined behavior, such as memory modification or remote code execution.

Mitigation

You can remove the affected NiFi Archive (NAR file) from the CFM library directory if it is determined that your dataflow is not using any of the processors identified above. Cloudera has created a script that you can run on the flow.json.gz file to check and remove the NAR file easily.

To run the script (hosted on Github) without downloading to do this:

curl -sL https://raw.githubusercontent.com/cloudera/DiM/main/CVE-2025-30065/cloudera-check-parquet.py | python3 - <path to your flow.json.gz>

Detailed instructions on what the script does, as well as its expected output can be found in the README file.

For the latest updates on this issue, see the corresponding Knowledge article.

Addressed in release/refresh/patch

The has been addressed in the following Cloudera Flow Management releases: