Codiga has joined Datadog!

Read the Blog·

Interested in our Static Analysis?

Sign up
← All posts
Julien Delange Friday, October 28, 2022

How to avoid insecure hash functions in Python (CWE-328)

Share

AUTHOR

Julien Delange, Founder and CEO

Julien is the CEO of Codiga. Before starting Codiga, Julien was a software engineer at Twitter and Amazon Web Services.

Julien has a PhD in computer science from Universite Pierre et Marie Curie in Paris, France.

See all articles

What is a hash function?

A hash function is an algorithm that transform an input (a file, byte streams, etc.) into an output (e.g. hash or digest) that holds the following properties

  1. one cannot guess the input from the output (hash or digest)
  2. each time the hash function is invoked, the same value is produced

A hash function should also avoid collision, which mean to avoid (or at least reduce as much as possible) having the same hash or digest for different input values. In other words, the hash/digest values should (mostly) be unique.

There is a long list of

What is a weak hash functions?

An weak hash function is a function where:

  1. There are too many collision and too many different values produce the same hash
  2. The input can be guessed from the output

Example of weak hash functions are MD4, MD5 or SHA-1

How to use hash functions in Python?

With Python, the hashlib module is your one-stop shop to use secure hash algorithms and generate digest.

Using hashlib is really simple, there is an example of how to generate a message digest with the SHA-256 algorithm.

import hashlib
message = hashlib.sha256()
message.update(b"My message")
hex_digest = message.hexdigest()

Using this code, the hex_digest value contains the hexadecimal value of the hash.

How to check for insecure hash functions in Python?

The hashlib Python module implements some hashing algorithms that are known to be insecure. While these functions are necessary sometimes (e.g. for backward compatibility), they should not be used for any new systems.

For example, new systems should not use the MD5 hashing algorithm (as well as MD4 or SHA-1) when implementing a new security feature. Usage of such hashing algorithm should be prohibited.

Automatically detect weak hash functions

In order to prevent developers from using weak cryptographic algorithms, developers should get instant feedback about their usage.

With the Codiga plugin, developers automatically detect safe or insecure Python code in real-time. It integrates in your IDE as well as your CI/CD pipelines, thanks to all our integrations. You can even try the follow rule to avoid weak hash functions (such as MD5, SHA-1).

Detect Weak Hash Functions

To use this rule consistently, all you need to do is to install the integration in your IDE (for VS Code or JetBrains) or code management system and add a codiga.yml file at the root of your profile with the following content:

rulesets:
  - python-security

It will then check all your Python code against 100+ rules that detect unsafe and insecure code and suggests fixes for each of them.

More resources

Are you interested in Datadog Static Analysis?

Sign up