Computing checksums with Python and Hashlib

Estimated read time 2 min read

In the world of cybersecurity and threat hunting, computing checksums is a common practice to ensure data integrity and verify the authenticity of files.

Checksums are unique values generated from file contents, and they act like digital fingerprints. Python provides a convenient module called hashlib that allows us to compute various checksum algorithms, such as MD5, SHA1, and SHA256. In this article, I will walk you through the process of computing checksums using Python and the hashlib module.

Computing checksums with Python and Hashlib
Computing checksums with Python and Hashlib

Step-by-Step Guide Python Guide

Step 1: Import the Required Module To get started, we need to import the hashlib module, which provides the necessary functions for computing checksums.

import hashlib

Step 2: Define the Function Next, we define a function called compute_checksums that takes a file_path as its parameter. This function will compute the MD5, SHA1, and SHA256 checksums for the given file.

def compute_checksums(file_path):
    hash_md5 = hashlib.md5()
    hash_sha1 = hashlib.sha1()
    hash_sha256 = hashlib.sha256()

Step 3: Open the File Now, we open the file specified by file_path using a with statement to ensure proper handling and closure of the file.

with open(file_path, "rb") as f:

Step 4: Read and Update Checksums We read the file in chunks of 4096 bytes and update the checksums for each chunk. This approach is efficient for handling large files.

for chunk in iter(lambda: f.read(4096), b""):
            hash_md5.update(chunk)
            hash_sha1.update(chunk)
            hash_sha256.update(chunk)

Step 5: Return the Checksums After processing the entire file, we return the computed checksums as a dictionary containing the MD5, SHA1, and SHA256 values.

return {"md5": hash_md5.hexdigest(), "sha1": hash_sha1.hexdigest(), "sha256": hash_sha256.hexdigest()}

Full Code: Here’s the complete code that you can copy and use in your work:

import hashlib

def compute_checksums(file_path):
    hash_md5 = hashlib.md5()
    hash_sha1 = hashlib.sha1()
    hash_sha256 = hashlib.sha256()

    with open(file_path, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            hash_md5.update(chunk)
            hash_sha1.update(chunk)
            hash_sha256.update(chunk)

    return {"md5": hash_md5.hexdigest(), "sha1": hash_sha1.hexdigest(), "sha256": hash_sha256.hexdigest()}

By utilizing this code, you’ll be able to compute MD5, SHA1, and SHA256 checksums for any file you specify. This can be immensely helpful in various cybersecurity and threat hunting scenarios to ensure the integrity and authenticity of files.

Reza Rafati https://cyberwarzone.com

Reza Rafati, based in the Netherlands, is the founder of Cyberwarzone.com. An industry professional providing insightful commentary on infosec, cybercrime, cyberwar, and threat intelligence, Reza dedicates his work to bolster digital defenses and promote cyber awareness.

You May Also Like

More From Author