Find similar image using Image hashing or perceptual hashing with OpenCV and Python :

Ahin Das
2 min readMay 26, 2021

by Ahin Subhra Das

What is Image hashing or Perceptual hashing ?

: Image hashing is the process of using an algorithm to assign a unique hash value to an image. Duplicate copies of the image all have the exact same hash value.

For this reason, it is sometimes referred to as a ‘digital fingerprint’.

In this article we find similar images using image hashing for that we need to compare two images to see how similar they are.

The image hashing algorithms used here involve scaling the original image to an 8x8 gray scale image, and then performing calculations on each of the 64 pixels. We use the imagehash library in Python to compute the hash of an image and then we calculate hamming distance to get similar ones .

For example we will take three shirts images [.jpg extension] . Number one a blue check shirt ,then a violet shirt and at last a light blue shirt . Here our test data or constant data is blue check shirt and dataset 1 , dataset 2 accordingly violet shirt & light blue shirt .

The screenshot of three shirt images in one window .

Code :

from PIL import Image

import imagehash

image_one = ‘D:\\ip_camp\\blueshirt2.jpg’

img = Image.open(image_one)

image_one_hash = imagehash.whash(img)

print(“hash image1:”,image_one_hash)

image_two = ‘D:\\ip_camp\\violet.jpg’

img2 = Image.open(image_two)

image_two_hash = imagehash.whash(img2)

print(“hash mage2:”,image_two_hash)

if(img==img2):

print(“Both dress are perfectly similar”)

else:

print(“Both dress are not perfectly similar”)

similarity = image_one_hash — image_two_hash

print(similarity,”[a smaller hamming distance means that they are more similar…]”)

Output :

As a result we get hamming distance 6 for violet shirt and hamming distance 4 for light blue shirt . Hence light blue shirt is more similar than violet shirt from blue check shirt .

Code explanation :

At first we need to install two packages in our virtual environment .

pip install imagehash
pip install pillow

Form that we import :

from PIL import Image

import imagehash

then we put blue check shirt image path in to image_one variable , open a particular image from a path we use Image.open(image_one) .

For image hash value we use imagehash.whash thzt print the hash value into console .

We follow same steps for the second image also . At last to getting hamming distance we debit image_two_hash[ hash value ] from image_one_hash[ hash value ] .

Smaller hamming distance means that they are more similar.

Therefore we get more nearest image using image hashing or perceptual hashing .

--

--