Perceptual hashing allows you to generate a unique signature of an image that correlates to its visual appearance and allows you to calculate the "distance" between images using some bitwise arithmetic operations.
You can request that Cloudinary calculate and return the pHash value of the image as part of a standard upload() method call. For example, using NodeJS - in the below example we can upload the Cloudinary sample image by passing a remote URL as the upload source and pass the phash
parameter set to true
in the options
object of the upload call. That parameter will tell Cloudinary to calculate and return the pHash value in the response for the uploaded image. To run this example, please add your cloud_name
, api_key
and api_secret
to the configuration (lines 5-7) and then click the green run button. That will upload the Cloudinary sample image in your account and subsequently print out the returned pHash value in the console.
If you're performing unsigned uploads then you can tell Cloudinary to return the pHash value for the uploaded image by editing the Upload Preset you are using (Edit upload preset -> Media Analysis and AI tab) then scrolling down to the 'Perceptual hash' option, turning that ON and saving the changes. Following this, all future uploads using that upload preset will return the pHash value in the response similar to the earlier API example.
The above detailed how to request Cloudinary to return the pHash value for an image at upload time. You can also ask Cloudinary to return the pHash value for any existing image in your account by using the explicit() method and passing the phash
parameter set to true
in the options
, the same way as the previous upload() method example.
Note: The process of generating the pHash value requires Cloudinary to read/re-process the image. Therefore, using the explicit() method and requesting the pHash value for already uploaded images will consume 1 Transformation quota.
Apart from using the above methods through Cloudinary to retrieve the pHash value of your uploaded images, sometimes you may find yourself in need of pre-calculating this on your side. For that purpose, you may find some useful open-source libraries, such as https://github.com/jenssegers/imagehash for PHP or https://github.com/bjlittle/imagehash for Python.
Note: The pHash value generated from the Python library will be different than the one generated by Cloudinary upload that includes the normalization.
The below example uses the Python imagehash library from above to calculate the pHash value for two example images stored locally (as part of the replit snippet) and then compare the pHash "distance" (similarity) between them.
A similarity score closer to 0.5 indicates the least amount of similarity between the two images, while values closer to 1 show high similarity. If the pHash values are the same we have an exact match.
You can use the above to upload your own images using the Files menu on the top left and then update the code snippet to load the new image filenames. Then running the code snippet would calculate and compare the pHash for the custom images you upload rather than the 'sample' and 'dog' examples included by default.
Another approach to calculating the "distance" or "difference" between two images would be to compare their pHash values directly. To do that, you can use an approach like below (demonstrated using PHP and gmp
library):
$p1 = gmp_init(phash1, 16); $p2 = gmp_init(phash2, 16); $diff = gmp_popcount(gmp_xor($p1, $p2));
Lastly, please also see the following Cloudinary blog post on the topic of pHash and image similarity:
https://cloudinary.com/blog/how_to_automatically_identify_similar_images_using_phash
Comments
2 comments
how to do this using express(node.js)
Hi Jay,
Please take a look here at: https://github.com/pwlmaciejewski/imghash for implementation.
Regards,
Mo
Please sign in to leave a comment.