Images provide a lot more information than audio or text. Image processing is the prime field of research for robotics as well as search engines. In this article we will explore the concept of finding similarity between digital images using python. Then we will use our program to find top 10 search results inside a dataset of images for a given picture. It won't be as good as google's search engine because of the technique we will be using to find similarity between images. But what we are going to make will be pretty cool. So lets start.
Setting up the Environment
The code we are going to write requires a few tools which we need to install first. I will try to be as precise as i can and if you get stuck into installing some tool then you can drop a comment below and i will help you sort out the problem. So here are the tools and the steps to install those tools in ubuntu (16.04 but should work on any version). The steps are not independent so follow them accordingly.
-
Install virtualenv
sudo pip install -g virtualenv
-
Create environment directory and activate virtual environment.
mkdir env
virtualenv env
cd env
source bin/activate
-
Install Numpy
pip install numpy
-
Install Pillow (Python image library)
pip install pillow
-
Install Tkinter (For GUI)
sudo apt-get install python-tk
Our Algorithm
The approach we will be using includes finding euclidean distances between color histograms of images. This is a very basic approach and it will help us to search images using their colors and not using their features. So it is possible that with this approach the best search result for a zebra might be a yin-yang but that depends on the dataset actually. So this is not a bad approach at all. Lets see what the step by step process looks like :
- Create the color histogram of each image in the dataset
- Create the color histogram of the image to be searched
- Calculate the euclidean distance between the histogram of image to be searched and histograms of the images in the dataset.
- Select the smallest 10 distances and those are the search results.
This might sound a little confusing because we don't know about a couple of things in the algorithm. Like color histogram and euclidean distance. Let us understand these things.
- What is color histogram:
Color histogram is spectrum of each possible color space of the image containing the number of pixels for a specific color component. Consider a simple RGB format image in which each pixel has one red component, one green and one blue component. So each pixel of the image can be represented by a tuple of 3 values (red, green, blue). Now the range of these values vary from 0 to 255 and thus forms a unique color by selecting any value for 3 components. Thus (0,0,0) represents white color and (255,255,255) represents black. The total possible values for each color component are 256 and thus we have a total of 768 possible color components (these are not total number of colors but only the number of the color components).
Now the color histogram for this image will be a table having 768 entries in which each entry contains the number of pixels containing that component
- What is Euclidean Distance
It is nothing special but the ordinary distance between 2 vectors. If you know about vectors then you should have known how the distance between 2 vectors is calculated. But in case you dont know then lets assume that we have 2 vectors with N-dimensions. So each vector has N components. The distance between these 2 vectors can be calculated as below.
How the code looks
So far we have set up our environment and have learnt about the alogorithm we are going to use. Now its time to write actual code. Create 2 files with names hist.py
and show_images.py
. Below is the code for the hist.py
file.
from PIL import Image from numpy import * import os import show_images DATASETDIR = '/path/to/your/dataset/directory/' def perform_search(filename): #create an image object using Image.open method for the given image im = Image.open(filename) #we can use histogram method of image object to automatically build our histogram #we then convert the histogram array to numpy array to perform calculations search_histo = array(im.histogram()) #create an empty list to store distances dist = [] #get all the images of the dataset directory files = os.listdir(DATASETDIR) #declare the structure of the data for your dist list #It is only to perform sorting using numpy dtype = [('name', 'S100'), ('distance', float)] #Now we calculate euclidean distance between our search_histo and all images histograms for file in files: imob = Image.open(os.path.join(DATASETDIR, file)) histo = array(imob.histogram()) #Euclidean Distance Calculation try: diff = histo - search_histo sq = square(diff) total = sum(sq) result = sqrt(total) dist.append((file, result)) except ValueError: pass #convert our list to numpy array with given data type distance = array(dist, dtype=dtype) #sort the array in increasing order to get top 10 results sort_dist = sort(distance, order='distance') top10 = sort_dist[:11] #show the result images in a window show_images.show_images(top10[1:])
Lets build the GUI
We have our backbone ready. Now we just need to show the images for which we require some GUI library. You can use pyqt or tkinter. But for this article we are going to use tkinter. If you have followed the first section then you already have tkinter installed into your system but if its not then i suggest you to install it first. Lets build our GUI program. The following code is for show_images.py
from Tkinter import * from PIL import Image, ImageTk import os DATASETDIR = '/path/to/your/dataset/directory/' class MainFrame(Frame): def __init__(self, parent, *args, **kw): Frame.__init__(self, parent, *args, **kw) # create a canvas object and a vertical scrollbar for scrolling it vscrollbar = Scrollbar(self, orient=VERTICAL) vscrollbar.pack(fill=Y, side=RIGHT, expand=False) canvas = Canvas(self, bd=0, highlightthickness=0, yscrollcommand=vscrollbar.set) canvas.pack(side=LEFT, fill=BOTH, expand=True) vscrollbar.config(command=canvas.yview) # reset the view canvas.yview_moveto(0) # create a frame inside the canvas which will be scrolled with it self.interior = interior = Frame(canvas) interior_id = canvas.create_window(0, 0, window=interior, anchor=NW) # track changes to the canvas and frame width and sync them, # also updating the scrollbar def _configure_interior(event): # update the scrollbars to match the size of the inner frame size = (interior.winfo_reqwidth(), interior.winfo_reqheight()) canvas.config(scrollregion="0 0 %s %s" % size) if interior.winfo_reqwidth() != canvas.winfo_width(): # update the canvas's width to fit the inner frame canvas.config(width=interior.winfo_reqwidth()) interior.bind('<Configure>', _configure_interior) def _configure_canvas(event): if interior.winfo_reqwidth() != canvas.winfo_width(): # update the inner frame's width to fill the canvas canvas.itemconfigure(interior_id, width=canvas.winfo_width()) canvas.bind('<Configure>', _configure_canvas) def show_images(top10, root=None): root = Tk() root.title("Similar Images") root.imageframe = MainFrame(root) root.imageframe.pack(fill=BOTH, expand=True) images = [] imagetks = [] imagepanels = [] r=0 size = 128, 128 for image in top10: imagename = os.path.splitext(image[0])[0] img = Image.open(os.path.join(DATASETDIR, image[0])) img.thumbnail(size) images.append(img) imgtk = ImageTk.PhotoImage(images[-1]) imagetks.append(imgtk) panel = Label(root.imageframe.interior, image=imagetks[-1]) imagepanels.append(panel) imagepanels[-1].grid(row=r, column=0) r = r+1 root.mainloop()
I wish i could tell you why the code for GUI looks like so but then we will be going off topic. I will consider writing another article on making GUI applications with tkinter but till then you can use the above code as it is. Make sure you have specified the dataset directory in your both program files. So now that we have our main program and our gui program, its time to test the program.
- Run the python interpreter using the command
python
- Now import our
hist
module using statementimport hist
- call
hist.perform_search
with the full path to the image to be searched - You should see a nice window showing the search results.
Additional Techniques
The technique we used above is pretty simple and might not provide results on the basis of other features of images like the shape, orientation and scaling. So you might tempt to use a better search technique. I know 2 other methods which involves using the following concepts.
Thats it for now. If you encounter any problem or have a question then don't hesitate to drop a comment below. Your feedback is also valuable so tell us your thoughts about the article in the comments.
Hi. is this going to work on windows or only on ubuntu?
ReplyDeleteThe code will work on windows too.
DeleteTkinter comes with python so there is no need to explicitly install it. Also the sudo commands won't work in windows which are required for environment setup. But the code is good to go.
Hi. Sorry for disturbing. Maybe i can get an email to further the conversation. Thank you
DeleteAlso, I tried running the code but i got an error message
Traceback (most recent call last):
File "C:\Users\DANIEL FAREMI\Desktop\CBIRNEW\show_images.py", line 7, in
class MainFrame(Frame):
File "C:\Users\DANIEL FAREMI\Desktop\CBIRNEW\show_images.py", line 59, in MainFrame
for image in top10:
NameError: name 'top10' is not defined
I am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work..
ReplyDeletepython online course
google usa|
ReplyDeletegoogle uk|
google usa|
google france|
google singapore
google all
ReplyDeleteall google
google all see
all google finder
google bing all
ReplyDeletethe blog is about Image Search Using Python updated much useful for students and IT Developers
for more updates go with ServiceNow Online Training
For more info on other technologies go with below links
tableau online training hyderabad
mulesoft Online Training
Python Online Training
Great concept
ReplyDeletepython online training
artificial intelligence training
machine learning online training
we are go to help people to crack interview by providing interview questions. Here I am giving some interview questions related sites, you can visit and prepare for interview
dbms interview questions
bootstrap interview questions
ReplyDeleteThank you for sharing such a great information.Its really nice and informative.hope more posts from you. I also want to share some information recently i have gone through and i had find the one of the best mulesoft 4 training videos
ReplyDeleteHi everyone! Hope you are doing well. I just came across your website and I have to say that your work is really appreciative. Your content is exceptional. The ease to use your site is remarkable. We also have some tools that can help you in upgrading your site. Here is the link to our website search image
ReplyDeleteHi everyone! Hope you are doing well. I just came across your website and I have to say that your work is really appreciative. Your content is exceptional. The ease to use your site is remarkable. We also have some tools that can help you in upgrading your site. Here is the link to our website best reverse image search
ReplyDeleteHi everyone! Hope you are doing well. I just came across your website and I have to say that your work is really appreciative. Your content is exceptional. The ease to use your site is remarkable. We also have some tools that can help you in upgrading your site. Here is the link to our website best reverse image search
Thank you for the informative post. It was thoroughly helpful to me. Keep posting more such articles and enlighten us.
ReplyDeleteWeb Designing Training Course in Chennai | Certification | Online Training Course | Web Designing Training Course in Bangalore | Certification | Online Training Course | Web Designing Training Course in Hyderabad | Certification | Online Training Course | Web Designing Training Course in Coimbatore | Certification | Online Training Course | Web Designing Training Course in Online | Certification | Online Training Course
Really good information to show through this blog. I really appreciate you for all the valuable information that you are providing us through your blog.
ReplyDeletevisit : Digital Marketing Training in Chennai || Digital Marketing Course in Chennai
I am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work..
ReplyDeletepython Training in chennai
python Course in chennai
Thanks for such a wonderful content. Our Motive is not just to create links but to get them indexed as will
ReplyDeleteIncrease Domain Authority (DA).We’re on a mission to increase DA PA of your domain
High Quality Backlink Building Service
Boost DA upto 15+ at cheapest
Boost DA upto 25+ at cheapest . Very Helpful
online training in python
ReplyDeleteonline training on python
Thank you for sharing such a great information.Its really nice and informative.hope more posts from you. I also want to share some information about Agro Fertilizer Company in India
ReplyDelete