Build a Media Analysis Dashboard with Python & Cloudinary

#python

In this tutorial, we'll build a media library and content analysis dashboard with Python & Cloudinary.

First, you'll learn how to work with the Cloudinary API to upload, store, and automatically tag images, and also how to retrieve images again and search for specific tags.

Then, we go on and build a web app using Streamlit. The app contains an image gallery that allows for filtering by tags, and also an interactive dashboard built with Plotly.

You can either read the article or watch the video here:

Set up and configure Cloudinary

Everything in this tutorial can be achieved with the free tier. You can signup for free here. Next, create a new project and install the Cloudinary Python SDK and python-dotenv:

pip install cloudinary python-dotenv

Then, create a .env file with the CLOUDINARY_URL that you find in your Dashboard. The same steps are also explained in this Python quick start guide.

CLOUDINARY_URL=cloudinary://<api_key>:<api_secret>@<cloud_name>

Upload images with Cloudinary and Python

Let's learn how to work with the Cloudinary Python SDK. Create a file named cloudinary_service.py, configure Cloudinary, and define a helper function to upload an image:

from dotenv import load_dotenv
load_dotenv()

import pathlib

import cloudinary
import cloudinary.uploader
import cloudinary.api

config = cloudinary.config(secure=True)

def upload_image(filename, folder="my_photos"):
stem = pathlib.Path(filename).stem
res = cloudinary.uploader.upload(filename,
public_id=stem,
folder=folder)
return res

You can now call upload_image() with an image or video filename and it will upload the file and store it in your Cloudinary Media Library.

Automatically tag images with Cloudinary

To enable auto-tagging, we need to enable the Cloudinary AI Content Analysis Add-on in our Dashboard. You can subscribe to the free plan and then add this function to upload and tag images or videos:

Tip: You can also play around with other Add-ons. E.g., the Google Auto Tagging Add-on can also be used for image tagging.

def upload_and_tag_image(filename, folder="my_photos"):
stem = pathlib.Path(filename).stem
res = cloudinary.uploader.upload(
filename,
public_id=stem,
folder=folder,
detection="openimages",
auto_tagging=0.25,
)
return res

Upload and tag all images in a folder

To upload and tag all images in a folder, we need to iterate over all files. Add the following helper function that relies on the upload_and_tag_image function. You can modify the supported_files tuple to include all the file types you want.

import os

supported_files = (".png", ".jpg", ".jpeg", ".heic")

def upload_folder(folder_name):
n = 0
for file in sorted(os.listdir(folder_name)):
if pathlib.Path(file).suffix.lower() in supported_files:
try:
print(file)
upload_and_tag_image(folder_name + "/" + file)
n += 1
except Exception as e:
print("failed for ", file)
print(e)
print(n, " photos uploaded")

Then, you only need to prepare a folder with all images you want to upload and call the upload_folder() function. Congrats! You can now automatically upload, store, and tag your images. Let's learn how to retrieve the images again and build a nice-looking app around them.

Search and access tags and images

Now, let's add three more helper functions that utilize Cloudinary's Search API and Admin API:

  • get_all_tags(): Returns a list of all tags.
  • search_img(): Demonstrates how to use the Search API to search for images with a specific tag.
  • get_all_images_with_tags(): Returns a list of all uploaded images including their tags. Each resource is a dictionary with a url and tags key that can be used for displaying the results.
def get_all_tags():
all_tags = []
tags = cloudinary.api.tags(max_result=100)
all_tags.extend(tags["tags"])
next_cursor = tags.get("next_cursor")

while next_cursor:
tags = cloudinary.api.tags(max_result=100,
next_cursor=next_cursor)
all_tags.extend(tags["tags"])
next_cursor = tags.get("next_cursor")
return all_tags


def search_img(tag_name):
result = (
cloudinary.Search()
.expression(f"resource_type:image AND tags={tag_name}")
.sort_by("public_id", "desc")
.execute()
)
return result


def get_all_images_with_tags():
all_resources = []
result = cloudinary.api.resources(
type="upload",
resource_type="image",
prefix="my_photos",
tags=True,
max_result=100,
)
all_resources.extend(result["resources"])
next_cursor = result.get("next_cursor")

while next_cursor:
result = cloudinary.api.resources(
type="upload",
resource_type="image",
prefix="my_photos",
tags=True,
max_result=100,
next_cursor=next_cursor,
)
all_resources.extend(result["resources"])
next_cursor = result.get("next_cursor")
return all_resources

Implement the app

For our web app, we use Streamlit and Plotly. Install all dependencies:

pip install streamlit pandas plotly

Next, create another file app.py in the same folder and add the following code. We'll wrap our helper function into another function and decorate it with st.cache so that the results can be cached.

import streamlit as st
import plotly.express as px
import pandas as pd

from collections import Counter
from itertools import combinations

import cloudinary_service


@st.cache
def get_images_with_tags():
return cloudinary_service.get_all_images_with_tags()


all_images = get_images_with_tags()

Now, build a list of the most common tags, and also the most common combinations of two tags.

For this, we can utilize itertools.combinations . We also create another list calledsorted_tag_strings with strings that are used for displaying later.

all_tags = []
all_tags_lists = []
for image in all_images:
tags = image["tags"]
if not tags or "person" in tags:
continue
all_tags.extend(tags)
all_tags_lists.append(tags)

tag_counter = Counter(all_tags)

sorted_tags = [item for item in sorted(tag_counter.items(), key=lambda x: -x[1])]
sorted_tag_strings = [f"{item[0]} ({item[1]})" for item in sorted_tags]

combs = []
for tags_per_image in all_tags_lists:
for comb in combinations(tags_per_image, 2):
combs.append(comb)

most_common_combs = Counter(combs).most_common(20)

Now, create the functions to display the Image Gallery page. We create one helper function show_images() that displays all images in an image grid with 3 columns. Since .heic image files are not supported, we can change the format on the fly with Cloudinary just by using a different file ending.

In the image_page() function we first add an st.selectbox to select possible tags. Then we filter all images, returning only images that contain those tags. And lastly, we use those images to call show_images() .

def show_images(images):
columns = st.columns(3)
for idx, img in enumerate(images):
col = columns[idx % 3]
url = img["url"]
if url.endswith(".heic"):
url = url[:-5] + ".jpg"
with col:
st.image(url)
st.markdown(f"[Link]({url})")


def image_page():
options = sorted_tag_strings[:20]
for item in most_common_combs:
options.append(f"{item[0][0]}, {item[0][1]} ({item[1]})")

tag = st.selectbox("Select tag", options)

idx = tag.find("(")
tag = tag[: idx - 1]
if "," in tag:
# multiple tags
tag1, tag2 = tag.split(",")
tag1, tag2 = tag1.strip(), tag2.strip()
images_with_tag = [
img for img in all_images if tag1 in img["tags"] and tag2 in img["tags"]
]
else:
images_with_tag = [img for img in all_images if tag in img["tags"]]
show_images(images_with_tag)

To add the second page with the Dashboard, create another helper function stats_page() that adds three Plotly plots:

  • A pie chart with the most common tags
  • A bar chart with the top 5 tags
  • And a horizontal bar chart with the most common combinations
def stats_page(min_tags_number=20):
filtered_tags = {
k: v
for k, v in sorted(tag_counter.items(), key=lambda x: -x[1])
if v >= min_tags_number
}
labels = list(filtered_tags.keys())
counts = list(filtered_tags.values())

st.markdown(f"#### Top {min_tags_number} Tags")

df = pd.DataFrame(list(zip(labels, counts)), columns=["Tags", "Counts"])
fig = px.pie(df, values="Counts", names="Tags")
fig.update_traces(textinfo="label+percent")
fig.update_layout(width=700, height=700)
st.plotly_chart(fig)

st.markdown(f"#### Top 5 Tags")
df = pd.DataFrame(list(zip(labels[:5], counts[:5])), columns=["Tags", "Counts"])
fig = px.bar(df, x="Tags", y="Counts")
fig.update_layout(width=800, height=500)
st.plotly_chart(fig)

st.markdown(f"#### Most common combinations")

labels = [str(x[0]) for x in most_common_combs][::-1]
counts = [x[1] for x in most_common_combs][::-1]

df = pd.DataFrame(list(zip(labels, counts)), columns=["Combinations", "Counts"])
fig = px.bar(df, x="Counts", y="Combinations", orientation="h")
fig.update_layout(width=800, height=600)
st.plotly_chart(fig)

And as the last step, we only need to combine all the helper functions and add another st.selectbox to choose which page will be displayed:

if __name__ == "__main__":
options = ("Image Gallery", "Image Stats")
selection = st.sidebar.selectbox("Menu", options)

if selection == "Image Gallery":
st.title("Image Gallery")
image_page()
else:
st.title("Image Stats")
stats_page()

Run the app

Now you can run the app with the following command

streamlit run app.py

In the sidebar you can then switch between two different pages:

  • The Image Gallery page is where all your images are displayed. You can filter them by tags.
  • The Image Stats page is where the tags are analyzed and different plots are shown.

And that's it! Congratulations 🥳 You should now have an app that can automatically upload and tag your images, and then analyze all tags and display the results in some nice-looking plots.

Resources

* This is a sponsored link. By clicking on it you will not have any additional costs. Instead you will support me. Thank you so much for the support! 🙏

Hope you enjoyed the read!

Share on Twitter RSS Feed