Skip to content

Tutorial: Intelligent Search for Content Understanding

As a core product offering, coactive allows users to easily perform intelligent search at scale through millions of unstructured visual assets. In this notebook we'll demonstrate how you can use coactive's python SDK to perform intelligent search with multimodal queries (i.e. text and/or images).

1. Import dependencies

First, we import any necesary dependencies. Note that you will first need to install the coactive SDK package to your local python environment.

1
2
3
4
5
6
7
# From python core and other common packages
from PIL import Image
from IPython.display import HTML

# From coactive SDK package
import coactive
from coactive.apis import SimilaritySearchApi

2. Authenticate

Next, we load the authentication and environment variables necessary for calling coactive's APIs. Note that these variables are environment specific and will be provided by Coactive.

1
2
3
4
5
6
7
8
from auth_credentials import COACTIVE_HOST, CLIENT_ID, CLIENT_SECRET
from env_variables import CLIENT_EMBEDDING_ID

access_token = f"{CLIENT_ID}:{CLIENT_SECRET}"
configuration = coactive.Configuration(
    host = COACTIVE_HOST,
    access_token = access_token,
)

coactive's intelligent search allows you to submit multimodal queries to perform multimodal semantic search using our SimilaritySearch API. More specifically:

  • Your search query can include text, images or both
  • coactive will then semantically search through your images, video and any associated metadata

Below, we highlight these core functions to find specific visual assets within millions of images and videos.

We define a custom function to make it simple to run multiple API requests using the SDK and to visually validate the results using this notebook. This custom function can be found below.

from typing import Optional

from helpers import get_images_df_from_api_response, get_image_from_url, image_formatter

def intelligent_search(
    text_query: Optional[str] = None,
    image_query: Optional[str] = None,
    limit: int = 5) -> None:
    '''
    Example function that displays the resutls of calling the SimilaritySearch API.
    '''
    assert text_query or image_query, 'Need at least one query input.'

    # Display query
    if text_query:
        print(f'Text query: {text_query}')
    if image_query:
        print(f'Image query: {image_query}')
        with Image.open(image_query) as img:
            img.thumbnail(size=(300, 300))
            display(img)

    # Call SimilaritySearchAPI
    with coactive.ApiClient(configuration) as api_client:

        try:
            # For text-only search
            if text_query and not image_query:
                api_response = SimilaritySearchApi(api_client).search_images_by_text_grouped(
                    embedding_id=CLIENT_EMBEDDING_ID,
                    query=text_query,
                    limit=limit)

            # For image-only search
            elif not text_query and image_query:
                api_response = SimilaritySearchApi(api_client).get_similar_images_grouped_semantic(
                    embedding_id=CLIENT_EMBEDDING_ID,
                    file=open(image_query, 'rb'),
                    group_by='',
                    limit=limit)

            # For multimodal search
            elif text_query and image_query:
                api_response = SimilaritySearchApi(api_client).get_similar_images_grouped_semantic(
                    embedding_id=CLIENT_EMBEDDING_ID,
                    file=open(image_query, 'rb'),
                    metadata_value=text_query,
                    group_by='',
                    limit=limit)         

        except coactive.ApiException as e:
            print("Exception when calling SimilaritySearchApi->: %s\n" % e)

    # Display results
    df = get_images_df_from_api_response(api_response)
    display(HTML(df.to_html(formatters={'image': image_formatter}, escape=False, index=False)))

3.1 Visual search with a text query

intelligent_search(text_query='a mannequin wearing a green dress')
Output:

Text query: a mannequin wearing a green dress
coactive_image_id url image
3b5e3c54-3142-4e03-8c27-afe23a314a88 http://farm8.staticflickr.com/3178/2707126899_9a572cd5e8_o.jpg
5d34f94b-2df3-4a99-9fd2-fc11232d75b6 http://c4.staticflickr.com/4/3917/14659686687_fe5d6e5fa2_o.jpg
c0097ba3-1156-44f0-8cb4-ddd9a8c7f28f http://c8.staticflickr.com/6/5569/14843870296_e0efdee779_o.jpg
97df220e-3346-49d6-a462-6815e8a34543 http://farm6.staticflickr.com/196/505517021_3942a09b5b_o.jpg
b1f5ac52-8ab0-4fe6-9c17-5750e12801ef http://farm4.staticflickr.com/5509/14288811531_a5291ba62a_o.jpg

3.2 Visual search with an image query

intelligent_search(image_query='query_images/sneaker.jpg')
Output:

Image query: query_images/sneaker.jpg

png

coactive_image_id url image
172141d9-7bc2-4b4d-8007-17a5cd3bcc6b http://farm2.staticflickr.com/8576/16720391621_f23e59b2ed_o.jpg
94967883-3147-4f91-95df-38c629838227 http://farm8.staticflickr.com/7646/16975223482_f816d138bc_o.jpg
af752100-58af-42df-b551-328cef01335c http://c4.staticflickr.com/9/8570/16547880660_4209b88f79_o.jpg
1dc99439-05d1-4399-88dd-11f2ea07d159 http://farm4.staticflickr.com/8569/16533577930_4bfea7d818_o.jpg
9999057d-e230-42c6-b709-7feeb65ea1c8 http://c2.staticflickr.com/9/8663/16719396281_5fdb05784b_o.jpg

3.3 Visual search with a multimodal query

intelligent_search(text_query='a blue sofa',
                   image_query='query_images/sofa.jpg')
Output:

Text query: a blue sofa
Image query: query_images/sofa.jpg

png

coactive_image_id url image
a9406d38-e0e2-486c-bef6-e99a7082d22e http://c2.staticflickr.com/5/4033/4640898099_36a8394b1b_o.jpg
169c5f07-dd48-447f-9d97-7ceddbef3d3b http://farm5.staticflickr.com/8573/16501018541_6fa6cf3444_o.jpg
f752d710-3b41-46c9-bad6-8bec820fb13a http://c4.staticflickr.com/8/7526/15678097529_bc6c639cf6_o.jpg
de6d4836-01d3-4782-8d62-2ec695f0d824 http://farm3.staticflickr.com/1366/1394988944_b3b44a83c3_o.jpg
14f399ff-5865-4e04-9d36-182ed098fca9 http://c7.staticflickr.com/4/3739/10588986074_6164f720d9_o.jpg