!pip install /dbfs/FileStore/coactive_sdk/python/coactive-0.1.22-py3-none-any.whl
Processing /dbfs/FileStore/coactive_sdk/python/coactive-0.1.22-py3-none-any.whl Requirement already satisfied: urllib3>=1.25.3 in /databricks/python3/lib/python3.10/site-packages (from coactive==0.1.22) (1.26.11) Requirement already satisfied: python-dateutil in /databricks/python3/lib/python3.10/site-packages (from coactive==0.1.22) (2.8.2) Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil->coactive==0.1.22) (1.16.0) Installing collected packages: coactive Successfully installed coactive-0.1.22 [notice] A new release of pip available: 22.2.2 -> 23.2.1 [notice] To update, run: pip install --upgrade pip
2. Authenticate¶
Next, load the authentication credentials necessary for calling the coactive
APIs using Databricks' Secrets API. Note that these variables are environment specific and will be provided by Coactive.
import coactive
# Load credentials
CLIENT_ID = dbutils.secrets.get("coactive_sdk", "client_id")
CLIENT_SECRET = dbutils.secrets.get("coactive_sdk", "client_secret")
COACTIVE_HOST = "https://app.coactive.ai"
# Set up auth
access_token = f"{CLIENT_ID}:{CLIENT_SECRET}"
configuration = coactive.Configuration(
host = COACTIVE_HOST,
access_token = access_token,
)
3. Select a dataset¶
Before you can get started with coactive
's core functions, such as intelligent search and real-time analytics, you will need to specify which dataset of visual data you will run these core functions on. More specifically, you will need the corresponding dataset_id
and embedding_id
for the chosen dataset.
The code below shows how to get these IDs given a dataset name.
Note: This section assumes that you have already initialized a dataset on the Coactive platform. For further details on how to do this, see our Quickstart guide.
from coactive.apis import DatasetApi
from coactive.apis import EmbeddingApi
# Target dataset name
dataset_name = "Open_Images_10K"
# Get the dataset_id
with coactive.ApiClient(configuration) as api_client:
try:
api_response = DatasetApi(api_client).get_datasets()
except coactive.ApiException as e:
print("Exception when calling DatasetApi->get_datasets: %s\n" % e)
for dataset in api_response['data']:
if dataset["name"] == dataset_name:
dataset_id = dataset["dataset_id"]
# Get the embedding_id
with coactive.ApiClient(configuration) as api_client:
try:
api_response = EmbeddingApi(api_client).get_embeddings_by_dataset_id(dataset_id)
except coactive.ApiException as e:
print("Exception when calling EmbeddingApi: %s\n" % e)
for embedding in api_response['data']:
if embedding["name"] == "multimodal":
embedding_id = embedding.embedding_id
4. Search your images¶
You can use coactive
's intelligent search to quickly find the visual assets you are looking for. coactive
's Search API allow you to semantically search through millions of images, videos and any associated metadata using text and/or images. An example of intelligent search on images using a text query is shown below.
import pandas as pd
from coactive.apis import SimilaritySearchApi
text_query = 'a modern bathroom'
limit = 5 # Only show the top 5 results
with coactive.ApiClient(configuration) as api_client:
try:
search_response = SimilaritySearchApi(api_client).search_images_by_text_grouped(
embedding_id=embedding_id,
query=text_query,
limit=limit)
except coactive.ApiException as e:
print("Exception when calling SimilaritySearchApi: %s\n" % e)
search_results = [
{"coactive_image_id": result['items'][0]["coactive_image_id"], "url": result["items"][0]["preview_images"]["full"]["url"]}
for result in search_response["data"]
]
search_df = pd.DataFrame(search_results)
display(search_df)
coactive_image_id | url |
---|---|
0e3b4fb6-1537-473b-9abd-146e15cef1e9 | https://app.coactive.ai/assets/v0/images/coactive/Open_Images_10K_1692743735659/0e3b4fb6-1537-473b-9abd-146e15cef1e9?size=full |
88b6b7d0-4160-4a11-b236-a47d4aec912c | https://app.coactive.ai/assets/v0/images/coactive/Open_Images_10K_1692743735659/88b6b7d0-4160-4a11-b236-a47d4aec912c?size=full |
22bc48fe-75c5-4b48-a8b4-2dc93e7b9874 | https://app.coactive.ai/assets/v0/images/coactive/Open_Images_10K_1692743735659/22bc48fe-75c5-4b48-a8b4-2dc93e7b9874?size=full |
84a63f76-38af-4664-b87b-0bd9723c779a | https://app.coactive.ai/assets/v0/images/coactive/Open_Images_10K_1692743735659/84a63f76-38af-4664-b87b-0bd9723c779a?size=full |
77e06da0-5168-4aab-85a2-4b0f1c829118 | https://app.coactive.ai/assets/v0/images/coactive/Open_Images_10K_1692743735659/77e06da0-5168-4aab-85a2-4b0f1c829118?size=full |
5. Run real-time queries over your visual data¶
coactive
's real-time analytics engine allows you to query your unstructured visual data using SQL by providing a structured view of your visual data (i.e. rows = visual asset, columns = metadata).
Visual dynamic tags combined with the standard capabilities of SQL can easily answer analytical questions spanning visual and structured data. No heavy lifting or deep technical expertise is required!
Here is an example of how coactive
's Query API can help answer an analytical questions using standard SQL syntax:
Note: This section assumes that you have already created a Concept on the Coactive platform. For further details on how to do this, see our Quickstart guide.
from coactive.apis import QueryApi
from coactive.model.query_request import QueryRequest
# Question: What are the 10 images that are most likely to be modern bathroom?
sql_query = '''
SELECT *
FROM coactive_table_adv
ORDER BY modern_bathroom_prob DESC
LIMIT 10
'''
# Run the query
with coactive.ApiClient(configuration) as api_client:
try:
api_response = QueryApi(api_client).execute_query(
QueryRequest(query=sql_query, embedding_id=embedding_id))
except coactive.ApiException as e:
print("Exception when calling QueryApi: %s\n" % e)
query_id = api_response.query_id
Once the query has finished running, the code below shows how you can view the results as a pandas.DataFrame
with coactive.ApiClient(configuration) as api_client:
try:
query_response = QueryApi(api_client).get_query_by_id(query_id)
except coactive.ApiException as e:
print("Exception when calling QueryApi: %s\n" % e)
query_df = pd.DataFrame([row['data'] for row in query_response.to_dict()['results']['data']])
display(query_df)
coactive_image_id | path | format | created_dt | coactive_shot_id | coactive_video_id | keyframe_time_ms | keyframe_index | modern_bathroom_prob | dessert_prob | modern_bathroom | dessert |
---|---|---|---|---|---|---|---|---|---|---|---|
25148aee-3732-482e-aaae-097919450031 | s3://coactive-demo-datasets/RandomOpenImages10k/https___c3.staticflickr.com_9_8483_8195419476_37f38ab4e2_o.jpg | .jpg | 2023-08-22T23:01:20.443+0000 | null | null | null | null | 0.6387478797 | 0.152396264 | 0.0 | 0.0 |
97e353ba-70ec-4163-9f9e-b35ac1b45b97 | s3://coactive-demo-datasets/RandomOpenImages10k/https___c2.staticflickr.com_1_299_19027943784_ecf012215b_o.jpg | .jpg | 2023-08-22T23:01:21.991+0000 | null | null | null | null | 0.5896980228 | 0.1179050834 | 0.0 | 0.0 |
0e3b4fb6-1537-473b-9abd-146e15cef1e9 | s3://coactive-demo-datasets/RandomOpenImages10k/https___c1.staticflickr.com_5_4033_4693399784_c7b7d34604_o.jpg | .jpg | 2023-08-22T23:01:19.790+0000 | null | null | null | null | 0.5159772334 | 0.1409304643 | 0.0 | 0.0 |
ec3a9d32-ef88-4b6e-bb4e-f986746cde91 | s3://coactive-demo-datasets/RandomOpenImages10k/https___c1.staticflickr.com_5_4133_5121820686_1f3652589d_o.jpg | .jpg | 2023-08-22T23:01:21.013+0000 | null | null | null | null | 0.4802939965 | 0.1286086056 | 0.0 | 0.0 |
b5a11226-17c3-43c2-b723-9dd13fd498ed | s3://coactive-demo-datasets/RandomOpenImages10k/https___c3.staticflickr.com_1_301_19134500775_be7bcd5f0b_o.jpg | .jpg | 2023-08-22T23:01:19.758+0000 | null | null | null | null | 0.4785106668 | 0.1199321677 | 0.0 | 0.0 |
e6cf1576-3a9c-4919-8ccb-d5081ace633c | s3://coactive-demo-datasets/RandomOpenImages10k/https___c3.staticflickr.com_4_3094_3212572852_0eb206bd95_o.jpg | .jpg | 2023-08-22T23:01:21.602+0000 | null | null | null | null | 0.4403114618 | 0.1092961075 | 0.0 | 0.0 |
b95be019-7720-4f21-8bd2-c9becf766f10 | s3://coactive-demo-datasets/RandomOpenImages10k/https___c2.staticflickr.com_2_1070_1392744065_4751a445b7_o.jpg | .jpg | 2023-08-22T23:01:21.228+0000 | null | null | null | null | 0.3905662469 | 0.0730388757 | 0.0 | 0.0 |
57d998d1-a593-4acb-af62-4333caeb5803 | s3://coactive-demo-datasets/RandomOpenImages10k/https___c2.staticflickr.com_1_179_380488357_331083f746_o.jpg | .jpg | 2023-08-22T23:01:20.179+0000 | null | null | null | null | 0.3630601063 | 0.161972273 | 0.0 | 0.0 |
88b6b7d0-4160-4a11-b236-a47d4aec912c | s3://coactive-demo-datasets/RandomOpenImages10k/https___c5.staticflickr.com_1_190_446668008_5e66ad0ed6_o.jpg | .jpg | 2023-08-22T23:01:19.368+0000 | null | null | null | null | 0.3363838938 | 0.2882942516 | 0.0 | 0.0 |
d303aafc-181d-42b5-b1bf-4c075aecce9e | s3://coactive-demo-datasets/RandomOpenImages10k/https___c4.staticflickr.com_6_5302_5571454601_e494196ea1_o.jpg | .jpg | 2023-08-22T23:01:20.629+0000 | null | null | null | null | 0.3015868355 | 0.1850784195 | 0.0 | 0.0 |
6. Export results to Databricks¶
As pandas dataframes, any of the results above can be exported as tables to your Databricks Lakehouse environment.
The example code below shows how to save these results as managed tables.
spark.createDataFrame(search_df).write.mode("overwrite").saveAsTable("demo_search_results")
spark.createDataFrame(query_df).write.mode("overwrite").saveAsTable("demo_query_results")
display(spark.sql("SHOW TABLES"))
database | tableName | isTemporary |
---|---|---|
default | demo_query_results | false |
default | demo_search_results | false |