Tutorial: Visual Content Moderation for Trust and Safety
As a core product offering, coactive
allows users to easily perform real-time monitoring at scale through millions of unstructured visual assets. In this notebook we'll demonstrate how you can use coactive
's python SDK to perform real-time classification and analytics for content moderation.
1. Import dependencies
First, we import any necesary dependencies. Note that you will first need to install the coactive
SDK package to your local python environment.
2. Authenticate
Next, we load the authentication and environment variables necessary for calling coactive
's APIs. Note that these variables are environment specific and will be provided by Coactive.
3. Real-time analytics
coactive
's real-time analytics engine allows you to perform classification using our Classification
API. Additionally, you can query your unstructured visual data using SQL using our Query
API. More specifically, coactive
:
- provides a structured view of your visual data (i.e. rows = visual asset, columns = metadata) giving you the ability to run analytical SQL queries
- can generate metadata on demand for your visual assets using a library of visual concepts
- gives you full control over these visual concepts used in classification and queries, allowing you to seamlessly create and update these concepts as your tasks or visual data change over time
Below, we highlight the core functions of our real-time analytics engine to monitor a dataset of user-generated visual assets for problematic content.
We define a custom functions to make it simple to run multiple API requests using the SDK and to visually validate the results using this notebook. These custom functions can be found below.
3.1 Classify visual assets
In this example, we perform real time classification of 5 images using an existing library of visual concepts for content moderation.
Below, we show the classification probability for the visual concepts that were flagged in at least one image (i.e. 'fight', 'handgun' and 'syringe') along with the classification tags.
image | fight | syringe | handgun | classification |
---|---|---|---|---|
0.000372 | 0.000253 | 0.000031 | [] | |
0.000033 | 0.002870 | 0.997537 | [handgun] | |
0.000581 | 0.000090 | 0.001383 | [] | |
0.855864 | 0.000447 | 0.000023 | [fight] | |
0.000043 | 0.999982 | 0.169217 | [syringe] |
3.2 Run SQL queries over your visual data
In this example, we a few example queries to demonstrate how you can perform real-time analytical queries over your visual data to gain valuable insights.
3.2.1 Find additional examples
In this first example, we find the IDs and paths of 10 images that contain people fightig using the fight
concept shown in the previous example.
Query run time (hrs:min:sec) = 0:00:04.226605
coactive_image_id | path | |
---|---|---|
0 | a44e7d09-459f-46f7-8034-027342cecaaa | s3://coactive-demo-datasets/trust_and_safety_1... |
1 | a01ec94e-1e93-416b-94d7-ed6bd0a041e2 | s3://coactive-demo-datasets/trust_and_safety_1... |
2 | f51cf45c-d0d0-4c36-bdfc-24a07b30f8f6 | s3://coactive-demo-datasets/trust_and_safety_1... |
3 | a1923c4f-2015-49c9-ba7d-2abbfdbbda94 | s3://coactive-demo-datasets/trust_and_safety_1... |
4 | 860bd2ce-7254-4a75-b1be-57d45dfca16b | s3://coactive-demo-datasets/trust_and_safety_1... |
5 | 3ae5251e-21a7-4d57-8d89-ed617d387603 | s3://coactive-demo-datasets/trust_and_safety_1... |
6 | a8dc01d6-aa5a-493f-97e1-bcd10a3a95fb | s3://coactive-demo-datasets/trust_and_safety_1... |
7 | 37066042-e862-46af-b492-830e86a5c40d | s3://coactive-demo-datasets/trust_and_safety_1... |
8 | e89facaf-d026-4e7f-ab38-04bc50b6cea9 | s3://coactive-demo-datasets/trust_and_safety_1... |
9 | c12424aa-5dc6-436f-a55d-793234188cf8 | s3://coactive-demo-datasets/trust_and_safety_1... |
3.2.2 Count the number of occurrences
In this second example, we perform an aggregate count to find the number of images that contain either a handgun
, rifle
or shotgun
. We define this custom count as weapon_count
.
Query run time (hrs:min:sec) = 0:00:06.685277
weapon_count | |
---|---|
0 | 181.0 |
3.2.3 Define custom classification metrics
In this third example, we show how you can find the top 10 images that maximize a custom classification metric (i.e. problematic_score
) based on the classification probability of a subset of concepts that are particularly violent (i.e. blood
), sexual (i.e. brassiere
) and drug-related (i.e. syringe
).
Query run time (hrs:min:sec) = 0:00:08.380099
coactive_image_id | path | problematic_score | |
---|---|---|---|
0 | ecc5a391-85ae-47d4-a0c5-282c5bf80f1d | s3://coactive-demo-datasets/trust_and_safety_1... | 1.000000 |
1 | 4a683536-fca3-498b-b802-10a2a734951c | s3://coactive-demo-datasets/trust_and_safety_1... | 1.000000 |
2 | 00cb0324-bf9c-44d1-9bc1-51c7172e47b8 | s3://coactive-demo-datasets/trust_and_safety_1... | 1.000000 |
3 | ff6e65b4-5229-45b7-95b1-56fff246d471 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999998 |
4 | b0022a11-16f5-4c08-ba3e-9ad20e057fab | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999998 |
5 | 0dd0143e-9eb8-44f3-936e-f1c3fcb8e1f3 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999997 |
6 | 96420eb5-34dd-4ede-b76c-b8eec6e810b7 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999994 |
7 | e22a2d6c-f5a2-4a83-9ac8-88368595bbba | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999990 |
8 | d51ac9d7-8852-431a-92d0-83bccb5012ff | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999981 |
9 | 18ceef73-11c8-4358-a1d5-1e0259391e95 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999979 |
10 | d89a94f1-84d1-43c1-b097-d04050f12173 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999979 |
11 | 7440c3ca-74d8-40c6-a014-b410fa043b7d | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999978 |
12 | c98035cb-7d66-484a-98a6-79e7efaaef1e | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999972 |
13 | d37d1d65-5da0-4490-a450-ab59c2c201fa | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999969 |
14 | 2eb49b9a-9cb6-478f-9103-b885863e6c2b | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999968 |
15 | 8703245e-13ed-4fdf-89a5-3c9511549d71 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999964 |
16 | 234330f4-5f27-4c87-bba0-234dcf8ae64c | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999956 |
17 | 34a38325-b6d6-4e72-ab60-e9bea628eba5 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999953 |
18 | 14807279-afd1-4d62-8e49-232b6663f229 | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999946 |
19 | 617be723-3ece-41ef-9281-6ed043e3da3f | s3://coactive-demo-datasets/trust_and_safety_1... | 0.999940 |