Automation Metrics API
The Automation Metrics API allows you to query for metrics through the last 90 days for any deployed solution that relates to extraction performance, based on validation success and human review modifications.
To authorize your request, the Automation Metrics API requires an Authorization
header
with the value Bearer XYZ
, where XYZ is your access token. See API authorization.
In this document, URL_BASE
refers to the root URL of your Instabase instance, such as https://www.instabase.com
.
import requests
url_base = "https://www.instabase.com"
automation_metrics_api_url = url_base + '/api/v2/automation-metrics'
Query for Metrics
Method | Syntax |
---|---|
POST | URL_BASE/api/v2/automation-metrics |
Description
Query for automation metrics based on a specific time query range and one-to-many aggregation formats.
Request body
The request body is a JSON object containing details about the type of metrics that you want to query for, and how you want to aggregate those metrics in for our API response.
Parameters are required unless marked as optional.
Name | Type | Description | Values |
---|---|---|---|
solution_name |
string | The name of the deployed solution to query metrics from. | A valid name of an existing deployed solution |
start_time |
int (epoch-milliseconds) | Beginning of the lookback window for the query range | An epoch within the last 3 months |
end_time |
int (epoch-milliseconds) | End of the lookback window for the query range | An epoch within the last 3 months and later than start_time |
aggregations |
list[dict] | List of aggregation actions | Valid aggregation dictionaries |
username |
string | Optional. Username of the human reviewer | Username for an existing Instabas euser |
job-id |
string | Optional. A unique identifier for a specific flow job | Valid flow job ID |
Aggregation types
Each aggregation type is associated with specific values in the aggregation dictionary sent through the aggregations
parameter. Supported aggregation types are:
TIME-SERIES
CUSTOM-METRIC
SUM
Aggregation dictionary format
{
"name": "", // str
"type": "" , // str
"options": {}, // Dict[str, Any]
}
Key | Type | Description |
---|---|---|
name |
string | A custom user-defined name |
type |
string | One of TIME-SERIES , CUSTOM-METRIC , SUM |
options |
dict | Required values are dependent on aggregation type |
Aggregation type: Time Series
A time series aggregation request returns bucketed results for the given sub-aggregations. The results are divided into evenly sized buckets across the time interval, and each requested sub-aggregation runs inside each bucket.
Request
{
"name": "<custom name>",
"type": "TIME-SERIES",
"options": {
"buckets": 1, //int
"timezone": "", // str (optional)
"aggregations": [], // List[Dict]
}
}
Key | Type | Description |
---|---|---|
buckets |
int | Number of buckets that the queried time series data is divided into |
aggregations |
list[dict] | A list of sub aggregations to perform on data in each bucket. There must be at least one requested sub aggregation. |
timezone |
string | Optional. Timezone to divide bucket intervals. Defaults to UTC, so for a query with 30 buckets over 30 days of data, buckets are 1 day and intervals are at midnight UTC. |
Response
Returns a list of buckets, each of them containing a timestamp and all requested sub-aggregations.
[
{
"timestamp": int (epoch milliseconds),
<sub aggregation name>: <sub aggregation value>,
<sub aggregation name>: <sub aggregation value>
}
]
Aggregation type: Custom Metric
A custom metric aggregation returns a predefined, custom reponse for a specific metric type.
Request
{
"name": "<custom name>",
"type": "CUSTOM-METRIC",
"options": {
"metric_name": "" // str
}
}
Key | Type | Description |
---|---|---|
metric_name |
string | Name of the custom metric. Supported options are FIELD-LEVEL-ACCUMULATE and CLASS-LEVEL-ACCUMULATE |
Response
Each custom metric returns its own response format.
CLASS-LEVEL-ACCUMULATE
[
{
"version": str,
"class_name": str,
"pages_processed": int,
"class_counters": {
"invalid_modified": int,
"invalid_unmodified": int,
"no_extraction": int,
"valid_modified": int,
"no_validation_unmodified": int,
"valid_unmodified": int,
"no_validation_modified": int
integers here record the number of records with classification results in each validation state
includes all records which were originally classified by the flow as <classname>
these results can be used to determine reclassification rate
},
"field_counters": {
"invalid_modified": int,
"invalid_unmodified": int,
"no_extraction": int,
"valid_modified": int,
"no_validation_unmodified": int,
"valid_unmodified": int,
"no_validation_modified": int
integers here record the number of fields with extraction results in each validation state
these results can be used to determine class-level automation rate
}
}
]
FIELD-LEVEL-ACCUMULATE
[
{
"version": str,
"class_name": str,
"field_name": str,
"field_counters": {
"invalid_modified": int,
"invalid_unmodified": int,
"no_extraction": int,
"valid_modified": int,
"no_validation_unmodified": int,
"valid_unmodified": int,
"no_validation_modified": int
integers here record the number of occurances of the field in each validation state
these results can be used to determine field level confusion matrix
}
}
]
Aggregation type: Sum
A sum aggregation returns the integer sum of the requested metric.
Request
{
"name": "<custom name>",
"type": "SUM",
"options": {
"metric_name": "" // str
}
}
Key | Type | Description |
---|---|---|
metric_name |
string | Metric to sum over. Supported options are listed below. |
Supported Metric Names
ERRORED-RECORDS
MODIFIED-RECORDS
PROCESSED-RECORDS
REMAPPED-RECORDS
STP-RECORDS
RECLASSIFIED-RECORDS
FAILED-VALIDATION-RECORDS
PROCESSED-PAGES
PROCESSED-DOCUMENTS
FAILED-VALIDATION-DOCUMENTS
STP-DOCUMENTS
ERRORED-DOCUMENTS
FAILED-VALIDATION-DOCUMENTS
Response
Returns an integer sum
Response status
Status | Meaning |
---|---|
200 OK | Metrics were successfully queried and aggregated |
400 Bad Request | Request body had bad syntax and might be incorrect |
401 Unauthorized | Authorized user does not have permissions to the Deployed Solutions app |
Examples
Request
headers = {
'Authorization': 'Bearer {0}'.format(token)
}
data = json.dumps({
'solution_name': 'My Example Solution',
'start_time': 12345,
'end_time': 67890,
'aggregations': [
{
'name': 'time_series_data',
'type': 'TIME-SERIES',
'options': {
'buckets': 2,
'aggregations': [
{
'name': 'class_accumulate_results',
'type': 'CUSTOM-METRIC',
'options': {
'metric_name': 'CLASS-LEVEL-ACCUMULATE'
}
},
{
'name': 'field_accumulate_results',
'type': 'CUSTOM-METRIC',
'options': {
'metric_name': 'FIELD-LEVEL-ACCUMULATE'
}
},
{
'name': 'total_errored_records',
'type': 'SUM',
'options': {
'metric_name': 'ERRORED-RECORDS'
}
}
]
}
},
{
'name': 'total_processed_records',
'type': 'SUM',
'options': {
'metric_name': 'PROCESSED-RECORDS'
}
}
]
})
resp = requests.post(automation_metrics_api_url, headers=headers, data=data)
This request queries time series data for 3 aggregations: class level accumulate, field level accumulate, and the number of errored records. All of these aggregations are split across two buckets for the returned data. The request also queries the total number of records processed across the total time range.
Response
HTTP STATUS CODE 200
{
"status": "OK",
"data": {
"time_series_data": [
{
"timestamp": 12345
"class_accumulate_results": ...
"field_accumulate_results": ...
"total_errored_records": 4
},
{
"timestamp": 34567
"class_accumulate_results": ...
"field_accumulate_results": ...
"total_errored_records": 2
},
],
"total_processed_records": 100
}
}