Welcome to Aito HTTP API reference documentation. You can also test out the queries in the interactive Swagger UI.
Examples are shown in this column.
All requests must specify an API key in the x-api-key
header. There are two types of
authentication keys:
read-only
Allows only read queries. Good for sharing access to 3rd parties.read/write
Allows all queries.We recommend setting up your Aito instance configuration using environment variables as follows:
Environment variable | Value |
---|---|
AITO_INSTANCE_URL | your-aito-instance-url |
AITO_API_KEY | your-api-key |
These environment variables are recognized by the Aito Python SDK. The URL and API keys can be found on the instance overview page in the Aito Console.
Your instance might have monthly and burst API call limits. Refer to limits on pricing page. (https://aito.ai)
Payload size is limited to 10MB per message. This includes data and all headers. For uploading larger datasets to the database, the file upload API can be used to overcome this limit.
Queries must be completed within 30 seconds, or they will time out. In this case API Gateway will return a HTTP(500) Gateway Timeout
-error. This value and behavior is a hard limit, set by AWS, and cannot be modified or extended.
API access limits cannot be enforced on an IP or hostname basis. The authentication is based on an API key. The API is served only over secure HTTP.
Some endpoints use pagination to limit the amount of results returned at once. The pagination
is based on offset
and limit
parameters, similar to SQL and many other APIs.
As an example, to get the first result set of 10 items with Search query you can request:
{ "from": "products", "offset": 0, "limit": 10 }
The response will have a total
field, which tells you how many items were found in total:
{ "offset": 0, "total": 81, "hits": [ ... ] }
If this exceeds the amount of items in hits
array, it means some results were
filtered out from the response. To request the next 10 items, you can query:
{ "from": "products", "offset": 10, "limit": 10 }
The default values for pagination parameters are the following.
Parameter | Default value |
---|---|
offset | 0 |
limit | 10 |
Responses are served by default as compact json. If you want to have server-side
pretty-printed responses, as was the earlier default, you can add the
header x-aito-prettyprint: true
to your API-request.
All responses are served with access-control-allow-origin: *
headers. This is useful for browser
applications.
We aim for a familiar API but in some cases Aito has a different default behavior what other databases might have.
By default Aito sorts everything from the largest to the smallest. This is a design choice, dictated by the fact that within the domain of statistical reasoning: the highest values are often the most interesting ones.
For example: the items with the highest probabilities, the highest frequencies, the highest similarities, the highest mutual information, and the highest scores are often the most desired ones.
Use $asc
to sort values from the smallest to the biggest, as shown in the example:
{ "from": "products", "where": { "category.id": 89 }, "orderBy": { "$asc": "price" } }
Aito has been designed to work well even with small data sets. One example of this is how personalised recommendations work. This is easiest to understand with an example, let's take a digital grocery store as an example.
When requesting product recommendations for a customer who's a vegetarian, Aito also considers what non-vegetarians purchase. If for example the customer would be the only vegetarian user of the grocery web shop, they could receive meat recommendations if the general average purchased a lot of meat.
This default behavior is usually a good default. In book, music, movie, and many other recommendations you commonly want to find new items, instead of getting recommendations only from your own history. However in some cases the behavior might lead to unexpected predictions. For example if we predicted how likely a vegetarian is to purchase bacon, Aito could return that it is very likely, because based on data, that's the common average.
An example recommend query could look like this:
{ "from": "impressions", "where": { "context.user": "veronica" }, "recommend": "product", "goal": { "purchase": true } }
Even if we limit the data to impressions by veronica
, Aito still considers other data points.
In error cases, we return with proper HTTP status codes. Error responses:
400 Bad Request
Returned when there's an error with the given request payload. For example invalid query syntax.429 Too Many Requests
Returned when a query limit is hit. The response
contains a header called x-error-cause
which indicates the cause of the
error and it is either Quota Exceeded
or Throttled
. The query quota is
reset each month. You can increase it by upgrading the tier of your instance,
see the terms of service for
details. A request is throttled when there are too many requests per second.
A throttled request should be retried after a short delay and will likely
succeed as soon as the overall request rate drops.Example error
Error returned when trying to use incorrect table name. Instead of prodjucts
, it should be products
.
{ "charOffset": 17, "lineNumber": 3, "columnNumber": 13, "error": "failed to open 'prodjucts'", "status": 400, "message": "3:13: failed to open 'prodjucts'\n\n \"from\": \"prodjucts\"\n ^\n", "messageLines": [ "3:13: failed to open 'prodjucts'", "", " \"from\": \"prodjucts\"", " ^" ] }
Aito Database names cannot have whitespaces (spaces, tabs, linefeeds etc.) or any of the following characters:
/".$
These name validation rules were progressively applied in June. If you have invalid tables, you can use schema/_rename
end point to rename them.
We take our quality seriously and aim for the smoothest developer experience possible. If you run into problems, please send an email to support@aito.ai containing reproduction steps and we'll fix it as soon as possible.
The query language operations.
POST /api/v1/_search
Search rows.
Allows you to search, filter, and order rows. You can also select only specific columns. Similar to SELECT in SQL.
The results are in descending order by default.
Aito supports intuitive links following. If your products
table has a link column
called category
which links to another table called categories
,
you can simply use the following convenience in the query selection:
{ "from": "products", "where": { "category.id": 89 }, "orderBy": "price" }
You can easily select all rows from a table with the following query:
{ "from": "products" }
Note: the amount of results is limited to 10 by default.
If you want to get search results with highlights, see Generic query.
Name | Type | Description |
---|---|---|
bodyrequired | object | Search query |
Response | Type | Description |
---|---|---|
200 OK | object | Search results |
Find by id
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
You can copy-paste the example curl command to your terminal.
curl -X POST \ https://aito-demo.aito.app/api/v1/_search \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "id": "6411300000494" } }'
Response
{ "offset": 0, "total": 1, "hits": [ { "category": "108", "id": "6411300000494", "name": "Juhla Mokka coffee 500g sj", "price": 3.95, "tags": "coffee" } ] }
Where price is greater than
You can copy-paste the example curl command to your terminal.
curl -X POST \ https://aito-demo.aito.app/api/v1/_search \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "price": { "$gt": 1.5 } }, "limit": 2 }'
Response
{ "offset": 0, "total": 21, "hits": [ { "category": "101", "id": "6437002001454", "name": "VAASAN Ruispalat 660g 12 pcs fullcorn rye bread", "price": 1.69, "tags": "gluten bread" }, { "category": "101", "id": "6411402202208", "name": "Fazer Puikula fullcorn rye bread 9 pcs/500g", "price": 1.85, "tags": "gluten bread" } ] }
Find products with search term
curl -X POST \ https://aito-demo.aito.app/api/v1/_search \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "name": { "$match": "coffee" } } }'
Response
{ "offset": 0, "total": 4, "hits": [ { "category": "108", "id": "6411300000494", "name": "Juhla Mokka coffee 500g sj", "price": 3.95, "tags": "coffee" }, { "category": "108", "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "price": 3.45, "tags": "coffee" }, { "category": "108", "id": "6411300164653", "name": "Juhla Mokka Dark Roast coffee 500g hj", "price": 3.95, "tags": "coffee" }, { "category": "108", "id": "6410405181190", "name": "Pirkka Costa Rica filter coffee 500g UTZ", "price": 2.89, "tags": "coffee pirkka" } ] }
More complex where proposition
Find all products priced over 1.5€,
which have tag drink
or their name matches to coffee
.
curl -X POST \ https://aito-demo.aito.app/api/v1/_search \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "$and": [ { "$or": [ { "tags": { "$has": "drink" } }, { "name": { "$match": "coffee" } } ] }, { "price": { "$gt": 1.5 } } ] }, "limit": 2 }'
Response
{ "offset": 0, "total": 6, "hits": [ { "category": "104", "id": "6408430000258", "name": "Valio eilaâ„¢ Lactose-free semi-skimmed milk drink 1l", "price": 1.95, "tags": "lactose-free drink" }, { "category": "108", "id": "6411300000494", "name": "Juhla Mokka coffee 500g sj", "price": 3.95, "tags": "coffee" } ] }
POST /api/v1/_predict
Predict the likelihood of a feature given a hypothesis.
For example predict what other products user could
add into their e-commerce shopping cart, based on the existing cart.
To understand why Aito predicts certain results, you can select "$why"
.
Related information
exclusiveness
option is explained in Exclusiveness chapter.Name | Type | Description |
---|---|---|
bodyrequired | object | Predict query |
Response | Type | Description |
---|---|---|
200 OK | object | Predict results |
Predict purchase likelihood
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
In the example we're predicting how likely the customer with username larry
would
purchase the product "Finnish bread cheese 120g lactose-free"
(6410405197764
). In the example data, Larry purchases a lot of lactose-free
products, but has never purchased any cheese. Aito detects that the "lactose-free"
tag is a commonly occuring feature in the data, and predicts that Larry would also
quite likely purchase the cheese.
The query format depends on how the data has been structured in Aito (schema).
In the example dataset
impressions
table contains each individual product a user has seen in their shop
visit (=session) and if they bought the product or not.
curl -X POST \ https://aito-demo.aito.app/api/v1/_predict \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "context.user": "larry", "product.id": "6410405216120" }, "predict": "purchase" }'
Response
{ "offset": 0, "total": 2, "hits": [ { "$p": 0.948364489629863, "field": "purchase", "feature": false }, { "$p": 0.051635510370137104, "field": "purchase", "feature": true } ] }
Explain the prediction
Same example as above, but we ask Aito to explain why it predicted the results.
To understand the response, see "$why"
section.
curl -X POST \ https://aito-demo.aito.app/api/v1/_predict \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "context.user": "larry", "product.id": "6410405216120" }, "select": ["$why"], "predict": "purchase" }'
Response
{ "offset": 0, "total": 2, "hits": [ { "$why": { "type": "product", "factors": [ { "type": "baseP", "value": 0.948364354525726, "proposition": { "purchase": { "$has": false } } }, { "type": "product", "factors": [ { "type": "normalizer", "name": "exclusiveness", "value": 1 }, { "type": "normalizer", "name": "trueFalseExclusiveness", "value": 1.0000000000135638 } ] }, { "type": "relatedPropositionLift", "proposition": { "product.id": { "$has": "6410405216120" } }, "value": 1.0000001816851178 }, { "type": "relatedPropositionLift", "proposition": { "context.user": { "$has": "larry" } }, "value": 0.9999999607614847 } ] } }, { "$why": { "type": "product", "factors": [ { "type": "baseP", "value": 0.05163564547427404, "proposition": { "purchase": { "$has": true } } }, { "type": "product", "factors": [ { "type": "normalizer", "name": "exclusiveness", "value": 1 }, { "type": "normalizer", "name": "trueFalseExclusiveness", "value": 1.0000000000135638 } ] }, { "type": "relatedPropositionLift", "proposition": { "product.id": { "$has": "6410405216120" } }, "value": 0.9999966627545696 }, { "type": "relatedPropositionLift", "proposition": { "context.user": { "$has": "larry" } }, "value": 1.0000007207445256 } ] } } ] }
Example request
In the example we're predicting three suitable tags for a hypothetical new product based on its name. Tags are predicted based on what tags existing products have.
curl -X POST \ https://aito-demo.aito.app/api/v1/_predict \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "name": "Hovis Seed Sensations Seven Seeds Original 800g" }, "predict": "tags", "exclusiveness": false, "limit": 3 }'
Response
{ "offset": 0, "total": 25, "hits": [ { "$p": 0.3409090909090909, "field": "tags", "feature": "pirkka" }, { "$p": 0.29545454545454547, "field": "tags", "feature": "food" }, { "$p": 0.25, "field": "tags", "feature": "meat" } ] }
POST /api/v1/_recommend
Recommend a row which optimizes a given goal.
For example, you could ask Aito to choose a product, which maximizes the click likelihood,
when user id equals 4543
.
Recommend differs from predict and match in the following way: recommend always optimizes a goal, while predict and match merely mimics the existing behavior patterns in the data. As an example, consider the problem matching employees to projects. With predict and match: you can mimic the way the projects are staffed currently, and Aito will mimic both the good and the bad staffing practices. With recommend, Aito seeks to maximize the success rate and avoid decisions that lead to bad outcomes, even if these decisions were a popular practice.
The chapter Personalisation also explains a characteristic of the recommendations.
Name | Type | Description |
---|---|---|
bodyrequired | object | Recommend query |
Response | Type | Description |
---|---|---|
200 OK | object | Recommend results |
Recommend top 5 products for a customer
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
In the example we're recommending the top 5 products which veronica
(user id)
would most likely to purchase based on her behavior history stored in impressions
table.
The table contains information of which products she has seen and which of those where
bought.
This query could be used to generate campaign email which recommends relevant products for a customer.
curl -X POST \ https://aito-demo.aito.app/api/v1/_recommend \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "context.user": "veronica" }, "recommend": "product", "goal": { "purchase": true }, "limit": 5 }'
Response
{ "offset": 0, "total": 42, "hits": [ { "$p": 0.4824468810415186, "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, { "$p": 0.4125168211770318, "category": "100", "id": "2000503600002", "name": "Chiquita banana", "price": 0.28054, "tags": "fresh fruit" }, { "$p": 0.3889196836779399, "category": "111", "id": "6414880021620", "name": "Ilta Sanomat weekend news", "price": 2.3, "tags": "news" }, { "$p": 0.3805216224063433, "category": "100", "id": "6410405060457", "name": "Pirkka bio cherry tomatoes 250g international 1lk", "price": 1.29, "tags": "fresh vegetable pirkka tomato" }, { "$p": 0.37278546698503334, "category": "100", "id": "6410405093677", "name": "Pirkka iceberg salad Finland 100g 1st class", "price": 1.29, "tags": "fresh vegetable pirkka salad" } ] }
Recommend top products with additional filtering
This example is the same as above, but we're adding an additional criteria: the product name should match to 'Banana' search query.
This query could be used to build a personalised search functionality.
curl -X POST \ https://aito-demo.aito.app/api/v1/_recommend \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "product.name": { "$match": "Banana" }, "context.user": "veronica" }, "recommend": "product", "goal": { "purchase": true }, "limit": 5 }'
Response
{ "offset": 0, "total": 2, "hits": [ { "$p": 0.4824468810415186, "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, { "$p": 0.4125168211770318, "category": "100", "id": "2000503600002", "name": "Chiquita banana", "price": 0.28054, "tags": "fresh fruit" } ] }
POST /api/v1/_evaluate
Evaluate performance and accuracy.
The query supports evaluation of Predict, Match, Similarity, and Generic queries.
Evaluate operation is in alpha stage. The syntax might change in the future.
The evaluation is performed by first specifying the train and test data split:
The testing data is specified using the test
proposition or the TestSource. The training data is the remaining data that is not the testing data.
The evaluating query is specified following the evaluate
keyword.
After that, a simulated evaluation scenario is ran: Aito simulates inserting the training data in to a table and then runs the given query for each sample (=row in a table) in the test data and measures how good the results were.
It is also possible to group multiple entries into a single test case and evaluate using the EvaluateGroupedQuery
Response | Type | Description |
---|---|---|
200 OK | object | Evaluate results |
Example request
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
In the example we're evaluating how good results Aito provides when we predict tags for a new hypothetical product. The results give us the accuracy and performance of the prediction example shown in Predict operation's documentation.
$index is a built-in variable which tells the insertion index of a row.
In the example, we select 1/4 of the rows in products
table to be used as test data.
The rest of the rows are automatically used as training data.
Aito iterates through each product in the test data, and tests how accurate
the prediction of tags
for a given product name was.
curl -X POST \ https://aito-demo.aito.app/api/v1/_evaluate \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "test": { "$index": { "$mod": [4, 0] } }, "evaluate": { "from": "products", "where": { "name": { "$get": "name" } }, "predict": "tags" } }'
Response
{ "n": 11, "testSamples": 11, "trainSamples": 31, "features": 203, "error": 0.09090909090909094, "baseError": 0.7272727272727273, "accuracy": 0.9090909090909091, "baseAccuracy": 0.2727272727272727, "accuracyGain": 0.6363636363636364, "meanRank": 2.090909090909091, "baseMeanRank": 4.636363636363637, "rankGain": 2.545454545454546, "informationGain": 2.2066064654562325, "mxe": 1.4400780671492726, "h": 3.646684532605505, "geomMeanP": 0.3685473609393994, "baseGmp": 0.0798433170066807, "geomMeanLift": 4.615882390113653, "meanNs": 8236222.181818182, "meanUs": 8236.222181818182, "meanMs": 8.236222181818182, "medianNs": 8355539, "medianUs": 8355.539, "medianMs": 8.355539, "allNs": [ 4709341, 5930798, 7958994, 8355539, 4094623, 6398333, 11801244, 10139013, 8999718, 11887446, 10323395 ], "allUs": [ 4709, 5930, 7958, 8355, 4094, 6398, 11801, 10139, 8999, 11887, 10323 ], "allMs": [4, 5, 7, 8, 4, 6, 11, 10, 8, 11, 10], "warmingMs": 0, "accurateOffsets": [0, 1, 2, 3, 4, 5, 6, 7, 8, 10], "errorOffsets": [9], "cases": [ { "offset": 0, "testCase": { "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, "accurate": true, "top": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" } }, { "offset": 1, "testCase": { "category": "100", "id": "6410405093677", "name": "Pirkka iceberg salad Finland 100g 1st class", "price": 1.29, "tags": "fresh vegetable pirkka salad" }, "accurate": true, "top": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" } }, { "offset": 2, "testCase": { "category": "101", "id": "6413467282508", "name": "Fazer Puikula fullcorn rye bread 330g", "price": 1.29, "tags": "gluten bread" }, "accurate": true, "top": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" }, "correct": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" } }, { "offset": 3, "testCase": { "category": "102", "id": "6410405205483", "name": "Pirkka Finnish beef-pork minced meat 20% 400g", "price": 2.79, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" } }, { "offset": 4, "testCase": { "category": "103", "id": "6412000030026", "name": "Saarioinen Maksalaatikko liver casserole 400g", "price": 1.99, "tags": "meat food" }, "accurate": true, "top": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" }, "correct": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" } }, { "offset": 5, "testCase": { "category": "104", "id": "6410405082657", "name": "Pirkka Finnish semi-skimmed milk 1l", "price": 0.81, "tags": "lactose drink pirkka" }, "accurate": true, "top": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" } }, { "offset": 6, "testCase": { "category": "104", "id": "6408430000258", "name": "Valio eilaâ„¢ Lactose-free semi-skimmed milk drink 1l", "price": 1.95, "tags": "lactose-free drink" }, "accurate": true, "top": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" }, "correct": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" } }, { "offset": 7, "testCase": { "category": "108", "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "price": 3.45, "tags": "coffee" }, "accurate": true, "top": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" }, "correct": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" } }, { "offset": 8, "testCase": { "category": "109", "id": "6411401015090", "name": "Fazer Sininen milk chocolate slab 200g", "price": 2.19, "tags": "candy lactose" }, "accurate": true, "top": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" }, "correct": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" } }, { "offset": 9, "testCase": { "category": "111", "id": "6413200330206", "name": "Lotus Soft Embo 8 rll toilet paper", "price": 3.35, "tags": "toilet-paper" }, "accurate": false, "top": { "$p": 0.18373901890597413, "field": "tags", "feature": "pirkka" } }, { "offset": 10, "testCase": { "category": "115", "id": "6410402010318", "name": "Pirkka tuna fish pieces in oil 200g/150g", "price": 1.69, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" } } ], "accurateCases": [ { "offset": 0, "testCase": { "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, "accurate": true, "top": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" } }, { "offset": 1, "testCase": { "category": "100", "id": "6410405093677", "name": "Pirkka iceberg salad Finland 100g 1st class", "price": 1.29, "tags": "fresh vegetable pirkka salad" }, "accurate": true, "top": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" } }, { "offset": 2, "testCase": { "category": "101", "id": "6413467282508", "name": "Fazer Puikula fullcorn rye bread 330g", "price": 1.29, "tags": "gluten bread" }, "accurate": true, "top": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" }, "correct": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" } }, { "offset": 3, "testCase": { "category": "102", "id": "6410405205483", "name": "Pirkka Finnish beef-pork minced meat 20% 400g", "price": 2.79, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" } }, { "offset": 4, "testCase": { "category": "103", "id": "6412000030026", "name": "Saarioinen Maksalaatikko liver casserole 400g", "price": 1.99, "tags": "meat food" }, "accurate": true, "top": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" }, "correct": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" } }, { "offset": 5, "testCase": { "category": "104", "id": "6410405082657", "name": "Pirkka Finnish semi-skimmed milk 1l", "price": 0.81, "tags": "lactose drink pirkka" }, "accurate": true, "top": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" } }, { "offset": 6, "testCase": { "category": "104", "id": "6408430000258", "name": "Valio eilaâ„¢ Lactose-free semi-skimmed milk drink 1l", "price": 1.95, "tags": "lactose-free drink" }, "accurate": true, "top": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" }, "correct": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" } }, { "offset": 7, "testCase": { "category": "108", "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "price": 3.45, "tags": "coffee" }, "accurate": true, "top": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" }, "correct": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" } }, { "offset": 8, "testCase": { "category": "109", "id": "6411401015090", "name": "Fazer Sininen milk chocolate slab 200g", "price": 2.19, "tags": "candy lactose" }, "accurate": true, "top": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" }, "correct": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" } }, { "offset": 10, "testCase": { "category": "115", "id": "6410402010318", "name": "Pirkka tuna fish pieces in oil 200g/150g", "price": 1.69, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" } } ], "errorCases": [ { "offset": 9, "testCase": { "category": "111", "id": "6413200330206", "name": "Lotus Soft Embo 8 rll toilet paper", "price": 3.35, "tags": "toilet-paper" }, "accurate": false, "top": { "$p": 0.18373901890597413, "field": "tags", "feature": "pirkka" } } ], "alpha_binByTopScore": [ { "meanScore": 0.25681418461581484, "maxScore": 0.30775002048928934, "minScore": 0.18373901890597413, "accuracy": 0.75, "n": 4, "accurateOffsets": [4, 10, 5], "errorOffsets": [9] }, { "meanScore": 0.3675691762234355, "maxScore": 0.39760749608842766, "minScore": 0.33827403115335014, "accuracy": 1, "n": 4, "accurateOffsets": [6, 1, 0, 8], "errorOffsets": [] }, { "meanScore": 0.4908397539011848, "maxScore": 0.5601975730961072, "minScore": 0.42760884737731225, "accuracy": 1, "n": 3, "accurateOffsets": [2, 3, 7], "errorOffsets": [] } ] }
POST /api/v1/_similarity
Similarity can be used to return entries, that are similar to the given sample object.
The sample object can be either a complete or a partial row. Similarity operation uses TF-IDF for scoring the documents.
The chapter Personalisation also explains a characteristic of the similarity model.
Name | Type | Description |
---|---|---|
bodyrequired | object | Similarity query |
Response | Type | Description |
---|---|---|
200 OK | object | Similarity results |
Example request
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
In the example we're finding similar products to a given existing product. Aito assumes that the given sample object is a hypothetical new object, which is why in this example the exact same product is also in the results.
curl -X POST \ https://aito-demo.aito.app/api/v1/_similarity \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "similarity": { "category": "108", "id": "6411300000494", "name": "Juhla Mokka coffee 500g sj", "price": 3.95, "tags": "coffee" }, "limit": 3 }'
Response
{ "offset": 0, "total": 42, "hits": [ { "$score": 4347.31478753839, "category": "108", "id": "6411300000494", "name": "Juhla Mokka coffee 500g sj", "price": 3.95, "tags": "coffee" }, { "$score": 368.87273582619054, "category": "108", "id": "6411300164653", "name": "Juhla Mokka Dark Roast coffee 500g hj", "price": 3.95, "tags": "coffee" }, { "$score": 18.108373975215677, "category": "108", "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "price": 3.45, "tags": "coffee" } ] }
Example request
In the example we're finding similar products based on just a product name.
curl -X POST \ https://aito-demo.aito.app/api/v1/_similarity \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "similarity": { "name": "Hovis Seed Sensations Seven Seeds Original 800g" }, "limit": 3 }'
Response
{ "offset": 0, "total": 42, "hits": [ { "$score": 1, "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, { "$score": 1, "category": "100", "id": "2000604700007", "name": "Cucumber Finland", "price": 0.9765, "tags": "fresh vegetable" }, { "$score": 1, "category": "100", "id": "6410405060457", "name": "Pirkka bio cherry tomatoes 250g international 1lk", "price": 1.29, "tags": "fresh vegetable pirkka tomato" } ] }
POST /api/v1/_match
Match the most likely value/feature of a column or any column of a linked table to a given hypothesis.
While match is similar to Predict query, there are fine-grained differences explained below.
Match can return A) the row behind a link or B) the value inside a text field. If match is done against non-analyzed field, it works similarly to predict, except the inference algorithm is somewhat different
Predict treats features as 'black boxes', and it does statistical reasoning purely based on the feature's own statistics. Match does 'glass box' statistical reasoning by using all the features found behind the link or within a field.
For example, if you are predicting a product, the predict-query will look at the histories of the each individual product ids. If there is no history for the product, Aito will not be able to do proper inference. On the other hand, if you are matching the product, Aito will look at the product category, title and description. This enables Aito to match products, it has never seen before, as long as it is familiar with its internal features
The chapter Personalisation also explains a characteristic of the matching.
Name | Type | Description |
---|---|---|
bodyrequired | object | Match query |
Response | Type | Description |
---|---|---|
200 OK | object | Match results |
Match user to products
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
In the example we're matching a user to products.
curl -X POST \ https://aito-demo.aito.app/api/v1/_match \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "context.user": "larry" }, "match": "product", "limit": 5 }'
Response
{ "offset": 0, "total": 42, "hits": [ { "$p": 0.03483352113522869, "category": "108", "id": "6411300000494", "name": "Juhla Mokka coffee 500g sj", "price": 3.95, "tags": "coffee" }, { "$p": 0.03477540068839571, "category": "108", "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "price": 3.45, "tags": "coffee" }, { "$p": 0.03477540068839571, "category": "108", "id": "6410405181190", "name": "Pirkka Costa Rica filter coffee 500g UTZ", "price": 2.89, "tags": "coffee pirkka" }, { "$p": 0.033224315232439634, "category": "100", "id": "6410405060457", "name": "Pirkka bio cherry tomatoes 250g international 1lk", "price": 1.29, "tags": "fresh vegetable pirkka tomato" }, { "$p": 0.03295429335429589, "category": "108", "id": "6411300164653", "name": "Juhla Mokka Dark Roast coffee 500g hj", "price": 3.95, "tags": "coffee" } ] }
POST /api/v1/_relate
Relate provides statistical information of data relationships.
It calculates correlations between a pair of features, which can be used to for example to find causation and correlation.
The hits are by default ordered by relation.mi
field. It indicates how strong the correlation is.
Name | Type | Description |
---|---|---|
bodyrequired | object | Relate query |
Response | Type | Description |
---|---|---|
200 OK | object | Relate results |
What features of products affect purchasing
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
In the example we ask Aito to explain what factors of products affect to people
purchasing them. With $exists
, we tell Aito to get all properties of the product
(impressions table links to the products table), and relate those to the condition
{"purchase": true }
.
The response may seem overwhelming but it contains a lot of useful information.
When looking at the second hit, we can see that when { "product.tags" : { "$has": "vegetable" } }
, the "lift"
value is high (compared to 1.0). It means that when the product tags contain a tag
vegetable
, it is ~1.9x more likely that the product will be purchased
compared to the average product (=base probability).
The lift is calculated with the formula: the probability of the condition { "purchase": true}
divided by the average probability of the condition. The formula with the correct
field names is: ps.pOnCondition / ps.p
.
In the example data set, people purchase 50% of products they see. This causes the base probability to be 0.5.
curl -X POST \ https://aito-demo.aito.app/api/v1/_relate \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "$exists": "product" }, "relate": [ { "purchase": true } ], "limit": 2 }'
Response
{ "offset": 0, "total": 2, "hits": [ { "related": { "purchase": { "$has": true } }, "condition": { "product.category": { "$has": "111" } }, "lift": 2.339270518921214, "fs": { "f": 4496, "fOnCondition": 658.59375, "fOnNotCondition": 3837.40625, "fCondition": 5447.66154661017, "n": 87089 }, "ps": { "p": 0.05163564547427404, "pOnCondition": 0.12078974318343687, "pOnNotCondition": 0.04701024867056299, "pCondition": 0.06255282381234953 }, "info": { "h": 0.2933048451230419, "mi": 0.052056347717906876, "miTrue": 0.14809531647853316, "miFalse": -0.09603896876062629 }, "relation": { "n": 87089, "varFs": [5447.66154661017, 4496], "stateFs": [77803.93220338984, 4789.06779661017, 3837.40625, 658.59375], "mi": 0.0035604469151132847 } }, { "related": { "purchase": { "$has": true } }, "condition": { "product.name": { "$has": "ilta" } }, "lift": 2.944562691897145, "fs": { "f": 4496, "fOnCondition": 264, "fOnNotCondition": 4232, "fCondition": 1732, "n": 87089 }, "ps": { "p": 0.05163564547427404, "pOnCondition": 0.15204439523557498, "pOnNotCondition": 0.04958775636552271, "pCondition": 0.019887742693174074 }, "info": { "h": 0.2933048451230419, "mi": 0.09998855952306565, "miTrue": 0.23689328541005905, "miFalse": -0.1369047258869934 }, "relation": { "n": 87089, "varFs": [1732, 4496], "stateFs": [81125, 1468, 4232, 264], "mi": 0.0020498727155220694 } } ] }
POST /api/v1/_query
Generic query is a powerful expert interface.
It provides the functionality of every other query type in the API. Search, Similarity, Match, and Recommend can be seen as convenience APIs for the generic query.
The query format resembles the Search-query, except that it supports a "get"
statement. Since
this endpoint provides functionality of all other queries, "get": "product"
is used as a replacement for
"predict": "product"
, "recommend": "product"
, and "match": "product"
counterparts.
The chapter Personalisation also explains a characteristic of the inference model.
The "get"
operation changes the namespaces of "select"
and "orderBy"
operations.
The namespace is changed from the "from"
table to the linked table (specified with "get"
).
As an example, think of this query. The impressions
table has a column called
product
which links to a row in products
table. The price
and title
fields are columns
of products
.
{ "from": "impressions", "where": { "query": "macbook air 2018" }, "get": "product", "orderBy": ["price"], "select": ["title", "$highlight"] }
When using "select"
and "orderBy"
, we are already in the products
table namespace, instead of
having to use product.title
or product.price
.
Related information
$p
and $lift
Name | Type | Description |
---|---|---|
bodyrequired | object | Generic query |
Response | Type | Description |
---|---|---|
200 OK | object | Query results |
Search query
Simple search query with the generic query.
curl -X POST \ https://aito-demo.aito.app/api/v1/_query \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "id": "6410402010318" } }'
Response
{ "offset": 0, "total": 1, "hits": [ { "category": "115", "id": "6410402010318", "name": "Pirkka tuna fish pieces in oil 200g/150g", "price": 1.69, "tags": "meat food protein pirkka" } ] }
Search query with highlighted results
Search query which returns related products ordered by similarity. The response also contains the highlighted words which matched to the search term.
curl -X POST \ https://aito-demo.aito.app/api/v1/_query \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "name": { "$match": "coffee" } }, "select": ["id", "name", "tags", "price", "$score", "$highlight"], "orderBy": "$similarity" }'
Response
{ "offset": 0, "total": 4, "hits": [ { "id": "6411300000494", "name": "Juhla Mokka coffee 500g sj", "tags": "coffee", "price": 3.95, "$score": 2.1726635013471625, "$highlight": [ { "score": 1.1194647495169912, "field": "name", "highlight": "Juhla Mokka <font color=\"green\">coffee</font> 500g sj" } ] }, { "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "tags": "coffee", "price": 3.45, "$score": 2.1726635013471625, "$highlight": [ { "score": 1.1194647495169912, "field": "name", "highlight": "Kulta Katriina filter <font color=\"green\">coffee</font> 500g" } ] }, { "id": "6411300164653", "name": "Juhla Mokka Dark Roast coffee 500g hj", "tags": "coffee", "price": 3.95, "$score": 2.1726635013471625, "$highlight": [ { "score": 1.1194647495169912, "field": "name", "highlight": "Juhla Mokka Dark Roast <font color=\"green\">coffee</font> 500g hj" } ] }, { "id": "6410405181190", "name": "Pirkka Costa Rica filter coffee 500g UTZ", "tags": "coffee pirkka", "price": 2.89, "$score": 2.1726635013471625, "$highlight": [ { "score": 1.1194647495169912, "field": "name", "highlight": "Pirkka Costa Rica filter <font color=\"green\">coffee</font> 500g UTZ" } ] } ] }
Generic similarity query
In the example we're finding similar products based on the given hypothetical new product name.
curl -X POST \ https://aito-demo.aito.app/api/v1/_query \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "orderBy": { "$similarity": { "name": "Atria bratwurst 175g" } }, "limit": 2 }'
Response
{ "offset": 0, "total": 42, "hits": [ { "$score": 2.7310670795288075, "category": "102", "id": "6407870070333", "name": "Atria lauantaimakkara bread sausage 225g", "price": 0.89, "tags": "meat sausage with-bread" }, { "$score": 2.7310670795288075, "category": "102", "id": "6407870071224", "name": "Atria Gotler ham sausage 300g", "price": 1.75, "tags": "meat sausage with-bread" } ] }
Generic predict query
In the example we're predicting which tags a new hypothetical product could have.
curl -X POST \ https://aito-demo.aito.app/api/v1/_query \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "products", "where": { "name": "Atria bratwurst 175g" }, "get": "tags.$feature", "orderBy": "$p", "limit": 5 }'
Response
{ "offset": 0, "total": 25, "hits": [ { "$p": 0.30473884347328356, "field": "", "feature": "meat" }, { "$p": 0.2715004188138674, "field": "", "feature": "sausage" }, { "$p": 0.05475553563545456, "field": "", "feature": "food" }, { "$p": 0.03369571423720281, "field": "", "feature": "protein" }, { "$p": 0.031178962371886616, "field": "", "feature": "pirkka" } ] }
Recommend products which a customer would most likely purchase
In the example we're finding the top 5 products which veronica
(user id)
would most likely to purchase based on her behavior history stored in impressions
table.
This example is the the same as in the documentation of Recommendation endpoint, but made with the generic query.
curl -X POST \ https://aito-demo.aito.app/api/v1/_query \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "context.user": "veronica" }, "get": "product", "orderBy": { "$p": { "$context": { "purchase": true } } }, "limit": 5 }'
Response
{ "offset": 0, "total": 42, "hits": [ { "$p": 0.4824468810415186, "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, { "$p": 0.4125168211770318, "category": "100", "id": "2000503600002", "name": "Chiquita banana", "price": 0.28054, "tags": "fresh fruit" }, { "$p": 0.3889196836779399, "category": "111", "id": "6414880021620", "name": "Ilta Sanomat weekend news", "price": 2.3, "tags": "news" }, { "$p": 0.3805216224063433, "category": "100", "id": "6410405060457", "name": "Pirkka bio cherry tomatoes 250g international 1lk", "price": 1.29, "tags": "fresh vegetable pirkka tomato" }, { "$p": 0.37278546698503334, "category": "100", "id": "6410405093677", "name": "Pirkka iceberg salad Finland 100g 1st class", "price": 1.29, "tags": "fresh vegetable pirkka salad" } ] }
Query with custom scoring
In the example we're finding the top 5 products which veronica
(user id)
would most likely to purchase but in addition we're boosting products which have
higher price. This would recommend products which are relevant for the user but also
bring higher revenue to the shop. This demonstrates a situation where multiple
factors should be considered in recommendations.
curl -X POST \ https://aito-demo.aito.app/api/v1/_query \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "from": "impressions", "where": { "context.user": "veronica" }, "get": "product", "orderBy": { "$multiply": [ { "$p": { "$context": { "purchase": true } } }, "price" ] }, "limit": 3 }'
Response
{ "offset": 0, "total": 42, "hits": [ { "$score": 0.9740401547366683, "category": "111", "id": "6413200330206", "name": "Lotus Soft Embo 8 rll toilet paper", "price": 3.35, "tags": "toilet-paper" }, { "$score": 0.8945152724592617, "category": "111", "id": "6414880021620", "name": "Ilta Sanomat weekend news", "price": 2.3, "tags": "news" }, { "$score": 0.794395202320012, "category": "108", "id": "6410405181190", "name": "Pirkka Costa Rica filter coffee 500g UTZ", "price": 2.89, "tags": "coffee pirkka" } ] }
POST /api/v1/jobs/{query}
Create a job for queries that last longer than 30 seconds. The regular endpoints reach a timeout after 30 seconds.
You can make a job request out of Predict, Match, Similarity, Generic, and Evaluate query endpoints. The query used is the same as you would use for the regular endpoint.
The API also supports running some of the more time-consuming database-operations as jobs. For the given operations, the jobs-API is the recommended way to call the API, due the query timeout limit. The available operations are Batch Data Insert, Data Delete, and Optimize endpoints. The payload format is identical to the regular operations.
Name | Type | Description |
---|---|---|
queryrequired | string | Any of the Aito query endpoints |
Response | Type | Description |
---|---|---|
200 OK | object | Job info |
Example request
The examples are using the dataset of our grocery store demo app. To get deeper understanding of the data context, you can check out the demo app.
The example query is exactly the same as would be when using the regular _evaluate endpoint.
In the example we're evaluating how good results Aito provides when we predict tags for a new hypothetical product. The results give us the accuracy and performance of the prediction example shown in Predict operation's documentation.
$index is a built-in variable which tells the insertion index of a row.
In the example, we select 1/4 of the rows in products
table to be used as test data.
The rest of the rows are automatically used as training data.
Aito iterates through each product in the test data, and tests how accurate
the prediction of tags
for a given product name was.
curl -X POST \ https://aito-demo.aito.app/api/v1/jobs/_evaluate \ -H 'content-type: application/json' \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi' \ -d ' { "test": { "$index": { "$mod": [4, 0] } }, "evaluate": { "from": "products", "where": { "name": { "$get": "name" } }, "predict": "tags" } }'
Response
{ "id": "1a97404d-a40e-4e50-b689-9edc3f7eebee", "parameters": { }, "path": "_evaluate", "startedAt": "2024-11-10T15:49:31.720835Z" }
GET /api/v1/jobs/
List all jobs that exist currently.
Response | Type | Description |
---|---|---|
200 OK | object | Job statuses |
Example request
curl -X GET \ https://aito-demo.aito.app/api/v1/jobs/ \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi'
Response
[ { "expiresAt": "2024-11-10T16:04:32.093043Z", "finishedAt": "2024-11-10T15:49:32.093043Z", "id": "1a97404d-a40e-4e50-b689-9edc3f7eebee", "parameters": { }, "path": "_evaluate", "startedAt": "2024-11-10T15:49:31.720835Z" } ]
GET /api/v1/jobs/{uuid}
If you have started a job for some of the queries, this endpoint can return you the status of the job by its ID.
Response | Type | Description |
---|---|---|
200 OK | object | Job status |
Example request
curl -X GET \ https://aito-demo.aito.app/api/v1/jobs/1a97404d-a40e-4e50-b689-9edc3f7eebee \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi'
Response
{ "expiresAt": "2024-11-10T16:04:32.093043Z", "finishedAt": "2024-11-10T15:49:32.093043Z", "id": "1a97404d-a40e-4e50-b689-9edc3f7eebee", "parameters": { }, "path": "_evaluate", "startedAt": "2024-11-10T15:49:31.720835Z" }
GET /api/v1/jobs/{uuid}/result
Get the query result for a created job.
Response | Type | Description |
---|---|---|
200 OK | object | Evaluate job result |
Example request
curl -X GET \ https://aito-demo.aito.app/api/v1/jobs/1a97404d-a40e-4e50-b689-9edc3f7eebee/result \ -H 'x-api-key: yg4rTlXkqDzm4y8gPeY75HCKaNwfbTQ2si64ONTi'
Response
{ "n": 11, "testSamples": 11, "trainSamples": 31, "features": 203, "error": 0.09090909090909094, "baseError": 0.7272727272727273, "accuracy": 0.9090909090909091, "baseAccuracy": 0.2727272727272727, "accuracyGain": 0.6363636363636364, "meanRank": 2.090909090909091, "baseMeanRank": 4.636363636363637, "rankGain": 2.545454545454546, "informationGain": 2.2066064654562325, "mxe": 1.4400780671492726, "h": 3.646684532605505, "geomMeanP": 0.3685473609393994, "baseGmp": 0.0798433170066807, "geomMeanLift": 4.615882390113653, "meanNs": 7991752.7272727275, "meanUs": 7991.7527272727275, "meanMs": 7.991752727272727, "medianNs": 7369366, "medianUs": 7369.366, "medianMs": 7.369366, "allNs": [ 6607197, 7901433, 8642123, 10903722, 5531592, 7369366, 11605853, 7301086, 6841940, 7317825, 7887143 ], "allUs": [6607, 7901, 8642, 10903, 5531, 7369, 11605, 7301, 6841, 7317, 7887], "allMs": [6, 7, 8, 10, 5, 7, 11, 7, 6, 7, 7], "warmingMs": 0, "accurateOffsets": [0, 1, 2, 3, 4, 5, 6, 7, 8, 10], "errorOffsets": [9], "cases": [ { "offset": 0, "testCase": { "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, "accurate": true, "top": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" } }, { "offset": 1, "testCase": { "category": "100", "id": "6410405093677", "name": "Pirkka iceberg salad Finland 100g 1st class", "price": 1.29, "tags": "fresh vegetable pirkka salad" }, "accurate": true, "top": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" } }, { "offset": 2, "testCase": { "category": "101", "id": "6413467282508", "name": "Fazer Puikula fullcorn rye bread 330g", "price": 1.29, "tags": "gluten bread" }, "accurate": true, "top": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" }, "correct": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" } }, { "offset": 3, "testCase": { "category": "102", "id": "6410405205483", "name": "Pirkka Finnish beef-pork minced meat 20% 400g", "price": 2.79, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" } }, { "offset": 4, "testCase": { "category": "103", "id": "6412000030026", "name": "Saarioinen Maksalaatikko liver casserole 400g", "price": 1.99, "tags": "meat food" }, "accurate": true, "top": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" }, "correct": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" } }, { "offset": 5, "testCase": { "category": "104", "id": "6410405082657", "name": "Pirkka Finnish semi-skimmed milk 1l", "price": 0.81, "tags": "lactose drink pirkka" }, "accurate": true, "top": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" } }, { "offset": 6, "testCase": { "category": "104", "id": "6408430000258", "name": "Valio eilaâ„¢ Lactose-free semi-skimmed milk drink 1l", "price": 1.95, "tags": "lactose-free drink" }, "accurate": true, "top": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" }, "correct": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" } }, { "offset": 7, "testCase": { "category": "108", "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "price": 3.45, "tags": "coffee" }, "accurate": true, "top": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" }, "correct": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" } }, { "offset": 8, "testCase": { "category": "109", "id": "6411401015090", "name": "Fazer Sininen milk chocolate slab 200g", "price": 2.19, "tags": "candy lactose" }, "accurate": true, "top": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" }, "correct": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" } }, { "offset": 9, "testCase": { "category": "111", "id": "6413200330206", "name": "Lotus Soft Embo 8 rll toilet paper", "price": 3.35, "tags": "toilet-paper" }, "accurate": false, "top": { "$p": 0.18373901890597413, "field": "tags", "feature": "pirkka" } }, { "offset": 10, "testCase": { "category": "115", "id": "6410402010318", "name": "Pirkka tuna fish pieces in oil 200g/150g", "price": 1.69, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" } } ], "accurateCases": [ { "offset": 0, "testCase": { "category": "100", "id": "2000818700008", "name": "Pirkka banana", "price": 0.166, "tags": "fresh fruit pirkka" }, "accurate": true, "top": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.3863059328219029, "field": "tags", "feature": "pirkka" } }, { "offset": 1, "testCase": { "category": "100", "id": "6410405093677", "name": "Pirkka iceberg salad Finland 100g 1st class", "price": 1.29, "tags": "fresh vegetable pirkka salad" }, "accurate": true, "top": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.34808924483006115, "field": "tags", "feature": "pirkka" } }, { "offset": 2, "testCase": { "category": "101", "id": "6413467282508", "name": "Fazer Puikula fullcorn rye bread 330g", "price": 1.29, "tags": "gluten bread" }, "accurate": true, "top": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" }, "correct": { "$p": 0.42760884737731225, "field": "tags", "feature": "bread" } }, { "offset": 3, "testCase": { "category": "102", "id": "6410405205483", "name": "Pirkka Finnish beef-pork minced meat 20% 400g", "price": 2.79, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.48471284123013486, "field": "tags", "feature": "pirkka" } }, { "offset": 4, "testCase": { "category": "103", "id": "6412000030026", "name": "Saarioinen Maksalaatikko liver casserole 400g", "price": 1.99, "tags": "meat food" }, "accurate": true, "top": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" }, "correct": { "$p": 0.2528185612401276, "field": "tags", "feature": "food" } }, { "offset": 5, "testCase": { "category": "104", "id": "6410405082657", "name": "Pirkka Finnish semi-skimmed milk 1l", "price": 0.81, "tags": "lactose drink pirkka" }, "accurate": true, "top": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.30775002048928934, "field": "tags", "feature": "pirkka" } }, { "offset": 6, "testCase": { "category": "104", "id": "6408430000258", "name": "Valio eilaâ„¢ Lactose-free semi-skimmed milk drink 1l", "price": 1.95, "tags": "lactose-free drink" }, "accurate": true, "top": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" }, "correct": { "$p": 0.33827403115335014, "field": "tags", "feature": "drink" } }, { "offset": 7, "testCase": { "category": "108", "id": "6420101441542", "name": "Kulta Katriina filter coffee 500g", "price": 3.45, "tags": "coffee" }, "accurate": true, "top": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" }, "correct": { "$p": 0.5601975730961072, "field": "tags", "feature": "coffee" } }, { "offset": 8, "testCase": { "category": "109", "id": "6411401015090", "name": "Fazer Sininen milk chocolate slab 200g", "price": 2.19, "tags": "candy lactose" }, "accurate": true, "top": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" }, "correct": { "$p": 0.39760749608842766, "field": "tags", "feature": "lactose" } }, { "offset": 10, "testCase": { "category": "115", "id": "6410402010318", "name": "Pirkka tuna fish pieces in oil 200g/150g", "price": 1.69, "tags": "meat food protein pirkka" }, "accurate": true, "top": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" }, "correct": { "$p": 0.2829491378278684, "field": "tags", "feature": "pirkka" } } ], "errorCases": [ { "offset": 9, "testCase": { "category": "111", "id": "6413200330206", "name": "Lotus Soft Embo 8 rll toilet paper", "price": 3.35, "tags": "toilet-paper" }, "accurate": false, "top": { "$p": 0.18373901890597413, "field": "tags", "feature": "pirkka" } } ], "alpha_binByTopScore": [ { "meanScore": 0.25681418461581484, "maxScore": 0.30775002048928934, "minScore": 0.18373901890597413, "accuracy": 0.75, "n": 4, "accurateOffsets": [4, 10, 5], "errorOffsets": [9] }, { "meanScore": 0.3675691762234355, "maxScore": 0.39760749608842766, "minScore": 0.33827403115335014, "accuracy": 1, "n": 4, "accurateOffsets": [6, 1, 0, 8], "errorOffsets": [] }, { "meanScore": 0.4908397539011848, "maxScore": 0.5601975730961072, "minScore": 0.42760884737731225, "accuracy": 1, "n": 3, "accurateOffsets": [2, 3, 7], "errorOffsets": [] } ] }
Operations which manipulate the Aito database.
GET /api/v1/schema
Get the schema for the database.
Response | Type | Description |
---|---|---|
200 OK | object | The current active schema |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X GET \ https://your-env-name.aito.app/api/v1/schema \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
{ "schema": { "products": { "columns": { "description": { "analyzer": "english", "nullable": false, "type": "Text" }, "id": { "nullable": false, "type": "Int" }, "name": { "nullable": false, "type": "String" }, "price": { "nullable": false, "type": "Decimal" } }, "type": "table" } } }
PUT /api/v1/schema
Create or update the schema for the entire database.
Note:
Name | Type | Description |
---|---|---|
bodyrequired | object | The aito schema definition |
Response | Type | Description |
---|---|---|
200 OK | object | The current active schema |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X PUT \ https://your-env-name.aito.app/api/v1/schema \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' { "schema": { "products": { "type": "table", "columns": { "id": { "type": "Int" }, "name": { "type": "String" }, "price": { "type": "Decimal" }, "description": { "type": "Text", "analyzer": "English" } } } } }'
Response
{ "schema": { "products": { "columns": { "description": { "analyzer": "english", "type": "Text" }, "id": { "type": "Int" }, "name": { "type": "String" }, "price": { "type": "Decimal" } }, "type": "table" } } }
DELETE /api/v1/schema
Delete the entire database schema.
The operation deletes all data and contents of the database! The action is irreversible.
Response | Type | Description |
---|---|---|
200 OK | object | The summary of deletion |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X DELETE \ https://your-env-name.aito.app/api/v1/schema \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
{ "deleted": ["products"] }
GET /api/v1/schema/{table}
Get the schema of the specified table.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
Response | Type | Description |
---|---|---|
200 OK | object | The current schema of the table |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X GET \ https://your-env-name.aito.app/api/v1/schema/products \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
{ "columns": { "description": { "analyzer": "english", "nullable": false, "type": "Text" }, "id": { "nullable": false, "type": "Int" }, "name": { "nullable": false, "type": "String" }, "price": { "nullable": false, "type": "Decimal" } }, "type": "table" }
PUT /api/v1/schema/{table}
Update a schema of the specified table.
Note:
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
bodyrequired | object | The new schema of the table |
Response | Type | Description |
---|---|---|
200 OK | object | The current schema of the table |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X PUT \ https://your-env-name.aito.app/api/v1/schema/products \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' { "type": "table", "columns": { "id": { "type": "Int" }, "name": { "type": "String" }, "price": { "type": "Decimal" }, "description": { "type": "Text", "analyzer": "English" } } }'
Response
{ "columns": { "description": { "analyzer": "english", "type": "Text" }, "id": { "type": "Int" }, "name": { "type": "String" }, "price": { "type": "Decimal" } }, "type": "table" }
DELETE /api/v1/schema/{table}
Delete a single table in the schema.
The operation deletes all data and contents of the table! The action is irreversible.
Note: The delete operation would fail if it leaves the database schema in broken state.
For example, given the following schema:
{ "schema": { "users": { "type": "table", "columns": { "username": { "type": "String" } } }, "sessions" : { "type": "table", "columns": { "id" : { "type" : "String" }, "user" : { "type" : "String", "link": "users.username" } } } } }
The users
table cannot be deleted before changing the sessions
table first so that sessions.user
is not linked to the users
table.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
Response | Type | Description |
---|---|---|
200 OK | object | The summary of deletion |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X DELETE \ https://your-env-name.aito.app/api/v1/schema/products \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
{ "deleted": ["products"] }
GET /api/v1/schema/{table}/{column}
Get the schema of a column.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
tablerequired | string | The name of the column |
Response | Type | Description |
---|---|---|
200 OK | object | The current schema of the column |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X GET \ https://your-env-name.aito.app/api/v1/schema/products/name \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
{ "nullable": false, "type": "String" }
PUT /api/v1/schema/{table}/{column}
Add or replace a column of a table.
If a column with the same name already exists then the operation deletes all data and contents of the column! The action is irreversible.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
tablerequired | string | The name of the column |
bodyrequired | object | The schema of the column |
Response | Type | Description |
---|---|---|
200 OK | object | The schema of the column |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X PUT \ https://your-env-name.aito.app/api/v1/schema/products/quantity \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' { "type": "Int", "nullable": false, "value": 0 }'
Response
{ "nullable": false, "type": "Int", "value": 0 }
DELETE /api/v1/schema/{table}/{column}
Delete a column from a table.
The operation deletes all data and contents of the column! The action is irreversible.
Note: The delete operation would fail if it leaves the database schema in broken state.
For example, given the following schema:
{ "schema": { "users": { "type": "table", "columns": { "username": { "type": "String" }, "name": { "type": "String" } } }, "sessions" : { "type": "table", "columns": { "id" : { "type" : "String" }, "user" : { "type" : "String", "link": "users.username" } } } } }
The column username
of the users
table cannot be deleted before changing the sessions
table first so that sessions.user
is not linked to users.username
.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
tablerequired | string | The name of the column |
Response | Type | Description |
---|---|---|
200 OK | object | The summary of deletion |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X DELETE \ https://your-env-name.aito.app/api/v1/schema/products/description \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' {}'
Response
{ "deleted": ["description"] }
POST /api/v1/schema/_rename
Rename a table to the specified name.
Rename the table in the 'from' field to the specified name in the rename field. Set 'replace' to true, if you want to replace an existing table with the specified name.
The new table name must be valid. See Valid Table Names section for more information.
Name | Type | Description |
---|---|---|
bodyrequired | object | The request body |
Response | Type | Description |
---|---|---|
200 OK | object | Rename Table results |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/schema/_rename \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' { "from": "products", "rename": "renamed_products" }'
Response
{}
POST /api/v1/schema/_copy
Copy a table. This operations creates a copy of the table with the given name. The operation can be very fast, because the copying is done by copying the reference to the underlying immutable data structure.
The 'from' field must contain the name of the copied table. The 'copy' field must contain the new name of the new copy. Set 'replace' field to true, if you want to replace any existing table with the target name.
The new table name must be valid. See Valid Table Names section for more information.
Name | Type | Description |
---|---|---|
bodyrequired | object | The request body |
Response | Type | Description |
---|---|---|
200 OK | object | Copy Table results |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/schema/_copy \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' { "from": "products", "copy": "old_products" }'
Response
{}
POST /api/v1/data/{table}
Insert entry to a table.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
bodyrequired | object | Any object which is valid according to the provisioned schema |
Response | Type | Description |
---|---|---|
200 OK | object | The inserted entry |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/data/products \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' { "id": 1, "name": "Apple iPhone 8 64 Gt, spacegray", "price": 648.9, "description": "A11 processor and wireless charging." }'
Response
{ "description": "A11 processor and wireless charging.", "id": 1, "name": "Apple iPhone 8 64 Gt, spacegray", "price": 648.9 }
POST /api/v1/data/{table}/batch
Import multiple entries into the database.
The batch import can be used to upload multiple entries to a single table. The payload needs to be a valid JSON array (instead of ndjson).
The batch import can run as a job. The path for running batch as a job is
/api/v1/jobs/data/<TABLE>/batch.
Note: batch API supports max 10MB payloads.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
bodyrequired | array | An array of objects which are valid according to the provisioned schema |
Response | Type | Description |
---|---|---|
200 OK | object | Summary of the inserted entries |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/data/products/batch \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' [ { "id": 1, "name": "Apple iPhone 8 64 Gt, spacegray", "price": 648.9, "description": "A11 processor and wireless charging." }, { "id": 2, "name": "Apple iPhone X 32 GB, space gray", "price": 1048.9, "description": "All‑screen design. Longest battery life ever in an iPhone." }, { "id": 3, "name": "Samsung Galaxy S9", "price": 698.2, "description": "The Camera. Reimagined." } ]'
Response
{ "entries": 3, "status": "ok" }
POST /api/v1/data/_delete
Delete entries with a Search-like interface.
You can describe the target table and filters for which entries to delete. The delete-operation must walk over each entry in the table, and can thus be expensive. Delete can be run as a job, thus preventing timeout errors from happening. The path for running delete as a job is
/api/v1/jobs/data/<TABLE_NAME>/_delete.
An empty proposition will match and delete everything!
Name | Type | Description |
---|---|---|
bodyrequired | object | To be clarified |
Response | Type | Description |
---|---|---|
200 OK | object | Delete results |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/data/_delete \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' { "from": "products", "where": { "id": 1 } }'
Response
{ "total": 1 }
POST /api/v1/data/{table}/file
Initiate a file upload session.
The file API allows circumventing the batch upload API payload size limit by allowing upload of large data sets. The file API accepts data in gzip compressed ndjson format, stored into a file.
File must be a gzip compressed ndjson, normal JSON arrays are not accepted.
The data file is uploaded to AWS S3 and processed asynchronously. The file must be compressed with gzip before uploading to reduce the size of the transferred data.
The file API is not a single API, but requires a minimum of three calls (per table). The sequence is as follows:
You can find the bash implementation of the flow at our tools repository. See the upload-file.sh script.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
Response | Type | Description |
---|---|---|
200 OK | object | The details to execute the S3 upload and the job's id |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/data/products/file \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
{ "expires": "2024-11-10T16:09:35", "id": "2473f65d-2b45-4e9b-85b8-20f7446a626b", "method": "PUT", "url": "https://aitoai-customer-uploads.s3.eu-west-1.amazonaws.com/localhost/products/2473f65d-2b45-4e9b-85b8-20f7446a626b?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20241110T154936Z&X-Amz-SignedHeaders=host&X-Amz-Expires=1199&X-Amz-Credential=AKIA42C7USIXZFCKEUJT%2F20241110%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Signature=98223a48118327dac25d64174b46fc721beab0e46b7c927fe8ae74b3ff27e8e4" }
POST /api/v1/data/{table}/file/{uuid}
Start the processing of a previously uploaded file.
Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
uuidrequired | string | The assigned id of the operation |
Response | Type | Description |
---|---|---|
200 OK | object | Processing started status |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/data/products/file/47052890-c0c9-4062-a9be-6fa4565a3a90 \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
{ "id": "47052890-c0c9-4062-a9be-6fa4565a3a90", "status": "started" }
GET /api/v1/data/{table}/file/{uuid}
Get the file upload progress.
The response is probabilistic and might not contain the very last result, since the status update is asynchronous, and the upload happens in multiple parallel streams. The response, however, will give an idea of approximate progress.
Note: This operation is part of the file upload sequence. If you want to read how to execute a full file upload flow, see Initiate file upload documentation.
Name | Type | Description |
---|---|---|
tablerequired | string | The name of the table to add data to |
uuidrequired | string | The assigned id of the operation |
Response | Type | Description |
---|---|---|
200 OK | object | The file processing status |
Request during processing
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X GET \ https://your-env-name.aito.app/api/v1/data/products/file/ef8ac649-325f-43db-937e-77e4caed6ae6 \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
The example shows what the response looks while data processing is still in progress.
{ "errors": { "message": "Last 0 failing rows", "rows": null }, "status": { "phase": "AitoDatabaseInsert", "finished": false, "completedCount": 3, "lastSuccessfulElement": { "description": "The Camera. Reimagined.", "id": 3, "name": "Samsung Galaxy S9", "price": 698.2 }, "startedAt": "20241110T154937.567Z", "throughput": "2.71/s" } }
Request after processing
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X GET \ https://your-env-name.aito.app/api/v1/data/products/file/0c908fcb-3ea2-42f7-878f-4c55e0a7b7df \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY'
Response
The example shows what the response looks after data processing has been successfully done.
{ "errors": { "message": "Last 0 failing rows", "rows": null }, "status": { "totalDurationMs": 1028, "phase": "Finished", "finished": true, "completedCount": 3, "lastSuccessfulElement": { "description": "The Camera. Reimagined.", "id": 3, "name": "Samsung Galaxy S9", "price": 698.2 }, "totalDuration": "1 second and 28 milliseconds", "startedAt": "20241110T154938.887Z", "finishedAt": "20241110T154939.915Z", "throughput": "2.92/s" } }
POST /api/v1/data/{table}/optimize
Optimize the database for the query performance
Note: The recommended way to run optimize is a job for it. The optimize-operation easily times out for any non-trivial database. The path for running optimize as a job is
/api/v1/jobs/data/<TABLE_NAME>/optimize.
Aito.ai database is implemented as a log-structured merge-tree. Because this architecture, Aito's tables are implemented internally as a tree of table segments.
Now, the complexity of the table tree has major implications on both query speed and write speed side. The less segments Aito maintains in the tree, the faster the queries are, but the slower the writes are, because Aito needs to rewrite parts of the tree regularly. Similarly the more segments are allowed, the slower the queries are, but the faster the write speed becomes.
Aito seeks to maintain the approximately O(log N) segments in the table tree in order to maintain a reasonable compromise between the query and the write speeds.
Still, there can be situations, where it is beneficial to rewrite the entire database as a single segment to get the optimal query speed. Optimize operation does this.
It may take minutes or hours to optimize a big table. This means, that optimize should be used to improve the query performance only in situations, when the database and the results need to be updated rarely, for example nightly.
Optimize will maintain a write lock on the database over the entire operation. This means that you cannot add data at the time the optimize operation is running. Still, the queries will work normally. After the optimize is finished, the optimized table needs to be reloaded, which can induce a significant latency for the following query.
Name | Type | Description |
---|---|---|
bodyrequired | object | An empty object |
Response | Type | Description |
---|---|---|
200 OK | object | An empty object |
Example request
This request sample is not directly copy-pasteable. Your own Aito environment is required.
curl -X POST \ https://your-env-name.aito.app/api/v1/data/products/optimize \ -H 'content-type: application/json' \ -H 'x-api-key: YOUR_READ_WRITE_API_KEY' \ -d ' {}'
Response
{}
The Aito database requires a schema to operate. The schema defines:
Please refer to the Defining a database schema guide for more details.
Any schema which is a valid Aito table schema.
Table schema describes the structure of the table in a formal language. The schema describes all fields (or columns), data types of the fields, and information to help Aito preprocess your data. For example what language a textual data contains.
The contents of the schema depends on the data that will be inserted into the database.
Example
{ "type": "table", "columns": { "id": { "type": "Int", "nullable": false }, "name": { "type": "String", "nullable": false }, "price": { "type": "Decimal", "nullable": false }, "description": { "type": "Text", "nullable": false, "analyzer": "English" } } }
Type of the column.
Describes an individual field (or column), the type, and information to help Aito preprocess your data. For example what language a textual data contains.
Examples
{ "type": "int", "nullable": false }
{ "type": "string", "nullable": false }
{ "type": "decimal", "nullable": false }
{ "type": "text", "nullable": false, "analyzer": "english" }
{ "type": "json", "nullable": true }
Boolean column type.
When column is a boolean, the only accepted values are true
and false
.
Example
{ "type": "boolean" }
Double-precision floating-point number.
Example
{ "type": "Decimal", "nullable": false }
Integer column type.
Examples
{ "type": "Int" }
{ "type": "Int", "link": "users.id" }
String column type.
The string data type is a primitive version of the Text type.
The value is turned into a single feature. For example "lazy black cat"
becomes 1
feature: "lazy black cat"
.
Examples
{ "type": "String", "nullable": false }
{ "type": "String", "link": "messages.id" }
Text column type.
The text data type enables smart textual analysis of strings. A text column has an analyzer which defines how the text can be split into words or tokens, which are used as features during inference.
Example
{ "type": "Text", "analyzer": "English", "nullable": false }
"Json column type.
The json data type can be any atomic JSON value, object, array or nested structure.
The json value is turned into a single feature. For example {"list":[1, 2], "value":true}
becomes 1
feature: {"list":[1, 2], "value":true}
."
(warning: content from core)
Example
{ "type": "Json", "nullable": true }
Aito analyzers break the Text type data into features that can be used for inference.
Let's take a look at an example of predicting the category of a product using its description using the following data:
description | tags |
---|---|
Brazilian organic orange | organic, fruit, imported |
Local organic spinach | organic, vegetable, local |
Lentil snack | snack |
Examples
"standard"
"whitespace"
"english"
"en"
{ "type": "delimiter", "delimiter": "," }
{ "type": "language", "language": "en" }
{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }
{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }
Aito has several built-in analyzers and they are selected by using their name in the "analyzer" field of a text column. For instance:
{ "analyzer": "english" }
The built-in analyzers include:
Examples
"standard"
"whitespace"
"english"
"en"
The Character N-gram Analyzer breaks text into n-gram features.
For example, the following n-gram analyzer:
{ "type": "char-ngram", "minGram": 3, "maxGram": 3 }
would break the text "the cats are running" into the following list of features:
["the", "he ", "e c", " ca", "cat", "ats", "ts ", "s a", " ar", "are", "re ", "e r", " ru", "run", "unn", "nni", "nin", "ing"]
The analyzer can be useful for languages that don’t use spaces or that have long compound words, like German.
Example
{ "type": "char-ngram", "minGram": 2, "maxGram": 3 }
The Delimiter Analyzer breaks text into features whenever encounters a specified delimiter character.
With the trimWhitespace option, the analyzer trims the whitespace surrounding a feature.
For example, the following analyzer:
{ "type": "delimiter", "delimiter": ",", "trimWhitespace": true }
would break the text "the, cats,are, running" into 4 features:
["the", "cats", "are", "running"]
Examples
{ "type": "delimiter", "delimiter": "," }
{ "type": "delimiter", "delimiter": "\n", "trimWhitespace": true }
Language Analyzers aim to analyze text of a specific language.
When using a language analyzer, text is analyzed into lower-case word stem features. For example, using the following english analyzer:
{ "type": "language", "language": "english" }
a text "the cats are running" will be broken into 4 word stem features:
["the", "cat", "ar", "run"]
The value of the "language" parameter specifies which language will be used. The value can be the name or the ISO 639-1 code of the language. The full list is shown as below:
Language | Name | ISO code |
---|---|---|
Arabic | arabic | ar |
Armenian | armenian | hy |
Basque | basque | eu |
Brazilian Portuguese | brazilian | pt-br |
Bulgarian | bulgarian | bg |
Catalan | catalan | ca |
Chinese, Japanese, Korean | cjk | cjk |
Czech | czech | cs |
Danish | danish | da |
Dutch | dutch | nl |
English | english | en |
Finnish | finnish | fi |
French | french | fr |
Galician | galician | gl |
German | german | de |
Greek | greek | el |
Hindi | hindi | hi |
Hungarian | hungarian | hu |
Indonesian | indonesian | id |
Irish | irish | ga |
Italian | italian | it |
Latvian | latvian | lv |
Norwegian | norwegian | no |
Persian | persian | fa |
Portuguese | portuguese | pt |
Romanian | romanian | ro |
Russian | russian | ru |
Spanish | spanish | es |
Swedish | swedish | sv |
Thai | thai | th |
Turkish | turkish | tr |
The language analyzers support filtering the stop words (common words that are normally not useful). Each language has a list of default stop words for filtering that can be enabled through the useDefaultStopWords" parameter. Some common English stop words are:
"a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with"
By default, "useDefaultStopWords" is set as false. The following analyzer:
{ "type": "language", "language": "english", "useDefaultStopWords": true }
would break the text "the cats are running" into 2 features:
["cat", "run"]
It is also possible to specify a set of words that would be filtered through the "customStopWords" parameter and a set of words that would not be analyzed through the "customKeyWords" parameter. The following analyzer:
{ "type": "language", "language": "english", "useDefaultStopWords": false, "customStopWords": ["cats"], "customKeyWords": ["running"] }
would break the text "the cats are running" into 3 features:
["the", "ar", "running"]
Examples
{ "type": "language", "language": "en" }
{ "type": "language", "language": "english", "useDefaultStopWords": true, "customStopWords": ["flower"], "customKeyWords": ["animal"] }
The Token N-gram Analyzer breaks text into token n-grams (shingles) based on a source analyzer. In other words, it combines the features of the source analyzer into new features.
For example, the following Token N-gram Analyzer:
{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2, "tokenSeparator": "_" }
would breaks the text "the cat is running" into the following list of features:
["the", "the_cat", "cat", "cat_ar", "ar", "ar_run", "run"]
Examples
{ "type": "token-ngram", "source": "english", "minGram": 1, "maxGram": 2 }
{ "type": "token-ngram", "source": { "type": "delimiter", "delimiter": "," }, "minGram": 1, "maxGram": 3, "tokenSeparator": "_" }
The reference documentation for Aito query language.
To make better analysis of the data, Aito splits fields into features under the hood. How the
featurization is done, depends on the field type. For example the Text
type supports an "analyzer"
option which allows you to control how a text field is split into
features.
Some queries, for example Relate, return the features instead of the actual values of the field.
Exclusiveness is an option in predictions. In summary, it describes whether the predicted field can have multiple values at the same time or not.
Understanding the concept is easiest through an example. If we were predicting
tags for a product, we would want to set "exclusiveness": false
, because
a product can have multiple tags. A product could be described with
the following tags:
However if we were predicting the user, who would most likely purchase a product, we would want to use "exclusiveness": true
(default behavior) because
the value can only be one user at a time.
If we were trying to find a customer, who is best characterized by a message, we'd need
to understand the difference between $p
and $lift
. To make the difference clear,
consider the following situation:
Querying users by $p
quite likely finds Alice, because she may be overall the more likely person
to mention "iPhone". Querying users by $lift
, on the other hand will very certainly find Bob,
because $lift
describes that how characteristic the feature "iPhone" is for the user.
A more mathematical and technical description for the phenomenon is the following:
Aito uses Bayesian probability inference to estimate so that where the probability lift component
The probability lift component describes that how much more likely X is true in the specified context, when compared to average.
In Aito query syntax: $p
stands for the , while $lift
stands for the
component.
Useful for creating conditional queries with text fields.
Operator to check if a textual field fuzzy matches a given string.
Case insensitive. The matched text is split to tokens with the analyzer specified
for the field in schema. For example { "$match": "great programmers" }
will match strings
"Bob is the greatest programmer!", and "Programmers are having great fun"
if the field is properly analyzed with the English analyzer.
Examples
{ "$match": "coffee" }
{ "from": "products", "where": { "name": { "$match": "coffee" } } }
Operator to check if a textual field starts with a given string. Case sensitive.
Examples
{ "$startsWith": "Cucumber" }
{ "from": "products", "where": { "name": { "$startsWith": "Cucumber" } } }
Useful for creating conditional queries.
Operator to check if a field is greater than a given value.
Examples
{ "$gt": 8 }
{ "$gt": 231.1 }
{ "$gt": "20150308" }
{ "from": "products", "where": { "price": { "$gt": 2.14 } } }
Operator to check if a field is greater than or equal to a given value.
Examples
{ "$gte": -2 }
{ "$gte": 0 }
{ "$gte": "20180502" }
{ "from": "products", "where": { "price": { "$gte": 2 } } }
Operator to check if a field is less than a given value.
Examples
{ "$lt": 4 }
{ "$lt": -12.1 }
{ "$lt": "20180502" }
{ "from": "products", "where": { "price": { "$lt": 1.24 } } }
Operator to check if a field is less than or equal to a given value.
Examples
{ "$lte": 8 }
{ "$lte": 0 }
{ "$lte": "20180502" }
{ "from": "products", "where": { "price": { "$lte": 1 } } }
Has operation checks whether the field has the specified feature.
$has
is a low level operation, that operates at the feature level.
The features can differ significantly from the original data, specifically in case of text,
when analyzers are used.
For example if you have field called content
with the text "programmers and horses",
the field would have features 'programmer' and 'hors', which are stems by the
English analyzer.
Examples
{ "$has": "drink" }
{ "from": "products", "where": { "tags": { "$has": "drink" } } }
Operator to select rows based on if an nullable field has been defined or not.
Example
{ "$defined": true }
An operator to get features of given field(s).
Examples
{ "$exists": ["query", "product.tags"] }
{ "from": "impressions", "where": { "$on": [ { "$exists": ["query", "customer.tags"] }, { "click": true } ] }, "relate": ["product.title", "product.tags"] }
Useful for combining multiple conditions in conditional queries.
Performs a logical and
operation on the given array containing two or more Propositions.
will always find products of which description is super "slim laptop" and price is greater than 200{ "from": "products", "where": { "$and": [ { "description": "super slim laptop" }, { "price": { "$gt" : 200 } } ] } }
{ "from": "products", "where": { "$and": [ { "description": "super slim laptop" }, { "price": { "$gt" : 200 } } ] }, "price": "tag" } Aito might look for products with a price greater than 200 but do not match the description of super slim laptop or products that match the description but do not meet the price condition. This is because there might be a lack of data (e.g: not enough products in the price range) to make a sophisticated prediction.
To guarantee that all propositions are met in a inference query, refer to $atomic
Examples
{ "$and": [ { "$gt": 10 }, { "$lt": 20 } ] }
{ "from": "products", "where": { "price": { "$and": [ { "$gt": 1.5 }, { "$lt": 2.1 } ] } } }
Performs a logical or
operation on the given array containing two or more Propositions.
Examples
{ "$or": [ { "tags": "cover" }, { "tags": "laptop" } ] }
{ "from": "products", "where": { "price": { "$or": [ { "$lt": 0.9 }, { "$gt": 2.1 } ] } } }
Performs a logical not
operation on the given Proposition.
Examples
{ "$not": { "tags": "laptop" } }
{ "$not": { "$lt": 0 } }
{ "from": "products", "where": { "price": { "$not": { "$lt": 1.1 } } } }
Can be used in "orderBy"
clause to declare the sorting order of the result.
Sort returned hits in ascending order (A-Z) based on the given attribute or custom scoring function.
Examples
{ "$asc": "price" }
{ "$asc": "product.price" }
{ "$asc": { "$multiply": ["product.price", "$p"] } }
Sort returned hits in ascending order (A-Z) based on the given attribute (or column).
Example
{ "$asc": "lift" }
Sort returned hits in descending order (Z-A) based on the given attribute or custom scoring function.
Examples
{ "$desc": "price" }
{ "$desc": "product.price" }
{ "$desc": { "$multiply": ["product.price", "$p"] } }
Sort returned hits in descending (Z-A) order based on the given attribute (or column).
Example
{ "$desc": "info.miTrue" }
Can be used in conditional queries or scoring in "orderBy"
clauses.
Operator to check if the value of a field divided by a divisor has the specified remainder.
In other words perform a modulo operation. This operator supports object or array form. Note that the field will be converted to an integer (effectively a math floor) before the modulo operation.
Examples
{ "$mod": [2, 0] }
{ "$mod": { "divisor": 2, "remainder": 0 } }
{ "from": "products", "where": { "price": { "$mod": { "divisor": 2, "remainder": 0 } } } }
Example
{ "$multiply": ["price", 2] }
Division operation.
Example
{ "$divide": ["cost", 4] }
Exponentiation operation. First item raised to the power of the second.
Example
{ "$pow": ["width", 2] }
Example
{ "$sum": ["priceNet", "priceVat"] }
Subtraction operation.
Example
{ "$subtract": ["price", 2] }
More advanced operators which can improve query results in certain situations.
Transforms a statement into a 'black box' proposition.
This prevents Aito from analyzing the proposition and using its parts separately in the statistical reasoning.
In practice the difference between normal 'white box' expressions,
and the $atomic
's black box expressions is: that the atomic expressions have a smaller
bias, but a higher measurement error.
Consider the following example:
{ "tags": "pen", "price": { "$gte": 200 } } }
During the statistical reasoning: Aito may recognize that pens are often sold, and that over 200€ product purchases are somewhat common. As a result, Aito might assume the over 200€ pen to be a popular product.
Now, consider the expression:
{ "$atomic": { "tags": "pen", "price": { "$gte" : 200 } } }
The results of this expression will depend of the amount of data. If there are no over 200€ pens in the data: Aito will make no assumptions of the proposition's effect. On the other hand, if you have the data: Aito will recognize correctly, that the over 200€ pens are bought extremely rarely.
Examples
{ "$atomic": { "tags": "pen", "price": { "$gte": 200 } } }
{ "from": "products", "where": { "$atomic": { "tags": "pen", "price": { "$gte": 200 } } } }
Provides ability to access the fields of the table specified in "from"
, instead of
fields of the table in "get"
.
Examples
{ "$context": { "click": true } }
{ "from": "impressions", "where": { "customerEmail": "john.doe@aito.ai", "query": "laptop" }, "get": "product", "orderBy": { "$p": { "$context": { "click": true } } } }
Provides ability to access the fields of the hit.
Examples
{ "$hit": "price" }
{ "$hit": "$similarity" }
{ "from": "impressions", "where": { "product.title": { "$match": "iphone" } }, "get": "product", "orderBy": { "$multiply": [ { "$hit": "$similarity" }, { "$hit": "price" } ] } }
$on
operator is used to define conditional propositions or hard filters.
This is useful when you have limited amount of data and the condition would help to
limit the context and provide better results. This can be done by providing a list
containing of two items, the first object (or "prop"
) is the hypothesis and the
second object (or "on"
) is the conditional.
In Aito the where clause contains propositions which aren't hard filters.
Instead, Aito will turn all the propositions into features (the user's ID, every word
in a text field, etc.). There are many of these and they are not statistically independent.
Aito picks a subset of these features that are the best predictors of the field
that is to be predicted. So what goes into the "where"
is a description of the situation
you're in and Aito tells you what you should expect to find if you look in a field.
But the description is not taken at face value, Aito will ignore parts of it if it
doesn't help the prediction.
However, there is another way to achieve this: the "$on"
proposition. It is
modeled after conditional probability. It is divided into two parts, the normal "where"
parts and the conditional part ("hard filters"). The "$on"
parameters explained:
{ "from": "...", "where": { "$on": [ { "message": "hello, world", "something": true, // other things you put in your "where" clause }, { // The subset of data that exactly matches these conditions "userId": 42, "day": "monday" } ] }, "predict": "..." }
The $on
can also be combined with normal query. If the $on
condition is too strong,
you could move parts of the filtering back to the where clause:
{ "from": "...", "where": { "$on": [ { "message": "hello, world", "something": true, // other things you put in your "where" clause }, { // The subset of data that exactly matches these conditions "day": "monday" } ], "user_id": 42 }, "predict": "..." }
Examples
{ "$on": { "prop": { "click": true }, "on": { "user.tags": "nyc" } } }
{ "$on": [ { "click": true }, { "user.tags": "nyc" } ] }
The $knn
operator is an adaptation of the classic k-nearest neighbor algorithm.
Aito's $knn
operator identifies k most similar rows to the conditions defined in the 'near' parameter. The similarity metric is the same metric used in the similarity query. The k nearest rows can be used in inference.
The $knn
operator can be useful in situation where there is no training data. For example:
{ "from": "impressions", "where": { "product.name": "Columbian Coffee", "product.tags": "high quality coffee" }, "predict": "purchase" }
The query would not yield sensible results since there's no such product existed in the current data. This can be improved by using the $knn
operator:
{ "from": "impressions", "where": { "$knn": { "k": 5, "near": { "product.name": "Columbian Coffee", "product.tags": "high quality coffee" } } }, "predict": "purchase" }
In the query above, Aito would look for 5 entries that are most similar to the given criteria in "near"
and use that for inference.
Examples
{ "$knn": [ 4, { "tags": "laptop" } ] }
{ "$knn": { "k": 4, "near": { "tags": "laptop" } } }
The $nn
operator is similar to the classic k-nearest neighbor algorithm, except that it
matches a dynamic number of entries that are roughly the same as the specified proposition.
Aito's $nn
operator identifies all rows that are roughly same to the conditions defined in the 'near' parameter.
This group of parameters can be used in inference. This rough sameness is based on the same score used
in the $sameness, and you can inspect the score of the matching values with the following
query:
{ "from": "rfps", "where": { "$nn": [{ "question": "Does your company comply to ISO 27001?" }] }, "orderBy": "$sameness" }
$nn
accepts also threshold parameter that can used to make matching stricter or looser like here:
{ "from": "rfps", "where": { "$nn": [{ "question": "Does your company comply to ISO 27001?" }, 0.5] }, "orderBy": "$sameness" }
The default threshold is 1.0.
An examples of using $nn
in inference relates to question answering setting, where there is a desire
to avoid false positive present in classic $knn
or the default Bayesian inferences.
An example of using $nn
for answering RFP question is following:
{ "from": "rfps", "where": { "question": { "$nn": ["Does your company comply to ISO 27001?"] } }, "predict": "answer" }
This specific question will match similar question in the database. Still, because this question may also match questions like 'Does your comply with ISO 9001?', it makes sense to to also use the question's 'Does your comply with ISO 9001?' conditional features in inference like this:
{ "from": "rfps", "where": { "question": { "$on": [ "Does your company comply to ISO 27001?", {"$nn": ["Does your company comply to ISO 27001?"]} ], "$nn": ["Does your company comply to ISO 27001?"] } }, "predict": "answer" }
In this example, we $nn identifies a group of similar questions and uses these in the inference. At the same time $on structure allows the inference to see e.g. ISO 27001 as a separate feature inside this group. In this way, the system can focus on similar questions, while using individual features like ISO 27001 to infer the right answer.
Examples
{ "$nn": [ { "tags": "laptop" } ] }
{ "$nn": { "near": { "tags": "laptop" } } }
Operator to check if a numeric field fuzzy matches a given number.
By default, numbers are compared exactly against one another. The $numeric proposition signifies that comparisons should be inexact and that the target is somewhere close to the specified number. The size of the region depends on the spread and density of the data.
Examples
{ "$numeric": 42 }
{ "$numeric": 3.14 }
$hash converts the field value into a hash integer.
The hash code can be used to split non-integer data pseudo-randomly in the evaluate query.
Example
{ "$hash": { "$mod": [2, 1] } }
$toString
operator is used to convert a nummeric value to string.
This is useful when you want to use a numeric as input for an operator or a field that requires text input. For example:
{ "description": { "$match": { "$toString": { "$get": "id" } } } }
Example
{ "$toString": 4 }
Can be used in "orderBy"
clause to sort or create an advanced scoring algorithm.
"$lift"
can be used in the "orderBy"
clause of the Generic query to get the most likely values based on lifts of features with regard to other features.
In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest lifts that it would be purchased:
{ "from": "impressions", "where": { "product.name": {"$match": "lactose"} }, "get": "purchase", "orderBy": "$lift" }
Running the following query would yield the most likely product based on all the fields of the linked product
table:
{ "from": "impressions", "where": { "context.user": "bob", "purchase": true }, "get": "product", "orderBy": "$lift" }
Since the product
field in the impressions
table is linked to the products
table, Aito would find all the statistical relations between what is declared inside the "where"
clause and all the fields feature of a product, that is, id
, name
, category
, price
, tag
. In this case, the lift score is the product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select"
clause (e.g: "select": ["$score", "$why"]
):
"$why": { "type": "product", "factors": [ { "type": "hitPropositionLift", "proposition": { "id" : 6410405093677 }, "value": 1.9827806375460209, "factors": [ { "type": "relatedPropositionLift", "proposition": { "purchase" : true }, "value": 1.9827806375460209 } ] }, { "type": "hitPropositionLift", "proposition": { "$not": { "name" : {"$has": "puikula" } } }, "value": 1.0472308585357502, "factors": [ { "type": "relatedPropositionLift", "proposition" : { "purchase" : true }, "value": 1.0472308585357502 } ] } ... ] }
We can see that the lift score is composed of lift of an id
feature, a name
feature and others.
See also $p
and $lift
.
Examples
"$lift"
{ "from": "messages", "get": "user", "orderBy": "$lift", "where": { "message": { "$match": "dog" } } }
"$p"
can be used in the "orderBy"
clause of the Generic query to get
the most probable values.
When used this way, it is similar to the Match query.
In the grocery dataset, running the following query would yield products with name similar to "lactose" that have the highest probabilities that it would be purchased:
{ "from": "impressions", "where": { "product.name": {"$match": "lactose"} }, "get": "purchase", "orderBy": "$p" }
Similar to the Match query, running the following query would yield the most likely product based on all the fields of the linked product
table:
{ "from": "impressions", "where": { "context.user": "bob", "purchase": true }, "get": "product", "orderBy": "$p" }
Since the product
field in the impressions
table is linked to the products
table, Aito would find all the statistical relations between what is declared inside the "where"
clause and all the fields feature of a product, that is, id
, name
, category
, price
, tag
. In this case, the probability score is the normalized product of the lift of each field's feature. We can investigate this by opening up the explanation adding the $why operator to the "select"
clause (e.g: "select": ["$score", "$why"]
):
"$why": { "type": "product", "factors": [ { "type": "hitPropositionLift", "proposition": { "id" : 6410405093677 }, "value": 1.9827806375460209, "factors": [ { "type": "relatedPropositionLift", "proposition": { "purchase" : true }, "value": 1.9827806375460209 } ] }, { "type": "hitPropositionLift", "proposition": { "$not" : { "name" : { "$has": "puikula" } } }, "value": 1.0472308585357502, "factors": [ { "type": "relatedPropositionLift", "proposition": { "purchase" : true }, "value": 1.0472308585357502 } ] } ... ] }
We can see that the probability score is composed of lift of an id
feature, a name
feature and others.
See also $p
and $lift
.
Examples
"$p"
{ "from": "messages", "get": "user", "orderBy": "$p", "where": { "message": { "$match": "dog" } } }
"$similarity"
can be used in Generic query to get most similar
rows based on the contents of the "where"
clause.
Consider the following example. It will return all the products, that contain 'iphone' in the title. It also sorts the results by their similarity to the 'iphone' and highlight the 'iphone' term in the product title field.
{ "from": "product", "where": { "title": { "$match": "iphone" } }, "get": "message", "orderBy": "$similarity", "select": ["title", "$highlight"] }
Examples
"$similarity"
{ "from": "product", "get": "message", "orderBy": "$similarity", "where": { "title": { "$match": "iphone" } } }
"$sameness"
can be used in Generic query to get most roughly the same
rows based on the contents of the "where"
clause.
Consider the following example. It will return all the questions, that are roughly same as 'How can I order a sim card?' based on how closely they match the question.
{ "from": "questions", "where": { "title": { "$nn": ["How can I order a sim card?"] } }, "get": "message", "orderBy": "$sameness" }
Examples
"$sameness"
{ "from": "product", "orderBy": "$sameness", "where": { "title": { "$match": "iphone" } } }
Conceptually similar to the plain $lift operator, but allows using a customized proposition for the lift score calculation.
This $lift
operator enables more options to customized the lift score calculation, especially when getting the values of linked table.
Narrow down the fields that are used to calculate the lift:
This is similar to the behavior of the "basedOn"
clause of the Match query
When calculating the lift of a linked field, aito used all the fields of the linked table (See $lift for how the lift is calculated for a linked field).
If you would like to narrow down how the lift is calculated, you can add the field name following the $lift
. For example, find the most likely product based on only the product name
:
{ "from": "impressions", "where": { "context.user": "bob", "purchase": true }, "get": "product", "orderBy": { "$lift": "name" } }
You can also calculate the lift based on multiple fields by using the array format. For instance:
{ "$lift": ["category", "tag"] }
Calculate the lift based on a specific context:
This is similar to the behavior of the Recommend query
By combining with the $context operator, the lift score can be defined as the lift of a context. For instance, to find the products with the highest lift of getting purchased:
{ "from": "impressions", "where": { "context.user": "bob" }, "get": "product", "orderBy": { "$lift": {"$context": {"purchase": true}} } }
Examples
{ "$lift": "tags" }
{ "$lift": ["tags", "title"] }
{ "$lift": { "$context": { "click": true } } }
{ "from": "impressions", "where": { "product.title": { "$match": "iphone" } }, "get": "product", "orderBy": { "$lift": "tags" } }
{ "from": "impressions", "where": { "product.title": { "$match": "iphone" } }, "get": "product", "orderBy": { "$lift": ["tags", "title"] } }
{ "from": "impressions", "where": { "product.title": { "$match": "iphone" } }, "get": "product", "orderBy": { "$lift": { "$context": { "click": true } } } }
Conceptually similar to the plain $p operator, but allows using a customized proposition for the probability score calculation.
This $p
operator enables more options to customized the probability score calculation, especially when getting the values of linked table:
Narrow down the fields that are used to calculate the probability:
This is similar to the behavior of the "basedOn"
clause of the Match query
When calculating the probability of a linked field, aito used all the fields of the linked table (See $p for how the probability is calculated for a linked field).
If you would like to narrow down how the probability is calculated, you can add the field name following the $p
. For example, find the most likely product based on only the product name
:
{ "from": "impressions", "where": { "context.user": "bob", "purchase": true }, "get": "product", "orderBy": { "$p": "name" } }
You can also calculate the probability based on multiple fields by using the array format. For instance:
{ "$p": ["category", "tag"] }
Calculate the probability based on a specific context:
This is similar to the behavior of the Recommend query
By combining with the $context operator, the probability score can be defined as the probability of a context. For instance, to find the products with the highest probability that the product would be purchased:
{ "from": "impressions", "where": { "context.user": "bob" }, "get": "product", "orderBy": { "$p": {"$context": {"purchase": true}} } }
Examples
{ "$p": "tags" }
{ "$p": ["tags", "title"] }
{ "$p": { "$context": { "click": true } } }
{ "from": "impressions", "where": { "product.title": { "$match": "iphone" } }, "get": "product", "orderBy": { "$p": "tags" } }
{ "from": "impressions", "where": { "product.title": { "$match": "iphone" } }, "get": "product", "orderBy": { "$p": ["tags", "title"] } }
{ "from": "impressions", "where": { "product.title": { "$match": "iphone" } }, "get": "product", "orderBy": { "$p": { "$context": { "click": true } } } }
Conceptually similar to the plain $similarity operator, but allows using a customized proposition for the similarity score calculation.
The plain $similarity
operator calculates the similarity score based on the "where"
clause contents, whereas this $similarity
operator calculates the similarity score based on the given proposition.
These Generic Queries would yield the same results:
{ "from": "products", "where": { "name": {"$match": "coffee"} }, "orderBy": "$similarity" }
{ "from": "products", "orderBy": { "$similarity": { "name": "coffee" } } }
This $similarity
operation is useful for customizing scoring as the example below. Please refer to GenericQuery query with custom scoring example.
{ "from": "impressions", "where": { "context.user": "veronica" }, "get": "product", "orderBy": { "$multiply": [ { "$p": { "$context": { "purchase": true } } }, { "$similarity": { "name": "coffee" } } ] } }
Examples
{ "$similarity": { "title": "apple iphone", "tags": "premium ios phone" } }
{ "from": "products", "orderBy": { "$similarity": { "title": "apple iphone", "tags": "premium ios phone" } } }
This operator provides a score, which reflect whether the return entry is roughly the same as the given proposition / data.
E.g. if you have a phrase "How to order a sim card?" it should be judged roughly the same as "Ho can I order a sim card?". The values above 1 are considered to be roughly same and values under 1 are considered to be roughly distinct.
$sameness works in similar way to $similarity, except that it does more strict matching. E.g. query "sim card" provides above 1 similarity score with "How can I order sim card?", but it provides significantly under 1 sameness score. $sameness works better in situations, where a more restrictive is scoring is needed to avoid false matches. An example of this is e.g. question answering situation where, where one needs to match questions more strictly in order to avoid false positivess
Examples
{ "$sameness": { "title": "apple iphone", "tags": "premium ios phone" } }
{ "from": "products", "orderBy": { "$sameness": { "title": "apple iphone", "tags": "premium ios phone" } } }
This operator provides a score, which reflect whether the return entry is analogious with the given value in respect to some other values.
E.g. if you have a query word 'cheap' it should have high $analogy
score with words like
affordable and inexpensive.
Examples
{ "$analogy": { "with": "veronica", "basedOn": "purchases" } }
{ "from": "products", "get": "title", "orderBy": { "$analogy": { "with": "ideapad", "basedOn": "tags" } } }
"$f"
can be used in the "orderBy"
clause of the Generic query to get the frequency of a feature.
Examples
"$f"
{ "from": "impressions", "get": "product", "orderBy": "$f" }
$normalize
operator can be used in the "orderBy"
clause of the Generic query to make a score to sum to 1.
For example, you can normalize the $lift or the $lift object to 1:
{ "from": "impressions", "where": { "product.name": {"$match": "lactose"} }, "get": "purchase", "orderBy": { "$normalize": "$lift" } }
or
{ "from": "impressions", "where": { "context.user": "bob", "purchase": true }, "get": "product", "orderBy": { "$normalize": { "$lift": { "$context": { "click": "true" } } } } }
Examples
{ "$normalize": "$lift" }
{ "$normalize": { "$lift": "name" } }
Can be used in "select"
clause.
$sort
is a built-in field that can be used to access the sort value used in the orderBy-clause.
Example
"$sort"
$score
is a built-in field that can be used to access the sort value used in the orderBy-clause,
when the sort-value is a numeric score like a probability.
Example
"$score"
$index
is a built-in variable which indicates the insertion index of a row. It can be
used together with $mod to select parts of a table. It's useful
for example in Evaluate query for selecting training or test data.
Example
"$index"
$p
is a built-in field that can be used to access the value used by orderBy-clause,
when the sort-value is a probability. See $p for more information.
Example
"$p"
When selecting $why
, Aito opens up why a certain result was predicted.
Explanation contains 3 different factors, which are explained below.
The three different factors are for an estimate of form:
"baseP"
The base probability.
"normalizer"
Aito has two different normalizes, that are
These normalizes are often grouped into a single 'product' component.
{ "type" : "product", "factors" : [ { "type" : "normalizer", "name" : "exclusiveness", "value" : 1.0119918068684681 }, { "type" : "normalizer", "name" : "trueFalseExclusiveness", "value" : 1.09917613448721 } ] }
The exclusiveness
normalizer is only used, when exclusiveness is on. In this case, it is assumed
that only one feature can be true at the same time, and that one feature will be true.
In practice, exclusiveness enforces the probabilities of alternative features to sum to 1.0.
The normalizer is of form:
Aito makes a probability estimation for both X and ¬X on the background and uses the
trueFalseExclusiveness
normalizer to assert that the probabilities P(X) and P(¬X) sum to 1.0.
The normalizer is of form:
"relatedVariableLift"
Probability lifts. For example: the lift may say a product is clicked with 2.3x likelihood (or 130% higher likelihood), when it has 5 stars.
A probability lift is of form:
Example
"$why"
$value
is a built-in field which contains the value of the returned object.
$value can be used to access the field value referred in the predict, match, recommend
and get-clauses, when the returned item is either a field value or a field feature/proposition.
$value
is intended to replace the 'feature' field in the long term.
The $value field has been added to contain the information in the ‘feature’ so that for query:
{ "from" : "products", "where" : { "title" : "apple iphone" }, "predict": "tags", "select" : ["$p", "$value"], "limit":3 }
The result will be the following:
{ "offset" : 0, "total" : 10, "hits" : [ { "$p" : 0.3656914544001758, "$value" : "premium" }, { "$p" : 0.1546922568903658, "$value" : "cover" }, { "$p" : 0.09493670104339776, "$value" : "macosx" } ] }
Value works similarly, when predicting the field value, using the generic query.
{ "from" : "products", "where" : { "title" : "apple iphone" }, "get": "tags", "orderBy" : "$p", "select" : ["$p", "$value"], "limit":3 }
Or when when predicting the field features with the generic query:
{ "from" : "products", "where" : { "title" : "apple iphone" }, "get": "tags.$feature", "orderBy" : "$p", "select" : ["$p", "$value"], "limit":3 }
Example
"$value"
$proposition
is a built-in field which contains the proposition object of
the returned feature. The returned proposition is compatible with the
proposition format and it can be used as such in the where
clause.
Consider the following query:
{ "from": "products", "where": { "title": "Apple" }, "predict": { "$on": [ { "$exists": "tags" }, { "$and": [ { "tags": { "$match": "phone" } }, { "$not": { "tags": { "$match": "laptop" } } } ] } ] }, "select": ["$p", "$value", "$proposition"], "limit": 1 }
This provides the following results:
{ "offset" : 0, "total" : 10, "hits" : [ { "$p" : 0.22622976807854914, "$value" : "phone", "$proposition" : { "$on" : [ { "tags" : { "$has" : "phone" } }, { "$and" : [ { "tags" : { "$has" : "phone" } }, { "$not" : { "tags" : { "$has" : "laptop" } } } ] } ] } } ] }
Example
"$proposition"
Explanation object when using the "$why"
operator.
Conceptually similar to BaseProbabilityExplanation but show the prior lift instead of prior probability.
See more Probability vs. Lift
Example
{ "type": "baseLift", "value": 31 }
Explain the initial weight of a feature. It can be understand as the prior probability of a feature.
Let's take a look at an example of a Predict query:
{ "from": "products", "where": { "name": "Columbian coffee" }, "predict": "tags", "select": ["$p", "feature", "$why"] }
When opening up the explanation with "$why" operator, a tag's feature "coffee" has a BaseProbabilityExplanation:
{ "type": "baseP", "value": 0.16 }
This explanation tells that Aito gives the feature "coffee" a prior probability of 0.16.
Example
{ "proposition": { "click": true }, "type": "baseP", "value": 0.5 }
Explain how an exponent score was calculated.
The ExponentExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of an exponent score such as: 1. The tf-idf score to calculate the similarity in the Similarity query. 1. The score of the $pow operator.
Example
{ "base": { "type": "idf", "value": 1.7551720221592049 }, "power": { "type": "tf", "value": 1 }, "type": "exponent", "value": 1.7551720221592049 }
Explain how a field score was calculated.
The field explanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a score that was calculated using:
A field value
{ "from" : "impressions", "where" : { "product.name":{"$match": "coffee"} }, "get":"product", "orderBy" : { "$multiply": ["$p", "price"] }, "select": ["$score", "$why"] }
The explanations would contains the value of the "price"
field that was use in the $multiply operator.
{ "type": "field", "field": "price", "value": 3.95 }
A field feature (e.g: $f operator for frequency):
{ "from" : "impressions", "where" : { "product.name":{"$match": "coffee"} }, "get":"product", "orderBy" : "$f", "select": ["$score", "$why"] }
The explanation would contains the frequency of the feature.
{ "type": "field", "field": "$f", "value": 152.0 }
Example
{ "field": "price", "type": "field", "value": 1500 }
Explain how a propositions's lift was calculated.
A hit score was calculated by aggregating the score of its propositions (features). The HitPropositionLiftExplanation explains how different proposition was calculated.
A HitPropositionLift can be:
A similarity score
A hit's field can contain a word that match the stem of the given similarity condition. That word would have a HitPropositionLift that is a similarity score. Let's take a look at an example of Similarity query:
{ "from": "products", "similarity": { "name": "Columbian coffee", "tags": "expansive coffee" }, "select": ["$score", "name", "tags", "$why"] }
When opening up the explanation with "$why" operator, we can see that a hit with name "Juhla Mokka coffee 500g sj" containing the word coffee has a HitPropositionLiftExplanation as follows:
{ "type": "hitPropositionLift", "proposition": "name:coffe", "value": 2.1726635013471625, "factors": [ { "type": "exponent", "value": 2.1726635013471625, "base": { "type": "idf", "value": 2.1726635013471625 }, "power": { "type": "tf", "value": 1.0 } } ] }
An aggregated score of BaseLift and RelatedPropositionLift
Let's take a look at an example of Match query:
{ "from": "impressions", "where": { "context.user": "larry" }, "match": "product", "select": ["$score", "name", "$why"] }
When opening up the explanation with "$why" operator, the first hit has a HitPropositionLiftExplanation as follows:
{ "type": "hitPropositionLift", "proposition": { "id" : { "$has" : 6410405216120 } }, "value": 599.5491890842981, "factors": [ { "type": "baseLift", "value": 265.0 }, { "type": "relatedPropositionLift", "proposition": { "context.user": { "$has" : "larry" } }, "value": 2.2624497701294266 } ] }
This explains that the initial lift of the feature "id:6410405216120"
is 265 and when the user is Larray, the relatedPropositionLift is 2.2624497701294266. Hence the aggregated lift is
Example
{ "factors": [ { "type": "baseLift", "value": 31 } ], "proposition": { "field": 4 }, "type": "hitPropositionLift", "value": 31 }
Explain how a propositions's lift was calculated.
HitLinkPropositionLiftExplanation explains the impact of the value that links to table containing the returned hits.
Let's consider an example, where there is an impression table that has a numeric field 'product' that links to the product table. In such a case the HitLinkPropositionLift would explain the significance of the field 'product' in the impression table. E.g., if the product link's value is 4, the HitLinkPropositionLiftExplanation will explain the effect of the proposition { "product" : 4 }. If the value is 2.0, it means that the 4 product is estimated to be twice as probable just based on the statistics of the linking column.
Example
{ "proposition": { "product": 5 }, "type": "hitLinkPropositionLift", "value": 2.32 }
Explain the inverse document frequency score.
The inverse document frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.
Example
{ "type": "idf", "value": 1.7551720221592049 }
Explain how a special named score was calculated.
The NamedExplanation now only appears when calculating a score with exclusiveness. In this case, it explains the normalizer that enforces the probabilities of a feature to have sum of 1.0.
Example
{ "name": "exclusiveness", "type": "normalizer", "value": 0.2982788431762749 }
Explain how a probability was calculated.
The PredictExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of a Predict query. Let's take a look at an example of Predict query:
{ "from": "products", "where": { "name": "Columbian coffee" }, "predict": "tags", "select": ["$p", "feature", "$why"], "limit": 22 }
The first hit has an explanation of"
{ "type": "product", "factors": [ { "type": "baseP", "value": 0.16 }, { "type" : "product", "factors" : [ { "type" : "normalizer", "name" : "exclusiveness", "value" : 1.0119918068684681 }, { "type" : "normalizer", "name" : "trueFalseExclusiveness", "value" : 1.09917613448721 } ] }, { "type": "relatedVariableLift", "variable": "name:coffe", "value": 8.45603245079726 } ] }
Example
{ "factors": [ { "type": "baseP", "value": 0.8048780487804879 }, { "name": "exclusiveness", "type": "normalizer", "value": 0.04604801347746731 } ], "type": "product" }
This explanation object explains how a product score was calculated. It occurs in $why results, if $multiply operation is used.
The ProductExplanation most commonly appears when opening up the explanation (e.g: using the $why operator) of:
The final score is a product of multiple score components:{ "from": "impressions", "where": { "context.user": "larry" }, "match": "product", "select": ["$score", "name", "$why"] }
{ "type": "product", "factors": [ { "type": "hitPropositionLift", "proposition": {Â "id" : 6410405216120 }, "value": 599.5491890842981, "factors": [ { "type": "baseLift", "value": 265.0 }, { "type": "relatedPropositionLift", "proposition": {Â "context.user" : "larry" }, "value": 2.2624497701294266 } ] }, ... ] }
Example
{ "factors": [ { "factors": [ { "type": "baseLift", "value": 31 } ], "proposition": { "id": 3 }, "type": "hitPropositionLift", "value": 31 }, { "field": "price", "type": "field", "value": 1500 } ], "type": "product" }
Example
{ "dividend": { "field": "return", "type": "field", "value": 400000 }, "divisor": { "field": "investment", "type": "field", "value": 250000 }, "type": "division" }
Explain how a related variable's lift was calculated.
A related variable (feature) most commonly appears when doing inference with some conditions. The RelatedVariableLiftExplanation explains how a variable of the conditions affecting the lift of a hit's variable.
Let's take a look at an example of Match query:
{ "from": "impressions", "where": { "context.user": "larry" }, "match": "product", "select": ["$score", "name", "$why"] }
When opening up the explanation with "$why" operator, the first hit has an explanation as follows:
{ "type": "hitVariableLift", "variable": "id:6410405216120", "value": 599.5491890842981, "factors": [ { "type": "baseLift", "value": 265.0 }, { "type": "relatedVariableLift", "variable": "context.user:larry", "value": 2.2624497701294266 } ] }
This explains that the feature "context.user:larry"
extracted from the conditions "where": { "context.user": "larry" }
enhances the likelihood that the product having an id of 6410405216120 with a lift of 2.2624497701294266.
Examples coming later.
Explain how a score was calculated.
Example
{ "type": "baseP", "value": 0.28 }
Example
{ "terms": [ { "field": "id", "type": "field", "value": 4 }, { "field": "price", "type": "field", "value": 1500 } ], "type": "sum" }
Example
{ "minuend": { "field": "price", "type": "field", "value": 119.5 }, "subtrahend": { "field": "cost", "type": "field", "value": 100.5 }, "type": "subtraction" }
Explain the term frequency score.
The term frequency score is one component of the term frequency-inverse document frequency score which is used in Aito's similarity metrics.
Example
{ "type": "tf", "value": 1 }
Default value explanation describes the default score for some operation.
For example TF-IDF scoring assigns default lift of 1.0 for all rows without matching terms.
Example
{ "type": "default", "value": 1 }
Default value explanation describes a constant value, typically given by the user.
Examples coming later.
Explanation proposition object when using the "$why"
or "relate"
operator.
FieldPropositionExpanation expresses a statement about a document field
For example the expression
{ "tags": { "$has" : "laptop" } }
states, that the tags
field contains the "laptop" feature
Examples coming later.
This format is used in the $why and relate explanations for the '$has'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$and'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$or'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$on'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$not'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$startsWith'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$gt'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$gte'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$lt'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
Examples coming later.
This format is used in the $why and relate explanations for the '$defined'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$numeric'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$knn'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the '$nn'-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This format is used in the $why and relate explanations for the is-proposition. Explanations of this format follow the normal proposition syntax and it can be reused in the where-clause.
Examples coming later.
This is the format for all propositions used in the $why and relate explanations.
Examples coming later.
All other API types.
Name of a column in a table. Links are supported.
Examples
"id"
"age"
"product.id"
Example
{ "from": "impressions", "get": "product", "limit": 20, "offset": 10 }
Supported query to be evaluated in EvaluateGroupedQuery. Currently only support Generic query and Recommend query
Examples
{ "from": "impressions", "goal": { "purchase": "true" }, "recommend": "product", "where": { "product.name": { "$get": "query" }, "session.user": { "$get": "session.user" } } }
{ "from": "impressions", "get": "product", "orderBy": { "$p": { "purchase": true } }, "where": { "product.name": { "$get": "query" }, "session.user": { "$get": "session.user" } } }
The EvaluateGroupedQuery is similar to the EvaluateQuery with an addition option to group multiple entries into a single test case.
For example, if there exists a "customerCohort" identifier in "impressions" table, we can evaluate by the customerCohort
instead of the individual customer with the following EvaluateGroupedQuery:
{ "evaluate": { "from": "impressions", "where": { "customer": { "$get": "customer" } }, "recommend": "product", "goal": { "purchase": true } }, "group": "customerCohort", "test": { "customerCohort": { "$gte": 5 } }, "select": ["trainSamples", "testSamples", "meanRank"] }
Example
{ "evaluate": { "from": "impressions", "goal": { "purchase": "true" }, "recommend": "product", "where": { "product.name": { "$get": "query" }, "session.user": { "$get": "session.user" } } }, "group": "userGroup", "select": ["accuracy", "meanRank", "n"], "test": { "userGroup": { "$gte": 5 } } }
Example
{ "from": "impressions", "limit": 2, "match": "prevProduct", "offset": 2, "select": ["title", "description", "price"], "where": { "customer": 4, "query": "laptop" } }
The Generic Query to be evaluated in a EvaluateGroupedQuery
Example
{ "from": "impressions", "get": "product", "orderBy": { "$p": { "purchase": true } }, "where": { "product.name": { "$get": "query" }, "session.user": { "$get": "session.user" } } }
Operation to be evaluated.
Examples
{ "from": "messages", "get": "product", "similarity": { "description": { "$get": "message" }, "title": { "$get": "message" } } }
{ "from": "products", "get": "product", "orderBy": "$p", "where": { "name": { "$get": "name" } } }
{ "from": "products", "predict": "category", "where": { "name": { "$get": "name" } } }
{ "from": "messages", "match": "product", "where": { "message": { "$get": "message" } } }
Example
{ "from": "products", "predict": "category", "where": { "name": { "$get": "name" } } }
A query to evaluate:
Example
{ "evaluate": { "from": "products", "predict": "category", "where": { "name": { "$get": "name" } } }, "select": ["accuracy", "meanRank", "n"], "test": { "$index": { "$mod": [10, 1] } } }
Example
{ "from": "impressions", "goal": { "purchase": "true" }, "recommend": "product", "where": { "product.name": { "$get": "query" }, "session.user": { "$get": "session.user" } } }
A Similarity query to be evaluated in the Evaluate query
Example
{ "from": "messages", "get": "product", "similarity": { "description": { "$get": "message" }, "title": { "$get": "message" } } }
Define the base and the exponent of the $pow operator in the array format.
The first item of the array is the base and the second item of the array is the exponent.
Example
["width", 2]
Define the base and the exponent of the $pow operator in the object format.
Example
{ "base": "width", "exponent": 2 }
The empty response object may contain more information in the future.
Examples coming later.
FieldProposition expresses statements about a field in a table.
For example, the following expression
"price": {"$lt": 500 }
describes a statement that price is under 500.
Examples coming later.
Examples
{ "from": "impressions", "where": { "click": true } }
"impressions"
{ "from": { "from": "impressions", "where": { "click": true } }, "orderBy": "$p", "where": { "query": "laptop" } }
{ "from": "impressions" }
From expression declares the used table
Examples
"impressions"
"products"
"customers"
"messages"
From expression declares the examined table.
Examples
"impressions"
"products"
"customers"
"messages"
FromWhere expression allows you to narrow the examined table.
When using the FromWhere, Aito would only consider that narrowed slice of table.
For instance, this query:
{ "from": { "from": "impressions", "where": { "context.user": "larry"} }, "match": "product" }
is different from:
{ "from": "impressions", "where": { "context.user": "larry" }, "match": "product" }
In the first query, Aito matches Larry with products only based on Larry impressions data while in the second query, Aito matches Larray with products based on Larry and other users' impressions data.
Example
{ "from": "impressions", "where": { "click": true } }
Get expression defines what items are returned as query results.
By default, the hits are from the table defined in "from"
clause.
In some cases, you may want to declare propositions like 'query is laptop' in
impression table, while returning results from the separate products table,
based on click likelihood. In this case, you may have query such as
{ "from": "impressions", "where": { "query": "laptop" }, "get": "product", "orderBy": { "$p": { "$context": { "click": true } } } }
The "get"
expression takes a field name as a parameter. If the field is link,
the returned results are from the linked table. If the field is not link,
the field values are returned as results.
Normally, the result of a query consists of the field values that best fulfill the query conditions. Field analyzers extract features from text fields and the $feature property can be used to return features instead of complete field values. For instance, the following example demonstrates how to discover product tags which are likely to lead to sales
{ "from": "impressions", "where": { "query": "cheap phone" }, "get": "product.tags.$feature", "orderBy": { "$p": { "$context": { "click": true } } } }
The $feature syntax also allows you to examine the values/features of a link field like it would be a regular field.
Examples
"product"
"user"
"text.$feature"
"link.field"
"link.$feature"
"link.text.$feature"
$get is used to access external variables in the evaluate query.
$get is currently only used in the the Evaluate queries. The evaluate tests a specified query by examining the table rows one-by-one. $get allows accessing the tested row's properties.
Consider the following example.
Given a table containing products data with the following schema:
"products": { "type": "table", "columns": { "title": { "type": "Text", "analyzer": "English" }, "description": { "type": "Text", "analyzer": "English" } } }
and a table containing impressions data with the following schema:
"impressions": { "type": "table", "columns": { "customer": { "type": "Int", "link": "customers.id" }, "product": { "type": "Int", "link": "products.id" }, "query": { "type": "Text", "analyzer": "English" }, } }
The goal is to test how well the traditional TF-IDF similarity metric works for finding a product. The $get is used in the similarity query to compare the product's title and description fields with the impression table's query field.
{ "test": { "click": true }, "evaluate": { "from": "impressions", "get": "product", "similarity": { "title": { "$get": "query" }, "description": { "$get": "query" } } }, "select": ["trainSamples", "n", "accuracy", "baseAccuracy", "meanRank", "mxe"] }
Examples
{ "$get": "query" }
{ "$get": "click" }
{ "$get": "product.title" }
Specifies a goal to maximize.
Results are ordered by the likelihood of the goal in descending order.
Examples
{ "purchase": true }
{ "click": true }
Example
[ { "$p": 0.16772371915637704, "category": "100", "id": "6410405060457", "name": "Pirkka bio cherry tomatoes 250g international 1st class", "price": 1.29, "tags": "fresh vegetable pirkka tomato" }, { "$p": 0.16772371915637704, "category": "100", "id": "6410405093677", "name": "Pirkka iceberg salad Finland 100g 1st class", "price": 1.29, "tags": "fresh vegetable pirkka" } ]
The syntax {"field": { "$is": "yourvalue" } }
is equivalent to { "field": "yourvalue" }
.
Example
{ "$is": "value" }
Define the 'k' and the 'near' parameter of the $knn operator in the array format.
The first item of the array is the 'k' parameter and the second item of the array is the 'near' parameter.
Example
[ 4, { "tags": "laptop" } ]
Define the 'k' and the 'near' parameter of the $knn operator in the object format.
Example
{ "k": 4, "near": { "tags": "laptop" } }
Define the 'near' and 'threshold' the parameters of the $nn operator in the array format.
The first item of the array is the 'near' parameter and the second item of the array is the 'threshold' parameter.
Example
[ { "tags": "laptop" } ]
Define the 'near' and the 'threshold' parameters of the $nn operator in the object format.
Example
{ "near": { "tags": "laptop" } }
Define the divisor and the remainder of the $mod operator in the array format.
The first item of the array is the divisor and the second item of the array is the remainder.
Example
[2, 0]
Define the divisor and the remainder of the $mod operator in the object format.
Example
{ "divisor": 2, "remainder": 0 }
Define the hypothesis and the conditional of the $on operator in the array format.
The first item of the array is the hypothesis and the second item of the array is the condition.
Example
[ { "click": true }, { "user.tags": "nyc" } ]
Define the hypothesis and the conditional of the $on operator in the object format.
Example
{ "on": { "user.tags": "nyc" }, "prop": { "click": true } }
Declares the sorting order of the result by a field or by a user-defined score.
Examples
"product.price"
{ "$asc": "product.price" }
{ "$desc": "product.price" }
{ "$multiply": ["$p", "prices"] }
{ "$pow": ["product.width", 2] }
PrimitiveProposition states a field's value.
It should always be used inside a field declaration of a document proposition.
For example, in the proposition { "field": "value" }
the string "value"
is the primitive
proposition.
Examples
4
3.1
false
null
"text"
Proposition expression describes a fact, or a statement.
For instance, the following proposition:
{ "customer.id": 4 }
describes a customer with the id of 4{ "clicked": true }
describes that the customer has clicked the itemYou can also combine multiple propositions by declaring them in an object clause. The propositions will be combined by the $and operator. For instance:
{ "price": { "$gt": 20, "$lte": 40 } }
describes an item of which price is greater than 20 and less than or equal to 40.
This proposition is equivalent to:
{ "price": { "$and": [ { "$gt": 20 }, { "$lte": 40 } ] } }
This proposition can be used, for example, in a Search Query to find an item that matches this price criteria:
{ "from": "products", "where": { "price": { "$gt": 20, "$lte": 40 } } }
Examples
{ "customer": 4, "query": { "$match": "laptop" } }
{ "price": { "$gte": 50, "$lt": 100 } }
{ "tags": { "$matches": "laptop" } }
PropositionSet expression is used to describe a collection of propositions. This collection of statements can be the alternative values in a field.
Examples
"product.tags"
"query"
"product"
"tags"
Declares the sorting order.
The sorting order can be any attribute of the Relate query hit.
Examples
{ "$desc": "info.miTrue" }
{ "$asc": "lift" }
Examples
{ "name": "My product", "price": 172.19 }
{ "$score": 0.22350516297675496, "$value": "coffee", "$why": { "factors": [ { "factors": [ { "proposition": { "name": { "$has": "coffee" } }, "type": "relatedPropositionLift", "value": 8.45603245079726 } ], "proposition": "coffee", "type": "hitPropositionLift", "value": 8.45603245079726 } ], "type": "product" } }
Score expression resolves to a numeric score value or probability.
All scores can be used in both highlights ($highlight
) and explanations ($why
).
Examples
2
"product.margin"
"$p"
"$similarity"
{ "$multiply": ["$p", "margin"] }
Describes the fields and/or built-in attributes to return.
Examples
["user.name", "query", "product.title", "click"]
["$why"]
TestSource enables more options to choose the testing data in the Evaluate Query. Using the TestSource, you can specify the testing data as a specific slice of the same table with the training data or of a completely different table.
Example
{ "limit": 100, "select": ["query"], "where": { "$index": { "$mod": [5, 1] } } }
Any object which is valid according to the database schema.
The contents of the object depends on the data inserted into the database. If for example
you have a products
table which has fields name
and price
, your object could look
like:
{ "name": "My product", "price": 172.19 }
Example
{ "name": "My product", "price": 172.19 }
Value expression resolves to a primitive like int or json, score, probability or individual feature.
Value expression can refer to any field in the table with
expressions like "query"
or "product.price"
. Value expression can refer to
the narrowed document overall likelihood, for example "$p"
, after "get": "message"
,
to refer the message's likelihood. Value can also refer to the likelihood
of a proposition with expressions such as { "$p": { "tags": "cover" } }
or
{ "$p": { "$context": "click" } }
to refer to the context table's fields.
Example
"product.id"