Datasets REST API¶
Overview¶
The Dataset REST APIs, allow you to manage the Datasets.
Below are the various Dataset APIs available in Fire Insights, They should be executed after you have logged into Fire Insights.
GET List of Datasets by Application¶
Returns the list of Datasets for the logged in user for a given application id:
curl -X GET --header 'Accept: application/json' --header 'api_key: cookies' 'http://localhost:8080/api/v1/datasets?sortPara=dsc&projectId=1'
Create / Update Dataset¶
If id value is not passed, new dataset will be created:
JSON¶
{
"id": 13,
"version": 0,
"name": "spam",
"header": true,
"path": "data\/spam.csv",
"delimiter": ",",
"schemaModel": {
"schemaColList": [
{
"colName": "label",
"colType": "DOUBLE",
"colFormat": "",
"colMLType": "NUMERIC"
},
{
"colName": "message",
"colType": "STRING",
"colFormat": "",
"colMLType": "TEXT"
},
{
"colName": "id",
"colType": "DOUBLE",
"colFormat": "",
"colMLType": "NUMERIC"
}
]
}
}
Curl¶
curl-X POST --header 'Content-Type: application/json' --header 'Accept: /' -d '{"id":13,"version":0,"name":"spam","header":true,"path":"data/spam.csv","delimiter":",","schemaModel":{"schemaColList":[{"colName":"label","colType":"DOUBLE","colFormat":"","colMLType":"NUMERIC"},{"colName":"message","colType":"STRING","colFormat":"","colMLType":"TEXT"},{"colName":"id","colType":"DOUBLE","colFormat":"","colMLType":"NUMERIC"}]}}' localhost:8080/dataset/save -b /tmp/cookies.txt
Delete Dataset¶
- “datasetId”: “98”
- “projectId”: “33”
An example request for Deleting dataset:
curl -X DELETE --header 'Accept: text/plain' 'http://localhost:8080/api/v1/datasets/98?projectId=33'
An example response:
Dataset with id 98 deleted successfully
Get Dataset by Id¶
- “datasetId”: “65”
- “projectId”: “33”
An example request for Getting dataset by Id:
curl -X GET --header 'Accept: application/json' 'http://localhost:8080/api/v1/datasets/65?projectId=33'
An example response:
{
"id": 65,
"userId": 33,
"uuid": "1e13ec2a-4094-405e-a6e7-ffed3bd027f7",
"version": 0,
"name": "Test-dataset",
"category": null,
"description": "Test",
"header": true,
"readOptions": null,
"path": "/user/sparkflows/Clickthru.csv",
"delimiter": ",",
"datasetType": "CSV",
"filterLinesContaining": null,
"datasetSchema": "{colNames:[\"Timestamp\",\"UserId\",\"IP Address\",\"Product Id\"],colTypes:[\"STRING\",\"INTEGER\",\"STRING\",\"INTEGER\"],colFormats:[\"\",\"\",\"\",\"\"],colMLTypes:[\"TEXT\",\"NUMERIC\",\"TEXT\",\"NUMERIC\"]}",
"dateCreated": 1566880637842,
"dateLastUpdated": 1566880637846,
"permission": null,
"readOptionsModel": null,
"schemaModel": {
"schemaColList": [
{
"colName": "Timestamp",
"colType": "STRING",
"colFormat": "",
"colMLType": "TEXT"
},
{
"colName": "UserId",
"colType": "INTEGER",
"colFormat": "",
"colMLType": "NUMERIC"
},
{
"colName": "IP Address",
"colType": "STRING",
"colFormat": "",
"colMLType": "TEXT"
},
{
"colName": "Product Id",
"colType": "INTEGER",
"colFormat": "",
"colMLType": "NUMERIC"
}
]
},
"sampleData": {
"headers": [
"Timestamp",
"UserId",
"IP Address",
" Product Id"
],
"cells": [
[
"9:03 AM",
"275",
"207.51.113.192",
"1"
],
[
"12:57 AM",
"586",
"62.34.98.94",
"2"
],
[
"2:45 AM",
"508",
"20.237.172.182",
"3"
],
[
"2:13 PM",
"378",
"69.215.255.150",
"4"
],
[
"9:27 AM",
"965",
"56.101.183.251",
"5"
],
[
"8:18 AM",
"263",
"9.151.97.180",
"6"
],
[
"9:40 AM",
"670",
"101.195.1.186",
"7"
],
[
"7:14 AM",
"447",
"232.29.216.95",
"8"
],
[
"12:57 AM",
"33",
"85.119.50.62",
"9"
],
[
"12:56 AM",
"589",
"185.132.243.178",
"10"
],
[
"11:04 PM",
"22",
"120.212.232.218",
"11"
],
[
"8:29 PM",
"504",
"226.70.25.117",
"12"
],
[
"5:18 PM",
"228",
"213.53.100.18",
"13"
],
[
"2:56 PM",
"536",
"60.65.25.167",
"14"
],
[
"3:57 AM",
"46",
"149.156.17.120",
"15"
],
[
"8:05 AM",
"812",
"23.213.182.107",
"16"
],
[
"12:02 PM",
"980",
"93.20.165.16",
"17"
],
[
"12:53 PM",
"915",
"24.180.112.147",
"18"
],
[
"11:32 AM",
"814",
"110.81.139.11",
"19"
],
[
"11:01 PM",
"429",
"115.123.246.193",
"20"
]
]
},
"json": "{\"id\":65,\"userId\":33,\"uuid\":\"1e13ec2a-4094-405e-a6e7-ffed3bd027f7\",\"version\":0,\"name\":\"Test-dataset\",\"description\":\"Test\",\"header\":true,\"path\":\"/user/sparkflows/Clickthru.csv\",\"delimiter\":\",\",\"datasetType\":\"CSV\",\"datasetSchema\":\"{colNames:[\\\"Timestamp\\\",\\\"UserId\\\",\\\"IP Address\\\",\\\"Product Id\\\"],colTypes:[\\\"STRING\\\",\\\"INTEGER\\\",\\\"STRING\\\",\\\"INTEGER\\\"],colFormats:[\\\"\\\",\\\"\\\",\\\"\\\",\\\"\\\"],colMLTypes:[\\\"TEXT\\\",\\\"NUMERIC\\\",\\\"TEXT\\\",\\\"NUMERIC\\\"]}\",\"dateCreated\":\"Aug 27, 2019 4:37:17 AM\",\"dateLastUpdated\":\"Aug 27, 2019 4:37:17 AM\",\"schemaModel\":{\"schemaColList\":[{\"colName\":\"Timestamp\",\"colType\":\"STRING\",\"colFormat\":\"\",\"colMLType\":\"TEXT\"},{\"colName\":\"UserId\",\"colType\":\"INTEGER\",\"colFormat\":\"\",\"colMLType\":\"NUMERIC\"},{\"colName\":\"IP Address\",\"colType\":\"STRING\",\"colFormat\":\"\",\"colMLType\":\"TEXT\"},{\"colName\":\"Product Id\",\"colType\":\"INTEGER\",\"colFormat\":\"\",\"colMLType\":\"NUMERIC\"}]},\"projectId\":33}",
"projectId": 33
},
Get Dataset Count¶
Returns the count of datasets available:
curl -X GET --header 'Accept: application/json' --header 'api_key: cookies' 'http://localhost:8080/api/v1/datasets/count'
Get sample data¶
Delimiter and header are optional values
- path: data/spam.csv
- schema: {“colNames”:[“0.0”,”this is not a spam”,”3.0”],”colTypes”:[“DOUBLE”,”STRING”,”DOUBLE”],”colFormats”:[“”,””,””],”colMLTypes”:[“NUMERIC”,”TEXT”,”NUMERIC”]}
CURL:
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --header 'api_key: cookies' -d
'{"colNames":["0.0","this is not a spam","3.0"],"colTypes":["DOUBLE","STRING","DOUBLE"],"colFormats":["","",""],"colMLTypes":["NUMERIC","TEXT","NUMERIC"]}' http://localhost:8080/api/v1/datasets/sample-data
Returns schema of the files in the given path using the given delimiter¶
- delimiter and header are optional values
- path:data/spam.csv
- schema: {“colNames”:[“0.0”,”this is not a spam”,”3.0”],”colTypes”:[“DOUBLE”,”STRING”,”DOUBLE”],”colFormats”:[“”,””,””],”colMLTypes”:[“NUMERIC”,”TEXT”,”NUMERIC”]}
CURL:
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --header 'api_key: cookies' -d
'{"colNames":["0.0","this is not a spam","3.0"],"colTypes":["DOUBLE","STRING","DOUBLE"],"colFormats":["","",""],"colMLTypes":["NUMERIC","TEXT","NUMERIC"]}' http://localhost:8080/api/v1/datasets/schema
Get Latest Five Datasets¶
Returns the latest updated datasets:
curl -X GET --header 'Accept: application/json' --header 'api_key: cookies' 'http://localhost:8080/api/v1/datasets/latest'
Get the list of files/directories in the given path¶
- path:data/transaction.csv
CURL:
curl -X GET --header 'Content-Type: application/json' --header 'Accept: application/json' -d 'data/transaction.csv' 'http://localhost:8080/filesInPathJSON -b /tmp/cookies.txt'