Reading/Writing from S3¶

Fire is fully integrated with AWS S3. The Dataset Processors of Fire, can directly read data from S3 if the policies allow them to.

Dataset Processors¶

Dataset Processors include:

The path specified for reading from S3 would be s3://…

Below is an example Workflow. It reads a CSV file from S3, parses it and prints out the first 10 records.

In the dialog box of the Read CSV processor the path is specified as s3a://sparkflow-sample-data/data/Clickthru.csv

Below is an example Workflow. It reads a CSV file and save it to S3 path specified.

In the dailog box of the save CSV processor the path is specified as s3a://sparkflow-sample-data/write/

Execution Result

Once the above workflow successfully completed, the save data can be viewed using DATABROWSERS/AWS S3 Location with specified path

Below is an example workflow in sparkflows, where data is read from S3 and the final Spark ML model is saved to S3 location.

Workflow:

Configure ReadCSV

Configure SaveMlModel

Execution Result:

Below is an example workflow in sparkflows, where final H20 ML model is saved to S3 location.

Workflow:

Configure Save H20 ML Model

Execution Result: