Sparkflows
latest
Architecture & Deployment
Installation
Configuration
Authentication
Security
Operating Guide
Quickstart Guide
User Guide
Analytical Apps User Guide
Machine Learning User Guide
Time Series Analysis
Tutorials
Troubleshooting
FAQ
Administration Guide
Databricks Guide
AWS Guide
AZURE Guide
Load Balancer
Superset
Python Integration
Performance Tuning
Developer Guide
Processors
16-Utilities
09-DataProfiling
05-FeatureEngineering
01-IO
02-ReadStructured
03-ReadUnstructured
03-Save
01-Connectors
11-ML-SparkML
ML-TS
02-Parse
06-Filter
18-OpenNLP
15-ScoreCardPy
03-Prepare
04-DataValidation
CustomProcessors
17-Documentation
12-ML-H2O
13-ML-AWSSagemaker
14-ML-Sklearn
08-Group
06-Code
10-Visualization
19-Deprecated
15-Streaming
15-StructuredStreaming
14-DL
07-JoinUnion
Release Notes
REST API Authentication
REST API Examples using Python
REST API Examples using Java
REST API Examples using curl
Third Party Acknowledgements
Sparkflows
Docs
»
Processors
»
01-IO
Edit on GitHub
01-IO
ΒΆ
02-ReadStructured
ReadExcel
EmptyDataset
ReadCSV
ReadAvro
ReadXML
QueryJDBCConnection
JDBCIncrementalLoad
DB2 JDBC
ReadParquet
ReadDatabricksTable
JDBCConnection
CreateDataset
DatasetStructured
ReadJDBC
ReadHanaCsv
URLSingleRecordJSONReader
ReadLibsvm
ReadJSON
URLTextFileReader
ReadShapeFile
03-ReadUnstructured
TextFiles
WholeTextFiles
Tika
PDF
PDFImageOCR
BinaryFiles
03-Save
SaveJDBC
UpsertJDBC
SaveCSV
SaveJSON
KafkaProducer
SaveParquet
SaveORC
InsertIntoHIVETable
SaveAsHIVETable
SaveAvro
01-Connectors
Salesforce
ReadMarketo
SaveRedshift-AWS
WriteToSnowFlake
SaveCassandra
ExecuteQueryInSnowFlake
ReadMongoDB
SaveMongoDB
ReadDatabricksTable
ReadHIVETable
SaveHBase
SaveElasticSearch
ReadFromSnowFlake
SFTP
ReadCassandra
SaveDatabricksTable
ReadRedshift-AWS
ReadElasticSearch