Configuring Pipelines

Fire uses Apache Airflow for executing Pipelines. Hence Airflow has to be installed on the same machine as Fire.

Below are the configurations needed in Fire for Airflow.

Pipelines Configuration

Airflow Installation

It explain the steps involved in installing Airflow on Centos and RHEL. Detailed Airflow Install Instructions is at:

https://airflow.apache.org/installation.html

  • Login to machine
  • Before installing airflow update installed package:
  • yum -y update
  • Install python-pip and any required packages:
  • sudo yum install epel-release
  • sudo yum install python-pip
  • Check the version of pip that is installed and if reqd upgrade:
  • pip -V
  • pip install –upgrade setuptools
  • Note that for 1.10 you now need to preface install commands or export this env var:
  • export SLUGIFY_USES_TEXT_UNIDECODE=yes
  • Install gcc , gcc-c++ and dependencies for python 2.7
  • sudo yum -y install gcc gcc-c++ kernel-devel
  • sudo yum -y install python-devel libxslt-devel libffi-devel openssl-devel
  • Airflow needs a home, ~/airflow is the default
  • export AIRFLOW_HOME=~/airflow
  • Install from pypi using pip
  • pip install apache-airflow
  • To check airflow version
  • airflow version
airflow
  • Generate a Fernet key for Airflow(optional)
  • python -c “from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())”
  • fgrc0MPUG1n3Q352Fp705A-bysNHX6EFRr7nYFTmXXA=
  • update in airflow.cfa
  • fernet key: fgrc0MPUG1n3Q352Fp705A-bysNHX6EFRr7nYFTmXXA=
  • Initialize the Airflow database
  • airflow initdb
  • Start the web server, its default port is 8080, If any other application is running on 8080, we can update other port for airflow
  • airflow webserver -p 8090
airflow
  • Start the scheduler
  • airflow scheduler
airflow
  • Login in browser
airflow