前面我們安裝好了消息隊列的redis,這次我們就正式安裝airflow了。
該文是基於python虛擬化環境來安裝,非虛擬化也是一樣,虛擬化我只是不想破環系統環境。
安裝python虛擬環境
pip install virtualenv
設置環境變量
sudo vi /etc/profile
將如下內容添加到末尾
export PYTHON_HOME=/usr/local/python3
export PATH=$PATH:$PYTHON_HOME/bin
source /etc/profile
創建虛擬環境存儲文件夾
mkdir /softwares/pyenv_for_airflow
cd pyenv_for_airflow/
創建python虛擬環境
virtualenv --no-site-packages airflow_env
賦權
chmod +x -R *
激活虛擬環境
cd bin
source ./activate
安裝依賴組件
yum -y install gcc zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel
yum -y install python-devel mysql-devel
yum -y install python3-devel
yum -y install cyrus-sasl cyrus-sasl-devel cyrus-sasl-lib
pip install paramiko
pip install pymysql
pip install sqlalchemy
vi /etc/profile
export AIRFLOW_HOME=/softwares/airflow
export SLUGIFY_USES_TEXT_UNIDECODE=yes
#即時生效
source /etc/profile
安裝airflow,all全安裝
pip install apache-airflow[all]
# 我選擇全安裝,因為我嘗試過只是安裝部分,有些功能就出現按bug。
初始化數據庫
cd /softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin
./airflow initdb
查看其生成文件
cd /softwares/airflow/
創建mysql後臺數據庫
<code>create database airflow_db default charset utf8 collate utf8_general_ci;
create user 'airflow'@'%' identified by 'airflow_db';
create user 'airflow'@'localhost' identified by 'airflow_db';
grant all on airflow_db.* to 'airflow'@'%';
flush privileges;
-----------------------------------------utf8mb4字符的---------------------------------------------------------------
create database airflow_db default charset utf8mb4 collate utf8mb4_unicode_ci;
create user 'airflow'@'%' identified by 'airflow_db';
create user 'airflow'@'localhost' identified by 'airflow_db';
grant all on airflow_db.* to 'airflow'@'%';
flush privileges;/<code>
·配置airflow使用LocalExecutor執行器,及使用MySQL數據庫
vi airflow/airflow.cfg
executor = LocalExecutor
sql_alchemy_conn = mysql://root:[email protected]:3306/airflow_db
[webserver]
base_url = http://airflow.mn01:8085
web_server_port = 8085
調整時區
default_timezone = Asia/Shanghai
還需要修改3個文件
#1、修改webserver頁面上右上角展示的時間:
vi ${PYTHON_HOME}/lib/python3.7/site-packages/airflow/www/templates/admin/master.html
<code>var UTCseconds = (x.getTime() + x.getTimezoneOffset()*60*1000);
$("#clock").clock({
"dateFormat":"Y-m-d ",
"timeFormat":"H:i:s %UTC%",
"timestamp":UTCseconds
}).click(function(){
alert('{{ hostname }}');
});
改為:
var UTCseconds = x.getTime();
$("#clock").clock({
"dateFormat":"Y-m-d ",
"timeFormat":"H:i:s",
"timestamp":UTCseconds
}).click(function(){
alert(
/<code>
#2、修改airflow/utils/timezone.py
<code>#在 utc = pendulum.timezone('UTC') 這行(第27行)代碼下添加
from airflow import configuration as conf
try:
tz = conf.get("core", "default_timezone")
if tz == "system":
utc = pendulum.local_timezone()
else:
utc = pendulum.timezone(tz)
except Exception:
pass
#修改utcnow()函數 (在第69行)
#d = dt.datetime.utcnow()
d = dt.datetime.now()/<code>
#3、修改airflow/utils/sqlalchemy.py
<code>#在utc = pendulum.timezone('UTC') 這行(第37行)代碼下添加
from airflow import configuration as conf
try:
tz = conf.get("core", "default_timezone")
if tz == "system":
utc = pendulum.local_timezone()
else:
utc = pendulum.timezone(tz)
except Exception:
pass /<code>
重新初始化數據庫
./airflow initdb
啟動服務
cd /softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin
./airflow webserver -D
可能錯誤
<code>錯誤1:
啟動可能報錯:FileNotFoundError: [Errno 2] No such file or directory: 'gunicorn' ,找不到gunicorn。
airflow webserver啟動時,會調用subprocess.Popen創建子進程,webserver使用gunicorn,啟動參數:
1: ['gunicorn', '-w', '4', '-k', 'sync', '-t', '120', '-b', '0.0.0.0:8080', '-n', 'airflow-webserver', '-p', '/home/admin/airflow/airflow-webserver.pid', '-c', 'airflow.www.gunicorn_config', '--access-logfile', '-', '--error-logfile', '-', 'airflow.www.app:cached_app()']
執行gunicorn啟動時,因為在PATH中找不到該命令報錯。
創建gunicorn軟連接
ln –fs /home/admin/python3.6/bin/gunicorn/bin/gunicorn /bin/gunicorn
或者將/usr/local/python3/bin添加到PATH,export PATH=$PATH:/usr/local/python3/bin
#即使生效
source /etc/profile
錯誤2:
有可能會啟動不了,可以查看err日誌,
一般報錯什麼pid已經存在,這時候需要刪除airflow目錄下的airflow-webserver-monitor.pid文件/<code>
啟動其它服務
./airflow scheduler -D
./airflow worker -D
#啟動flower
./airflow flower-D
默認的端口為 5555,您可以在瀏覽器地址欄中輸入 "http://hostip:5555" 來訪問 flower ,對 celery 消息隊列進行監控。
設置開機啟動服務
#1、創建啟動shell腳本
cd /softwares/
mkdir shellscripts
cd shellscripts/
touch startairflow.sh
vi startairflow.sh
<code>#!/bin/bash
# chkconfig: 2345 10 90
# description:airflow開機自啟腳本
#因為pid文件存在啟動會報錯,所以啟動服務前先判定是否存在pid文件,存在刪除先
airflow_path="/softwares/airflow/"
airflow_webserver_monitor_name="airflow-webserver-monitor.pid"
airflow_webserver_pid_name="airflow-webserver.pid"
airflow_scheduler_pid_name="airflow-scheduler.pid"
airflow_worker_pid_name="airflow-worker.pid"
if [ -x "$airflow_path" ]; then
echo "$airflow_path existed"
cd "$airflow_path"
if [ -f "$airflow_webserver_monitor_name" ]; then
echo "$airflow_webserver_monitor_name existed, i can delete it"
rm -rf "$airflow_webserver_monitor_name"
fi
if [ -f "$airflow_webserver_pid_name" ]; then
echo "$airflow_webserver_pid_name existed, i can delete it"
rm -rf "$airflow_webserver_pid_name"
fi
if [ -f "$airflow_scheduler_pid_name" ]; then
echo "$airflow_scheduler_pid_name existed, i can delete it"
rm -rf "$airflow_scheduler_pid_name"
fi
if [ -f "$airflow_worker_pid_name" ]; then
echo "$airflow_worker_pid_name existed, i can delete it"
rm -rf "$airflow_worker_pid_name"
fi
fi
#進入python虛擬環境
cd /softwares/pyenv_for_airflow/airflow_env/bin
#激活虛擬環境
source ./activate
#啟動相應的airflow 服務
/softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin/airflow webserver -D
/softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin/airflow scheduler -D
#LocalExecutor模式不需要啟動worker
#/softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin/airflow worker -D /<code>
#2、將bash腳本cp到inti.d
sudo cp startairflow.sh /etc/init.d/startairflow
#3、加入到自啟動中
#增加執行權限
cd /etc/init.d/
sudo chmod +x startairflow
#加入自動啟動
sudo chkconfig startairflow on
#查看是否增加到自啟動,2345為on即設置OK
chkconfig --list
· 將airflow命令加入PATH系統變量中,不需要每次指定到airflow bin目錄下執行
sudo vi /etc/profile
#增加如下內容到末尾
export AIRFLOW_CLI_HOME=/usr/local/python3/lib/python3.7/site-packages/airflow/
export PATH=$PATH:$AIRFLOW_CLI_HOME/bin
#立即生效
source /etc/profile
閱讀更多 IT知識小課堂 的文章