跳转至

如何使用 Fluentd Docker

使用 Docker 通过 fluentd 发送日志

您可以使用 Docker 将日志发送到分析服务器,而无需安装 Fluentd。当您运行 hive_fluentd.sh 时,它会自动创建并执行一个 Docker 镜像。


在开始之前

准备一个可以使用Docker的环境,并下载自动化脚本。

设置 Docker 环境

要使用 Docker,您需要安装 docker enginedocker-compose。有关详细信息,请参见 here

下载自动化脚本

下载并提取文件

wget https://developers.withhive.com/wp-content/uploads/2024/08/hive_fluentd_docker.tar  # Download the file
tar -xvf hive_fluentd_docker.tar # Extract the file


提取压缩文件后,您可以在hive_fluentd_docker文件夹中找到配置文件hive.conf和脚本文件hive_fluentd.sh

配置hive.conf

有用于使用特定语言库(Java、Python等)发送特定日志文件的配置。

使用特定语言库的配置

仅配置构建环境(sandboxlive)。请参见下面的示例。

### Hive.conf ###
# Only use either sandbox or live for the build environment. logs are sent to the Hive server corresponding to each environment.
# Do not modify any other text
build:sandbox

### Don't delete this line ####


通过库发送日志时,Docker通过24224端口将日志传输到日志服务器。

发送特定日志文件的配置

输入文件路径(绝对路径)和tag。如果有多个日志文件夹,您必须同时添加pathtag

以下是只有一个日志文件夹时的示例。

### hive.conf ###

build:sandbox # Only use either sandbox or live for the build environment. logs are sent to the Hive server corresponding to each environment.

# Path: folder path where the file to be sent is located (absolute path).
# Text lines added to all files in that path after fluentd is run are sent.
path:/home/user1/docker/shell_test/game

# Tag: tag name to be applied to the file, ha2union.game.log category name
tag:ha2union.game.log.test

### Don't delete this line ####


以下是当有多个日志文件夹时的示例。

### hive.conf ###

build:sandbox # Only use either sandbox or live for the build environment. logs are sent to the Hive server corresponding to each environment.

# Path: folder path where the file to be sent is located (absolute path).
# Text lines added to all files in that path after fluentd is run are sent.
path:/home/user1/docker/shell_test/game

# Tag: tag name to be applied to the file, ha2union.game.log category name
tag:ha2union.game.log.test

# To add multiple file paths, follow the method below.
# Path and tag must be added together; otherwise, an error will occur if either is empty.
path:/home/user1/docker/shell_test/game2
tag:ha2union.game.log.test2

path:/home/user1/docker/shell_test/game3
tag:ha2union.game.log.test3

### Don't delete this line ####

请勿修改配置文件顶部和底部的文本。

构建

这指的是将日志发送到的分析服务器类型。它可以是sandboxlive,表示分析沙盒服务器或分析实时服务器。

路径

这是日志文件所在的文件夹路径(绝对路径)。日志文件必须为JSON格式。在成功运行hive_fluentd.sh脚本后,从那时起添加到该路径下所有文件的行将被发送到服务器。这意味着即使文件夹中已有日志文件,在Docker开始操作后,它将从新添加的行开始读取并发送日志。它会记住读取日志的位置,因此即使Docker重新启动,它也会从完成传输后的下一行开始发送日志。

标签

输入要应用于path中日志的tagtag必须以ha2union.game.name_to_be_created的格式创建

运行和检查自动化脚本

完成配置设置后,运行hive_fluentd.sh

bash hive_fluentd.sh

运行脚本会自动执行从 Docker 镜像创建到 Docker 执行和日志传输的所有操作。如果发生执行错误,请再次检查配置是否设置正确。如果出现以下消息,则表示成功。忽略与 pull access denied 相关的消息。

[+] Running 1/1
! fluentd Warning pull access denied for com2usplatform/hive_analytics_fluentd_docker, repository does not exist or may require 'docker login': deni...                  2.8s
[+] Building 2.1s (8/8) FINISHED                                                                                                                                docker:default
=> [fluentd internal] load build definition from Dockerfile                                                                                                              0.0s
=> => transferring dockerfile: 145B                                                                                                                                      0.0s
=> [fluentd internal] load metadata for docker.io/fluent/fluentd:v1.11.4-debian-1.0                                                                                      1.9s
=> [fluentd auth] fluent/fluentd:pull token for registry-1.docker.io                                                                                                     0.0s
=> [fluentd internal] load .dockerignore                                                                                                                                 0.0s
=> => transferring context: 2B                                                                                                                                           0.0s
=> [fluentd 1/2] FROM docker.io/fluent/fluentd:v1.11.4-debian-1.0@sha256:b70acf966c6117751411bb638bdaa5365cb756ad13888cc2ccc0ba479f13aee7                                0.0s
=> CACHED [fluentd 2/2] RUN ["gem", "install", "fluent-plugin-forest"]                                                                                                   0.0s
=> [fluentd] exporting to image                                                                                                                                          0.0s
=> => exporting layers                                                                                                                                                   0.0s
=> => writing image sha256:857dc72492380aeb31bbee36b701f13ae5ae1a933b46945662657246b28964a5                                                                              0.0s
=> => naming to docker.io/com2usplatform/hive_analytics_fluentd_docker:sandbox                                                                                           0.0s
=> [fluentd] resolving provenance for metadata file                                                                                                                      0.0s
[+] Running 2/2
 Network shell_test_default                       Created                                                                                                               0.1s
 Container hive_analytics_fluentd_docker_sandbox  Started                                                                                                               0.5s


一旦执行成功完成,它就开始发送日志。

注意

在这里您可以找到命令,如何检查日志,文件结构,以及如何检查forest日志。

命令

以下是命令的集合。

重启

根据hive.conf文件,重新生成fluentd.confdocker-compose文件,并重启容器。即使重启,如果之前创建的文件夹和pos文件存在,它将从之前发送日志后的部分开始发送。

bash hive_fluentd.sh restart

暂停

暂停 Docker 镜像。

bash hive_fluentd.sh stop

恢复暂停的图像

恢复暂停的 Docker 镜像。

bash hive_fluentd.sh start

删除docker容器

删除容器。

bash hive_fluentd.sh down

如何检查日志

使用下面的命令检查日志。

bash hive_fluentd.sh logs


命令正常执行时日志的示例如下:

hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: gem 'fluent-plugin-forest' version '0.3.3'
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: gem 'fluentd' version '1.11.4'
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: adding forwarding server 'sandbox-analytics-hivelog' host="sandbox-analytics-hivelog.withhive.com" port=24224 weight=60 plugin_id="object:3fcfa2188060"
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: using configuration file: 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | log\_level info
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | @type forward
hive\_analytics\_fluentd\_docker\_sandbox | skip\_invalid\_event true
hive\_analytics\_fluentd\_docker\_sandbox | chunk\_size\_limit 10m
hive\_analytics\_fluentd\_docker\_sandbox | chunk\_size\_warn\_limit 9m
hive\_analytics\_fluentd\_docker\_sandbox | port 24224
hive\_analytics\_fluentd\_docker\_sandbox | bind "0.0.0.0"
hive\_analytics\_fluentd\_docker\_sandbox | source\_address\_key "fluentd\_sender\_ip"
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | @type copy
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | @type "forest"
hive\_analytics\_fluentd\_docker\_sandbox | subtype "file"
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | time\_slice\_format %Y%m%d%H
hive\_analytics\_fluentd\_docker\_sandbox | time\_slice\_wait 10s
hive\_analytics\_fluentd\_docker\_sandbox | path /fluentd/forest/${tag}/${tag}
hive\_analytics\_fluentd\_docker\_sandbox | compress gz
hive\_analytics\_fluentd\_docker\_sandbox | format json
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | @type "forward"
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | @type "file"
hive\_analytics\_fluentd\_docker\_sandbox | path "/fluentd/buffer/foward\_buffer/"
hive\_analytics\_fluentd\_docker\_sandbox | chunk\_limit\_size 10m
hive\_analytics\_fluentd\_docker\_sandbox | flush\_interval 3s
hive\_analytics\_fluentd\_docker\_sandbox | total\_limit\_size 16m
hive\_analytics\_fluentd\_docker\_sandbox | flush\_thread\_count 16
hive\_analytics\_fluentd\_docker\_sandbox | queued\_chunks\_limit\_size 16
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | name "sandbox-analytics-hivelog"
hive\_analytics\_fluentd\_docker\_sandbox | host "sandbox-analytics-hivelog.withhive.com"
hive\_analytics\_fluentd\_docker\_sandbox | port 24224
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | @type "secondary\_file"
hive\_analytics\_fluentd\_docker\_sandbox | directory "/fluentd/failed/log/forward-failed/send-failed-file"
hive\_analytics\_fluentd\_docker\_sandbox | basename "dump.${tag}.${chunk\_id}"
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive\_analytics\_fluentd\_docker\_sandbox | 
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: starting fluentd-1.11.4 pid=7 ruby="2.6.6"
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: spawn command to main:  cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/local/bundle/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--under-supervisor"]
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: adding match pattern="ha2union.**" type="copy"
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: #0 adding forwarding server 'sandbox-analytics-hivelog' host="sandbox-analytics-hivelog.withhive.com" port=24224 weight=60 plugin_id="object:3f8185f73f64"
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: adding source type="forward"
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: #0 starting fluentd worker pid=17 ppid=7 worker=0
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: #0 listening port port=24224 bind="0.0.0.0"
hive_analytics_fluentd_docker_sandbox  | 2024-08-19 07:19:31 +0000 [info]: #0 fluentd worker is now running worker=0

文件结构

当脚本第一次运行时,它会自动创建必要的配置文件和文件夹,如下所示。除非您随意删除配置文件和文件夹,否则即使您关闭或删除Docker容器,它们也会被保留。

---- hive_fluentd_docker  # The folder created when the downloaded file is decompressed
├-- hive.conf  # Environment configuration file
├-- hive_fluentd.sh  # Automation script
├-- docker-compose.yaml  # Configuration file for docker image creation and volume mount
├-- buffer
│  ├-- foward_buffer  # Folder where buffer files are temporarily stored
│  │  ├-- {tag1}  # Temporarily stores files to be sent in folders created based on tag names as buffer files
│  │  ├-- {tag2}.....
│  ├-- pos   # Folder where files that remember the location of read files are stored (data loss or duplicate transmission may occur if this folder is deleted, tampered with, etc.)
│  │  ├-- {tag1}  # Stores pos files in folders created based on tag names.
│  │  ├-- {tag2}.....
├-- conf
│  └-- fluentd.conf  # Fluentd configuration file
├-- failed  # Stores logs that failed to be sent as files
└-- forest  # Stores and compresses the successfully sent files on an hourly basis.

检查森林日志

创建一个文件夹,使用在将日志文件发送到 forest 文件夹时使用的 tag 名称。它作为临时文件累积一段时间,最后保存为 gzip 文件。您可以通过检查文件夹中是否创建了文件以及文件大小是否在增加来判断日志是否正确发送。

drwxr-xr-x 2 root     root     4096 Aug 23 13:00 ha2union.game.sample.login/
-rw-r--r-- 1 root     root     4515 Aug 23 13:00 ha2union.game.sample.login.2024082303_0.log.gz
-rw-r--r-- 1 root     root     4515 Aug 23 12:00 ha2union.game.sample.login.2024082302_0.log.gz


使用以下命令检查压缩文件的日志。

zgrep "text to search for" filename.gz
# Example
zgrep "category" ha2union.game.sample.login.2024080104_0.log.gz