Docker Centralized Logging with AWS S3, Athena, Glue and QuickSight


Collectord has default configuration embedded. Changing this configuration allows you to control how often data is forwarded to CloudWatch, which host logs should be forwarded, default sampling for the logs and more.

Review configuration

You can always review all the configuration that is applied to collectord by executing the command on one of the running collectord container.

docker exec -it collectord-cloudwatch /collectord show-config


Override configuration

Overriding configuration with environment variables

You can override configurations with the environment variables in format

--env "COLLECTOR__<ANYUNIQUENAME>=<section>__<key>=<value>"

Overriding configuration by embedding configuration files

You can create your configuration files, which overrides the default values in 001-general.conf. Just place only the values that you want to replace inside this file, for example, create a file 002-conf.conf

Create a Dockerfile

FROM outcoldsolutions/collectord:6.0.301

COPY 002-conf.conf /config/s3/docker/002-conf.conf

Build the image

docker build -t .

Use this image to start the collector with the instructions how we deploy the collector.

Change how often to upload files to S3

By default collectord uploads files every 10 minutes or when it reaches 100M. Decreasing the values can result in more files on S3, which will require more PUT requests to upload them, and GET requests to search them with Athena.

docker run 
    ... \
    --env "COLLECTORD__OUTPUT_S3_FREQUENCY=output.s3__frequency=10m" \
    --env "COLLECTORD__OUTPUT_S3_FILESIZE=output.s3__fileSize=100M" \
    ... \

Manage the space used by compressed files

Collectord monitors the usage of the disk space. By default it uses maximum of 1GB of disk space, or if FS allows to get the stats of the mount, it will monitor additionally that at least 20% of disk space is free.

docker run 
    ... \
    --env "COLLECTORD__OUTPUT_S3_MAXSIZEONDISK=output.s3__maxSizeOnDisk=1GB" \
    --env "COLLECTORD__OUTPUT_S3_FREESPACEPERCENT=output.s3__manageFreeSpaceOnDiskPercent=20" \
    --env "COLLECTORD__OUTPUT_S3_FREESPACEPERIOD=output.s3__manageFreeSpaceOnDiskPeriod=30s" \
    ... \

Number of uploaders

Collectord by default uses 10 threads for uploader, allowing to upload 10 files in parallel.

docker run 
    ... \
    --env "COLLECTORD__OUTPUT_S3_UPLOADERS=output.s3__uploaders=10" \
    ... \

Disable telemetry

Collectord forwards very basic telemetry about the performance and enabled configurations. You can disable it

docker run 
    ... \
    --env "COLLECTORD__TELEMETRY=general__telemetryEndpoint=" \
    ... \
  • Installation
    • Setup centralized Logging in 5 minutes.
    • Automatically forward host, container and application logs.
    • Test our solution with the 30 days evaluation license.
  • AWS Glue Catalog
    • Table definitions in Glue Catalog.
  • Querying data with Athena
    • Query automatically partitioned data with AWS Athena.
    • Best practices to work with Athena.
    • Query examples for container_logs, events and host_logs.
  • QuickSight for Dashboards and Reports
    • Connecting AWS QuickSight with the Athena.
    • Building dashboards.
  • Access control
    • Limit access to the data with IAM Policy.
  • Annotations
    • Forwarding application logs.
    • Multi-line container logs.
    • Fields extraction for application and container logs (including timestamp extractions).
    • Hiding sensitive data, stripping terminal escape codes and colors.
  • Configuration
    • Advanced configurations for collectord.
  • Troubleshooting
    • Troubleshooting steps.
    • Verify configuration.

About Outcold Solutions

Outcold Solutions provides solutions for building centralized logging infrastructure and monitoring Kubernetes, OpenShift and Docker clusters. We provide easy to setup centralized logging infrastructure with AWS services. We offer Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers.