Collectord

Kubernetes Centralized Logging with AWS S3, Athena, Glue and QuickSight

Configuration

Collectord has default configuration embedded. Changing this configuration allows you to control how often data is forwarded to S3, which host logs should be forwarded, default sampling for the logs and more.

Review configuration

You can always review all the configuration that is applied to collectord by executing the command on one of the running collectord container.

Get a list of the pods from the collectord-s3 namespace

kubectl get pods -n collectord-s3

The output will look like

NAME                                          READY   STATUS    RESTARTS   AGE
collectord-s3-4n52x                   1/1     Running   0          18s
collectord-s3-addon-6b6bbdfdd-g8qhm   1/1     Running   0          18s

There are two deployments running. One is the daemonset that is deployed on every node and forwards host, container and application logs. Second one is a deployment that forwards Kubernetes events.

To get the configuration from the pod, run the following command (change the pod name from one of the list). The output from pod scheduled with DaemonSet will be different from the pod scheduled with Deployment.

kubectl exec -it -n collectord-s3 collectord-s3-4n52x /collectord show-config

Overriding the configuration

With the installation instruction we provide a YAML template that has a ConfigMap, allowing you to override default configuration.

apiVersion: v1
kind: ConfigMap
metadata:
  name: collectord-s3
  namespace: collectord-s3
  labels:
    app: collectord-s3
data:
  101-general.conf: |
    [general]
    # Review SLA at https://www.outcoldsolutions.com/docs/license-agreement/ and accept the license
    acceptLicense = false
    # Request the trial license with automated form https://www.outcoldsolutions.com/trial/request/
    license = 
    # If you are planning to setup log aggregation for multiple cluster, name the cluster
    fields.cluster = -

    [aws]
    # Specify AWS Region
    region = 

    [output.s3]
    # Specify Bucket Name
    bucket = 
  102-daemonset.conf: |

  103-addon.conf: |

Change how often to upload files to S3

By default collectord uploads files every 10 minutes or when it reaches 100M. Decreasing the values can result in more files on S3, which will require more PUT requests to upload them, and GET requests to search them with Athena.

  101-general.conf: |

    ...

    [output.s3]
    frequency = 10m
    fileSize = 100M

Manage the space used by compressed files

Collectord monitors the usage of the disk space. By default it uses maximum of 1GB of disk space, or if FS allows to get the stats of the mount, it will monitor additionally that at least 20% of disk space is free.

  101-general.conf: |

    ...

    [output.s3]
    # maximum size on disk of the files
    maxSizeOnDisk = 1GB

    # manage to keep at least this percent of the disk space
    manageFreeSpaceOnDiskPercent = 20

    # how often to check the free space on the disk
    manageFreeSpaceOnDiskPeriod = 30s

Number of uploaders

Collectord by default uses 10 threads for uploader, allowing to upload 10 files in parallel.

  101-general.conf: |

    ...

    [output.s3]
    # number of uploader threads
    uploaders = 10

Disable telemetry

Collectord forwards very basic telemetry about the performance and enabled configurations. You can disable it

  101-general.conf: |
    ...

    [general]
    # telemetry report endpoint, set it to empty string to disable telemetry
    telemetryEndpoint =
  • Installation
    • Setup centralized Logging in 5 minutes.
    • Automatically forward host, container and application logs.
    • Test our solution with the 30 days evaluation license.
  • AWS Glue Catalog
    • Table definitions in Glue Catalog.
  • Querying data with Athena
    • Query automatically partitioned data with AWS Athena.
    • Best practices to work with Athena.
    • Query examples for container_logs, events and host_logs.
  • QuickSight for Dashboards and Reports
    • Connecting AWS QuickSight with the Athena.
    • Building dashboards.
  • Access control
    • Limit access to the data with IAM Policy.
  • Annotations
    • Forwarding application logs.
    • Multi-line container logs.
    • Fields extraction for application and container logs (including timestamp extractions).
    • Hiding sensitive data, stripping terminal escape codes and colors.
  • Configuration
    • Advanced configurations for collectord.
  • Troubleshooting
    • Troubleshooting steps.
    • Verify configuration.

About Outcold Solutions

Outcold Solutions provides solutions for building centralized logging infrastructure and monitoring Kubernetes, OpenShift and Docker clusters. We provide easy to setup centralized logging infrastructure with AWS services. We offer Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers.