Collectord

Kubernetes Centralized Logging with AWS CloudWatch Logs

Installation

Pre-requirements

Collectord automatically forwards containers logs, host logs and events in the LogGroup and LogStreams with the following format:

  • Container logs:
    • LogGroup: /kubernetes/{{cluster}}/container_logs/{{namespace}}/{{::coalesce(daemonset_name, deployment_name, statefulset_name, cronjob_name, job_name, replicaset_name, pod_name)}}/
    • LogStream: /{{pod_name}}/{{container_name}}/{{container_id}}/{{stream}}@{{host}}
  • Application logs:
    • LogGroup: /kubernetes/{{cluster}}/container_logs/{{namespace}}/{{::coalesce(daemonset_name, deployment_name, statefulset_name, cronjob_name, job_name, replicaset_name, pod_name)}}/
    • LogStream: /{{pod_name}}/{{container_name}}/{{container_id}}/{{volume_name}}/{{file_path}}@{{host}}
  • Host Logs:
    • LogGroup: /kubernetes/{{cluster}}/host_logs/{{host}}
    • LogStream: /{{file_path}}
  • Events:
    • LogGroup: /kubernetes/{{cluster}}/events/{{namespace}}/
    • LogStream: /

You need to create a User with Programmatic access that has permissions to create new LogGroups, LogStreams and PutEvents.

Create IAM User with Programmatic access

Collectord uses IAM user for authentication purposes to push data to CloudWatch Logs. You can create IAM user with CLI or Web Console by following the guidance https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html. In the example below we restrict Collectord to write only to the log groups, and allow to create new log groups.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:PutRetentionPolicy"
            ],
            "Resource": [
                "arn:aws:logs:*:*:log-group:/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "*"
        }
    ]
}

CLI Example

Create policy

aws iam create-policy --policy-name collectord-cloudwatch --policy-document "{\"Version\": \"2012-10-17\", \"Statement\": [{\"Effect\": \"Allow\", \"Action\": [\"logs:CreateLogStream\", \"logs:PutLogEvents\", \"logs:PutRetentionPolicy\"], \"Resource\": [\"arn:aws:logs:*:*:log-group:/*\" ] }, {\"Effect\": \"Allow\", \"Action\": \"logs:CreateLogGroup\", \"Resource\": \"*\"} ] }"

From the output note the Arn of the policy (like arn:aws:iam::999999999999:policy/collectord-cloudwatch)

Create user

aws iam create-user --user collectord-cloudwatch

Attach just created policy to the user

Replace the ARN with the Policy ARN from the 2 steps before

aws iam attach-user-policy --user collectord-cloudwatch --policy-arn arn:aws:iam::999999999999:policy/collectord-cloudwatch

Create Access Key and Secret for collectord-cloudwatch user

aws iam create-access-key --user collectord-cloudwatch

From the output note the AccessKeyId and SecretAccessKey

Container Runtime

Collectord works out of the box with CRI-O and Docker as runtime engines.

Our default configuration is optimized for the Kubernetes clusters deployed in Production environments, some data might now be available with minikube. For example minikube forwards host logs to journald without persistence on the disk and combines multiple control plane components into one process.

Docker Container Runtime

If you use Docker as a Container Runtime, the Collectord uses JSON-files generated by JSON logging driver as a source for container logs.

Some linux distributions, CentOS for example, by default enable journald logging driver instead of default JSON logging driver. You can verify which driver is used by default

$ docker info | grep "Logging Driver"
Logging Driver: json-file

If docker configuration file location is /etc/sysconfig/docker (common in CentOS/RHEL case with Docker 1.13), you can change it and restart docker daemon after that with following commands.

$ sed -i 's/--log-driver=journald/--log-driver=json-file --log-opt max-size=100M --log-opt max-file=3/' /etc/sysconfig/docker
$ systemctl restart docker

If you configure Docker daemon with daemon.json in /etc/docker/daemon.json (common in Debian/Ubuntu), you can change it and restart docker daemon.

{
  "log-driver": "json-file",
  "log-opts" : {
    "max-size" : "100m",
    "max-file" : "3"
  }
}
$ systemctl restart docker

Please follow the manual to learn how to configure default logging driver for containers:

JSON logging driver configuration

With the default configuration, docker does not rotate JSON log files, with time they can become large and consume all disk space. That is why we specify max-size and max-file with the default configurations. See Configure and troubleshoot the Docker daemon for more details.

Install Collectord

With the YAML file below we deploy two workloads, one is the daemonset that runs on each node to collect container logs, host logs and application logs, and second one is a deployment, that forwards Kubernetes events.

Deploying Collectord on Kubernetes

Copy following YAML file and save it as collectord-cloudwatch.yaml

apiVersion: v1
kind: Namespace
metadata:
  labels:
    app: collectord-cloudwatch
  name: collectord-cloudwatch
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: collectord-cloudwatch
  name: collectord-cloudwatch
  namespace: collectord-cloudwatch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app: collectord-cloudwatch
  name: collectord-cloudwatch
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs:     ['use']
  resourceNames:
  - privileged
- apiGroups:
  - ""
  - apps
  - batch
  - extensions
  - monitoring.coreos.com
  - etcd.database.coreos.com
  - vault.security.coreos.com
  resources:
  - alertmanagers
  - cronjobs
  - daemonsets
  - deployments
  - endpoints
  - events
  - jobs
  - namespaces
  - nodes
  - nodes/proxy
  - pods
  - prometheuses
  - replicasets
  - replicationcontrollers
  - scheduledjobs
  - services
  - statefulsets
  - vaultservices
  - etcdclusters
  verbs:
  - get
  - list
  - watch
- nonResourceURLs:
  - /metrics
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app: collectord-cloudwatch
  name: collectord-cloudwatch
  namespace: collectord-cloudwatch
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: collectord-cloudwatch
subjects:
  - kind: ServiceAccount
    name: collectord-cloudwatch
    namespace: collectord-cloudwatch
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: collectord-cloudwatch
  namespace: collectord-cloudwatch
  labels:
    app: collectord-cloudwatch
data:
  101-general.conf: |
    [general]
    # Review SLA at https://www.outcoldsolutions.com/docs/license-agreement/ and accept the license
    acceptLicense = false
    # Request the trial license with automated form https://www.outcoldsolutions.com/trial/request/
    license = 
    # If you are planning to setup log aggregation for multiple cluster, name the cluster
    fields.cluster = -

    [aws]
    # Specify AWS Region
    region = 

    [output.cloudwatch.logs]

  102-daemonset.conf: |

  103-addon.conf: |

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: collectord-cloudwatch
  namespace: collectord-cloudwatch
  labels:
    app: collectord-cloudwatch
spec:
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      daemon: collectord-cloudwatch
  template:
    metadata:
      name: collectord-cloudwatch
      labels:
        daemon: collectord-cloudwatch
    spec:
      dnsPolicy: ClusterFirstWithHostNet
      hostNetwork: true
      serviceAccountName: collectord-cloudwatch
      tolerations:
      - operator: "Exists"
        effect: "NoSchedule"
      - operator: "Exists"
        effect: "NoExecute"
      containers:
      - name: collectord-cloudwatch
        # Collectord version
        image: outcoldsolutions/collectord:6.0.301
        imagePullPolicy: Always
        securityContext:
          runAsUser: 0
        resources:
          limits:
            cpu: 2
            memory: 256Mi
          requests:
            cpu: 200m
            memory: 64Mi
        env:
        - name: COLLECTORDCONFIG_PATH
          value: /config/cloudwatch/kubernetes/daemonset/
        - name: KUBERNETES_NODENAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        volumeMounts:
        # We store state in /data folder (file positions)
        - name: collectord-cloudwatch-state
          mountPath: /var/lib/collectord-cloudwatch/data/
        # Configuration file deployed with ConfigMap
        - name: collectord-cloudwatch-config
          mountPath: /config/cloudwatch/kubernetes/daemonset/user
          readOnly: true
        - name: collectord-cloudwatch-secret
          mountPath: /config/cloudwatch/kubernetes/daemonset/secret
          readOnly: true
        # Location of docker root (for container logs and metadata)
        - name: docker-root
          mountPath: /rootfs/var/lib/docker/
          readOnly: true
        # Host logs location (including CRI-O logs)
        - name: logs
          mountPath: /rootfs/var/log/
          readOnly: true
        # Docker socket
        - name: docker-unix-socket
          mountPath: /rootfs/var/run/docker.sock
          readOnly: true
        # CRI-O socket (if using CRI-O runtime)
        - name: crio-unix-socket
          mountPath: /rootfs/var/run/crio/
          readOnly: true
        # Application logs
        - name: volumes-root
          mountPath: /rootfs/var/lib/kubelet/
          readOnly: true
        # correct timezone
        - name: localtime
          mountPath: /etc/localtime
          readOnly: true
      volumes:
      # We store state directly on host, change this location, if
      # your persistent volume is somewhere else
      - name: collectord-cloudwatch-state
        hostPath:
          path: /var/lib/collectord-cloudwatch/data/
      # Location of docker root (for container logs and metadata)
      - name: docker-root
        hostPath:
          path: /var/lib/docker/
      # Host logs location (including CRI-O logs)
      - name: logs
        hostPath:
          path: /var/log
      # Docker socket
      - name: docker-unix-socket
        hostPath:
          path: /var/run/docker.sock
      # CRI-O socket (if using CRI-O runtime)
      - name: crio-unix-socket
        hostPath:
          path: /var/run/crio/
      # Location for kubelet mounts, to autodiscover application logs
      - name: volumes-root
        hostPath:
          path: /var/lib/kubelet/
      # correct timezone
      - name: localtime
        hostPath:
          path: /etc/localtime
      # configuration from ConfigMap
      - name: collectord-cloudwatch-secret
        secret:
          secretName: collectord-cloudwatch
      - name: collectord-cloudwatch-config
        configMap:
          name: collectord-cloudwatch
          items:
          - key: 101-general.conf
            path: 101-general.conf
          - key: 102-daemonset.conf
            path: 102-daemonset.conf
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: collectord-cloudwatch-addon
  namespace: collectord-cloudwatch
  labels:
    app: collectord-cloudwatch
spec:
  replicas: 1

  selector:
    matchLabels:
      daemon: collectord-cloudwatch

  template:
    metadata:
      name: collectord-cloudwatch-addon
      labels:
        daemon: collectord-cloudwatch
    spec:
      serviceAccountName: collectord-cloudwatch
      containers:
      - name: collectord-cloudwatch
        image: outcoldsolutions/collectord:6.0.301
        imagePullPolicy: Always
        securityContext:
          runAsUser: 0
          privileged: true
        resources:
          limits:
            cpu: 500m
            memory: 256Mi
          requests:
            cpu: 50m
            memory: 64Mi
        env:
        - name: COLLECTORDCONFIG_PATH
          value: /config/cloudwatch/kubernetes/addon
        - name: KUBERNETES_NODENAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        volumeMounts:
        - name: collectord-cloudwatch-state
          mountPath: /var/lib/collectord-cloudwatch/data/
        - name: collectord-cloudwatch-config
          mountPath: /config/cloudwatch/kubernetes/addon/user
          readOnly: true
        - name: collectord-cloudwatch-secret
          mountPath: /config/cloudwatch/kubernetes/addon/secret
          readOnly: true
      volumes:
      - name: collectord-cloudwatch-state
        hostPath:
          path: /var/lib/collectord-cloudwatch/data/
      - name: collectord-cloudwatch-secret
        secret:
          secretName: collectord-cloudwatch
      - name: collectord-cloudwatch-config
        configMap:
          name: collectord-cloudwatch
          items:
          - key: 101-general.conf
            path: 101-general.conf
          - key: 103-addon.conf
            path: 103-addon.conf

In the ConfigMap replace the values

[general]
# Review SLA at https://www.outcoldsolutions.com/docs/license-agreement/ and accept the license
acceptLicense = false
# Request the trial license with automated form https://www.outcoldsolutions.com/trial/request/
license = 
# If you are planning to setup log aggregation for multiple cluster, name the cluster
fields.cluster = -

[aws]
# Specify AWS Region
region = 

For example (request a trial license at https://www.outcoldsolutions.com/trial/request/)

[general]
# Review SLA at https://www.outcoldsolutions.com/docs/license-agreement/ and accept the license
acceptLicense = true
# Request the trial license with automated form https://www.outcoldsolutions.com/trial/request/
license = Qkc1MTgzUTQ0SUUyTTowOjoz....
# If you are planning to setup log aggregation for multiple cluster, name the cluster
fields.cluster = test

[aws]
# Specify AWS Region
region = us-west-2

Apply the deployment

kubectl apply -f collectord-cloudwatch.yaml

If you need to change the default configuration, please read configuration. For example you can disable forwarding of the host logs, by default do not forward container logs (opt-out behavior) or set the default sampling percent for the container logs.

Create a secret with Access Key and Secret Key

Create a file 100-general.conf with the following content (replace the AccessKeyId and SecretAccessKey with the values from the previous step

[aws]
accessKeyID = AccessKeyId
secretAccessKey = SecretAccessKey

For example

[aws]
accessKeyID = AKIAI7SNNYYAV456WBVQ
secretAccessKey = CKnnNl8DqPrqeudQxV2vPgXga6BCR0y4RMrlirmC

Create the secret

kubectl create secret generic collectord-cloudwatch --from-file=./100-general.conf --namespace collectord-cloudwatch

Verify that Pods are running

With the following command verify the pods are running

kubectl get pods -n collectord-cloudwatch

You should see an output of the pods successfully running inside collectord-cloudwatch namespace.

After that you can find all the Log Groups with logs and events in the CloudWatch console.

AWS CloudWatch Logs

  • Installation
    • Setup centralized Logging in 5 minutes.
    • Automatically forward host, container and application logs.
    • Test our solution with the 30 days evaluation license.
  • Annotations
    • Forwarding application logs.
    • Multi-line container logs.
    • Fields extraction for application and container logs (including timestamp extractions).
    • Hiding sensitive data, stripping terminal escape codes and colors.
  • Configuration
    • Advanced configurations for collectord.
  • Troubleshooting
    • Troubleshooting steps.
    • Verify configuration.

About Outcold Solutions

Outcold Solutions provides solutions for building centralized logging infrastructure and monitoring Kubernetes, OpenShift and Docker clusters. We provide easy to setup centralized logging infrastructure with AWS services. We offer Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers.