Collectord

Kubernetes Centralized Logging with AWS CloudWatch Logs

Troubleshooting

Verify configuration

Get the list of the pods

kubectl get pods -n collectord-cloudwatch

The output will be similar to

NAME                                          READY   STATUS    RESTARTS   AGE
collectord-cloudwatch-4n52x                   1/1     Running   0          96m
collectord-cloudwatch-addon-6b6bbdfdd-g8qhm   1/1     Running   0          96m

Considering that we have 2 different deployment types, the DaemonSet we deploy on every node and one Deployment addon (collectord-cloudwatch-addon) verify one node from each deployment (in example below change the pod names to the pods that are running on your cluster).

kubectl exec -n collectord-cloudwatch collectord-cloudwatch-addon-6b6bbdfdd-g8qhm /collectord verify
kubectl exec -n collectord-cloudwatch collectord-cloudwatch-4n52x  /collectord verify

For each command you will see an output similar to

Version = 6.0.300
Build date = 190308
Environment = kubernetes


  General:
  + conf: OK
  + db: OK
  + db-meta: OK
  + instanceID: OK
    instanceID = 2M563HM3871R8KDT6P74V17RD8
  + license load: OK
  + license expiration: OK
  + license connection: OK

  Kubernetes configuration:
  + api: OK
  + volumes root: OK
  + runtime: OK
    docker

  Docker configuration:
  + connect: OK
    containers = 22
  + path: OK
  + files: OK

  CRI-O configuration:
  - ignored: OK
    kubernetes uses other container runtime

  File Inputs:
  x input(syslog): FAILED
    no matches
  x input(logs): FAILED
    no matches
  x input(journald): FAILED
    err = stat /rootfs/var/log/journal/: no such file or directory

Errors: 3

With the number of the errors at the end. In our example we show output from minikube, where we see some invalid configurations, like

  • input(syslog) - minikube does not persist syslog output to disk, we will not be able to see these logs in application
  • input(logs) - minikube does not have any host log files on /var/log
  • input(journald) - minikube does not persist journald on disk

If you find some error in the configuration, after applying the change kubectl apply -f ./collectord-cloudwatch.yaml you will need to recreate pods, for that you can just delete all of them in our namespace kubectl delete pods --all -n collectord-cloudwatch. The workloads will recreate them.

Collect diagnostic information

If you need to open a support case you can collect diagnostic information, including performance, metrics and configuration.

1. Collect diagnostics information run following command

Choose pod from which you want to collect a diag information.

The following command takes several minutes.

kubectl exec -n collectord-cloudwatch collectord-cloudwatch-4n52x  /collectord -- diag --stream 1>diag.tar.gz

You can extract a tar archive to verify the information that we collect. We include information about performance, memory usage, basic telemetry metrics, information file with the information of the host Linux version and basic information about the license.

2. Collect logs

kubectl logs -n collectord-cloudwatch --timestamps collectord-cloudwatch-bwmwr  1>collectord-cloudwatch.log 2>&1

3. Run verify

kubectl exec -n collectord-cloudwatch collectord-cloudwatch-bwmwr /collectord verify > verify.log

4. Prepare tar archive

kubectl -czvf collectord-cloudwatch-$(date +%s).tar.gz verify.log collectord-cloudwatch.log diag.tar.gz

Pod is not getting scheduled

Verify that daemonsets have scheduled pods on the nodes

kubectl get daemonset --namespace collectord-cloudwatch

If in the output numbers under DESIRED, CURRENT, READY or UP-TO-DATE are 0, something can be wrong with configuration

NAME                            DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE-SELECTOR   AGE
collectord-cloudwatch          0         0         0         0            0           <none>          1m

You can run command to describe current state of the daemonset/collectord-cloudwatch

$ kubectl describe daemonsets --namespace collectord-cloudwatch

In the output there are will be one daemonsetIn the last lines events reported for this daemonset, for example

...
Events:
  Type     Reason            Age                From                  Message
  ----     ------            ----               ----                  -------
  Warning  FailedCreate      31m                daemonset-controller  Error creating: pods "collectord-cloudwatch-" is forbidden: SecurityContext.RunAsUser is forbidden

This error means that you are using Pod Security Policies, in that case you need to add our Cluster Role to the privileged Pod Security Policy, with

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app: collectord-cloudwatch
  name: collectord-cloudwatch
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs:     ['use']
  resourceNames:
  - privileged
- apiGroups:
  ...

Failed to pull the image

When you run command

$ kubectl get daemonsets --namespace collectord-cloudwatch

You can find that number under READY does not match DESIRED

NAMESPACE   NAME                     DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE-SELECTOR   AGE
default     collectord-cloudwatch   1         1         0         1            0           <none>          6m

Try to find the pods, that Kubernetes failed to start

$ kubectl get pods --namespace collectord-cloudwatch

If you see that collectord-cloudwatch- pod has an error ImagePullBackOff, as in the example below

NAMESPACE   NAME                             READY     STATUS             RESTARTS   AGE
default     collectord-cloudwatch-55t61     0/1       ImagePullBackOff   0          2m

In that case you need to verify that your Kubernetes cluster have access to the hub.docker.com registry.

You can run command

$ kubectl describe pods --namespace collectord-cloudwatch

Which should show you an output for each pod, including events raised for every pod

Events:
  FirstSeen LastSeen    Count   From            SubObjectPath               Type        Reason      Message
  --------- --------    -----   ----            -------------               --------    ------      -------
  3m        2m      4   kubelet, localhost  spec.containers{collectord-cloudwatch}  Normal      Pulling     pulling image "hub.docker.com/outcoldsolutions/collectord:6.0.301"
  3m        1m      6   kubelet, localhost  spec.containers{collectord-cloudwatch}  Normal      BackOff     Back-off pulling image "hub.docker.com/outcoldsolutions/collectord:6.0.301"
  3m        1m      11  kubelet, localhost                      Warning     FailedSync  Error syncing pod

Blocked access to external registries

If you are blocking external registries (hub.docker.com) for security reasons, you can copy image from external registry to your own repository with one host which have access to external registry

Copying image from hub.docker.com to your own registry

$ docker pull outcoldsolutions/collectord:6.0.301

After that you can re-tag it by prefixing with your own registry

docker tag  outcoldsolutions/collectord:6.0.301 [YOUR_REGISTRY]/outcoldsolutions/collectord:6.0.301

And push it to your registry

docker push [YOUR_REGISTRY]/outcoldsolutions/collectord:6.0.301

After that you will need to change your configuration yaml file to specify that you want to use image from different location

image: [YOUR_REGISTRY]/outcoldsolutions/collectord:6.0.301

If you need to move image between computers you can export it to tar file

$ docker image save outcoldsolutions/collectord:6.0.301 > collectord.tar

And load it on different docker host

$ cat collectord.tar | docker image load

Pod is crashing or running, but you don't see any data

Take a look at the pod logs using kubectl

kubectl logs -n collectord-cloudwatch collectord-cloudwatch-4n52x
INFO 2019/03/09 19:42:07.888657 outcoldsolutions.com/collectord/main.go:294: Build date = 190308, version = 6.0.300
INFO 2019/03/09 19:42:07.888921 outcoldsolutions.com/collectord/main.go:92: reading configuration from /config/cloudwatch/kubernetes/daemonset/001-general.conf
INFO 2019/03/09 19:42:07.889049 outcoldsolutions.com/collectord/main.go:92: reading configuration from /config/cloudwatch/kubernetes/daemonset/002-daemonset.conf
INFO 2019/03/09 19:42:07.889157 outcoldsolutions.com/collectord/main.go:92: reading configuration from /config/cloudwatch/kubernetes/daemonset/secret/100-general.conf
INFO 2019/03/09 19:42:07.889226 outcoldsolutions.com/collectord/main.go:92: reading configuration from /config/cloudwatch/kubernetes/daemonset/user/101-general.conf
INFO 2019/03/09 19:42:07.889272 outcoldsolutions.com/collectord/main.go:92: reading configuration from /config/cloudwatch/kubernetes/daemonset/user/102-daemonset.conf
INFO 2019/03/09 19:42:07.889869 outcoldsolutions.com/collectord/main.go:282: InstanceID = 2M563HM3871R8KDT6P74V17RD8, created = 2019-03-09 19:42:07.889852952 +0000 UTC m=+0.005088022
INFO 2019/03/09 19:42:07.919469 outcoldsolutions.com/collectord/pipeline/input/file/dir/watcher.go:85: watching /rootfs/var/log//(glob = , match = ^(([\w\-.]+\.log(.[\d\-]+)?)|(docker))$)
INFO 2019/03/09 19:42:07.919503 outcoldsolutions.com/collectord/pipeline/input/file/dir/watcher.go:85: watching /rootfs/var/log//(glob = , match = ^(syslog|messages)(.\d+)?$)
INFO 2019/03/09 19:42:07.919549 outcoldsolutions.com/collectord/environment/instance.go:1292: journald input: cannot get stat of the path /rootfs/var/log/journal/
INFO 2019/03/09 19:42:07.937845 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 23320379-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.942340 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 242bb597-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.944736 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 44cf598f-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.946291 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 48fb8e3d-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.963262 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 2319f05f-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.965792 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 2330b439-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.967132 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 4305fa0e-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.969090 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 44cf44d7-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:07.970973 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 46990b04-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:08.053709 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 66351eb6-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:08.096823 outcoldsolutions.com/collectord/pipeline/watcher/watcher.go:305: kubernetes_watcher - watching 663f7aea-42a3-11e9-9920-0800277a19a4
INFO 2019/03/09 19:42:08.239163 outcoldsolutions.com/collectord/license/license_check_pipe.go:155: license-check kubernetes BG5183Q44IE2M 0 0 ouMQMBLYDcTo3Lfen4CBmD7geZL8J5Uh1ueP4JKA3ZA 1552160527 1552160527 6.0.300 1552003200 true true 0
INFO 2019/03/09 19:42:08.291102 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/collectord-cloudwatch/collectord-cloudwatch/
INFO 2019/03/09 19:42:08.297877 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/kube-system/kube-controller-manager-minikube/
INFO 2019/03/09 19:42:08.298004 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:636: log group created /kubernetes/test/container_logs/kube-system/coredns/
INFO 2019/03/09 19:42:08.331389 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/kube-system/etcd-minikube/
INFO 2019/03/09 19:42:08.332011 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/kube-system/kube-apiserver-minikube/
INFO 2019/03/09 19:42:08.332964 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/kube-system/kube-addon-manager-minikube/
INFO 2019/03/09 19:42:08.333177 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/kube-system/coredns/
INFO 2019/03/09 19:42:08.339094 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/collectord-cloudwatch/collectord-cloudwatch-addon/
INFO 2019/03/09 19:42:08.345735 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:636: log group created /kubernetes/test/container_logs/kube-system/kube-scheduler-minikube/
INFO 2019/03/09 19:42:08.355388 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:636: log group created /kubernetes/test/container_logs/kube-system/kube-proxy/
INFO 2019/03/09 19:42:08.359191 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/collectord-cloudwatch/collectord-cloudwatch/ - /collectord-cloudwatch-4n52x/collectord-cloudwatch/1194e4e30dd2ebb22f8e319a97859021cacb374de3c25c6b9d181529cd031842/stdout@minikube
INFO 2019/03/09 19:42:08.403311 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/kube-controller-manager-minikube/ - /kube-controller-manager-minikube/kube-controller-manager/6792eb897b5c2051d0588e0dba63a690d1246943056b12fc93b52572ab2a05b2/stderr@minikube
INFO 2019/03/09 19:42:08.421739 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/etcd-minikube/ - /etcd-minikube/etcd/d14c365c498beb652871665ea254b1896551a761878c11bd6660dc19b66f6a95/stderr@minikube
INFO 2019/03/09 19:42:08.464595 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/kube-apiserver-minikube/ - /kube-apiserver-minikube/kube-apiserver/7e9867d5b52bf30f4fca0754d094d054dbe994122449df98821178ea00ce188f/stderr@minikube
INFO 2019/03/09 19:42:08.474708 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/coredns/ - /coredns-86c58d9df4-8mkp5/coredns/78909b5cfc5244810177be44ef61637b89e48ae3f15e5ba897e96e8088dbfa7f/stdout@minikube
INFO 2019/03/09 19:42:08.475052 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/kube-addon-manager-minikube/ - /kube-addon-manager-minikube/kube-addon-manager/dda763c668d53c43a4a2a480996a298e31f2139031dc79d9ffd7f24c87e07ebc/stdout@minikube
INFO 2019/03/09 19:42:08.508908 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/collectord-cloudwatch/collectord-cloudwatch-addon/ - /collectord-cloudwatch-addon-6b6bbdfdd-g8qhm/collectord-cloudwatch/4c769d19f7e40b2189881b2e3d4edd1c78982a2d19f9a828972a6d99bbe480d0/stdout@minikube
INFO 2019/03/09 19:42:08.527153 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:657: log group already exists /kubernetes/test/container_logs/kube-system/kube-addon-manager-minikube/
INFO 2019/03/09 19:42:08.527864 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/coredns/ - /coredns-86c58d9df4-f7l4w/coredns/11aa186acca747593911322c3e87a541281b7e025a99c3dc4af8e50309e40287/stdout@minikube
INFO 2019/03/09 19:42:08.565959 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/kube-scheduler-minikube/ - /kube-scheduler-minikube/kube-scheduler/2492eb18b0e95d8c56c6f833ddac6c745f59e92e82c1f426baf736bd4ca89007/stderr@minikube
INFO 2019/03/09 19:42:08.577661 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/kube-proxy/ - /kube-proxy-mbbkc/kube-proxy/83704cb209ddd41da4bcd6fda566cdd380114d2418ff62df5a53e7834ddb6377/stderr@minikube
INFO 2019/03/09 19:42:08.584763 outcoldsolutions.com/collectord/pipeline/output/cloudwatchlogs/output.go:708: log stream created /kubernetes/test/container_logs/kube-system/kube-addon-manager-minikube/ - /kube-addon-manager-minikube/kube-addon-manager/dda763c668d53c43a4a2a480996a298e31f2139031dc79d9ffd7f24c87e07ebc/stderr@minikube

There are could be warning, letting you know about the existing issue. Which could be incorrect policy for the AWS service, or invalid configuration.

Documentation does not help?

Contact us or using intercom (bottom right corner).

  • Installation
    • Setup centralized Logging in 5 minutes.
    • Automatically forward host, container and application logs.
    • Test our solution with the 30 days evaluation license.
  • Annotations
    • Forwarding application logs.
    • Multi-line container logs.
    • Fields extraction for application and container logs (including timestamp extractions).
    • Hiding sensitive data, stripping terminal escape codes and colors.
  • Configuration
    • Advanced configurations for collectord.
  • Troubleshooting
    • Troubleshooting steps.
    • Verify configuration.

About Outcold Solutions

Outcold Solutions provides solutions for building centralized logging infrastructure and monitoring Kubernetes, OpenShift and Docker clusters. We provide easy to setup centralized logging infrastructure with AWS services. We offer Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers.