Collectord

Kubernetes Centralized Logging with AWS S3, Athena, Glue and QuickSight

Access control

If you need define access control to the specific partitions you can do that with the policies on S3 buckets. Let's create a user, who we want to provide access only to the limit set of logs, from the cluster with name dev and namespace guestbook.

Create a user with the limited policy

At first we want to create a limited policy that will be allowed to read the data only from specific path on S3.

With this policy we will allow to use Athena (run queries, access the metadata), read information from Glue tables about existing databases, tables and partitions, and will limit access to the S3 bucket where we store the logs.

We will allow with this policy to read only from the path /kubernetes/container_logs/cluster=dev/namespace=guestbook/* and kubernetes/events/cluster=dev/namespace=guestbook/*, allowing user who will be assigned with this policy to read container logs and events only from namespace guestbook and cluster dev.

And with the last statement we will allow with this policy to write and read results to the S3 bucket aws-athena-query-results-999999999999-us-west-2, but only under path /guestbook/

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "athena:Start*",
                "athena:Get*",
                "athena:BatchGet*",
                "athena:List*",
                "athena:Run*",
                "glue:Get*",
                "glue:BatchGet*"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:Head*",
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": [
                "arn:aws:s3:::example.com.logs",
                "arn:aws:s3:::example.com.logs/kubernetes/container_logs/cluster=dev/namespace=guestbook/*",
                "arn:aws:s3:::example.com.logs/kubernetes/events/cluster=dev/namespace=guestbook/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:Head*",
                "s3:Get*",
                "s3:List*",
                "s3:Put*"
            ],
            "Resource": [
                "arn:aws:s3:::aws-athena-query-results-999999999999-us-west-2",
                "arn:aws:s3:::aws-athena-query-results-999999999999-us-west-2/guestbook/*"
            ]
        }
    ]
}

After that you can create new user and attach this policy.

Querying the data

After signing in with this user, make sure to change the query result location to arn:aws:s3:::aws-athena-query-results-999999999999-us-west-2/guestbook/

Result Location

With the specified policy user will not be able to access the files on S3 that located not in the allowed path.

Result Location

To help user to always search over guestbook namespace and cluster dev we can create a view container_logs_7d_dev_guestbook (and as usual we can use filtering only by last 7 days)

CREATE OR REPLACE VIEW container_logs_7d_dev_guestbook AS
SELECT *
FROM container_logs
WHERE namespace = 'guestbook' and cluster='dev' and dt>=date_format(date_add('day', -7, now()), '%Y%m%d');

Query over view

Athena Workgroups

With Athena Workgroups you can control not only access, but cost as well. Read more at Using Workgroups to Control Query Access and Costs.

  • Installation
    • Setup centralized Logging in 5 minutes.
    • Automatically forward host, container and application logs.
    • Test our solution with the 30 days evaluation license.
  • AWS Glue Catalog
    • Table definitions in Glue Catalog.
  • Querying data with Athena
    • Query automatically partitioned data with AWS Athena.
    • Best practices to work with Athena.
    • Query examples for container_logs, events and host_logs.
  • QuickSight for Dashboards and Reports
    • Connecting AWS QuickSight with the Athena.
    • Building dashboards.
  • Access control
    • Limit access to the data with IAM Policy.
  • Annotations
    • Forwarding application logs.
    • Multi-line container logs.
    • Fields extraction for application and container logs (including timestamp extractions).
    • Hiding sensitive data, stripping terminal escape codes and colors.
  • Configuration
    • Advanced configurations for collectord.
  • Troubleshooting
    • Troubleshooting steps.
    • Verify configuration.

About Outcold Solutions

Outcold Solutions provides solutions for building centralized logging infrastructure and monitoring Kubernetes, OpenShift and Docker clusters. We provide easy to setup centralized logging infrastructure with AWS services. We offer Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers.