Getting Started with the Elastic Stack

Applications running in production, test, and development environments produce massive files filled with endless lines of text in the form of log files. Mining the data available in these files manually is a daunting, nearly impossible, effort. This is where log management tools come into play. The two most popular are Splunk and the Elastic Stack. Both solutions are excellent options, with each having their own pros and cons; this article makes no claims as to which tool is best for your organization, as there are simply too many variables to be taken into consideration to make a blanket statement of one being better than the other.

The intent of this article is to provide the reader with basic instructions needed to quickly setup a simple proof-of-concept of a working Elastic Stack. Formerly known as the ELK Stack, an orchestration of 3 open-source projects managed by Elastic, Elasticsearch-Logstash-Kibana, with the release of version 5.x of the projects, and the inclusion of the Beat project, the stack is now known as the Elastic Stack, with all projects being released at the same time with the same version numbers.

Through this guide, you will be exposed to the following technologies. The only software you will need to install locally is Docker, everything else will be defined within Docker containers.

Docker
Elasticsearch
Logstash
Kibana
Beats
Apache

Don’t worry if any of these are new to you, through the rest of this article, we will be walking through a working example that you will pull down from our Elastic Stack Example GitHub repository.

Now let’s get started!

Install Docker

Download, Install and Run Docker per the instructions appropriate to your operating system. After installing and running Docker, create a top level directory for the project, let’s call it elk-poc.

Download the Example

From the console, navigate to your home directory (referenced as ~ going forward in this article), and pull down the example project.

git clone https://github.com/agiletrailblazers/elk-poc.git

In the following steps, we will highlight the key elements of the example, build the corresponding Docker images, and execute the example stack.

Create the Elasticsearch Docker Container

Elasticsearch is the server-side search and analytics engine that will index and store the data records that we will be collected from our Apache web server.

cd ~/elk-poc/elasticsearchDocker

We will install and run Elasticsearch in it's own Docker container. Since, for this example, we do not need to make any configuration changes to Elasticsearch, we are simply using the official Elasticsearch 5.4 Docker Image as our base image; custom configuration could be easily supported in the future by updating this Dockerfile. This image includes a fully configured Elasticsearch instance listening on port 9200. Execute the following to build the Docker image, we will use this image later when configuring the Docker Compose stack.

docker build -t elk-poc-elasticsearch .

Create the Logstash Docker Container

Logstash is the server-side data processing pipeline that will ingest the log records sent to it from our Apache web server, it will feed those records to Elasticsearch where they will be indexed for searching.

cd ~/elk-poc/logstashDocker

Take a look at the logstash.conf file:

input {

beats {

port => "5043"

}

}

filter {

grok {

match => { "message" => "%{COMBINEDAPACHELOG}"}

}

}

output {
elasticsearch {
action => "index"
index => "elk-poc"
hosts => "elk-poc-elasticsearch-service:9200"
user => "elastic"
password => "changeme"
}
stdout {
codec => rubydebug
}
}

Here, we are configuring Logstash to listen for input on port 5043, the default Beats port, using the Beats plugin, included by default with Logstash. The log records are matched and parsed using the built in grok Apache log filter. All matched records are then sent to the 2 configured outputs, one is standard out, and the other is Elasticsearch. The action for Elastic search is to index the log record into the elk-poc index on the specified host and port. In our case, the name of the host running Elastic search is elk-poc-elasticsearch-service, which will be the name of the host running the Elasticsearch server that we defined and created earlier, Elasticsearch is listening on its default port 9200.

We will install and run Logstash in it's own Docker container. The Docker image is based on the official Logstash 5.4 Docker Image. We override the default Logstash configuration by placing our logstash.conf configuration into the appropriate Logstash configuration directory. Execute the following to build the Docker image, we will use this image later when configuring the Docker Compose stack.

docker build -t elk-poc-logstash .

Create the Kibana Docker Container

Kibana lets you visualize the Elasticsearch data, in our case, it will allow us to search and visualize the log records from our Apache web server.

cd ~/elk-poc/kibanaDocker

Take a look at the kibana.yml file:

# Kibana is served by a back end server. This setting specifies the port to use.
server.port: 5601

# Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values.
# The default is 'localhost', which usually means remote machines will not be able to connect.
# To allow connections from remote users, set this parameter to a non-loopback address.
server.host: "elk-poc-kibana-service"

# The Kibana server's name. This is used for display purposes.
server.name: "elk-poc-kibana-service"

# The URL of the Elasticsearch instance to use for all your queries.
elasticsearch.url: "http://elk-poc-elasticsearch-service:9200"

Here, we are configuring Kibana to expose it's web interface on port 5601, this is the default port used by Kibana. The server host and name provided for this Kibana instance, elk-poc-kibana-service, will be defined later when we configure the Docker Compose stack. The final configuration is the URL to the Elasticsearch instance that contains the data we wish to search. In our case, the name of the host running Elastic search is elk-poc-elasticsearch-service, which will be the name of the host running the Elasticsearch server that we defined and created earlier, the port is 9200 as we configured earlier in the Elasticsearch configuration.

We will install and run Kibana in it's own Docker container. The Docker image is based on the official Kibana 5.4 Docker Image. We override the default Kibana configuration by placing our kibana.yml configuration into the appropriate Kibana configuration directory. Execute the following to build the Docker image, we will use this image later when configuring the Docker Compose stack.

docker build -t elk-poc-kibana .

Create the Apache/Filebeat Docker Container

Now let's create an Apache web server instance so that we can generate some log files. The shipping of our log files to Logstash will be done using the the Beats platform, specifically, the Filebeat data shipper.

cd ~/elk-poc/filebeatApacheDocker

Take a look at the filebeat.yml file:

filebeat.prospectors:
- input_type: log
paths:
- /var/log/apache2/*.log

output.logstash:
hosts: ["elk-poc-logstash-service:5043"]

Here, we are configuring the filebeat plugin for it's inputs and outputs. The input is configured to monitor and log records for every log file found in the default Apache log file directory. The output is configured to ship the log records to Logstash, in our case, the name of the host running Logstash is elk-poc-logstash-service, which will be the name of the host running the Logstash server that we defined and created earlier, Logstash is listening on port 5043 as we configured earlier in the Logstash configuration.

We will install and run Apache and Filebeat in their own Docker container. The Docker image consists of an Ubunu Linux server, running Java, Apache, and the latest 5.x version of Filebeat. We override the default Filebeat configuration by placing our filebeat.yml configuration into the appropriate Filebeat configuration directory. Execute the following to build the Docker image, we will use this image later when configuring the Docker Compose stack.

docker build -t elk-poc-filebeat-apache .

Run the Example using the Docker Compose Stack

Compose is a tool for defining and running multi-container Docker applications, we are using it in our example to run all of the containers we just defined and created. We are not going to discuss this configuration in any detail, surfice it to say that docker-compose.yml included in our example defines a stack that includes all of the Docker containers that we have already defined and ensures that they can communicate with each other using the hostnames and ports that we have configured. If you are interested in learning more about Docker Compose, I recommend starting here.

Let's start up the example stack.

cd ~/elk-poc

docker-compose up

You can shutdown the example stack later by pressing Ctrl-c in the console window.

Let's See it in Action

Browse to http://localhost, this will display the default Apache welcome page. The entries from the Apache access log will be automatically forwarded to the Logstash server, where the records will be parsed and sent to Elasticsearch for indexing, they will also be written to standard output so they appear in the console window.
You can search Elasticsearch directly via its REST API. Browse to http://localhost:9200/elk-poc/_search?q=source:access.log. This search will return all of the records that were sourced from the Apache access log. If you are prompted to login, use the default credentials, username: elastic, password: changeme. Note: by default, only 10 records will be returned, the search behavior is controlled via additional parameters to the search API.
You can search Elasticsearch visually, using Kibana. Browse to http://localhost:5601, when prompted to login, use the default credentials, username: elastic, password: changeme. You will land in the Management section, where you must configure an index pattern, this is where you will tell Kibana which Elasticsearch index to use, enter elk-poc for the index name and then click the Create button. Upon successful configuration, you will see all of the fields that are searchable in the elk-poc index. To execute the same search as done previously using the Elasticsearch REST API, click on Discover in the left-side navigation and enter "source=access.log" into the search field and click the search icon, the matching log entries will be displayed in the search results below along with a timeline of when the events occurred.

Summary

Obviously, this is a simple example, intended to introduce you to the Elastic Stack. The real versatility of the Elastic Stack, and Splunk for that matter, is in all of the plugins and the entire ecosystem of tools that have been written to support and enhance them. Use this example as a starting point, download it from (https://github.com/agiletrailblazers/elk-poc) and start experimenting, figure out how it can become a valuable business intelligence tool for your organization.

Digital Transformation

Agile Trailblazers Blog