[[common-log-format-example]]
== Example: Parse logs in the Common Log Format
++++
Example: Parse logs
++++
In this example tutorial, you’ll use an <> to parse
server logs in the {wikipedia}/Common_Log_Format[Common Log Format] before
indexing. Before starting, check the <> for
ingest pipelines.
The logs you want to parse look similar to this:
[source,log]
----
212.87.37.154 - - [05/May/2099:16:21:15 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"
----
// NOTCONSOLE
These logs contain a timestamp, IP address, and user agent. You want to give
these three items their own field in {es} for faster searches and
visualizations. You also want to know where the request is coming from.
. In {kib}, open the main menu and click **Stack Management** > **Ingest
Pipelines**.
+
[role="screenshot"]
image::images/ingest/ingest-pipeline-list.png[Kibana's Ingest Pipelines list view,align="center"]
. Click **Create pipeline > New pipeline**.
. Set **Name** to `my-pipeline` and optionally add a description for the
pipeline.
. Add a <> to parse the log message:
.. Click **Add a processor** and select the **Grok** processor type.
.. Set **Field** to `message` and **Patterns** to the following
<>:
+
[source,grok]
----
%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \[%{HTTPDATE:@timestamp}\] "%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}
----
// NOTCONSOLE
+
.. Click **Add** to save the processor.
.. Set the processor description to `Extract fields from 'message'`.
. Add processors for the timestamp, IP address, and user agent fields. Configure
the processors as follows:
+
--
[options="header"]
|====
| Processor type | Field | Additional options | Description
| <>
| `@timestamp`
| **Formats**: `dd/MMM/yyyy:HH:mm:ss Z`
| `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'`
| <>
| `source.ip`
| **Target field**: `source.geo`
| `Add 'source.geo' GeoIP data for 'source.ip'`
| <>
| `user_agent`
|
| `Extract fields from 'user_agent'`
|====
Your form should look similar to this:
[role="screenshot"]
image::images/ingest/ingest-pipeline-processor.png[Processors for Ingest Pipelines,align="center"]
The four processors will run sequentially: +
Grok > Date > GeoIP > User agent +
You can reorder processors using the arrow icons.
Alternatively, you can click the **Import processors** link and define the
processors as JSON:
[source,js]
----
{
include::common-log-format-example.asciidoc[tag=common-log-pipeline]
}
----
// NOTCONSOLE
////
[source,console]
----
PUT _ingest/pipeline/my-pipeline
{
// tag::common-log-pipeline[]
"processors": [
{
"grok": {
"description": "Extract fields from 'message'",
"field": "message",
"patterns": ["%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}"]
}
},
{
"date": {
"description": "Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'",
"field": "@timestamp",
"formats": [ "dd/MMM/yyyy:HH:mm:ss Z" ]
}
},
{
"geoip": {
"description": "Add 'source.geo' GeoIP data for 'source.ip'",
"field": "source.ip",
"target_field": "source.geo"
}
},
{
"user_agent": {
"description": "Extract fields from 'user_agent'",
"field": "user_agent"
}
}
]
// end::common-log-pipeline[]
}
----
// TEST[skip:This can output a warning, and asciidoc doesn't support allowed_warnings]
////
--
. To test the pipeline, click **Add documents**.
. In the **Documents** tab, provide a sample document for testing:
+
[source,js]
----
[
{
"_source": {
"message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
}
}
]
----
// NOTCONSOLE
. Click **Run the pipeline** and verify the pipeline worked as expected.
. If everything looks correct, close the panel, and then click **Create
pipeline**.
+
You’re now ready to index the logs data to a <>.
. Create an <> with
<>.
+
[source,console]
----
PUT _index_template/my-data-stream-template
{
"index_patterns": [ "my-data-stream*" ],
"data_stream": { },
"priority": 500
}
----
// TEST[continued]
. Index a document with the pipeline you created.
+
[source,console]
----
POST my-data-stream/_doc?pipeline=my-pipeline
{
"message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
}
----
// TEST[s/my-pipeline/my-pipeline&refresh=wait_for/]
// TEST[continued]
. To verify, search the data stream to retrieve the document. The following
search uses <> to return only
the <>.
+
--
[source,console]
----
GET my-data-stream/_search?filter_path=hits.hits._source
----
// TEST[continued]
The API returns:
[source,console-result]
----
{
"hits": {
"hits": [
{
"_source": {
"@timestamp": "2099-05-05T16:21:15.000Z",
"http": {
"request": {
"referrer": "\"-\"",
"method": "GET"
},
"response": {
"status_code": 200,
"body": {
"bytes": 3638
}
},
"version": "1.1"
},
"source": {
"ip": "89.160.20.128",
"geo": {
"continent_name" : "Europe",
"country_name" : "Sweden",
"country_iso_code" : "SE",
"city_name" : "Linköping",
"region_iso_code" : "SE-E",
"region_name" : "Östergötland County",
"location" : {
"lon" : 15.6167,
"lat" : 58.4167
}
}
},
"message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
"url": {
"original": "/favicon.ico"
},
"user": {
"name": "-",
"id": "-"
},
"user_agent": {
"original": "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
"os": {
"name": "Mac OS X",
"version": "10.11.6",
"full": "Mac OS X 10.11.6"
},
"name": "Chrome",
"device": {
"name": "Mac"
},
"version": "52.0.2743.116"
}
}
}
]
}
}
----
--
////
[source,console]
----
DELETE _data_stream/*
DELETE _index_template/*
----
// TEST[continued]
////