Commit graph

70 commits

Author SHA1 Message Date
Tray Torrance
10cd07c809 Add UTC to the TZ grok pattern 2013-08-26 09:43:59 -07:00
Hugo Lopes Tavares
1e8f5d8b10 Add "emergency" to LOGLEVEL grok pattern
Apache, nginx, syslog, and many systems use emergency level,
and it was missing in logstash.

Also add tests to cover all scenarios of `LOGLEVEL` expansion.
2013-08-02 11:24:12 -04:00
Jordan Sissel
48409efc59 Revert "Update HOSTNAME in grok-patterns"
This reverts commit a17f72150d.
This change caused a syntax error in the HOSTNAME pattern I believe.
2013-06-26 15:06:28 -07:00
Jordan Sissel
93fe8c011f Merge pull request #520 from erezzarum/fix-pattern
Europe date metric compliance is dd/mm/yyyy
2013-06-23 23:32:23 -07:00
Erez Zarum
c113556765 Europe date metric compliance is dd/mm/yyyy 2013-06-17 19:27:33 +00:00
xiaclo
a17f72150d Update HOSTNAME in grok-patterns
RFC952 states of a hostname: "The last character must not be a minus sign or period."
https://tools.ietf.org/html/rfc952

Some of the limitations in RFC952 were lifted by RFC1123, but not this one.
https://tools.ietf.org/html/rfc1123

The updated regex still allows single character hostnames, but does not allow the final character in any section to be a '-'.
2013-06-10 14:11:43 +10:00
Oluf Lorenzen
2bf6a9c0d6 make numbers match w/o word-boundarys 2013-04-22 18:24:58 +03:00
Oluf Lorenzen
19f3bf2fb3 fix TTY (make subdir optional)
seems as if i did not test the other patch W(
2013-04-22 17:34:50 +03:00
Oluf Lorenzen
a49c52aab9 fix typo 2013-04-22 17:27:18 +03:00
Oluf Lorenzen
17c1ca2deb shorten/cleanup/fix TTY-pattern
removed BSD/Linux-specific TTYS, as there are several more TTY-names under even under linux than /dev/pts/${NONNEGINT}.
This also allows
 * "/dev/ttyUSB0"
 * "/dev/ttyS0"
2013-04-18 19:15:03 +03:00
emergion
0ea3cbca40 Periods are common in usernames, allowed in most cases and RFC2617 thinks they are ok 2013-03-14 17:18:55 +11:00
Jordan Sissel
0503b11260 Merge pull request #316 from xiaclo/patch-2
Update patterns/grok-patterns
2013-02-27 09:00:31 -08:00
alexkoltun
9d26770a5b Update patterns/grok-patterns
Fix the hour pattern to accept single digit hours, fixes an issue with timestamps like that: "2013-02-21 6:23:46"
2013-02-21 09:42:39 +02:00
Aaron Blew
e2a29e159f Added : as a valid separator between seconds and subseconds 2013-01-24 17:22:31 -08:00
xiaclo
c070cbd055 Update patterns/grok-patterns
This is a personal preference, but for web logs, I prefer the parser to capture what it can.  Currently with an invalid request, it fails completely rather than capturing the other log information such as date, bytes transferred and HTTP status.

This patch captures the invalid request into @fields.rawrequest and leaves @fields.verb, @fields.request and @fields.httpversion as nulls if it cannot be properly parsed.

Here is a sample of invalid requests I have from my logs:
115.70.170.86 - - [31/Oct/2012:06:41:24 +1100] "G" 408 0 "-" "-"
165.86.71.20 - - [31/Oct/2012:04:27:01 +1100] "GET http://dis.us.criteo.com/dis/dis.aspx?&t1=sendEvent&c=2&p=3937&p1=v%3D2%26wi%3D7715628%26pt1%3D0%26pt2%3D1%26si%3D1&cb=21664477550&ref=&sc_r=1280x1024&sc_d=32 HTTP/1.0" 400 672 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E)"

Obviously these are not valid requests, and I prefer to handle them this way, but the change is up to you.
2013-01-14 14:39:03 +11:00
xiaclo
3c89bea927 Update patterns/grok-patterns
The hyphens in the regexes are creating ranges and need to be escaped.  Without this change, results in parser failures for logs containing URIs such as:

/test/page.html?arg=hypenated-arg
2013-01-11 12:04:14 +11:00
Frank Rosquin
698baed405 Fixed year pattern.
Year was matching any digit, one or more times. This could lead to way
too eager matching.

Match years as either a group of 2, or a group of 4 digits.
2013-01-08 15:45:46 +01:00
Jordan Sissel
124a14461f Add '.' as a valid date separator for EU dates (requested by rarruda in irc) 2012-12-21 01:34:09 -08:00
Avishai Ish-Shalom
9d5649b845 fixed missing | 2012-12-04 22:41:12 +02:00
Avishai Ish-Shalom
e3a250e9bc Added TRACE to LOGLEVEL 2012-12-04 22:33:47 +02:00
MikeSchuette
e25a7701de Match invalid URI characters in COMBINEDAPACHELOG
Apache generally logs whatever is requested, which is not guaranteed to be valid.
2012-11-27 13:56:59 -06:00
MikeSchuette
cd0e08e29d Fix URIPARAM to allow square brackets
PHP uses these all the time.
2012-11-27 11:55:20 -06:00
Jordan Sissel
919329320c - Use atomic grouping for PATH and its siblings. Fixes LOGSTASH-701 2012-11-13 13:06:13 -08:00
Jordan Sissel
68258c1944 fix spec/examples/parse-apache-logs failure due to QUOTEDSTRING not matching empty "" 2012-10-28 21:25:09 -07:00
Jordan Sissel
6f74511067 - use atomic groups (no backtracking) in QUOTEDSTRING - should prevent
some additional watchdog timeouts due to onigiruma getting stuck.
  LOGSTASH-644
2012-10-24 17:54:14 -07:00
olagache
71f471c60b Update patterns/grok-patterns 2012-09-27 18:28:46 +03:00
Jordan Sissel
06f91394c6 Hopefully fix some apache parsing issues 2012-09-26 23:08:03 -07:00
Matthew Baxa
528daa1114 Added '?' to URIPARAM
Added the '?' character to URIPARAM to handle an edge case
2012-09-26 15:14:00 -05:00
Jordan Sissel
99d88eb0ae - facility/severity can be zero. 2012-09-10 20:26:16 -07:00
Jordan Sissel
481472ec0c - don't capture 'ZONE' by name. (LOGSTASH-251) 2012-09-08 11:23:32 -07:00
Corry Haines
a0cea051a0 Add FATAL loglevel to grok pattern
It may not be in syslog, but it is somewhat common.
2012-08-14 12:36:50 -07:00
Chris Mague
0b8e3ee904 Update patterns/grok-patterns
Add pipes as an acceptable character in URIPARAM as some sites use them.

eg http://b.foo.com/shop/uk/fr/omesuff?iid=Suff%20tail|foo|%2Fbuy%2Fuk%2Ffr%2F22
2012-08-13 14:11:59 -07:00
John A. Barbuto
a411cdca0d Added NONNEGINT to patterns
Commit e62536a introduced a complication: there are times when one
wants to match against zero as well as the positive integers (such
as in the LINUXTTY pattern).  For these times, NONNEGINT can be used.

Existing users of POSINT might continue to expect zero to match, so
this change should probably be mentioned in the release notes (on the
other hand, some could be using POSINT without wanting it to match
zero, as happened to me).

Ref: Paragraph 3 of http://en.wikipedia.org/wiki/Natural_number
2012-06-22 12:01:26 -07:00
Pete Fritchman
e9cd3446fb Merge commit 'e62536a' 2012-06-22 09:52:54 -04:00
John A. Barbuto
e62536a614 Zero isn't a positive integer :) 2012-06-19 18:49:05 -07:00
Pete Fritchman
5f8ac852e5 Merge remote-tracking branch 'blewa/master' 2012-06-18 12:14:43 -04:00
Pete Fritchman
584d07de36 Merge pull request #158 from prune998/patch-1
Changed the PROG pattern to match Cisco PROG name starting with a percen...
2012-06-18 01:27:11 -07:00
Jeremiah Shirk
15f7567389 Allow HTTP version to be absent in apache logs of HTTP/1.0 requests 2012-06-18 04:18:16 -04:00
Jeremiah Shirk
18307bdca0 Apache log can have "-" for the request on a 408 (timeout) 2012-06-18 04:17:37 -04:00
Jeremiah Shirk
6c1b208ab9 Add ; and = to support URI path segment parameters
RFC 3986 (the URI specification) describes the , ; and =
characters used for including parameters in path segments.
Typically these are seen only on the final segment, just before
any query parameters, i.e.
    http://www.site.com/path1/path2;jsessionid=OI24B9ASD7BSSD

Adding ; and = to the regex, as , is already included
2012-06-18 04:16:49 -04:00
prune998
c1c1f443c8 Changed the PROG pattern to match Cisco PROG name starting with a percent (%).
Description : 
Usual syslog message :
<85>Jun 14 15:19:47 localhost sudo:     root : TTY=pts/1 ; PWD=/opt/logstash ; USER=root ; COMMAND=/bin/bash

Cisco typical message :
<166> Jun 14 15:30:00 10.100.252.52 %ASA-6-302021:  Teardown ICMP connection for faddr 10.100.120.120/0 gaddr 10.100.252.1/0 laddr 10.100.252.1/0

----> program name start with a %

Can be reproduced sending a manual syslog message with python script :


import logging
from logging.handlers import SysLogHandler

#message='Jun 14 15:19:47 localhost sudo:     root : TTY=pts/1 ; PWD=/opt/logstash ; USER=root ; COMMAND=/bin/bash'
message=' Jun 15 09:47:36 10.100.252.1 %ASA-6-111116:  Teardown UDP connection 6201992 for internet:192.168.1.1/1026 to interne:10.100.120.120/427 duration 0:02:04 bytes 588'

logger = logging.getLogger()
logger.setLevel(logging.INFO)
syslog = SysLogHandler(address=('localhost',5544))
#syslog = SysLogHandler(address='/dev/log')
#formatter = logging.Formatter('%(name)s: %(levelname)s %(message)s')
#syslog.setFormatter(formatter)
logger.addHandler(syslog)
logger.warning(message)

Leading to a "NOT SYSLOG" message in the logs and no @fields{} values

With this change the fields are OK and "NOT SYSLOG" message is gone. I still have a "@tags":["_grokparsefailure"], error though...
2012-06-15 11:43:13 -03:00
Nicholas Padilla
6644013a07 Merge remote-tracking branch 'upstream/master'
Conflicts:
	CONTRIBUTORS
	Makefile
	lib/logstash/outputs/elasticsearch_http.rb
	lib/logstash/outputs/sns.rb
	test/logstash/outputs/sns_test.rb
	test/test_helper.rb
2012-05-22 10:04:17 -06:00
Jordan Sissel
b9c8d269f5 - Fix QUOTEDSTRING pattern (LOGSTASH-446)
In some cases, Onigiruma gets confused about negative matches, so
  previously a pattern of '%{QS} something', if false match, would
  cause Oniguruma to loop frantickly. I haven't yet dug into
  the part of Oni that does this, but it's common that some regexp
  engines have this behavior. Easy fix moving to non-backtracking
  matches..
2012-05-13 01:14:43 -07:00
Nicholas Padilla
473fa4541a rebase with upstream/master 2012-05-10 08:55:33 -06:00
Aaron Blew
f4ddbc051c Added UUID type
Added {} to URIPATH and URIPARAM patterns
2012-05-04 16:02:25 -07:00
Jordan Sissel
67495940b8 Merge pull request #134 from shaftoe/master
Adding a grok pattern that I use to parse log levels (INFO, Warning, err, ...)
2012-04-28 18:57:04 -07:00
Robin Bowes
df33b88f3a Add ; to chars allowed in URIPATH 2012-04-17 14:18:43 +00:00
Alexander Fortin
670c99ec87 Adding LOGLEVEL to grok-patterns 2012-03-31 18:31:57 +02:00
Jelle Smet
5514388117 Added space filter 2012-01-20 21:35:56 +01:00
Jordan Sissel
1c9b2ff4c9 - merge ruby-grok's pattern data in again after some fixes 2011-12-17 15:25:36 -08:00