Commit graph

10842 commits

Author SHA1 Message Date
Pete Fritchman
e8efeea14a - move on-disk indexes to $LOGSTASH_HOME/var/indexes/log:type
- move index creation into the ::Log module
- calculate the next synctime after we run our .commits
2009-08-17 02:53:56 +00:00
Jordan Sissel
f8c0627acd - Move MessageStream class to it's own file 2009-08-16 18:35:29 +00:00
Jordan Sissel
6af4e677a4 - Rather than having message handlers return a single message response, let's
support streaming multiple responses for a request and yield messages instead
  of returning them.
2009-08-14 09:31:45 +00:00
Jordan Sissel
6cdb98559a - Add LogStash::Net::MessageClientConnectionReset
- Add LogStash::Net::NoSocket
- Add special 'signal' socketpair that is always used in select() as a reader
  so that we can, from sendmsg(), notify select() that we should end and loop
  again
- MessageSocketMux#connect now returns true if the connection succeeded. An
  exception is thrown otherwise.
- Only include writers who have populated output streams.
- If we had a client receiver (via #connect(..)) before, but it went away,
  assume the connection was destroyed/reset by some other means and that this is
  an error to be handled by the client.
- Don't immediately add a socket to the @writers list when it's created.
- If _sendmsg has a socket that is nil, raise NoSocket. If _sendmsg is called
  with a nil socket, and @receiver (created through #connect(...)) is also nil,
  this is an error and we throw NoSocket.
- Delete @receiver if it is a socket that is being removed.
- Delete socket from @writers once we flush it's data out the socket.
- Raise MessageClientConnectionReset on EOF/IOError or ECONNRESET
- indexer: Sync index every 60 seconds
- Add alpha version of Agent. It watches /var/log/messages.

Tested:
  - Both the agent and indexer server are capable of recovering from disconnections.
  - searcher test code (sandbox/searchclient.rb) works while Agent is feeding
    the indexer
2009-08-14 09:05:44 +00:00
Jordan Sissel
ac4298f2d8 - clean up whitespace 2009-08-14 07:38:48 +00:00
Jordan Sissel
f6df3a60ec - Add stub for success? method to ResponseMessage 2009-08-14 07:38:30 +00:00
Jordan Sissel
3b657e8e69 - Add MessageClient and MessageServer. These are just empty subclasses of
MessageSocketMux, but the subclass names detail the intent much more clearly.
  Later, we may want to move MessageSocketMux#listen to MessageServer, and
  #connect to MessageClient.
- Some style fixings:
  * All private methods are prefixed with 'private' on the preceding line.
  * End all methods with 'end # def <method name>'
  * Make methods private that should be private
  * Add documentation to many methods
- Add array support to MessageSocketMux#_sendmsg (and thus #sendmsg). This lets
  a client internally queue things and send multiple message to the output
  queue at the same time.
- Add 'success?' method to ResponseMessage and IndexEventResponse.
- Raise exception if we try to read a message that is too large and is likely
  incorrect/corrupt.
- Refactor MessageReader#each to be much more concise
- Added MessageReader#get (and private methods #ready? and #next_length)
2009-08-14 07:19:36 +00:00
Greg Retkowski
ff21452966 Slightly improved INSTALL
a few fixes to module references
require rubygems to get at json gem
2009-08-14 06:23:59 +00:00
Jordan Sissel
59279531b3 - Add Search{Message,Response}
- Support SearchRequest in indexer server
- Add playtest searchclient.rb
2009-08-12 06:17:58 +00:00
Jordan Sissel
af54d862d6 - sandbox client should ping and try to index log lines
- remove blank line in srv.rb
2009-08-11 07:11:22 +00:00
Jordan Sissel
f5da7b3420 - Fully remove a sock if we get EPIPE or ECONNRESET
- Add Ping{Request,Response} message
- Better MessageReader buffer handling (handle multiple message blocks if we can)
2009-08-11 07:11:01 +00:00
Jordan Sissel
3ca59b3dbb - Make the Indexer server actually index logs
- Also add ping handling (this should be refactored into another class that can
  be included to add ping support to any server)
2009-08-11 07:09:18 +00:00
Jordan Sissel
e2d87ac44b - Reliable networking stuff again. I can send 10000 messages and get them all
ACK'd using the test client and server.
- Comment out old debugging stuff
- IO.select() with nil timeout (block until there is data)
- Split MessageSocketMux#remove into remove_writer and remove_reader becuase
  quite often we only want to close one.
- In MessageReader, since we are doing buffered IO, we need to defer any system
  EOFError exceptions until our buffer is exhausted.
2009-08-10 09:00:16 +00:00
Jordan Sissel
ed69d85b19 - Generate message ID only when needed
- Lock around critical sections (message handling, sendmsg, etc)

Having some weird error that we ar ehandling closes incorrectly.
Nondetemrinistic behavior observed by the client. (Close before all messages recieved, etc)
2009-08-10 07:43:10 +00:00
Jordan Sissel
ab6c2e0177 - Add MessageReader class for extrapolating network stream parsing
- Move message ID sequencing to RequestMessage (Responses shouldn't generate
  new IDs under the current protocol)
2009-08-10 07:11:20 +00:00
Jordan Sissel
3bff2c835d - remove unused module 2009-08-10 04:35:23 +00:00
Jordan Sissel
216673ddd3 - update comment 2009-08-10 04:34:32 +00:00
Jordan Sissel
70fe2e37bf - comment out debugging 2009-08-10 03:41:32 +00:00
Jordan Sissel
005ae3a41c - Buffer messages with sendmsg() and flush them when select() says we can write
to a socket.
- MessageSocketMux#run now will return if there's no work left to do (like if
  our socket is dead)
- Updated client play code to quit when we get all messaged ack'd
2009-08-10 03:40:57 +00:00
Jordan Sissel
716b15f002 - Comment-out some debug lines 2009-08-10 02:31:01 +00:00
Jordan Sissel
539e58d006 - Refactor ::Net::Server into ::Net::SocketMux. This class is used for both
client and server networking.

  A SocketMux can handle any message type. The only difference between a client
  and a server is who initiates the TCP connection.
    Clients call SocketMux#connect
    Servers call SocketMux#listen
- Only strip upper-ascii in message values that are strings
- Shuffle namespacing around. Flat LogStash to heirarchical LogStash::Net::...
- Add stub Indexer server
2009-08-10 02:29:25 +00:00
Pete Fritchman
40f830d85a - capture pid in SYSLOGPROG
- add some linux-syslog patterns
- sample config for linux-syslog
2009-08-10 02:13:05 +00:00
Pete Fritchman
68183f9c6a - revive test_parse_entry
- LOGSTASH_HOME
- don't display @LINE
2009-08-10 02:12:13 +00:00
Pete Fritchman
ee69c120c7 - OK, so I'm crazy. File load order does not matter, and it shouldn't. 2009-08-10 02:05:58 +00:00
Pete Fritchman
3d1745efc4 - work around a weird grok bug (?), load patterns in sort order, and
always load the grok-patterns first
2009-08-10 01:59:51 +00:00
Pete Fritchman
064721b299 - refactor Log's required/optional keys for less duplication
- handle all date stuff in Log, not the sub-classes
- use "encoding" and "type" properly, rather than "type" and "name" 
- switch to having a group of Groks, and doing first-pattern-match
- make example config.rb a little easier to read
2009-08-10 00:42:48 +00:00
Jordan Sissel
937bfb3c11 - When setting message values, always strip upper ascii (byte >= 128) values
because JSON.dump assumes UTF-8
- server: when we decode a message, if we have a handler for <msgname>Handler,
  we should call that handler with the message as an argument.
2009-08-10 00:18:14 +00:00
Pete Fritchman
d5c3a36087 - move everything to the LogStash namespace 2009-08-09 23:29:44 +00:00
Pete Fritchman
f622532240 - move grok-patterns to patterns/ subdir
- load all pattern files under patterns/
- grok captures don't include @LINE, so put it there ourselves
- properly filter grok captures per comments
- tear out grok cmdline stuff
- switch to seconds since epoch for @DATE
- add firewalls pattern with initial netscreen session close pattern
2009-08-09 20:40:36 +00:00
Pete Fritchman
71c6cb138f - all reflected in wiki/Design now 2009-08-09 18:43:00 +00:00
Jordan Sissel
c0fd574685 - Read larger chunks into our buffer when reading messages
- Make the client read from a file and dump 20 messages at a time to the server.
2009-08-09 11:07:48 +00:00
Jordan Sissel
c79ed129cf - MessageStream should keep a count of its messages
- Add MessageStream.clear for wiping messages in the stream.
- Make socket reading faster and more reliable (since read(N) may return elss
  than N bytes)

  Fun stats, dumping lines of apache logs into sandbox/srv.rb 
  from sandbox/client.rb:
    1421000 finished @ 163/sec => 8680.2 secs
    1422000 finished @ 163/sec => 8678.5 secs
    1423000 finished @ 163/sec => 8679.1 secs

  8500 lines per second? That's about 2MB/sec. Not bad?
2009-08-09 10:45:12 +00:00
Jordan Sissel
45663365a5 - Start work on network layer. Messages are automagically decoded off the wire
to the correct message instance (an IndexEvent request becomes an
  IndexEventRequest instance, etc)
- We use some metaprogramming tricks to bind Message JSON fields to
  functions, see BindToHash and hashbind in net/message.rb
- Protocol versioning is poor right now, but it is present.
- The server code is not well-written, yet. I just wanted real client/server
  encode/decode testing.
- Add some play code to sandbox/
2009-08-09 09:40:57 +00:00
Pete Fritchman
e7a6f57274 - respect LOGSTASH_HOME 2009-08-09 03:30:17 +00:00
Pete Fritchman
a6f282c4d7 - let ferret index Arrays, it does the right thing. 2009-08-09 02:28:51 +00:00
Pete Fritchman
3d9cf8e58f - add profiling (temporary, but very useful for now) 2009-08-07 00:52:17 +00:00
Jordan Sissel
127394a84b - Use RubyGrok instead of IO.popen("grok ...")
Requires installed: http://semicomplete.googlecode.com/svn/cgrok/ruby/
- use $HOME for logstash index directory.
- Have import use Time.now.to_f to get higher-precision time values.
2009-08-06 08:39:05 +00:00
Pete Fritchman
b92be62794 - remove the grok teardown stuff, since we're going to be using cgrok's
new ruby api
- ignore fields not explicitly named in grok (i.e. %{FOO} is ignored,
  but %{FOO:bar} is imported as key=bar)
2009-08-06 05:50:52 +00:00
Pete Fritchman
002c080abb - display import rate 2009-08-06 05:38:20 +00:00
Pete Fritchman
944939a7ee - search by @DATE, by default 2009-08-06 05:37:59 +00:00
Pete Fritchman
94944089d9 - actually create an index from the default FieldInfos. now that we're
using these defaults, importing is faster and there is a much better
  log:index size ratio.
2009-08-06 05:37:23 +00:00
Pete Fritchman
82fdc11f2e - early early early-stage logstash prototype 2009-08-05 01:01:23 +00:00