[[geoip-processor]] === GeoIP processor ++++ GeoIP ++++ The `geoip` processor adds information about the geographical location of an IPv4 or IPv6 address. [[geoip-automatic-updates]] By default, the processor uses the GeoLite2 City, GeoLite2 Country, and GeoLite2 ASN GeoIP2 databases from http://dev.maxmind.com/geoip/geoip2/geolite2/[MaxMind], shared under the CCA-ShareAlike 4.0 license. {es} automatically downloads updates for these databases from the Elastic GeoIP endpoint: https://geoip.elastic.co/v1/database. To get download statistics for these updates, use the <>. If your cluster can't connect to the Elastic GeoIP endpoint or you want to manage your own updates, see <>. If {es} can't connect to the endpoint for 30 days all updated databases will become invalid. {es} will stop enriching documents with geoip data and will add `tags: ["_geoip_expired_database"]` field instead. [[using-ingest-geoip]] ==== Using the `geoip` Processor in a Pipeline [[ingest-geoip-options]] .`geoip` options [options="header"] |====== | Name | Required | Default | Description | `field` | yes | - | The field to get the ip address from for the geographical lookup. | `target_field` | no | geoip | The field that will hold the geographical information looked up from the MaxMind database. | `database_file` | no | GeoLite2-City.mmdb | The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the `ingest-geoip` config directory. | `properties` | no | [`continent_name`, `country_iso_code`, `country_name`, `region_iso_code`, `region_name`, `city_name`, `location`] * | Controls what properties are added to the `target_field` based on the geoip lookup. | `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document | `first_only` | no | `true` | If `true` only first found geoip data will be returned, even if `field` contains array |====== *Depends on what is available in `database_file`: * If the GeoLite2 City database is used, then the following fields may be added under the `target_field`: `ip`, `country_iso_code`, `country_name`, `continent_name`, `region_iso_code`, `region_name`, `city_name`, `timezone`, `latitude`, `longitude` and `location`. The fields actually added depend on what has been found and which properties were configured in `properties`. * If the GeoLite2 Country database is used, then the following fields may be added under the `target_field`: `ip`, `country_iso_code`, `country_name` and `continent_name`. The fields actually added depend on what has been found and which properties were configured in `properties`. * If the GeoLite2 ASN database is used, then the following fields may be added under the `target_field`: `ip`, `asn`, `organization_name` and `network`. The fields actually added depend on what has been found and which properties were configured in `properties`. Here is an example that uses the default city database and adds the geographical information to the `geoip` field based on the `ip` field: [source,console] -------------------------------------------------- PUT _ingest/pipeline/geoip { "description" : "Add geoip info", "processors" : [ { "geoip" : { "field" : "ip" } } ] } PUT my-index-000001/_doc/my_id?pipeline=geoip { "ip": "8.8.8.8" } GET my-index-000001/_doc/my_id -------------------------------------------------- Which returns: [source,console-result] -------------------------------------------------- { "found": true, "_index": "my-index-000001", "_id": "my_id", "_version": 1, "_seq_no": 55, "_primary_term": 1, "_source": { "ip": "8.8.8.8", "geoip": { "continent_name": "North America", "country_name": "United States", "country_iso_code": "US", "location": { "lat": 37.751, "lon": -97.822 } } } } -------------------------------------------------- // TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/] Here is an example that uses the default country database and adds the geographical information to the `geo` field based on the `ip` field. Note that this database is included in the module. So this: [source,console] -------------------------------------------------- PUT _ingest/pipeline/geoip { "description" : "Add geoip info", "processors" : [ { "geoip" : { "field" : "ip", "target_field" : "geo", "database_file" : "GeoLite2-Country.mmdb" } } ] } PUT my-index-000001/_doc/my_id?pipeline=geoip { "ip": "8.8.8.8" } GET my-index-000001/_doc/my_id -------------------------------------------------- returns this: [source,console-result] -------------------------------------------------- { "found": true, "_index": "my-index-000001", "_id": "my_id", "_version": 1, "_seq_no": 65, "_primary_term": 1, "_source": { "ip": "8.8.8.8", "geo": { "continent_name": "North America", "country_name": "United States", "country_iso_code": "US", } } } -------------------------------------------------- // TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/] Not all IP addresses find geo information from the database, When this occurs, no `target_field` is inserted into the document. Here is an example of what documents will be indexed as when information for "80.231.5.0" cannot be found: [source,console] -------------------------------------------------- PUT _ingest/pipeline/geoip { "description" : "Add geoip info", "processors" : [ { "geoip" : { "field" : "ip" } } ] } PUT my-index-000001/_doc/my_id?pipeline=geoip { "ip": "80.231.5.0" } GET my-index-000001/_doc/my_id -------------------------------------------------- Which returns: [source,console-result] -------------------------------------------------- { "_index" : "my-index-000001", "_id" : "my_id", "_version" : 1, "_seq_no" : 71, "_primary_term": 1, "found" : true, "_source" : { "ip" : "80.231.5.0" } } -------------------------------------------------- // TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/] [[ingest-geoip-mappings-note]] ===== Recognizing Location as a Geopoint Although this processor enriches your document with a `location` field containing the estimated latitude and longitude of the IP address, this field will not be indexed as a {ref}/geo-point.html[`geo_point`] type in Elasticsearch without explicitly defining it as such in the mapping. You can use the following mapping for the example index above: [source,console] -------------------------------------------------- PUT my_ip_locations { "mappings": { "properties": { "geoip": { "properties": { "location": { "type": "geo_point" } } } } } } -------------------------------------------------- //// [source,console] -------------------------------------------------- PUT _ingest/pipeline/geoip { "description" : "Add geoip info", "processors" : [ { "geoip" : { "field" : "ip" } } ] } PUT my_ip_locations/_doc/1?refresh=true&pipeline=geoip { "ip": "8.8.8.8" } GET /my_ip_locations/_search { "query": { "bool": { "must": { "match_all": {} }, "filter": { "geo_distance": { "distance": "1m", "geoip.location": { "lon": -97.822, "lat": 37.751 } } } } } } -------------------------------------------------- // TEST[continued] [source,console-result] -------------------------------------------------- { "took" : 3, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value": 1, "relation": "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "my_ip_locations", "_id" : "1", "_score" : 1.0, "_source" : { "geoip" : { "continent_name" : "North America", "country_name" : "United States", "country_iso_code" : "US", "location" : { "lon" : -97.822, "lat" : 37.751 } }, "ip" : "8.8.8.8" } } ] } } -------------------------------------------------- // TESTRESPONSE[s/"took" : 3/"took" : $body.took/] //// [[manage-geoip-database-updates]] ==== Manage your own GeoIP2 database updates If you can't <> your GeoIP2 databases from the Elastic endpoint, you have a few other options: * <> * <> * <> [[use-proxy-geoip-endpoint]] **Use a proxy endpoint** If you can't connect directly to the Elastic GeoIP endpoint, consider setting up a secure proxy. You can then specify the proxy endpoint URL in the <> setting of each node’s `elasticsearch.yml` file. [[use-custom-geoip-endpoint]] **Use a custom endpoint** You can create a service that mimics the Elastic GeoIP endpoint. You can then get automatic updates from this service. . Download your `.mmdb` database files from the http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site]. . Copy your database files to a single directory. . From your {es} directory, run: + [source,sh] ---- ./bin/elasticsearch-geoip -s my/source/dir [-t target/directory] ---- . Serve the static database files from your directory. For example, you can use Docker to serve the files from an nginx server: + [source,sh] ---- docker run -v my/source/dir:/usr/share/nginx/html:ro nginx ---- . Specify the service's endpoint URL in the <> setting of each node’s `elasticsearch.yml` file. + By default, {es} checks the endpoint for updates every three days. To use another polling interval, use the <> to set <>. [[manually-update-geoip-databases]] **Manually update your GeoIP2 databases** . Use the <> to set `ingest.geoip.downloader.enabled` to `false`. This disables automatic updates that may overwrite your database changes. This also deletes all downloaded databases. . Download your `.mmdb` database files from the http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site]. + You can also use custom city, country, and ASN `.mmdb` files. These files must be uncompressed and use the respective `-City.mmdb`, `-Country.mmdb`, or `-ASN.mmdb` extensions. . On {ess} deployments upload database using a {cloud}/ec-custom-bundles.html[custom bundle]. . On self-managed deployments copy the database files to `$ES_CONFIG/ingest-geoip`. . In your `geoip` processors, configure the `database_file` parameter to use a custom database file. [[ingest-geoip-settings]] ===== Node Settings The `geoip` processor supports the following setting: `ingest.geoip.cache_size`:: The maximum number of results that should be cached. Defaults to `1000`. Note that these settings are node settings and apply to all `geoip` processors, i.e. there is one cache for all defined `geoip` processors. [[geoip-cluster-settings]] ===== Cluster settings [[ingest-geoip-downloader-enabled]] `ingest.geoip.downloader.enabled`:: (<>, Boolean) If `true`, {es} automatically downloads and manages updates for GeoIP2 databases from the `ingest.geoip.downloader.endpoint`. If `false`, {es} does not download updates and deletes all downloaded databases. Defaults to `true`. [[ingest-geoip-downloader-endpoint]] `ingest.geoip.downloader.endpoint`:: (<>, string) Endpoint URL used to download updates for GeoIP2 databases. Defaults to `https://geoip.elastic.co/v1/database`. {es} stores downloaded database files in each node's <> at `$ES_TMPDIR/geoip-databases/`. [[ingest-geoip-downloader-poll-interval]] `ingest.geoip.downloader.poll.interval`:: (<>, <>) How often {es} checks for GeoIP2 database updates at the `ingest.geoip.downloader.endpoint`. Must be greater than `1d` (one day). Defaults to `3d` (three days).