elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-30 02:13:33 -04:00

Author	SHA1	Message	Date
bellengao	b17ce85f13	Add copy_from parameter for set ingest processor (#63540 )	2020-11-02 10:40:05 -06:00
Jason Tedor	0d4494f121	Clarify ingest-geoip database_file docs (#64340 ) The docs for the geoip processor database_file option appear to indicate that all geoip databases are in the config directory. This is leftover legacy from when this was the case when ingest-geoip was a plugin, but it is no longer true as the built-in databases now ship inside the ingest-geoip module that is bundled by default. This commit clarifies those docs. Co-authored-by: Jakob Reiter <jakommo@users.noreply.github.com>	2020-10-29 13:27:17 -04:00
István Zoltán Szabó	6093518f4a	[DOCS] Changes experimental flag to beta in DFA related docs (#63992 )	2020-10-26 17:02:46 +01:00
bellengao	0c88c19c1d	Add country_name to the default properties of geoip ingest processor (#62915 )	2020-09-30 14:06:51 -05:00
Lisa Cawley	ecf9e929ba	[DOCS] Add experimental tag to inference processor and bucket aggregation (#63023 )	2020-09-30 07:20:38 -07:00
Jakob Reiter	534b179c33	[DOCS] Updated target_field description of the json ingest processor (#61968 ) Co-authored-by: Dan Hermann <danhermann@users.noreply.github.com>	2020-09-30 08:43:29 -04:00
Peter Ansell	b40bdd3093	Add network from MaxMind Geo ASN database (#61676 ) This adds the network property from the MaxMind Geo ASN database. This enables analysis of IP data based on the subnets that MaxMind have previously identified for ASN networks. closes #60942	2020-09-24 11:51:50 -05:00
Dan Hermann	80ea415e0f	[DOCS] allow_duplicates option for append processor (#62336 )	2020-09-15 09:01:25 -05:00
Dan Hermann	9b8e8aa7ed	[DOCS] Sort option for the grok patterns endpoint (#62092 )	2020-09-14 12:36:21 -05:00
James Rodewig	b0336111af	[DOCS] Fix Gsub processor snippet (#61720 )	2020-08-31 10:14:54 -04:00
Dan Hermann	0ba8d82c1b	[DOCS] Configurable output format for date processor (#61440 )	2020-08-24 11:07:13 -05:00
James Rodewig	bccd58b2f1	[DOCS] Fix `field` def for join processor (#61395 )	2020-08-21 08:35:56 -04:00
James Rodewig	a94e5cb7c4	[DOCS] Replace Wikipedia links with attribute (#61171 )	2020-08-17 09:44:24 -04:00
James Rodewig	a0f4edff66	[DOCS] Fix chunking in query docs (#61053 ) Changes: * Moves "Notes" sections for the joining queries and percolate query pages to the parent page * Adds related redirects for the moved "Notes" pages * Assigns explicit anchor IDs to other "Notes" headings. This was required for the redirects to work.	2020-08-12 13:45:49 -04:00
James Rodewig	4eb09cb31e	[DOCS] Fix case of ingest processor titles (#61024 ) Converts page headings to sentence case. Adds a title abbreviation.	2020-08-12 11:28:00 -04:00
James Rodewig	56c778235c	[DOCS] Fix metadata field refs (#60764 )	2020-08-05 13:21:00 -04:00
Alexander Reelsen	c7ac9e7073	[DOCS] http -> https, remove outdated plugin docs (#60380 ) Plugin discovery documentation contained information about installing Elasticsearch 2.0 and installing an oracle JDK, both of which is no longer valid. While noticing that the instructions used cleartext HTTP to install packages, this commit replaces HTTPs links instead of HTTP where possible. In addition a few community links have been removed, as they do not seem to exist anymore.	2020-07-31 15:58:38 -04:00
James Rodewig	441c3a21b1	[DOCS] Update my-index examples (#60132 ) Changes the following example index names to `my-index-000001` for consistency: * `my-index` * `my_index` * `myindex`	2020-07-27 14:46:39 -04:00
James Rodewig	80b674fb25	[DOCS] Reformat snippets to use two-space indents (#59973 )	2020-07-21 12:24:26 -04:00
Shahzad	24e5da7851	Update regex file for es user agent node processor (#59697 )	2020-07-17 16:54:34 +02:00
James Rodewig	2be9db01c8	[DOCS] Replace `datatype` with `data type` (#58972 )	2020-07-07 13:52:10 -04:00
David Kyle	bf245e4c07	Make Inference processor field_map and inference_config optional (#58868 ) Relaxes the requirement that the inference ingest processor must has a field_map and inference_config defined even if they are empty.	2020-07-03 08:36:57 +01:00
István Zoltán Szabó	d0042fb791	[DOCS] Updates results_field description in the inference processor docs (#58554 )	2020-06-29 11:28:17 +02:00
Jake Landis	5088ab151a	Update hh to HH in date processor example (#58089 ) (#58142 ) Co-authored-by: Leaf-Lin <39002973+Leaf-Lin@users.noreply.github.com>	2020-06-15 17:03:42 -05:00
bellengao	efc4c9a210	Add ignore_empty_value parameter in set ingest processor (#57030 )	2020-06-15 07:26:57 -05:00
Jake Landis	f5910664b7	Ensure Joni warning are logged at debug (#57302 ) When Joni, the regex engine that powers grok emits a warning it does so by default to System.err. System.err logs are all bucketed together in the server log at WARN level. When Joni emits a warning, it can be extremely verbose, logging a message for each execution again that pattern. For ingest node that means for every document that is run that through Grok. Fortunately, Joni provides a call back hook to push these warnings to a custom location. This commit implements Joni's callback hook to push the Joni warning to the Elasticsearch server logger (logger.org.elasticsearch.ingest.common.GrokProcessor) at debug level. Generally these warning indicate a possible issue with the regular expression and upon creation of the Grok processor will do a "test run" of the expression and log the result (if any) at WARN level. This WARN level log should only occur on pipeline creation which is a much lower frequency then every document. Additionally, the documentation is updated with instructions for how to set the logger to debug level.	2020-06-09 13:33:27 -05:00
Lisa Cawley	8b9293b3bf	[DOCS] Replace docdir attribute with es-repo-dir (#57489 )	2020-06-01 15:55:05 -07:00
Adam Locke	d77388f919	[DOCS] Add links to `flattened` datatype (#56794 ) * Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-05-19 13:40:26 -04:00
István Zoltán Szabó	ca2f98382f	[DOCS] Changes feature importance links to point to the new page (#55531 ) * [DOCS] Changes feature importance links to point to the new page. * [DOCS] Fixes line breaks.	2020-04-28 09:02:14 +02:00
Benjamin Trent	c1afda4a23	[ML] adding prediction_field_type to inference config (#55128 ) Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). Analytics provides the default `prediction_field_type` when the model is created from the process.	2020-04-15 08:32:48 -04:00
István Zoltán Szabó	a0662399c7	[DOCS] Makes PUT inference API docs collapsible (#54653 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-04-03 09:45:42 +02:00
Benjamin Trent	4e1ff31c3c	[ML] add new inference_config field to trained model config (#54421 ) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config.	2020-04-02 10:34:17 -04:00
lcawl	2641a39fd5	[DOCS] Fixes shared attribute for feature importance	2020-04-01 14:46:38 -07:00
István Zoltán Szabó	a65e95e093	[DOCS] Adds feature importance mapping subsection to inference processor docs (#54190 )	2020-03-26 09:22:12 +01:00
bellengao	8ffe5d1f94	Support array for all string ingest processors	2020-03-17 15:22:30 -05:00
Benjamin Trent	970f726c1f	[ML] renaming inference processor field field_mappings to new name field_map (#53433 ) This renames the `inference` processor configuration field `field_mappings` to `field_map`. `field_mappings` is now deprecated.	2020-03-12 12:49:25 -04:00
Benjamin Trent	4e1f029b04	[ML][Inference] adds new default_field_map field to trained models (#53294 ) Adds a new `default_field_map` field to trained model config objects. This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data. The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.	2020-03-11 12:23:56 -04:00
David Pilato	e51b8a51aa	[DOS] Fix typo in CSV processor docs (#52649 ) Corrects an example array in a snippet of the CSV processor docs.	2020-02-25 08:47:58 -05:00
Benjamin Trent	20f54272f0	[ML] Adds feature importance to option to inference processor (#52218 ) This adds machine learning model feature importance calculations to the inference processor. The new flag in the configuration matches the analytics parameter name: `num_top_feature_importance_values` Example: ``` "inference": { "field_mappings": {}, "model_id": "my_model", "inference_config": { "regression": { "num_top_feature_importance_values": 3 } } } ``` This will write to the document as follows: ``` "inference" : { "feature_importance" : { "FlightTimeMin" : -76.90955548511226, "FlightDelayType" : 114.13514762158526, "DistanceMiles" : 13.731580450792187 }, "predicted_value" : 108.33165831875137, "model_id" : "my_model" } ``` This is done through calculating the [SHAP values](https://arxiv.org/abs/1802.03888). It requires that models have populated `number_samples` for each tree node. This is not available to models that were created before 7.7. Additionally, if the inference config is requesting feature_importance, and not all nodes have been upgraded yet, it will not allow the pipeline to be created. This is to safe-guard in a mixed-version environment where only some ingest nodes have been upgraded. NOTE: the algorithm is a Java port of the one laid out in ml-cpp: https://github.com/elastic/ml-cpp/blob/master/lib/maths/CTreeShapFeatureImportance.cc usability blocked by: https://github.com/elastic/ml-cpp/pull/991	2020-02-21 16:36:21 -05:00
Yang Wang	5c9f79534f	Expose more authentication info to ingest pipeline (#51305 ) The changes add more granularity for identiying the data ingestion user. The ingest pipeline can now be configure to record authentication realm and type. It can also record API key name and ID when one is in use. This improves traceability when data are being ingested from multiple agents and will become more relevant with the incoming support of required pipelines (#46847) Resolves: #49106	2020-02-10 13:56:07 +11:00
Przemko Robakowski	5560135542	Add empty_value parameter to CSV processor (#51567 ) * Add empty_value parameter to CSV processor This change adds `empty_value` parameter to the CSV processor. This value is used to fill empty fields. Fields will be skipped if this parameter is ommited. This behavior is the same for both quoted and unquoted fields. * docs updated * Fix compilation problem Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-05 22:36:00 +01:00
David Kyle	34743bcd6f	[ML] Remove stray field from inference docs (#51870 ) model_info_field is not a valid option	2020-02-05 10:49:36 +00:00
Florian Kelbert	bd52041f92	[DOCS] Remove unneeded comma from CSV processor example (#51859 )	2020-02-04 09:23:43 -05:00
István Zoltán Szabó	4e0e6e83e0	[DOCS] Fixes indentation in inference processor code snippet (#51252 )	2020-01-21 16:21:17 +01:00
Martijn van Groningen	2b2935fd52	Add pipeline name to ingest metadata (#50467 ) This commit adds the name of the current pipeline to ingest metadata. This pipeline name is accessible under the following key: '_ingest.pipeline'. Example usage in pipeline: PUT /_ingest/pipeline/2 { "processors": [ { "set": { "field": "pipeline_name", "value": "{{_ingest.pipeline}}" } } ] } Closes #42106	2020-01-15 16:17:05 +01:00
Igor Motov	7f81467378	Geo: Switch generated GeoJson type names to camel case (#50285 ) (#50400 ) Switches generated GeoJson type names to camel case to conform to the standard. Closes #49568	2019-12-20 04:47:42 -10:00
István Zoltán Szabó	b8cae37374	[DOCS] Adds inference processor documentation (#50204 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-12-19 12:19:44 +01:00
Igor Motov	a26e4d1e5e	Geo: Switch generated WKT to upper case (#50285 ) Switches generated WKT to upper case to conform to the standard recommendation. Relates #49568	2019-12-18 07:28:56 -10:00
Przemko Robakowski	64e1a774fc	CSV ingest processor (#49509 ) * CSV Processor for Ingest This change adds new ingest processor that breaks line from CSV file into separate fields. By default it conforms to RFC 4180 but can be tweaked. Closes #49113	2019-12-11 14:52:04 +01:00
Przemko Robakowski	c57032f622	Allow list of IPs in geoip ingest processor (#49573 ) * Allow list of IPs in geoip ingest processor This change lets you use array of IPs in addition to string in geoip processor source field. It will set array containing geoip data for each element in source, unless first_only parameter option is enabled, then only first found will be returned. Closes #46193	2019-12-06 21:57:06 +01:00

1 2 3

103 commits