elasticsearch/docs/plugins
David Pilato 564ff9db88
Extract more standard metadata from binary files (#78754)
Until now, we have been extracted a few number of fields from the binary files sent to the ingest attachment plugin:

* `content`,
* `title`,
* `author`,
* `keywords`,
* `date`,
* `content_type`,
* `content_length`,
* `language`.

Tika has a list of more standard properties which can be extracted:

* `modified`,
* `format`,
* `identifier`,
* `contributor`,
* `coverage`,
* `modifier`,
* `creator_tool`,
* `publisher`,
* `relation`,
* `rights`,
* `source`,
* `type`,
* `description`,
* `print_date`,
* `metadata_date`,
* `latitude`,
* `longitude`,
* `altitude`,
* `rating`,
* `comments`

This commit exposes those new fields.

Related to #22339.

Co-authored-by: Keith Massey <keith.massey@elastic.co>
2021-11-23 05:01:08 +01:00
..
analysis-icu.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
analysis-kuromoji.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
analysis-nori.asciidoc [DOCS] Reformat Plugin snippets to use two-space indents (#59895) 2020-07-20 14:14:51 -04:00
analysis-phonetic.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
analysis-smartcn.asciidoc [DOCS] Swap [float] for [discrete] (#60124) 2020-07-23 11:48:22 -04:00
analysis-stempel.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
analysis-ukrainian.asciidoc [DOCS] http -> https, remove outdated plugin docs (#60380) 2020-07-31 15:58:38 -04:00
analysis.asciidoc [DOCS] Audit community plugins and integrations (#69378) 2021-02-22 16:10:17 -05:00
api.asciidoc [DOCS] Audit community plugins and integrations (#69378) 2021-02-22 16:10:17 -05:00
authors.asciidoc Fix docs path for docs PR to pass 2021-06-01 16:55:13 +02:00
discovery-azure-classic.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
discovery-ec2.asciidoc [DOCS] Update xrefs to the units sections in the ES guide (#74726) 2021-06-29 18:09:10 -07:00
discovery-gce.asciidoc [DOCS] Improve discovery-gce docs (#72338) 2021-04-28 12:26:52 -04:00
discovery.asciidoc [DOCS] Audit community plugins and integrations (#69378) 2021-02-22 16:10:17 -05:00
index.asciidoc Remove quota-aware-fs plugin (#76352) 2021-08-11 15:12:50 -07:00
ingest-attachment.asciidoc Extract more standard metadata from binary files (#78754) 2021-11-23 05:01:08 +01:00
ingest-user-agent.asciidoc Fix ingest cross-doc links 2018-12-22 20:51:18 -05:00
ingest.asciidoc [DOCS] Refactor ingest pipeline docs (#70253) 2021-03-15 12:22:57 -04:00
install_remove.asciidoc [DOCS] Add checksum links for plugin downloads (#64949) (#64957) 2020-11-11 13:30:14 -05:00
integrations.asciidoc [DOCS] Audit community plugins and integrations (#69378) 2021-02-22 16:10:17 -05:00
mapper-annotated-text.asciidoc [DOCS] Fix a typo in annotated text examples (#78683) 2021-10-05 07:23:04 -04:00
mapper-murmur3.asciidoc [DOCS] Update my-index examples (#60132) 2020-07-27 14:46:39 -04:00
mapper-size.asciidoc Update mapper-size.asciidoc 2021-04-01 11:05:34 +02:00
mapper.asciidoc [DOCS] Fix metadata field refs (#60764) 2020-08-05 13:21:00 -04:00
plugin-script.asciidoc Document the declarative plugins configuration file (#80760) 2021-11-17 21:11:00 +00:00
redirects.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
repository-azure.asciidoc [DOCS] Clarify the type of Azure storage for snapshots (#72826) (#72976) 2021-05-12 10:20:18 -04:00
repository-gcs.asciidoc [DOCS] Update service account creation docs for GCS repository plugin (#73561) (#73664) 2021-06-02 09:01:06 -04:00
repository-hdfs.asciidoc [DOCS] http -> https, remove outdated plugin docs (#60380) 2020-07-31 15:58:38 -04:00
repository-s3.asciidoc [DOCS] Overhaul snapshot and restore docs (#79081) 2021-11-15 12:45:07 -05:00
repository-shared-settings.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
repository.asciidoc [DOCS] Swap [float] for [discrete] (#60124) 2020-07-23 11:48:22 -04:00
store-smb.asciidoc Deprecate SimpleFS and replace it with NIOFS (#75156) (#75196) 2021-07-09 18:22:41 -04:00
store.asciidoc [DOCS] Swap [float] for [discrete] (#60124) 2020-07-23 11:48:22 -04:00