mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-07-01 10:53:25 -04:00
234 lines
No EOL
9.6 KiB
Text
234 lines
No EOL
9.6 KiB
Text
[#es-postgresql-connector-client-tutorial]
|
||
=== PostgreSQL self-managed connector tutorial
|
||
++++
|
||
<titleabbrev>Tutorial</titleabbrev>
|
||
++++
|
||
|
||
This tutorial walks you through the process of creating a self-managed connector for a PostgreSQL data source.
|
||
You'll be using the <<es-build-connector, self-managed connector>> workflow in the Kibana UI.
|
||
This means you'll be deploying the connector on your own infrastructure.
|
||
Refer to the <<es-connectors-postgresql, Elastic PostgreSQL connector reference>> for more information about this connector.
|
||
|
||
You'll use the {connectors-python}[connector framework^] to create the connector.
|
||
In this exercise, you'll be working in both the terminal (or your IDE) and the Kibana UI.
|
||
|
||
If you want to deploy a self-managed connector for another data source, use this tutorial as a blueprint.
|
||
Refer to the list of available <<es-build-connector,self-managed connectors>>.
|
||
|
||
[TIP]
|
||
====
|
||
Want to get started quickly testing a self-managed connector using Docker Compose?
|
||
Refer to this https://github.com/elastic/connectors/tree/main/scripts/stack#readme[README] in the `elastic/connectors` repo for more information.
|
||
====
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-prerequisites]
|
||
==== Prerequisites
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-prerequisites-elastic]
|
||
===== Elastic prerequisites
|
||
|
||
First, ensure you satisfy the <<es-build-connector-prerequisites, prerequisites>> for self-managed connectors.
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-postgresql-prerequisites]
|
||
===== PostgreSQL prerequisites
|
||
|
||
You need:
|
||
|
||
* PostgreSQL version 11+.
|
||
* Tables must be owned by a PostgreSQL user.
|
||
* Database `superuser` privileges are required to index all database tables.
|
||
|
||
[TIP]
|
||
====
|
||
You should enable recording of the commit time of PostgreSQL transactions.
|
||
Otherwise, _all_ data will be indexed in every sync.
|
||
By default, `track_commit_timestamp` is `off`.
|
||
|
||
Enable this by running the following command on the PosgreSQL server command line:
|
||
|
||
[source,shell]
|
||
----
|
||
ALTER SYSTEM SET track_commit_timestamp = on;
|
||
----
|
||
|
||
Then restart the PostgreSQL server.
|
||
====
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-steps]
|
||
==== Steps
|
||
|
||
To complete this tutorial, you'll need to complete the following steps:
|
||
|
||
. <<es-postgresql-connector-client-tutorial-create-index, Create an Elasticsearch index>>
|
||
. <<es-postgresql-connector-client-tutorial-setup-connector, Set up the connector>>
|
||
. <<es-postgresql-connector-client-tutorial-run-connector-service, Run the `connectors` connector service>>
|
||
. <<es-postgresql-connector-client-tutorial-sync-data-source>>
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-create-index]
|
||
==== Create an Elasticsearch index
|
||
|
||
Elastic connectors enable you to create searchable, read-only replicas of your data sources in Elasticsearch.
|
||
The first step in setting up your self-managed connector is to create an index.
|
||
|
||
In the {kibana-ref}[Kibana^] UI, navigate to *Search > Content > Elasticsearch indices* from the main menu, or use the {kibana-ref}/kibana-concepts-analysts.html#_finding_your_apps_and_objects[global search field].
|
||
|
||
Create a new connector index:
|
||
|
||
. Under *Select an ingestion method* choose *Connector*.
|
||
. Choose *PostgreSQL* from the list of connectors.
|
||
. Name your index and optionally change the language analyzer to match the human language of your data source.
|
||
(The index name you provide is automatically prefixed with `search-`.)
|
||
. Save your changes.
|
||
|
||
The index is created and ready to configure.
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-gather-elastic-details]
|
||
.Gather Elastic details
|
||
****
|
||
Before you can configure the connector, you need to gather some details about your Elastic deployment:
|
||
|
||
* *Elasticsearch endpoint*.
|
||
** If you're an Elastic Cloud user, find your deployment’s Elasticsearch endpoint in the Cloud UI under *Cloud > Deployments > <your-deployment> > Elasticsearch*.
|
||
** If you're running your Elastic deployment and the connector service in Docker, the default Elasticsearch endpoint is `http://host.docker.internal:9200`.
|
||
* *API key.*
|
||
You'll need this key to configure the connector.
|
||
Use an existing key or create a new one.
|
||
* *Connector ID*.
|
||
Your unique connector ID is automatically generated when you create the connector.
|
||
Find this in the Kibana UI.
|
||
****
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-setup-connector]
|
||
==== Set up the connector
|
||
|
||
Once you've created an index, you can set up the connector.
|
||
You will be guided through this process in the UI.
|
||
|
||
. *Edit the name and description for the connector.*
|
||
This will help your team identify the connector.
|
||
. *Clone and edit the connector service code.*
|
||
For this example, we'll use the {connectors-python}[Python framework^].
|
||
Follow these steps:
|
||
** Clone or fork that repository locally with the following command: `git clone https://github.com/elastic/connectors`.
|
||
** Open the `config.yml` configuration file in your editor of choice.
|
||
** Replace the values for `host`, `api_key`, and `connector_id` with the values you gathered <<es-postgresql-connector-client-tutorial-gather-elastic-details,earlier>>.
|
||
Use the `service_type` value `postgresql` for this connector.
|
||
+
|
||
.*Expand* to see an example `config.yml` file
|
||
[%collapsible]
|
||
====
|
||
Replace the values for `host`, `api_key`, and `connector_id` with your own values.
|
||
Use the `service_type` value `postgresql` for this connector.
|
||
[source,yaml]
|
||
----
|
||
elasticsearch:
|
||
host: <https://<my-elastic-deployment.es.us-west2.gcp.elastic-cloud.com>> # Your Elasticsearch endpoint
|
||
api_key: '<YOUR-API-KEY>' # Your top-level Elasticsearch API key
|
||
...
|
||
connectors:
|
||
-
|
||
connector_id: "<YOUR-CONNECTOR-ID>"
|
||
api_key: "'<YOUR-API-KEY>" # Your scoped connector index API key (optional). If not provided, the top-level API key is used.
|
||
service_type: "postgresql"
|
||
|
||
|
||
|
||
# Self-managed connector settings
|
||
connector_id: '<YOUR-CONNECTOR-ID>' # Your connector ID
|
||
service_type: 'postgresql' # The service type for your connector
|
||
|
||
sources:
|
||
# mongodb: connectors.sources.mongo:MongoDataSource
|
||
# s3: connectors.sources.s3:S3DataSource
|
||
# dir: connectors.sources.directory:DirectoryDataSource
|
||
# mysql: connectors.sources.mysql:MySqlDataSource
|
||
# network_drive: connectors.sources.network_drive:NASDataSource
|
||
# google_cloud_storage: connectors.sources.google_cloud_storage:GoogleCloudStorageDataSource
|
||
# azure_blob_storage: connectors.sources.azure_blob_storage:AzureBlobStorageDataSource
|
||
postgresql: connectors.sources.postgresql:PostgreSQLDataSource
|
||
# oracle: connectors.sources.oracle:OracleDataSource
|
||
# sharepoint: connectors.sources.sharepoint:SharepointDataSource
|
||
# mssql: connectors.sources.mssql:MSSQLDataSource
|
||
# jira: connectors.sources.jira:JiraDataSource
|
||
----
|
||
====
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-run-connector-service]
|
||
==== Run the connector service
|
||
|
||
Now that you've configured the connector code, you can run the connector service.
|
||
|
||
In your terminal or IDE:
|
||
|
||
. `cd` into the root of your `connectors` clone/fork.
|
||
. Run the following command: `make run`.
|
||
|
||
The connector service should now be running.
|
||
The UI will let you know that the connector has successfully connected to Elasticsearch.
|
||
|
||
Here we're working locally.
|
||
In production setups, you'll deploy the connector service to your own infrastructure.
|
||
If you prefer to use Docker, refer to the {connectors-python}/docs/DOCKER.md[repo docs^] for instructions.
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-sync-data-source]
|
||
==== Sync your PostgreSQL data source
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-sync-data-source-details]
|
||
===== Enter your PostgreSQL data source details
|
||
|
||
Once you've configured the connector, you can use it to index your data source.
|
||
|
||
You can now enter your PostgreSQL instance details in the Kibana UI.
|
||
|
||
Enter the following information:
|
||
|
||
* *Host*.
|
||
Server host address for your PostgreSQL instance.
|
||
* *Port*.
|
||
Port number for your PostgreSQL instance.
|
||
* *Username*.
|
||
Username of the PostgreSQL account.
|
||
* *Password*.
|
||
Password for that user.
|
||
* *Database*.
|
||
Name of the PostgreSQL database.
|
||
* *Comma-separated list of tables*.
|
||
`*` will fetch data from all tables in the configured database.
|
||
|
||
Once you've entered all these details, select *Save configuration*.
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-sync-data-source-launch-sync]
|
||
===== Launch a sync
|
||
|
||
If you navigate to the *Overview* tab in the Kibana UI, you can see the connector's _ingestion status_.
|
||
This should now have changed to *Configured*.
|
||
|
||
It's time to launch a sync by selecting the *Sync* button.
|
||
|
||
If you navigate to the terminal window where you're running the connector service, you should see output like the following:
|
||
|
||
[source,shell]
|
||
----
|
||
[FMWK][13:22:26][INFO] Fetcher <create: 499 update: 0 |delete: 0>
|
||
[FMWK][13:22:26][INF0] Fetcher <create: 599 update: 0 |delete: 0>
|
||
[FMWK][13:22:26][INFO] Fetcher <create: 699 update: 0 |delete: 0>
|
||
...
|
||
[FMWK][23:22:28][INF0] [oRXQwYYBLhXTs-qYpJ9i] Sync done: 3864 indexed, 0 deleted.
|
||
(27 seconds)
|
||
----
|
||
|
||
This confirms the connector has fetched records from your PostgreSQL table(s) and transformed them into documents in your Elasticsearch index.
|
||
|
||
Verify your Elasticsearch documents in the *Documents* tab in the Kibana UI.
|
||
|
||
If you're happy with the results, set a recurring sync schedule in the *Scheduling* tab.
|
||
This will ensure your _searchable_ data in Elasticsearch is always up to date with changes to your PostgreSQL data source.
|
||
|
||
[discrete#es-postgresql-connector-client-tutorial-learn-more]
|
||
==== Learn more
|
||
|
||
* <<es-build-connector, Overview of self-managed connectors and frameworks>>
|
||
* {connectors-python}[Elastic connector framework repository^]
|
||
* <<es-connectors-postgresql, Elastic PostgreSQL connector reference>>
|
||
* <<es-connectors, Overview of all Elastic connectors>>
|
||
* <<es-native-connectors, Elastic managed connectors in Elastic Cloud>> |