mirror of
https://github.com/elastic/kibana.git
synced 2025-06-27 18:51:07 -04:00
Adds guidelines for designing HTTP APIs (#224348)
This PR adds guidelines for designing Kibana HTTP APIs that are terraform-provider developer friendly. fix https://github.com/elastic/kibana/issues/224643 ## Summary Kibana doesn't have specific guidelines for designing HTTP APIs. With increasing constraints, it's time to document what was previously tribal knowledge. Elasticsearch is far further along this road, and other teams have compiled their own. This document serves as guidelines to designing _public_ HTTP APIs that are suitable for managing with Terraform. ## How to test this (recommended for easier reading) - pull this PR - setup [`docs.elastic.dev`](https://docs.elastic.dev/docs/local-dev-docs-setup) locally - run `yarn dev` from `docs.elastic.dev` - review the docs live!  ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials --------- Co-authored-by: florent-leborgne <florent.leborgne@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit is contained in:
parent
3d6954e252
commit
94b1174254
2 changed files with 505 additions and 0 deletions
502
dev_docs/contributing/kibana_http_api_guidelines.mdx
Normal file
502
dev_docs/contributing/kibana_http_api_guidelines.mdx
Normal file
|
@ -0,0 +1,502 @@
|
|||
---
|
||||
id: kibHttpApiGuidelines
|
||||
slug: /kibana-dev-docs/contributing/http-api-guidelines
|
||||
title: Guidelines for Terraform friendly HTTP APIs
|
||||
description: Guidelines for designing Terraform friendly HTTP APIs
|
||||
date: 2025-06-16
|
||||
tags: ['kibana','contributor', 'dev', 'http', 'api']
|
||||
---
|
||||
|
||||
Kibana's role has expanded beyond the UI. APIs have become the backbone for how customers automate their workflows, build custom scripts, and create their own integrations. With GitOps becoming the standard and Terraform establishing itself as the go-to infrastructure-as-code tool, HTTP APIs are now business-critical. The rise in Search AI means that it's more important than ever to make sure our APIs are easily understood by LLMs.
|
||||
|
||||
For Terraform to be a first class citizen, the provider needs to cover resources that weren't available before—think Dashboards, Visualizations, Data Views, and Detection Rules. This means we need to build REST APIs that follow the OpenAPI Specification properly, making it straightforward for teams to manage these resources as code through GitOps workflows.
|
||||
|
||||
These guidelines will help you build APIs that are easy to use, whether someone's automating with scripts or managing infrastructure at scale.
|
||||
|
||||
## Terraform provider developer-friendly API design
|
||||
|
||||
Terraform can work with any API, but some APIs are easier to deal with than others. Here are some tips to make your API more Terraform provider-friendly:
|
||||
|
||||
### Think in terms of resources
|
||||
|
||||
APIs that stick to a consistent set of arguments or parameters are much easier to map into Terraform. Terraform can describe the "thing" your API is managing. RPC-based APIs are harder to work with since they’re not declarative. Your HTTP APIs should describe REST-like actions (GET, POST, DELETE, etc.) against resources not remote procedures like: executeJob.
|
||||
|
||||
APIs designed around resources are easier to support than those focused on action-oriented endpoints.
|
||||
|
||||
✅ Preferred: REST-like actions against resources
|
||||
|
||||
```
|
||||
GET /api/alerting/rule/{id} # Kibana Alerting API
|
||||
POST /api/alerting/rule/{id} # Create rule
|
||||
PUT /api/alerting/rule/{id} # Update rule
|
||||
DELETE /api/alerting/rule/{id} # Delete rule
|
||||
```
|
||||
|
||||
⚠️ Alternative: action-style endpoints
|
||||
|
||||
```
|
||||
POST /api/executeJob
|
||||
POST /api/processSpaceUpdate
|
||||
POST /api/triggerIndexRebalance
|
||||
```
|
||||
|
||||
#### Example
|
||||
|
||||
The [Index Lifecycle Management API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ilm-put-lifecycle) (PUT /\_ilm/policy/\{policy\}) declaratively defines the desired lifecycle state rather than imperatively executing phase transitions:
|
||||
|
||||
```
|
||||
PUT _ilm/policy/my_ilm_policy
|
||||
{
|
||||
"policy": {
|
||||
"_meta": {
|
||||
"description": "used for nginx log",
|
||||
"project": {
|
||||
"name": "myProject",
|
||||
"department": "myDepartment"
|
||||
}
|
||||
},
|
||||
"phases": {
|
||||
"warm": {
|
||||
"min_age": "10d",
|
||||
"actions": {
|
||||
"forcemerge": {
|
||||
"max_num_segments": 1
|
||||
}
|
||||
}
|
||||
},
|
||||
"delete": {
|
||||
"min_age": "30d",
|
||||
"actions": {
|
||||
"delete": {}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
[Corresponding resource configuration](https://registry.terraform.io/providers/elastic/elasticstack/latest/docs/resources/elasticsearch_index_lifecycle)
|
||||
|
||||
```terraform
|
||||
provider "elasticstack" {
|
||||
elasticsearch {}
|
||||
}
|
||||
|
||||
resource "elasticstack_elasticsearch_index_lifecycle" "my_ilm" {
|
||||
name = "my_ilm_policy"
|
||||
|
||||
warm {
|
||||
min_age = "10d"
|
||||
forcemerge {
|
||||
"max_num_segments": 1
|
||||
}
|
||||
}
|
||||
|
||||
delete {
|
||||
min_age = "30d"
|
||||
delete {}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
#### API imposed challenges
|
||||
|
||||
API design decisions can create real challenges for Terraform resource implementation. The Elasticsearch [Index Settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) (\`PUT /\{index\}/\_settings\`) treats static settings (which require index recreation) and dynamic settings (which can be updated in place) identically. This creates a complex situation forcing the implementation to handle all settings together because the API doesn't offer a way to distinguish them upfront.
|
||||
|
||||
```bash
|
||||
{
|
||||
"error": {
|
||||
"reason": "Can't update non dynamic settings [[index.codec, index.number_of_shards]] for open indices"
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
[Index settings update bug](https://github.com/elastic/terraform-provider-elasticstack/issues/52)
|
||||
|
||||
A better API design would separate these concerns with different endpoints or provide metadata about which settings require recreation.
|
||||
|
||||
### Design human-friendly APIs
|
||||
|
||||
APIs with clear, descriptive field names and logical resource structures make both manual testing and Terraform resource development much smoother.
|
||||
|
||||
#### Use intuitive field names
|
||||
|
||||
Clear, descriptive field names translate to intuitive Terraform resources. Elasticsearch APIs generally follow this pattern, and we end up self-documenting configuration.
|
||||
|
||||
[Index resource](https://github.com/elastic/terraform-provider-elasticstack/blob/main/docs/resources/elasticsearch_index.md)
|
||||
```terraform
|
||||
resource "elasticstack_elasticsearch_index" "example" {
|
||||
name = "my-index" # Obvious purpose
|
||||
number_of_shards = 1 # Self-documenting
|
||||
number_of_replicas = 2 # Clear meaning
|
||||
deletion_protection = true # Prevent accidents
|
||||
search_idle_after = "20s" # Human-readable duration
|
||||
}
|
||||
```
|
||||
|
||||
##### Complex configuration handling
|
||||
|
||||
Complex mappings (e.g. saved objects) make life difficult but not impossible. One *can* make complex mappings readable through JSON encoding:
|
||||
|
||||
```terraform
|
||||
mappings = jsonencode({
|
||||
properties = {
|
||||
user = {
|
||||
properties = {
|
||||
name = { type = "text" }
|
||||
age = { type = "integer" }
|
||||
}
|
||||
}
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
The hickup here is with validation. Terraform just isn't designed to do validation (or testing!) like we would with conventional languages.
|
||||
It doesn’t validate what strings contain, only that they're parsable during terraform plan.
|
||||
Resource developers often fall back to the API layer for validation checks, that can lead to plan/apply issues.
|
||||
|
||||
### Return errors early—and all at once
|
||||
|
||||
If your API can validate inputs upfront and return all the errors in one go, do it\! Terraform handles it and shows them in a helpful way:
|
||||
|
||||
```typescript
|
||||
// PUT /api/alerting/rule/{id}
|
||||
...
|
||||
response: {
|
||||
200: {
|
||||
body: () => ruleResponseSchemaV1,
|
||||
description: 'Indicates a successful call.',
|
||||
},
|
||||
400: {
|
||||
description: 'Indicates an invalid schema or parameters.',
|
||||
},
|
||||
403: {
|
||||
description: 'Indicates that this call is forbidden.',
|
||||
},
|
||||
404: {
|
||||
description: 'Indicates a rule with the given ID does not exist.',
|
||||
},
|
||||
409: {
|
||||
description: 'Indicates that the rule has already been updated by another user.',
|
||||
},
|
||||
},
|
||||
...
|
||||
```
|
||||
When validation only happens during execution, catching errors early becomes impossible. For example, ILM policy structure errors only surface during API calls:
|
||||
|
||||
```bash
|
||||
Error: Failed to build [allocate] after last required field arrived
|
||||
Detail: [allocate] exclude doesn't support values of type: VALUE_STRING
|
||||
```
|
||||
|
||||
[ILM import issues](https://github.com/elastic/terraform-provider-elasticstack/issues/87)
|
||||
|
||||
The cost of not catching errors early is interrupted workflow, repetitive plan/apply failures and frustrated consumers.
|
||||
|
||||
### No side effects during HTTP GET or read requests
|
||||
|
||||
Terraform expects GET requests to be non-destructive and side-effect-free. If your API creates objects, updates data, or does anything else during a GET request, you’re in for trouble.
|
||||
|
||||
If you have to have side effects (like updating a ‘last-read’ timestamp), consumers have to work around it by, for example, limiting the number of times something’s read or building custom tooling to handle state drift.
|
||||
|
||||
### Standardize patterns across your API
|
||||
|
||||
Consistency is key for Terraform resource development. Using standard patterns make reusable abstractions and helper functions possible.
|
||||
|
||||
For example, all Elasticsearch resources use the same authentication patterns, whether you're working with indices, security roles, or lifecycle policies:
|
||||
|
||||
```terraform
|
||||
provider "elasticstack" {
|
||||
elasticsearch {
|
||||
username = "elastic"
|
||||
password = "changeme"
|
||||
endpoints = ["http://localhost:9200"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
[Provider configuration example](https://github.com/elastic/terraform-provider-elasticstack/blob/main/examples/provider/provider.tf)
|
||||
|
||||
OpenAPI specifications make generating consistent Kibana and Fleet clients possible:
|
||||
|
||||
```bash
|
||||
# From Makefile - generates standardized clients
|
||||
internal/clients/fleet/:
|
||||
oapi-codegen -package fleet fleet-openapi.json
|
||||
```
|
||||
|
||||
[Provider makefile](https://github.com/elastic/terraform-provider-elasticstack/blob/main/Makefile)
|
||||
|
||||
Standardization means authentication, error handling, and request patterns become reusable across all resources.
|
||||
|
||||
### Return as much as you can, and handle defaults carefully
|
||||
|
||||
Terraform needs complete information to track resource state effectively. If your API doesn't return enough details, Terraform can't properly detect changes or manage drift.
|
||||
|
||||
**The challenge with defaults**
|
||||
|
||||
Many Kibana APIs have extensive configuration options. Requiring users to specify every single value would create verbose, unwieldy configurations. However, being too sparse with returned data creates problems for Terraform.
|
||||
|
||||
**The golden rule**
|
||||
|
||||
A resource returned from a GET request should match as closely as possible to what a user specifies in a PUT request. When users don't specify values that affect critical behavior, consider returning those computed values too.
|
||||
|
||||
```bash
|
||||
# User creates resource with minimal config
|
||||
POST /api/resource { "name": "my-resource" }
|
||||
|
||||
# API applies defaults internally
|
||||
# name: "my-resource"
|
||||
# timeout: 30s (default)
|
||||
# retries: 3 (default)
|
||||
|
||||
# Later, GET returns only user-specified values
|
||||
GET /api/resource/my-resource
|
||||
{ "name": "my-resource" }
|
||||
|
||||
# Terraform can't detect if defaults changed!
|
||||
```
|
||||
|
||||
**Why this matters for import operations**
|
||||
|
||||
Terraform import brings existing resources under management by reading their current state. If your API returns different data than what users originally configured, it can cause [import state loss bugs](https://github.com/elastic/terraform-provider-elasticstack/issues/53).
|
||||
|
||||
### Be predictable
|
||||
|
||||
Terraform relies on the same input resulting in the same output. Avoid using fields like "last modified time"—Terraform can’t compare those meaningfully. This can be tricky with sensitive fields or when your API relies on randomness.
|
||||
|
||||
Constantly changing fields in API responses create a difficult situation for Terraform implementations because they cause constant drift.
|
||||
|
||||
Timestamps and other time-based fields aren’t predictable:
|
||||
|
||||
```
|
||||
settings {
|
||||
setting {
|
||||
name = "index.creation_date" # Timestamp varies
|
||||
value = "1643651976221" # Unix timestamp changes
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The [Alerting Rules API](https://www.elastic.co/docs/api/doc/kibana/operation/operation-post-alerting-rule-id) is another example, where execution status data changes on every read:
|
||||
|
||||
```
|
||||
"execution_status": {
|
||||
"status": "active",
|
||||
"last_execution_date": "2023-12-07T22:36:41.358Z", // Changes on every read
|
||||
"last_duration": 736
|
||||
}
|
||||
```
|
||||
|
||||
Users see Terraform detecting "changes" on every refresh, even when nothing in their configuration has actually changed\!
|
||||
|
||||
An alternative is to separate volatile runtime data from stable configuration data in responses:
|
||||
|
||||
```json
|
||||
{
|
||||
"config": {
|
||||
"name": "my-alert",
|
||||
"enabled": true,
|
||||
"params": {...}
|
||||
},
|
||||
"runtime_info": {
|
||||
"last_execution": "2023-12-07T22:36:41.358Z",
|
||||
"execution_count": 42
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Volatile fields create a challenging choice: include them (causing constant configuration drift) or exclude them (losing valuable runtime information).
|
||||
|
||||
Including and separating them makes it easier for the TF provider to specifically ignore sets of fields (marking them as [readonly](https://registry.terraform.io/providers/elastic/elasticstack/latest/docs/resources/kibana_alerting_rule#read-only)) but needs to be built into the client generator.
|
||||
|
||||
### Keep fields isomorphic
|
||||
|
||||
**Changing field types break terraform**
|
||||
|
||||
Terraform expects fields to maintain the same data type throughout a resource's lifecycle. When a field that was once a string suddenly becomes an array or object, it forces breaking changes that disrupt existing user configurations.
|
||||
|
||||
This is exactly what happened when Elasticsearch APIs [changed field types](https://github.com/elastic/terraform-provider-elasticstack/issues/926):
|
||||
|
||||
```terraform
|
||||
# Version 1.0 - field was a string
|
||||
resource "elasticstack_transform" "example" {
|
||||
group_by = "field_name" # String type
|
||||
}
|
||||
|
||||
# Version 2.0 - same field became an array
|
||||
resource "elasticstack_transform" "example" {
|
||||
group_by = ["field_name"] # Array type - BREAKING CHANGE!
|
||||
}
|
||||
```
|
||||
|
||||
Another example is when the Kibana SLO API changed `group_by` from `string` to `string | string[]`.
|
||||
|
||||
Changing a field to make it more lenient might look harmless on the surface but means that the API will return values clients don't expect.
|
||||
|
||||
**Design for consistency**
|
||||
|
||||
If your API needs to support multiple data types for the same logical concept, consider these approaches:
|
||||
|
||||
1. **Use separate, clearly named fields**
|
||||
```json
|
||||
{
|
||||
"group_by_field": "field_name", // Single field
|
||||
"group_by_fields": ["field1", "field2"] // Multiple fields
|
||||
}
|
||||
```
|
||||
|
||||
2. **Use the most flexible type from the start**
|
||||
```json
|
||||
{
|
||||
"group_by": ["field_name"] // Always an array, even for single values
|
||||
}
|
||||
```
|
||||
|
||||
**When you have to support flexible types**
|
||||
|
||||
If polymorphic behavior is unavoidable, JSON strings provide a workaround—but use them sparingly:
|
||||
|
||||
```terraform
|
||||
resource "elasticstack_elasticsearch_index" "example" {
|
||||
# Structured fields for simple, predictable values
|
||||
number_of_shards = 1
|
||||
number_of_replicas = 2
|
||||
|
||||
# JSON strings only for truly complex, variable structures
|
||||
mappings = jsonencode({
|
||||
properties = {
|
||||
user = { type = "keyword" }
|
||||
timestamp = { type = "date" }
|
||||
}
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
[Index resource schema](https://github.com/elastic/terraform-provider-elasticstack/blob/main/internal/elasticsearch/index/schema.go)
|
||||
|
||||
**What's the issue with JSON strings?**
|
||||
|
||||
`jsonencode()` and `jsondecode()` come with significant downsides:
|
||||
|
||||
- **Limited validation**: Terraform can only validate JSON syntax during plan, not the actual content structure
|
||||
- **Poor user experience**: Users lose autocompletion, type checking, and clear error messages
|
||||
- **Complex validation logic**: Content validation is left to the API layer, leading to runtime errors that could have been plan-time validation
|
||||
|
||||
**Bottom line**
|
||||
|
||||
Consistent field types make everyone's life easier—API consumers, Terraform provider developers, and your future self. When designing your API, choose field types that can accommodate future needs rather than changing them later.
|
||||
|
||||
### Make requests transactional
|
||||
|
||||
Terraform expects changes to take effect immediately. Design your API so that changes take effect immediately upon successful response.
|
||||
|
||||
Consider the scenario:
|
||||
|
||||
A user wants to create alerting rules with different configurations, depending on the space they are in.
|
||||
|
||||
In Kibana’s UI, they’d create a space, change into that space and then create the rule.
|
||||
|
||||
Through curl, the flow would be similar, although they’d probably GET the space to make sure it exists before POSTing the rule.
|
||||
|
||||
In terraform, it’s different. Both resources can (and often are) configured at the same time! In the scenario above, the client [needs the space ID](https://github.com/elastic/terraform-provider-elasticstack/blame/0826f4385a29fa58a79f785249f204e13ee3d47c/internal/clients/kibana/alerting.go#L192) to create the rule:
|
||||
|
||||
```go
|
||||
req := client.CreateRuleId(ctxWithAuth, rule.SpaceID, rule.RuleID).KbnXsrf("true").CreateRuleRequest(reqModel)
|
||||
```
|
||||
|
||||
When requests aren’t transactional, we’d need to build the POST & GET workflow into the client, adding complexity to what could have been simple.
|
||||
|
||||
Make sure this is the case by, for example, [setting \`refresh: wait\_for\`](https://github.com/elastic/kibana/blob/188669773e9cc1981b5e89f24afcfdc379ad3058/x-pack/platform/plugins/shared/alerting/server/alerts_client/alerts_client.ts#L644-L657) on the relevant indices in POST requests.
|
||||
|
||||
```ts
|
||||
private async persistAlertsHelper() {
|
||||
...
|
||||
await resolveAlertConflicts({
|
||||
logger: this.options.logger,
|
||||
esClient,
|
||||
bulkRequest: {
|
||||
refresh: 'wait_for', // Ensure the index is refreshed after the operation
|
||||
index: this.indexTemplateAndPattern.alias,
|
||||
require_alias: !this.isUsingDataStreams(),
|
||||
operations: bulkBody,
|
||||
},
|
||||
bulkResponse: response,
|
||||
ruleId: this.options.rule.id,
|
||||
ruleName: this.options.rule.name,
|
||||
ruleType: this.ruleType.id,
|
||||
});
|
||||
...
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
### Offer compare-and-swap
|
||||
|
||||
Between refreshing state and applying changes, there’s a gap where someone could modify your API. Generally it's ok to take the approach of last-write-wins.
|
||||
|
||||
If you have to consider concurrency control, support mechanisms like ETags and checksums, and the \`version\` property on saved objects.
|
||||
|
||||
### Ignore the robustness principle: don’t normalize output
|
||||
|
||||
The robustness principle says to accept input in various formats but return output in a consistent format, for example converting strings to lowercase, ordering elements in a list or changing whitespace in \`json\` strings.
|
||||
|
||||
For Terraform, don’t do this.
|
||||
|
||||
Terraform does byte-for-byte comparisons, so normalized output forces provider developers to implement logic to handle different data formats, ordering, capitalization variations and other unnecessary complexity.
|
||||
|
||||
Fortunately, the kibana client hasn’t needed to handle complex normalization yet. Let’s keep it that way, return data exactly as you received it.
|
||||
|
||||
Follow these tips, and you’ll make your API a dream to work with for Terraform provider developers.
|
||||
|
||||
### Example
|
||||
|
||||
The spaces API follows these guidelines and makes resource management with Terraform ideal.
|
||||
|
||||
Compare the [OAS specs](https://www.elastic.co/docs/api/doc/kibana/v9/operation/operation-get-spaces-space-id) with the [spaces resource configuration](https://registry.terraform.io/providers/elastic/elasticstack/latest/docs/resources/kibana_space) and you'll see why:
|
||||
|
||||
*Get a space*
|
||||
|
||||
```
|
||||
GET /api/spaces/space/{id}
|
||||
```
|
||||
|
||||
```
|
||||
curl \
|
||||
--request GET 'http://localhost:5622/api/spaces/space/{id}' \
|
||||
--header "Authorization: $API_KEY" \
|
||||
--header "elastic-api-version: 2023-10-31"
|
||||
```
|
||||
|
||||
Returns 200
|
||||
|
||||
```
|
||||
{
|
||||
"id": "test_space",
|
||||
"name": "Test Space",
|
||||
"color": null,
|
||||
"imageUrl": "",
|
||||
"initials": "ts",
|
||||
"solution": "es",
|
||||
"description": "A fresh space for testing visualisati*ons",
|
||||
"disabledFeatures": [ingestManager", "enterpriseSearch"]
|
||||
}
|
||||
```
|
||||
|
||||
*elasticstack_kibana_space*
|
||||
|
||||
```
|
||||
provider "elasticstack" {
|
||||
kibana {}
|
||||
}
|
||||
|
||||
resource "elasticstack_kibana_space" "example" {
|
||||
space_id = "test_space"
|
||||
name = "Test Space"
|
||||
description = "A fresh space for testing visualisations"
|
||||
disabled_features = ["ingestManager", "enterpriseSearch"]
|
||||
initials = "ts"
|
||||
}
|
||||
```
|
||||
|
|
@ -46,6 +46,9 @@
|
|||
{
|
||||
"id": "kibStyleGuide"
|
||||
},
|
||||
{
|
||||
"id": "kibHttpApiGuidelines"
|
||||
},
|
||||
{
|
||||
"id": "ktRFCProcess"
|
||||
},
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue