[HTTP] Follow up on dev doc additions for terraform-friendly HTTP APIs (#225317)

## Summary

Follow up PR https://github.com/elastic/kibana/pull/224348

Expanded original document with 3 new sections

<img width="753" alt="Screenshot 2025-06-25 at 17 13 13"
src="https://github.com/user-attachments/assets/9ccb5da4-dbd0-4c35-bd76-3b4d5cd7fa2f"
/>

<img width="760" alt="Screenshot 2025-06-25 at 17 13 19"
src="https://github.com/user-attachments/assets/32ba114a-e50d-4e38-9a0d-f62dc14f988b"
/>

(We can consider deleting this last section, as I'm not sure it'll be
worth it)
<img width="756" alt="Screenshot 2025-06-25 at 17 13 28"
src="https://github.com/user-attachments/assets/143666aa-78fa-42ab-880a-f5428ae4183f"
/>

---------

Co-authored-by: Christiane (Tina) Heiligers <christiane.heiligers@elastic.co>
Co-authored-by: florent-leborgne <florent.leborgne@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit is contained in:
Jean-Louis Leysens 2025-06-27 12:38:49 +02:00 committed by GitHub
parent e1cbc3c9f6
commit 506db10ae9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -19,7 +19,7 @@ Terraform can work with any API, but some APIs are easier to deal with than othe
### Think in terms of resources
APIs that stick to a consistent set of arguments or parameters are much easier to map into Terraform. Terraform can describe the "thing" your API is managing. RPC-based APIs are harder to work with since theyre not declarative. Your HTTP APIs should describe REST-like actions (GET, POST, DELETE, etc.) against resources not remote procedures like: executeJob.
APIs that stick to a consistent set of arguments or parameters are much easier to map into Terraform. Terraform can describe the "thing" your API is managing. RPC-based APIs are harder to work with since theyre not declarative. Your HTTP APIs should describe REST-like actions (GET, POST, DELETE, etc.) against resources not remote procedures like: executeJob.
APIs designed around resources are easier to support than those focused on action-oriented endpoints.
@ -99,16 +99,140 @@ resource "elasticstack_elasticsearch_index_lifecycle" "my_ilm" {
}
```
### Implement complete CRUD operations
For Terraform to properly manage resources, your API must implement a complete set of CRUD operations on each resource. This is essential for the Terraform lifecycle (create, read, update, delete) to work properly.
#### Required endpoints for each resource
For every resource type, implement these HTTP endpoints:
```
GET /api/resource/{id} # Read - retrieve an existing resource
POST /api/resource # Create - create a new resource
PUT /api/resource/{id} # Update - update an existing resource
DELETE /api/resource/{id} # Delete - remove an existing resource
GET /api/resource # List - retrieve all resources (with pagination)
```
<DocCallOut title="Terraform import">
The "List" operation is critical for Terraform's data source and import functionality ([see the docs](https://developer.hashicorp.com/terraform/cli/import)).
</DocCallOut>
#### Implementation considerations
1. **Consistent response structures**: Ensure GET and POST/PUT responses return the same structure with identical fields.
2. **Idempotent operations**: POST for creation and PUT for updates should be idempotent - running the same request multiple times should result in the same state. This is related to read/GETs without side-effects.
3. **Complete state**: After each operation, return the complete state of the resource, not just acknowledgment.
```json
// Good - returns complete state
{
"id": "my-resource", // it is best to call this field "id"
"name": "My Resource",
"config": { "setting1": "value1" },
"created_at": "2025-06-24T08:15:30Z",
"updated_at": "2025-06-24T08:15:30Z"
}
// Bad - returns only acknowledgment
{
"result": "success",
"message": "Resource created successfully"
}
```
4. **HTTP status codes**: Use appropriate HTTP status codes:
- `200/201` for successful operations
- `404` when a resource doesn't exist
- `409` for conflicts (e.g., resource already exists)
- `400` for validation errors
5. **Validation**: Validate input at creation/update time and return comprehensive errors.
#### Example: Complete API for a resource
```typescript
// Retrieve 1 resource
router.get(
{
path: '/api/resource/{id}',
validate: false, // `get` should not have bodies
},
async (context, request, response) => {
// Implementation...
return response.ok({ body: completeResourceState });
}
);
// Create, update
router.(post|put)(
{
path: '/api/resource',
validate: {
body: schema.object({
name: schema.string(),
config: schema.object({...}),
}),
},
},
async (context, request, response) => {
// Implementation...
return response.ok({ body: completeResourceState });
}
);
// Delete
router.delete(
{
path: '/api/resource/{id}',
validate: {
params: schema.object({
id: schema.string(),
}),
},
},
async (context, request, response) => {
// Implementation...
return response.ok();
}
);
// List many resources
router.get(
{
path: '/api/resource',
validate: {
query: schema.object({
page: schema.maybe(schema.number()),
perPage: schema.maybe(schema.number()),
}),
},
},
async (context, request, response) => {
// Implementation...
return response.ok({
body: {
items: resources,
total: totalCount,
page: currentPage,
perPage: itemsPerPage
}
});
}
);
```
#### API imposed challenges
API design decisions can create real challenges for Terraform resource implementation. The Elasticsearch [Index Settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) (\`PUT /\{index\}/\_settings\`) treats static settings (which require index recreation) and dynamic settings (which can be updated in place) identically. This creates a complex situation forcing the implementation to handle all settings together because the API doesn't offer a way to distinguish them upfront.
API design decisions can create real challenges for Terraform resource implementation. The Elasticsearch [Index Settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) (\`PUT /\{index\}/\_settings\`) treats static settings (which require index recreation) and dynamic settings (which can be updated in place) identically. This creates a complex situation forcing the implementation to handle all settings together because the API doesn't offer a way to distinguish them upfront.
```bash
{
"error": {
"reason": "Can't update non dynamic settings [[index.codec, index.number_of_shards]] for open indices"
}
```
@ -116,6 +240,100 @@ API design decisions can create real challenges for Terraform resource implement
A better API design would separate these concerns with different endpoints or provide metadata about which settings require recreation.
### Handle resource references and dependencies
Terraform configurations frequently define multiple resources that depend on each other. Your API design needs to account for these interdependencies in a way that enables Terraform's declarative model to work smoothly.
#### Use stable, predictable resource identifiers
Resource references must use identifiers that remain stable throughout a resource's lifecycle and across operations:
```json
// Resource reference using stable ID
{
"name": "My Report",
"space_id": "marketing", // Reference to a space by ID
"visualization_ids": ["vis-123"] // References to visualizations by ID
}
```
Note: provide users with a way to choose their own ID when creating resources, otherwise use auto-generated UUIDs.
#### Support referencing resources by identifiers, not just names
Ensure your API accepts references by ID, not just by name or other properties that may change:
```typescript
// Good: Reference by ID
router.post(
{
path: '/api/alerting/rule',
validate: {
body: schema.object({
name: schema.string(),
space_id: schema.string(), // Space ID reference
actions: schema.arrayOf(schema.object({
connector_id: schema.string(), // Connector ID reference
group: schema.string(),
// etc.
}))
})
}
},
handler
);
```
#### Use consistent reference patterns across APIs
Apply the same patterns for referencing resources across all your APIs:
1. **Consistent naming**: Use the same suffix for IDs (e.g., `space_id`, `dashboard_id`, etc.).
2. **Consistent structure**: Use the same structure for referencing resources in all APIs.
3. **Consistent validation**: Apply the same validation rules across all APIs.
#### Example: Resource with dependencies
```typescript
// Generic object API that references other objects and spaces
router.post(
{
path: '/api/objects/object',
validate: {
body: schema.object({
title: schema.string(),
description: schema.maybe(schema.string()),
// Reference to space by ID
space_id: schema.string(),
// References to other objects by ID
other_objects: schema.arrayOf(
schema.object({
object_id: schema.string(),
})
)
})
}
},
async (context, request, response) => {
// Implementation with dependency validation
const validationResult = await validateDependencies(request);
if (!validationResult.valid) {
return response.badRequest({
body: {
message: 'Validation failed',
validation: {
dependencies: validationResult.errors
}
}
});
}
// Proceed with creation
// ...
}
);
```
### Design human-friendly APIs
APIs with clear, descriptive field names and logical resource structures make both manual testing and Terraform resource development much smoother.
@ -152,8 +370,8 @@ mappings = jsonencode({
})
```
The hickup here is with validation. Terraform just isn't designed to do validation (or testing!) like we would with conventional languages.
It doesnt validate what strings contain, only that they're parsable during terraform plan.
The hickup here is with validation. Terraform just isn't designed to do validation (or testing!) like we would with conventional languages.
It doesnt validate what strings contain, only that they're parsable during terraform plan.
Resource developers often fall back to the API layer for validation checks, that can lead to plan/apply issues.
### Return errors early—and all at once
@ -222,7 +440,7 @@ OpenAPI specifications make generating consistent Kibana and Fleet clients possi
```bash
# From Makefile - generates standardized clients
internal/clients/fleet/:
internal/clients/fleet/:
oapi-codegen -package fleet fleet-openapi.json
```
@ -264,9 +482,9 @@ Terraform import brings existing resources under management by reading their cur
### Be predictable
Terraform relies on the same input resulting in the same output. Avoid using fields like "last modified time"—Terraform cant compare those meaningfully. This can be tricky with sensitive fields or when your API relies on randomness.
Terraform relies on the same input resulting in the same output. Avoid using fields like "last modified time"—Terraform cant compare those meaningfully. This can be tricky with sensitive fields or when your API relies on randomness.
Constantly changing fields in API responses create a difficult situation for Terraform implementations because they cause constant drift.
Constantly changing fields in API responses create a difficult situation for Terraform implementations because they cause constant drift.
Timestamps and other time-based fields arent predictable:
@ -283,7 +501,7 @@ The [Alerting Rules API](https://www.elastic.co/docs/api/doc/kibana/operation/op
```
"execution_status": {
"status": "active",
"status": "active",
"last_execution_date": "2023-12-07T22:36:41.358Z", // Changes on every read
"last_duration": 736
}
@ -291,7 +509,7 @@ The [Alerting Rules API](https://www.elastic.co/docs/api/doc/kibana/operation/op
Users see Terraform detecting "changes" on every refresh, even when nothing in their configuration has actually changed\!
An alternative is to separate volatile runtime data from stable configuration data in responses:
An alternative is to separate volatile runtime data from stable configuration data in responses:
```json
{
@ -307,7 +525,7 @@ An alternative is to separate volatile runtime data from stable configuration da
}
```
Volatile fields create a challenging choice: include them (causing constant configuration drift) or exclude them (losing valuable runtime information).
Volatile fields create a challenging choice: include them (causing constant configuration drift) or exclude them (losing valuable runtime information).
Including and separating them makes it easier for the TF provider to specifically ignore sets of fields (marking them as [readonly](https://registry.terraform.io/providers/elastic/elasticstack/latest/docs/resources/kibana_alerting_rule#read-only)) but needs to be built into the client generator.
@ -325,13 +543,13 @@ resource "elasticstack_transform" "example" {
group_by = "field_name" # String type
}
# Version 2.0 - same field became an array
# Version 2.0 - same field became an array
resource "elasticstack_transform" "example" {
group_by = ["field_name"] # Array type - BREAKING CHANGE!
}
```
Another example is when the Kibana SLO API changed `group_by` from `string` to `string | string[]`.
Another example is when the Kibana SLO API changed `group_by` from `string` to `string | string[]`.
Changing a field to make it more lenient might look harmless on the surface but means that the API will return values clients don't expect.
@ -363,7 +581,6 @@ resource "elasticstack_elasticsearch_index" "example" {
# Structured fields for simple, predictable values
number_of_shards = 1
number_of_replicas = 2
# JSON strings only for truly complex, variable structures
mappings = jsonencode({
properties = {
@ -396,9 +613,9 @@ Consider the scenario:
A user wants to create alerting rules with different configurations, depending on the space they are in.
In Kibanas UI, theyd create a space, change into that space and then create the rule.
In Kibanas UI, theyd create a space, change into that space and then create the rule.
Through curl, the flow would be similar, although theyd probably GET the space to make sure it exists before POSTing the rule.
Through curl, the flow would be similar, although theyd probably GET the space to make sure it exists before POSTing the rule.
In terraform, its different. Both resources can (and often are) configured at the same time! In the scenario above, the client [needs the space ID](https://github.com/elastic/terraform-provider-elasticstack/blame/0826f4385a29fa58a79f785249f204e13ee3d47c/internal/clients/kibana/alerting.go#L192) to create the rule:
@ -434,7 +651,7 @@ private async persistAlertsHelper() {
### Offer compare-and-swap
Between refreshing state and applying changes, theres a gap where someone could modify your API. Generally it's ok to take the approach of last-write-wins.
Between refreshing state and applying changes, theres a gap where someone could modify your API. Generally it's ok to take the approach of last-write-wins.
If you have to consider concurrency control, support mechanisms like ETags and checksums, and the \`version\` property on saved objects.
@ -442,13 +659,12 @@ If you have to consider concurrency control, support mechanisms like ETags and c
The robustness principle says to accept input in various formats but return output in a consistent format, for example converting strings to lowercase, ordering elements in a list or changing whitespace in \`json\` strings.
For Terraform, dont do this.
For Terraform, dont do this.
Terraform does byte-for-byte comparisons, so normalized output forces provider developers to implement logic to handle different data formats, ordering, capitalization variations and other unnecessary complexity.
Fortunately, the kibana client hasnt needed to handle complex normalization yet. Lets keep it that way, return data exactly as you received it.
Follow these tips, and youll make your API a dream to work with for Terraform provider developers.
Follow these tips, and you'll make your API a dream to work with for Terraform provider developers.
### Example
@ -500,3 +716,129 @@ resource "elasticstack_kibana_space" "example" {
}
```
### (Optional) Support bulk operations
Terraform configurations may manage many resources at once. When operating at scale, API efficiency may be a crucial factor. Supporting bulk operations allows Terraform providers to optimize performance and provide a better experience for users managing many resources.
<DocCallOut title="Reach out to Core for guidance" color="warning">
You may not need to implement bulk operations. Using bulk APIs may require additional work from
the Terraform provider implementation or you need to carefully design your bulk API to work well as Terraform configuration.
</DocCallOut>
#### Why bulk operations matter for Terraform
When a Terraform configuration manages hundreds of similar resources, individual API calls for each resource can lead to:
- **Performance bottlenecks**: Each HTTP request adds network latency
- **Rate limiting issues**: Many sequential calls may trigger rate limiting
#### Implement efficient bulk endpoints
Add these bulk operation endpoints to complement your individual resource operations:
```
POST /api/resource/_bulk_create # Create multiple resources
POST /api/resource/_bulk_update # Update multiple resources
POST /api/resource/_bulk_delete # Delete multiple resources
```
#### Design principles for bulk APIs
1. **Atomic operations**: If one operation fails, provide clear options:
- Allow partial success with detailed reports
- Support all-or-nothing transactions when needed
2. **Consistent response format**: Return individual status for each item in the batch:
```json
{
"items": [
{
"id": "resource-1",
"status": "created",
"result": { /* complete resource state */ }
},
{
"id": "resource-2",
"status": "error",
"error": { "message": "Validation failed", "code": 400 }
}
],
"took": 42,
"errors": true
}
```
3. **Reasonable batch sizes**: Document recommended batch sizes and implement server-side limits
4. **Idempotency**: Ensure bulk operations are idempotent for retry safety
5. **Consistent error handling**: Provide detailed errors for each item in the batch
#### Example: Bulk resource creation
```typescript
// Bulk create resources
router.post(
{
path: '/api/resources/_bulk_create',
validate: {
body: schema.arrayOf(
schema.object({
id: schema.string(), // Client-specified ID
title: schema.string(),
type: schema.string(),
space_id: schema.string(),
// etc.
})
)
}
},
async (context, request, response) => {
const visualizations = request.body;
const results = [];
// Process each item in the batch
for (const viz of visualizations) {
try {
const result = await createVisualization(viz);
results.push({
id: viz.id,
status: 'created',
result
});
} catch (error) {
results.push({
id: viz.id,
status: 'error',
error: {
message: error.message,
code: error.statusCode || 500
}
});
}
}
const hasErrors = results.some(item => item.status === 'error');
return response.ok({
body: {
items: results,
took: Date.now() - request.info.received,
errors: hasErrors
}
});
}
);
```
#### Optimizations for Terraform providers
When implementing bulk operations, consider these Terraform-specific optimizations:
1. **Batch size recommendations**: Document optimal batch sizes to help provider developers maximize performance
2. **Error mapping**: Design error responses that map cleanly to Terraform error handling patterns
3. **State synchronization**: Include complete resource state in responses to avoid additional GET requests