Skip to content

Latest commit

 

History

History
315 lines (248 loc) · 9.74 KB

File metadata and controls

315 lines (248 loc) · 9.74 KB

ECS Tooling Usage

In addition to the published schema and artifacts, the ECS repo contains tools to generate artifacts based on ECS schemas and your custom field definitions.

Why Use ECS Tooling?

  • Subset Generation: ECS has ~850 fields. Generate mappings for only the fields you need.
  • Custom Fields: Painlessly maintain your own custom field mappings alongside ECS.
  • Multiple Formats: Generate Elasticsearch templates, Beats configs, CSV exports, and documentation.

For detailed developer documentation, see scripts/docs/README.md.

NOTE - These tools and their functionality are considered experimental.

Table of Contents

Quick Start Example

Here's a complete example that generates artifacts with:

  • ECS 9.1 fields as the base
  • A subset of only needed fields
  • Custom fields added on top
  • Custom template settings
python scripts/generator.py \
  --ref v9.1.0 \
  --semconv-version v1.38.0 \
  --subset ../my-project/fields/subset.yml \
  --include ../my-project/fields/custom/ \
  --out ../my-project/

This generates:

  • my-project/generated/elasticsearch/composable/ - Modern Elasticsearch templates
  • my-project/generated/elasticsearch/legacy/ - Legacy templates
  • my-project/generated/beats/ - Beats field definitions
  • my-project/generated/csv/ - CSV field reference

Note: The --semconv-version flag is required. Use the version from the otel-semconv-version file or a specific version like v1.38.0.

Setup and Install

Requirements: Python 3.8+, git

Clone and setup:

git clone https://github.com/elastic/ecs
cd ecs
git checkout v9.1.0  # Optional: target specific version
pip install -r scripts/requirements.txt  # virtualenv recommended

Basic Usage

Generate artifacts from the current ECS schema:

make generate
# or
python scripts/generator.py --semconv-version v1.38.0

Key points:

  • Artifacts are created in the generated/ directory
  • Documentation is written to docs/reference/
  • Each run rewrites the entire generated/ directory
  • Must be run from the ECS repo root
  • The --semconv-version flag is required for OTel integration validation

For complete documentation on how the generator works, see:

Key Generator Options

Include Custom Fields

Add custom fields to ECS schemas:

python scripts/generator.py \
  --semconv-version v1.38.0 \
  --include ../myproject/custom-fields/ \
  --out ../myproject/out/

Custom field format - Use the same YAML format as ECS schemas:

---
- name: widgets
  title: Widgets
  group: 2
  short: Fields describing widgets
  description: Widget-related fields
  type: group
  fields:
    - name: id
      level: extended
      type: keyword
      short: Unique identifier of the widget
      description: Unique identifier of the widget.

Supports: Directories, multiple paths, wildcards (*.yml), combining with --ref

See also: Schema format documentation

Subset - Use Only Needed Fields

Generate artifacts with only the fields you need (reduces mapping size):

python scripts/generator.py \
  --semconv-version v1.38.0 \
  --subset ../myproject/subset.yml

Example subset file:

---
name: web_logs
fields:
  base:
    fields:
      "@timestamp": {}
  http:
    fields: "*"        # All http fields
  url:
    fields: "*"        # All url fields
  user_agent:
    fields:
      original: {}     # Specific fields only

Subset format:

  • name: Subset name (used for output directory)
  • fields: Declares which fieldsets/fields to include
    • fields: "*" - Include all fields in fieldset
    • field_name: {} - Include specific field
    • docs_only: true - Include in docs only, not artifacts

Tips:

  • Combine with --include for custom fields (they must be listed in subset)
  • Always include base fieldset with at least @timestamp

For detailed subset documentation with examples, see scripts/docs/schema-pipeline.md

Ref - Target Specific ECS Version

Generate artifacts from a specific ECS version:

python scripts/generator.py \
  --semconv-version v1.38.0 \
  --ref v9.0.0

Combines with other options:

# Generate from ECS v9.0.0 + experimental + custom fields
python scripts/generator.py \
  --semconv-version v1.38.0 \
  --ref v9.0.0 \
  --include experimental/schemas ../myproject/fields/custom

Loads schemas from git history (tags, branches, commits). Requires git.

Other Options

--out <directory> - Output to custom directory

python scripts/generator.py --semconv-version v1.38.0 --out ../myproject/

--exclude <files> - Remove specific fields (for testing deprecation impact)

python scripts/generator.py --semconv-version v1.38.0 --exclude deprecated-fields.yml

--strict - Enable strict validation (required for CI/CD)

python scripts/generator.py --semconv-version v1.38.0 --strict

Strict mode requires the following conditions, else the script exits on an exception:

  • Short descriptions must be less than or equal to 120 characters.
  • Example values containing arrays or objects must be quoted to avoid unexpected YAML interpretation when the schema files or artifacts are relied on downstream.
  • If a regex pattern is defined, the example values will be checked against it.
  • If expected_values is defined, the example value(s) will be checked against the list.

Example error when running with --strict:

$ python scripts/generator.py --ref v1.4.0 --semconv-version v1.38.0 --strict
Loading schemas from git ref v1.4.0
Running generator. ECS version 1.4.0
...
ValueError: Short descriptions must be single line, and under 120 characters (current length: 134).
Offending field or field set: number
Short description:
  Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet.

Without --strict, the same issue produces a warning and the script continues:

$ python scripts/generator.py --ref v1.4.0 --semconv-version v1.38.0
Loading schemas from git ref v1.4.0
Running generator. ECS version 1.4.0
~/dev/ecs/scripts/generators/ecs_helpers.py:176: UserWarning: Short descriptions must be single line, and under 120 characters (current length: 134).
Offending field or field set: number
Short description:
  Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet.

This will cause an exception when running in strict mode.

--template-settings / --mapping-settings - Custom Elasticsearch template settings

python scripts/generator.py \
  --semconv-version v1.38.0 \
  --template-settings ../myproject/template.json \
  --mapping-settings ../myproject/mappings.json

This is an example template.json to be passed with --template-settings-legacy:

{
  "index_patterns": ["mylog-*"],
  "order": 1,
  "settings": {
    "index": {
      "mapping": {
        "total_fields": {
          "limit": 10000
        }
      },
      "refresh_interval": "1s"
    }
  },
  "template": {
    "mappings": {}
  }
}

This is an example mapping.json to be passed with --mapping-settings:

{
  "_meta": {
    "version": "1.5.0"
  },
    "date_detection": false,
    "dynamic_templates": [
      {
        "strings_as_keyword": {
          "mapping": {
            "ignore_above": 1024,
            "type": "keyword"
          },
          "match_mapping_type": "string"
        }
      }
    ],
    "properties": {}
}

The mappings object in template.json and the properties object in mapping.json are left empty — they will be filled in automatically by the script.

--intermediate-only - Generate only intermediate files (for debugging)

--force-docs - Generate docs even with --subset/--include/--exclude

Additional Resources

Complete Documentation

Module-Specific Guides

Contributing