A logstash filter plugin, which parse input events from both json and xml files and modifies the events based on the configuration specified for the plugin
| Setting | Input Type | Required | Default Value |
|---|---|---|---|
| document | String | No | message |
| type | String | No | type |
| mainProp | String(Uri) | Yes | - |
| cacheSize | Long | No | - |
| multipathId | Boolean | No | false |
Configuration to set the field of the event from where we will get the document. Default value is “message”.
Configuration to set the field of the event from where we will get the type of the document. If the type is json/xml it will be processed. Default value is “type”.
Configuration setting for the filter containing path of the main properties file.
This is a required field, should be a valid file path. Inside the file four properties and their values must be present. These are: identifier.attribute.path.xml (xpath of identifier field), identifier.attribute.path.json (jsonpath of identifier field), config.location.xml (folder path where document type specific configuration files will be present for xml documents), config.location.json (folder path where document type specific configuration files will be present for json documents).
Configuration setting for the filter, which says maximum how many configurations files can be stored in cache memory. This is being done to avoid reading document type specific configuration files to be read multiple times from disk, which may cause performance degradation during event filtering.
If not specified cache size will be infinite, which may cause memory overflow.
Configuration setting for the filter, which says if document id will be in different paths or not, for different documents.
If this configuration is true, for different documents, document id can be in different paths. Otherwise, document id have to be in the same path for all the documents.
When the configuration is true, we can configure main properties file this way:
identifier.attribute.path.xml=[xpathExpr1] |OR| [xpathExpr2] |OR| ..|OR|..[xpathExprN]
identifier.attribute.path.json=[jsonpathExpr1] |OR| [jsonpathExpr2] |OR| ..|OR|..[jsonpathExprN]
Otherwise main properties file will be like:
identifier.attribute.path.xml=[xpathExpr]
identifier.attribute.path.json=[jsonpathExpr]
Be careful, when setting this configuration to true, if values found in more than one of the given paths, to find document id for a single document and those values are not same, _documentparsefailure tag will be added to the events.
- It takes both xml and json type documents from logstash input events.
- Extract fields from the documents based on some xml and json specific configuration values in configurations files.
- Sets the extracted value as the fields of the output events, based on the configuration values in configurations files.
- main-config.properties contains all configuration path and path of identifier attribute. Sample main-config.properties is given below:
identifier.attribute.path.xml = parent/child/grandchild/id
identifier.attribute.path.json = $.parent.child.grandchild.id
config.location.xml = <path to xml config folder>
config.location.json = <path to json config folder>- Suppose, value at one document at identifier path is id1. Then one id1.conf should be present in both xml and json config folder. Sample id1.conf for xml will look like:
parent/child/grandchild/field1 => field1
parent/child/grandchild/field2 => field2
Sample id1.conf for json will look like:
$.parent.child.grandchild.field1 => field1
$.parent.child.grandchild.field2 => field2
This configuration will add field1 and field2 fields with the value in their respective path in output event of logstash, for all document having the value id1 at identifier attribute path, in the document field of the input event.
Clone this filter plugin repository or download zip
Obtain a copy of the Logstash codebase with the following git command:
. git clone --branch <branch_name> --single-branch https://github.com/elastic/logstash.git <target_folder>
where, branch_name = Major version of logstash where you want to install the plugin target_folder = Location of logstash codebase in your local system(Call this LS_HOME)
Run ./gradlew assemble in terminal from LS_HOME. This should produce the $LS_HOME/logstash-core/build/libs/logstash-core-x.y.z.jar where x, y, and z refer to the version of Logstash.
Create gradle.properties file in the root folder of this cloned plugin project.
Run ./gradlew gem from the root folder of this cloned plugin project. This task will produce a gem file in the root directory of your plugin’s codebase with the name logstash-filter-json_xml_path_filter-1.0.0-SNAPSHOT.gem
- Go to your logstash deployment folder.
- Run the command: bin/logstash-plugin install --no-verify --local /path/to/logstash-filter-json_xml_path_filter-1.0.0-SNAPSHOT.gem
Create a configuration file which will look like below:
input {
file {
path => "<some_folder_path>/*.json"
start_position => "beginning"
sincedb_path => "/dev/jsonDb"
exclude => "*.gz"
codec => json
}
file {
path => "<some_folder_path>/*.xml"
start_position => "beginning"
sincedb_path => "/dev/xmlDb"
exclude => "*.gz"
codec => json
}
}
filter {
json_xml_path_filter {
mainProp => "<some_folder_path>/testProp.properties"
cacheSize => 10
}
}
output {
stdout { codec => rubydebug }
}
Let's name the configuration file test-config.conf and place it inside config folder of logstash deployment. Run the plugin using the command from logstash deployment home:
bin/logstash -f config/test-config.confAvailable at: https://drive.google.com/drive/folders/1U9Xi62tcozdczyvy79H00hoF9_sfIAT8?usp=sharing