-
})
OTRM Mask Rules
-
Create an OTRM mask rule to replace an expression with a mask string.
+
})
OTRM Hash and Mask Rules
+
Create an OTRM hash and mask rule to replace an expression with the respective hash and mask string.
diff --git a/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules.md b/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules.md
index c4362748c6..14ab6bd494 100644
--- a/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules.md
+++ b/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules.md
@@ -1,21 +1,118 @@
---
id: mask-rules
-title: OpenTelemetry Remote Management Mask Rules
-sidebar_label: Mask Rules
-description: Create an OpenTelemetry collector remote management mask rule to replace an expression with a mask string.
+title: OpenTelemetry Remote Management Hash and Mask Rules
+sidebar_label: Hash and Mask Rules
+description: Use hash and mask processing rules to replace an expression with the respective hash and mask strings.
---
+## OpenTelemetry Remote Management Hash Rules
+
+A hash rule is a processing rule that allows you to replace an expression with a hash code generated for that value. Hashed data is completely hidden (obfuscated) before being sent to Sumo Logic. This can be very useful in situations where certain types of data must not leave your premises, such as credit card numbers and Social Security numbers. Each unique value will have a unique hash code.
+
+The hash algorithm used is **SHA-256**.
+
+Ingestion volume is calculated after the hash filter is applied. If the hash reduces the log size, the smaller size will be measured against ingestion limits.
+
:::note
-This document does not cover masking logs for Windows source templates. For details on masking logs for Windows, refer to [Mask Rules for the Windows Source Template](mask-rules-windows.md).
+Currently available for Local File ST only.
+:::
+
+### How it works
+
+When you add a hash rule action to your processing rules, you need to provide two inputs:
+
+1. **Expression**: A regular expression that must contain exactly **one capture group** `( )`. The string value matched by this capture group will be hashed using SHA-256. If multiple parts of the string need to be hashed, add additional hashing rules for them.
+
+2. **Replacement Format**: The formatted replacement string that will replace the matching string in the log. Use `%s` to refer to the hashed value from the SHA-256 function. The `%s` reference is mandatory and can only be used once.
+
+### Examples
+
+#### Hash a password
+
+For example, to hash the password `Welcome123` from this log:
+
+```
+user=sumo password=Welcome123
+```
+
+You could use the following configuration:
+
+**Expression:**
+```
+password=([A-Za-z0-9]+)
+```
+
+**Replacement Format:**
+```
+password=%s
+```
+
+**Result:**
+- **Matching string**: `password=Welcome123`
+- **Capture group**: `Welcome123` (this value is hashed)
+- **Output log**: `user=sumo password=`
+
+Where `` is the SHA-256 hash of `Welcome123`.
+
+#### Hash member IDs
+
+To hash member IDs from this log:
+
+```
+2012-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [memberid=dan@demo.com] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]
+```
+
+You could use the following configuration:
+
+**Expression:**
+```
+memberid=([^\]]+)
+```
+
+**Replacement Format:**
+```
+memberid=%s
+```
+
+**Resulting hashed log:**
+
+```
+2012-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [memberid=906e9cc124c8e1085b10e1cec4cc6526f3637558be361d3b4bb54bb537e49a49] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]
+```
+
+:::important
+Any hashing expression should be tested and verified on a sample source file before being applied to your production logs.
+:::
+
+### Rules and limitations
+
+* The regular expression must contain exactly **one capture group** enclosed in `( )`. Values inside this capture group will be hashed. If multiple parts of the string need to be hashed, add additional hashing rules for them.
+
+* You can use an anchor to detect specific values in your logs. Only the value within the capture group will be hashed.
+
+* The hash algorithm is **SHA-256** (MD5 is not supported for OpenTelemetry collectors).
+
+* Make sure you do not specify a regular expression that matches a full log line. Doing so will hash the entire log line.
+
+* The replacement format must include `%s` exactly once to reference the hashed value.
+
+* Do not unnecessarily match on more of the log than needed. Use precise regular expressions to ensure that only the intended sensitive information is hashed, not the surrounding context.
+
+* Each unique value will produce a unique hash code. The same input value will always produce the same hash output, allowing you to correlate occurrences while keeping the actual value hidden.
+
+## OpenTelemetry Remote Management Mask Rules
+
+:::note
+This document does not cover masking logs for Windows source templates. For details on masking logs for Windows, refer to [OpenTelemetry Remote Management Windows Source Template Mask Rules](/docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules-windows/).
:::
A mask rule is a processing rule that hides irrelevant or sensitive information from logs before they are ingested. When you create a mask rule, the selected expression will be replaced with a mask string before the data is sent to Sumo Logic. You can either specify a custom mask string or use the default `"#####"`.
Ingestion volume is calculated after applying the mask filter. If the mask reduces the size of the log, the smaller size will be measured against ingestion limits. Masking is an effective method to reduce overall ingestion volume.
-## Examples
+### Examples
-### Mask an email address
+#### Mask an email address
For example, to mask the email address `dan@demo.com` from this log:
@@ -36,13 +133,11 @@ Using the masking string `auth=User:AAA` would produce the following result:
Any masking expression should be tested and verified with a sample source file before applying it to your production logs.
:::
-### Mask credit card numbers
+#### Mask credit card numbers
You can mask credit card numbers from log messages using a regular expression within a mask rule. Once masked with a known string, you can then perform a search for that string within your logs to detect if credit card numbers may be leaking into your log files.
-To mask credit card numbers in logs, you can use a masking filter with the following regular expression:
-
-The following regular expression can be used within a masking filter to mask American Express, Visa (16 digit only), Mastercard, and Discover credit card numbers:
+To mask credit card numbers in logs, you can use a masking filter with the following regular expression. The following regular expression can be used within a masking filter to mask American Express, Visa (16 digit only), Mastercard, and Discover credit card numbers:
```
((?:(?:4\d{3})|(?:5[1-5]\d{2})|6(?:011|5[0-9]{2}))(?:-?|\040?)(?:\d{4}(?:-?|\040?)){3}|(?:3[4,7]\d{2})(?:-?|\040?)\d{6}(?:-?|\040?)\d{5})
@@ -58,7 +153,7 @@ Samples include:
* **Discover**. 6011-0009-9013-9424 \| 6500000000000002 \| 6011 0009 9013 9424
-## Rules and limitations
+### Rules and limitations
* Expressions that you want masked must be selected by the regular expression you given. And the masking string provided will mask whole of the string which is selected by the regular expression.
diff --git a/docs/send-data/opentelemetry-collector/remote-management/processing-rules/overview.md b/docs/send-data/opentelemetry-collector/remote-management/processing-rules/overview.md
index 66171f4454..7409dd8eda 100644
--- a/docs/send-data/opentelemetry-collector/remote-management/processing-rules/overview.md
+++ b/docs/send-data/opentelemetry-collector/remote-management/processing-rules/overview.md
@@ -2,6 +2,7 @@
id: overview
title: OpenTelemetry Remote Management Processing Rules
sidebar_label: Overview
+description: Get an overview of how to use processing rules to specify what kind of data is sent to Sumo Logic using OpenTelemetry remote management.
---
import useBaseUrl from '@docusaurus/useBaseUrl';
@@ -10,11 +11,12 @@ Processing rules affect only the data sent to Sumo Logic; logs and metrics on y
## Logs collection
-Processing rules for logs collection support the following rule types:
+Processing rules for log collection support the following rule types:
-* [Exclude messages that match](include-and-exclude-rules.md). Remove messages that you do not want to send to Sumo Logic at all ("denylist" filter). These messages are skipped by OpenTelemetry Collector and are not uploaded to Sumo Logic.
+* [Exclude messages that match](include-and-exclude-rules.md). Remove messages that you do not want to send to Sumo Logic at all ("denylist" filter). These messages are skipped by the OpenTelemetry Collector and are not uploaded to Sumo Logic.
* [Include messages that match](include-and-exclude-rules.md). Send only the data you'd like in your Sumo Logic account (an "allowlist" filter). This type of rule can be useful, for example, if you only want to include messages coming from a firewall.
-* [Mask messages that match](mask-rules.md). Replace an expression with a mask string that you can customize. This is another way to your protect data, such as passwords, that you do not normally track.
+* [Mask messages that match](mask-rules.md). Replace an expression with a customizable mask string. This is another way to protect data you do not normally track, such as passwords.
+* [Hash messages that match](mask-rules.md). Replace an expression with a hash code generated for that value. This completely obscures sensitive data, such as credit card numbers and Social Security numbers, before they are sent to Sumo Logic.
## Metrics collection
@@ -25,13 +27,13 @@ Processing rules for metrics collection support the following rule types:
## How do processing rules work together?
-You can create one or more processing rules for a source template, combining the different types of filters to generate the exact data set you want sent to Sumo Logic.
+You can create one or more processing rules for a source template, combining different filter types to generate the exact dataset you want sent to Sumo Logic.
-When a Source has multiple rules they are processed in the following order: includes, excludes, masks.
+When a Source has multiple rules, they are processed in the following order: includes, excludes, followed by the order of occurrence of hashing or masking rules.
-Exclude rules take priority over include rules. Include rules are processed first, however, if an exclude rule matches data that matched the include rule filter, the data is excluded.
+Exclude rules take priority over include rules. Include rules are processed first. However, if an exclude rule matches data that matched the include rule filter, the data is excluded.
## Limitations
* Regular expressions must be [RE2 compliant](https://github.com/google/re2/wiki/Syntax).
-* Processing rules are tested with maximum of 20 rules.
+* Processing rules are tested with a maximum of 20 rules.
diff --git a/sidebars.ts b/sidebars.ts
index a531616215..310812ab11 100644
--- a/sidebars.ts
+++ b/sidebars.ts
@@ -304,6 +304,7 @@ module.exports = {
collapsed: true,
link: {type: 'doc', id: 'send-data/opentelemetry-collector/remote-management/processing-rules/index'},
items:[
+ 'send-data/opentelemetry-collector/remote-management/processing-rules/overview',
'send-data/opentelemetry-collector/remote-management/processing-rules/include-and-exclude-rules',
'send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules',
'send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules-windows',