Skip to content
26 changes: 26 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added
- **Parser Options and Configurability**: Added opt-in lenient parsing modes for handling real-world HL7 messages containing unknown segments or malformed data.
- New `ParserOptions` class for configuring parser behavior (thread-safe)
- New `ParserConfiguration` static class for global defaults
- New `ParserWarning` class for structured warning metadata
- New `ParseResult<T>` class for deserialization results with warnings
- New enums: `UnknownSegmentHandling`, `MalformedSegmentHandling`, `ParserWarningType`
- Optional `ParserOptions` parameter on `MessageSerializer.Deserialize<T>()` (backward compatible)
- New `MessageSerializer.DeserializeWithWarnings<T>()` method
- Warning collection and event-based notification at both instance and global levels
- Thread-safe implementation using `ConcurrentBag<T>` and defensive copying with `Array.AsReadOnly()`

### Notes
- Default behavior remains strict (throw on errors) for full backward compatibility
- No breaking changes - all existing code continues to work unchanged
- 42 new comprehensive tests added (1037 total tests passing)
- No external dependencies added - uses built-in .NET types
219 changes: 219 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,225 @@ IMessage message = MessageHelper.NewInstance(Hl7Version.V281);
// An instance of ClearHl7.V281.Message
```

## Fault-Tolerant Parsing with Parser Options

clear-hl7-net provides configurable parser options for handling real-world HL7 messages that may contain unknown segments or malformed data. By default, the parser is strict and throws exceptions on errors, but you can opt-in to lenient parsing modes when needed.

### Basic Usage

#### Lenient Parsing with Unknown Segments

```csharp
using ClearHl7;
using ClearHl7.Serialization;
using ClearHl7.V282;

// Configure parser to skip unknown segments
var options = new ParserOptions
{
UnknownSegmentHandling = UnknownSegmentHandling.Skip,
CollectWarnings = true
};

string hl7String = "MSH|^~\\&|SendApp||RecvApp||20240101120000|||||2.8.2\rPID|1|123\rZZZ|unknown segment\rPV1|1";

// Parse with lenient options
var message = MessageSerializer.Deserialize<Message>(hl7String, options);

// Check for warnings
if (options.Warnings.Count > 0)
{
foreach (var warning in options.Warnings)
{
Console.WriteLine($"Warning: {warning.Message} (Segment: {warning.SegmentId}, Line: {warning.LineNumber})");
}
}
// Output: Warning: Unknown segment 'ZZZ' skipped (Segment: ZZZ, Line: 2)
```

#### Parsing with Warning Collection

Use `DeserializeWithWarnings<T>()` to automatically collect warnings:

```csharp
using ClearHl7.Serialization;
using ClearHl7.V282;

var result = MessageSerializer.DeserializeWithWarnings<Message>(hl7String);

if (result.HasWarnings)
{
Console.WriteLine($"Parsed with {result.Warnings.Count} warnings:");
foreach (var warning in result.Warnings)
{
Console.WriteLine($" {warning.SegmentId}: {warning.Message}");
}
}

// Process the successfully parsed message
ProcessMessage(result.Message);
```

### Configuration Options

#### Unknown Segment Handling

Controls how the parser handles segments with unknown segment IDs:

```csharp
var options = new ParserOptions
{
UnknownSegmentHandling = UnknownSegmentHandling.Throw, // Default - throw exception (strict)
// UnknownSegmentHandling = UnknownSegmentHandling.Skip, // Skip unknown segments
// UnknownSegmentHandling = UnknownSegmentHandling.CreateGeneric // Create generic placeholder (future)
};
```

#### Malformed Segment Handling

Controls how the parser handles malformed segments (too short, invalid format):

```csharp
var options = new ParserOptions
{
MalformedSegmentHandling = MalformedSegmentHandling.Throw, // Default - throw exception (strict)
// MalformedSegmentHandling = MalformedSegmentHandling.Skip, // Skip malformed segments
// MalformedSegmentHandling = MalformedSegmentHandling.BestEffort // Parse what's possible
};
```

#### Warning Collection

Enable warning collection to review parsing issues:

```csharp
var options = new ParserOptions
{
CollectWarnings = true // Collect warnings during parsing
};
```

### Global Configuration

Set default parser options globally for your entire application:

```csharp
using ClearHl7.Serialization;

// At application startup
ParserConfiguration.DefaultOptions = new ParserOptions
{
UnknownSegmentHandling = UnknownSegmentHandling.Skip,
MalformedSegmentHandling = MalformedSegmentHandling.Skip,
CollectWarnings = true
};

// All subsequent parsing uses these defaults
var message = MessageSerializer.Deserialize<Message>(hl7String);

// Access global warnings
foreach (var warning in ParserConfiguration.GlobalWarnings)
{
Console.WriteLine($"Global warning: {warning.Message}");
}

// Clear global warnings
ParserConfiguration.ClearGlobalWarnings();
```

### Event-Based Notification

Subscribe to parsing warnings for real-time logging or monitoring:

```csharp
// Instance-level events
var options = new ParserOptions
{
UnknownSegmentHandling = UnknownSegmentHandling.Skip
};

options.ParserWarning += (sender, e) =>
{
_logger.LogWarning($"Parser warning: {e.Warning.Message}");
};

// Global events
ParserConfiguration.ParserWarning += (sender, e) =>
{
_telemetry.TrackEvent("HL7ParserWarning", new Dictionary<string, string>
{
{ "Type", e.Warning.Type.ToString() },
{ "SegmentId", e.Warning.SegmentId },
{ "Message", e.Warning.Message }
});
};
```

### Per-Call Override

Override global configuration for specific parsing operations:

```csharp
// Global configuration is lenient
ParserConfiguration.DefaultOptions = new ParserOptions
{
UnknownSegmentHandling = UnknownSegmentHandling.Skip
};

// But enforce strict parsing for this specific message
var strictOptions = new ParserOptions
{
UnknownSegmentHandling = UnknownSegmentHandling.Throw
};

try
{
var message = MessageSerializer.Deserialize<Message>(hl7String, strictOptions);
}
catch (ArgumentException ex)
{
Console.WriteLine($"Strict parsing failed: {ex.Message}");
}
```

### Warning Details

Each `ParserWarning` provides detailed information about parsing issues:

```csharp
public class ParserWarning
{
public ParserWarningType Type { get; set; } // UnknownSegment, MalformedSegment, ParseError
public string SegmentId { get; set; } // e.g., "ZZZ", "PID"
public int LineNumber { get; set; } // Segment position in message
public string Message { get; set; } // Human-readable description
public string RawSegment { get; set; } // Original segment string
public Exception Exception { get; set; } // Exception if applicable
public DateTime Timestamp { get; set; } // When warning occurred
}
```

### Thread Safety

All parser options components are thread-safe:

- `ParserOptions` uses `ConcurrentBag<T>` for warning collection
- `ParserConfiguration` uses defensive copying and locking for global state
- Warning snapshots are immutable (`IReadOnlyList<T>`)

### Backward Compatibility

Parser options are fully backward compatible:

```csharp
// Existing code works unchanged - default is strict parsing
var message = MessageSerializer.Deserialize<Message>(hl7String);

// Opt-in to lenient parsing when needed
var options = new ParserOptions { UnknownSegmentHandling = UnknownSegmentHandling.Skip };
var message = MessageSerializer.Deserialize<Message>(hl7String, options);
```

## Customizing
### Delimiter Characters
The HL7 specification calls out default delimiters to use for fields (pipe `|`), components (caret `^`), subcomponents (ampersand `&`), escaping (backslash `\`), and repetition (tilde `~`). Most will use these defaults. But if the consumer of your messages supports it, it is possible to define your own delimiters.
Expand Down
26 changes: 26 additions & 0 deletions src/ClearHl7/Serialization/MalformedSegmentHandling.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
namespace ClearHl7.Serialization
{
/// <summary>
/// Defines how the parser handles malformed segments (e.g., too short, parsing errors).
/// </summary>
public enum MalformedSegmentHandling
{
/// <summary>
/// Throw an ArgumentException when a malformed segment is encountered.
/// This is the default behavior and maintains backward compatibility.
/// </summary>
Throw = 0,

/// <summary>
/// Skip the malformed segment and continue parsing.
/// A warning is added to the warnings collection if CollectWarnings is true.
/// </summary>
Skip = 1,

/// <summary>
/// Attempt to parse what is possible from the malformed segment.
/// A warning is added for any parsing errors if CollectWarnings is true.
/// </summary>
BestEffort = 2
}
}
Loading