[TESTING][SECURITY]: File upload security manual test plan (malicious files, size limits, MIME validation, zip bombs)

# [TESTING][SECURITY]: Resource content security manual test plan (content validation, size limits, MIME format, injection prevention)

> **Note:** This test plan has been rescoped from "file upload security" to "resource content security" because the gateway does not support arbitrary binary file uploads. Resources require UTF-8 text content.

## Goal

Produce a **comprehensive manual test plan** for validating resource content security, including content size limits, UTF-8 validation, MIME type format checking, and injection prevention in resource content.

## Why Now?

Security testing is critical for GA release:

1. **Production Readiness**: Security must be validated before release
2. **Compliance**: Required by security standards and audits
3. **Defense in Depth**: Validates multiple protection layers
4. **Attack Mitigation**: Prevents common exploitation techniques
5. **User Trust**: Security issues erode confidence

---

## Scope Clarification

### What Exists

| Feature | Status | Location |
|---------|--------|----------|
| Content size limit | ✅ 1MB max | `schemas.py:1637`, `config.py:1874` |
| UTF-8 requirement | ✅ Enforced | `schemas.py:1640-1644` |
| MIME type format | ✅ Regex validation | `validators.py:1126` |
| HTML injection check | ✅ Pattern matching | `schemas.py:1648` |

### What Does NOT Exist

| Feature | Status |
|---------|--------|
| Binary file upload | ❌ Not implemented (UTF-8 required) |
| Zip bomb detection | ❌ Not implemented |
| Archive extraction | ❌ Not implemented |
| Executable detection | ❌ Not implemented |
| Magic byte validation | ❌ Not implemented |

---

## User Stories

### Story 1: Content Size Limits

**As a** security administrator
**I want** content size limits enforced
**So that** storage and memory are protected

#### Acceptance Criteria

```gherkin
Feature: Content size limits
  Scenario: Oversized content rejected
    When content exceeds 1MB (MAX_CONTENT_LENGTH)
    Then the request should be rejected with validation error
    And the error should indicate size limit exceeded
```

### Story 2: UTF-8 Content Validation

**As a** security administrator
**I want** only valid UTF-8 content accepted
**So that** binary data and encoding attacks are prevented

#### Acceptance Criteria

```gherkin
Feature: UTF-8 validation
  Scenario: Non-UTF-8 bytes rejected
    When content contains non-UTF-8 bytes
    Then the request should be rejected
    And the error should indicate UTF-8 requirement

  Scenario: Valid UTF-8 accepted
    When content is valid UTF-8 text
    Then the content should be accepted
```

### Story 3: Injection Prevention

**As a** security administrator
**I want** injection attacks in content prevented
**So that** stored XSS and similar attacks are blocked

#### Acceptance Criteria

```gherkin
Feature: Injection prevention
  Scenario: HTML tags detected in content
    When content contains <script>, <iframe>, or similar tags
    Then the content should be flagged or sanitized
    And potentially dangerous content should not be stored raw
```

---

## Architecture

```
┌─────────────────────────────────────────────────────────────────────┐
│                   Resource Content Validation                        │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │ ResourceCreate Schema Validation (schemas.py)               │    │
│  │                                                             │    │
│  │  1. Size Check (MAX_CONTENT_LENGTH = 1MB)                  │    │
│  │     └── if len(v) > MAX_CONTENT_LENGTH: reject            │    │
│  │                                                             │    │
│  │  2. UTF-8 Validation                                       │    │
│  │     └── bytes.decode("utf-8") or reject                   │    │
│  │                                                             │    │
│  │  3. HTML Pattern Detection                                  │    │
│  │     └── DANGEROUS_HTML_PATTERN matching                   │    │
│  │     └── Flags <script>, <iframe>, etc.                    │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │ MIME Type Validation (validators.py)                        │    │
│  │                                                             │    │
│  │  • Format validation via regex                             │    │
│  │  • Pattern: type/subtype format                            │    │
│  │  • No allowlist enforcement (format only)                  │    │
│  └─────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────┘
```

---

## Test Environment Setup

### Using Docker Compose with Makefile (Recommended)

```bash
# Build and start
make docker-prod && make compose-down && make testing-up

# Key settings (config.py defaults):
# MAX_CONTENT_LENGTH = 1048576 (1MB)
# MAX_DESCRIPTION_LENGTH = 8192 (8KB)
# MAX_TEMPLATE_LENGTH = 65536 (64KB)
```

### Create Test Token

```bash
export JWT_SECRET_KEY="${JWT_SECRET_KEY:-my-test-key}"
export TOKEN=$(python -m mcpgateway.utils.create_jwt_token \
  --username tester@example.com --exp 10080 --secret "$JWT_SECRET_KEY")

# Verify
curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8080/health | jq .
```

---

## Test Cases

### TC-CONTENT-001: Content Size Limit Enforced

**Objective:** Verify content exceeding 1MB is rejected

**Steps:**

```bash
# Step 1: Generate content > 1MB
LARGE_CONTENT=$(python -c "print('A' * 1100000)")

# Step 2: Attempt to create resource with large content
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"uri\":\"test://large\",\"name\":\"Large\",\"content\":\"$LARGE_CONTENT\"}" \
  http://localhost:8080/resources

# Step 3: Check for size limit error
```

**Expected Results:**
- Request rejected with 422 Unprocessable Entity
- Error message indicates content exceeds maximum length

---

### TC-CONTENT-002: UTF-8 Requirement Enforced

**Objective:** Verify non-UTF-8 content is rejected

**Steps:**

```bash
# Step 1: Create file with invalid UTF-8 bytes
printf '{"uri":"test://binary","name":"Binary","content":"Hello \x80\x81 World"}' > /tmp/bad_utf8.json

# Step 2: Attempt to create resource
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data-binary @/tmp/bad_utf8.json \
  http://localhost:8080/resources

# Step 3: Verify rejection
```

**Expected Results:**
- Request rejected with validation error
- Error indicates UTF-8 requirement

---

### TC-CONTENT-003: MIME Type Format Validation

**Objective:** Verify MIME type format is validated

**Steps:**

```bash
# Step 1: Invalid MIME type format
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://mime","name":"Mime Test","content":"test","mimeType":"invalid-mime"}' \
  http://localhost:8080/resources

# Step 2: Valid MIME type
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://mime2","name":"Mime Test 2","content":"test","mimeType":"text/plain"}' \
  http://localhost:8080/resources
```

**Expected Results:**
- Invalid format rejected
- Valid format accepted

---

### TC-CONTENT-004: HTML Injection Detection

**Objective:** Verify HTML tags in content are detected

**Steps:**

```bash
# Step 1: Content with script tag
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://xss","name":"XSS Test","content":"<script>alert(1)</script>"}' \
  http://localhost:8080/resources

# Step 2: Content with iframe tag
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://iframe","name":"Iframe Test","content":"<iframe src=\"evil.com\"></iframe>"}' \
  http://localhost:8080/resources
```

**Expected Results:**
- Dangerous HTML detected
- Content flagged or rejected based on configuration

---

### TC-CONTENT-005: Valid Content Accepted

**Objective:** Verify valid text content is accepted

**Steps:**

```bash
# Step 1: Plain text content
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://valid","name":"Valid Resource","content":"This is valid UTF-8 text content.","mimeType":"text/plain"}' \
  http://localhost:8080/resources

# Step 2: JSON content
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://json","name":"JSON Resource","content":"{\"key\":\"value\"}","mimeType":"application/json"}' \
  http://localhost:8080/resources
```

**Expected Results:**
- Both resources created successfully (200/201)
- Content stored correctly

---

### TC-CONTENT-006: Edge Cases

**Objective:** Test boundary conditions

**Steps:**

```bash
# Step 1: Empty content
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://empty","name":"Empty","content":""}' \
  http://localhost:8080/resources

# Step 2: Content exactly at 1MB limit
EXACT_1MB=$(python -c "print('A' * 1048576)")
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"uri\":\"test://exact\",\"name\":\"Exact\",\"content\":\"$EXACT_1MB\"}" \
  http://localhost:8080/resources

# Step 3: Unicode content
curl -s -w "\nStatus: %{http_code}\n" \
  -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"uri":"test://unicode","name":"Unicode","content":"Hello 世界 🌍 مرحبا"}' \
  http://localhost:8080/resources
```

**Expected Results:**
- Empty content handled per schema rules
- Content at exactly 1MB accepted (limit is >)
- Unicode UTF-8 content accepted

---

## Test Matrix

| Test Case | Size Limit | UTF-8 | MIME Format | HTML Detection |
|-----------|------------|-------|-------------|----------------|
| TC-CONTENT-001 | ✓ | | | |
| TC-CONTENT-002 | | ✓ | | |
| TC-CONTENT-003 | | | ✓ | |
| TC-CONTENT-004 | | | | ✓ |
| TC-CONTENT-005 | | | | |
| TC-CONTENT-006 | ✓ | ✓ | | |

---

## Success Criteria

- [ ] All 6 test cases pass
- [ ] Content > 1MB rejected
- [ ] Non-UTF-8 bytes rejected
- [ ] Invalid MIME format rejected
- [ ] Dangerous HTML detected
- [ ] Valid content accepted
- [ ] Edge cases handled correctly

---

## Related Files

### Resource Creation
- `mcpgateway/main.py` - POST `/resources` JSON API (line 3788)
- `mcpgateway/admin.py` - POST `/admin/resources` form handler (line 11046)
- `mcpgateway/schemas.py` - ResourceCreate schema with validation (lines 1580-1651)

### Validation
- `mcpgateway/common/validators.py` - SecurityValidator class
  - MIME validation (line 1126)
  - _MIME_TYPE_RE pattern (line 77)
- `mcpgateway/config.py` - Size limits (line 1874)

### Makefile Targets
- `make docker-prod` - Build production container
- `make compose-down` - Stop stack
- `make testing-up` - Start testing stack

---

## Related Issues

- None identified

---

## Appendix: What This Test Plan Does NOT Cover

The following features do **not exist** in the gateway and are out of scope:

- Binary file upload (gateway requires UTF-8 text)
- Zip bomb detection
- Archive extraction and scanning
- Executable file detection
- Magic byte validation
- File extension checking

If these features are needed, they would require new implementation.




Feature	Status	Location
Content size limit	✅ 1MB max	`schemas.py:1637`, `config.py:1874`
UTF-8 requirement	✅ Enforced	`schemas.py:1640-1644`
MIME type format	✅ Regex validation	`validators.py:1126`
HTML injection check	✅ Pattern matching	`schemas.py:1648`

Feature	Status
Binary file upload	❌ Not implemented (UTF-8 required)
Zip bomb detection	❌ Not implemented
Archive extraction	❌ Not implemented
Executable detection	❌ Not implemented
Magic byte validation	❌ Not implemented

Test Case	Size Limit	UTF-8	MIME Format	HTML Detection
TC-CONTENT-001	✓
TC-CONTENT-002		✓
TC-CONTENT-003			✓
TC-CONTENT-004				✓
TC-CONTENT-005
TC-CONTENT-006	✓	✓

[TESTING][SECURITY]: File upload security manual test plan (malicious files, size limits, MIME validation, zip bombs) #2417

Description

[TESTING][SECURITY]: Resource content security manual test plan (content validation, size limits, MIME format, injection prevention)

Goal

Why Now?

Scope Clarification

What Exists

What Does NOT Exist

User Stories

Story 1: Content Size Limits

Acceptance Criteria

Story 2: UTF-8 Content Validation

Acceptance Criteria

Story 3: Injection Prevention

Acceptance Criteria

Architecture

Test Environment Setup

Using Docker Compose with Makefile (Recommended)

Create Test Token

Test Cases

TC-CONTENT-001: Content Size Limit Enforced

TC-CONTENT-002: UTF-8 Requirement Enforced

TC-CONTENT-003: MIME Type Format Validation

TC-CONTENT-004: HTML Injection Detection

TC-CONTENT-005: Valid Content Accepted

TC-CONTENT-006: Edge Cases

Test Matrix

Success Criteria

Related Files

Resource Creation

Validation

Makefile Targets

Related Issues

Appendix: What This Test Plan Does NOT Cover

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions