Skip to content

Commit 7762808

Browse files
Feature/safo6 nrl 721 seed sandbox data (#1140)
- Deletes all items in a DynamoDB table and reseeds the table with X (default = 2) pointers of each type for X (default is 2) custodians - The scripts have been implemented to allow for execution either locally or via the lambda - When running locally, the delete_all_table_items.py and seed_sandbox_table.py scripts can be run independently or via the orchestrator script reset_sandbox_table.py - The lambda handles the orchestration of the delete and seed scripts (index.py), so the reset_sandbox_table.py script is not required for the lambda - The lambda is deployed account-wide to dev and/or test (since we wouldn't want to reseed prod tables) and only when a table in the account has been specified for reseeding - Logs showing the lambda working successfully can be seen [here](https://360016957465-rdeomcad.eu-west-2.console.aws.amazon.com/cloudwatch/home?region=eu-west-2#logsV2:log-groups/log-group/$252Faws$252Flambda$252Fnhsd-nrlf--dev--sandbox-seeder/log-events/2026$252F02$252F17$252F$255B$2524LATEST$255D79f615325d694728adfd7a13783facb7) - Because it's account-wide we have needed to make the lamda layers available in the a/c wide infrastructure
1 parent df578fd commit 7762808

34 files changed

+2508
-439
lines changed

.github/workflows/deploy-account-wide-infra.yml

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,13 +51,27 @@ jobs:
5151
echo "${HOME}/.asdf/bin" >> $GITHUB_PATH
5252
poetry install --no-root
5353
54+
- name: Build Lambda Layers
55+
run: |
56+
make build-layers
57+
make build-dependency-layer
58+
59+
- name: Build Seed Sandbox Lambda
60+
run: make build-seed-sandbox-lambda
61+
5462
- name: Configure Management Credentials
5563
uses: aws-actions/configure-aws-credentials@7474bc4690e29a8392af63c5b98e7449536d5c3a #v4.3.1
5664
with:
5765
aws-region: eu-west-2
5866
role-to-assume: ${{ secrets.MGMT_ROLE_ARN }}
5967
role-session-name: github-actions-ci-${{ inputs.environment }}-${{ github.run_id }}
6068

69+
- name: Add S3 Permissions to Lambda Layer
70+
env:
71+
ACCOUNT_NAME: ${{ vars.ACCOUNT_NAME }}
72+
run: |
73+
make get-s3-perms ENV=${ACCOUNT_NAME}
74+
6175
- name: Retrieve Server Certificates
6276
env:
6377
ACCOUNT_NAME: ${{ vars.ACCOUNT_NAME }}
@@ -92,6 +106,11 @@ jobs:
92106
aws s3 cp terraform/account-wide-infrastructure/$ACCOUNT_NAME/tfplan.txt s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/tfplan.txt
93107
aws s3 cp terraform/account-wide-infrastructure/modules/glue/files/src.zip s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/glue-src.zip
94108
109+
aws s3 cp dist/nrlf.zip s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/nrlf.zip
110+
aws s3 cp dist/dependency_layer.zip s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/dependency_layer.zip
111+
aws s3 cp dist/nrlf_permissions.zip s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/nrlf_permissions.zip
112+
aws s3 cp dist/seed_sandbox.zip s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/seed_sandbox.zip
113+
95114
terraform-apply:
96115
name: Terraform Apply - ${{ inputs.environment }}
97116
needs: [terraform-plan]
@@ -126,6 +145,12 @@ jobs:
126145
mkdir -p terraform/account-wide-infrastructure/modules/glue/files
127146
aws s3 cp s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/glue-src.zip terraform/account-wide-infrastructure/modules/glue/files/src.zip
128147
148+
mkdir -p dist
149+
aws s3 cp s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/nrlf.zip dist/nrlf.zip
150+
aws s3 cp s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/dependency_layer.zip dist/dependency_layer.zip
151+
aws s3 cp s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/nrlf_permissions.zip dist/nrlf_permissions.zip
152+
aws s3 cp s3://nhsd-nrlf--mgmt--ci-data/acc-$ACCOUNT_NAME/${{ github.run_id }}/seed_sandbox.zip dist/seed_sandbox.zip
153+
129154
- name: Retrieve Server Certificates
130155
env:
131156
ACCOUNT_NAME: ${{ vars.ACCOUNT_NAME }}

Makefile

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,11 @@ check-deploy: ## check the deploy environment is setup correctly
5858
check-deploy-warn:
5959
@SHOULD_WARN_ONLY=true ./scripts/check-deploy-environment.sh
6060

61-
build: check-warn build-api-packages build-layers build-dependency-layer ## Build the project
61+
build: check-warn build-api-packages build-layers build-dependency-layer build-seed-sandbox-lambda ## Build the project
62+
63+
build-seed-sandbox-lambda:
64+
@echo "Building seed_sandbox Lambda"
65+
@cd lambdas/seed_sandbox && make build
6266

6367
build-dependency-layer:
6468
@echo "Building Lambda dependency layer"

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -375,8 +375,7 @@ In order to deploy to a sandbox environment (`dev-sandbox`, `qa-sandbox`, `int-s
375375

376376
### Sandbox database clear and reseed
377377

378-
Any workspace suffixed with `-sandbox` has a small amount of additional infrastructure deployed to clear and reseed the DynamoDB tables (auth and document pointers) using a Lambda running
379-
on a cron schedule that can be found in the `cron/seed_sandbox` directory in the root of this project. The data used to seed the DynamoDB tables can found in the `cron/seed_sandbox/data` directory.
378+
The dev and test environments have a small amount of additional infrastructure deployed to clear and reseed specified DynamoDB sandbox tables with realistic data using a Lambda running on an Eventbridge schedule. You can specify the tables to be reseeded in `terraform/account-wide-infrastructure/{env}/lambda\__seed-sandbox.tf.` If you want to perform this manually on an adhoc basis, you can use `./scripts/reset_sandbox_table.py`.
380379

381380
### Sandbox authorisation
382381

lambdas/seed_sandbox/Makefile

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
.PHONY: build clean
2+
3+
build: clean
4+
@echo "Building Lambda deployment package..."
5+
mkdir -p build
6+
7+
# Copy the handler
8+
cp index.py build/
9+
10+
# Copy the required scripts
11+
mkdir -p build/scripts
12+
cp ../../scripts/delete_all_table_items.py build/scripts/
13+
cp ../../scripts/seed_sandbox_table.py build/scripts/
14+
cp ../../scripts/seed_utils.py build/scripts/
15+
16+
# Copy the pointer template data
17+
mkdir -p build/tests/data/samples
18+
cp -r ../../tests/data/samples/*.json build/tests/data/samples/
19+
20+
# Create the zip file in root dist
21+
mkdir -p ../../dist
22+
cd build && zip -r ../../../dist/seed_sandbox.zip . -x "*.pyc" -x "__pycache__/*" -x ".DS_Store"
23+
24+
@echo "✓ Lambda package created: ../../dist/seed_sandbox.zip"
25+
26+
clean:
27+
@echo "Cleaning build artifacts..."
28+
rm -rf build
29+
@echo "✓ Clean complete"

lambdas/seed_sandbox/index.py

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# flake8: noqa: T201
2+
3+
import json
4+
import os
5+
6+
from scripts.delete_all_table_items import delete_all_table_items
7+
from scripts.seed_sandbox_table import seed_sandbox_table
8+
9+
10+
def handler(event, context):
11+
"""
12+
Lambda handler that orchestrates the reset of specified pointer tables in the dev & test accounts, deleting all items and reseeding with fresh data .
13+
14+
The tables to be reset and number of pointers per type can be specified in `terraform/account-wide-infrastructure/{env}/lambda__seed-sandbox.tf`
15+
16+
"""
17+
table_names_str = os.environ.get("TABLE_NAMES", "")
18+
pointers_per_type = int(os.environ.get("POINTERS_PER_TYPE", "2"))
19+
20+
if not table_names_str:
21+
error_msg = "TABLE_NAMES environment variable is required"
22+
print(f"ERROR: {error_msg}")
23+
return {"statusCode": 500, "body": json.dumps({"error": error_msg})}
24+
25+
table_names = [name.strip() for name in table_names_str.split(",") if name.strip()]
26+
27+
if not table_names:
28+
error_msg = "No valid table names provided in TABLE_NAMES"
29+
print(f"ERROR: {error_msg}")
30+
return {"statusCode": 500, "body": json.dumps({"error": error_msg})}
31+
32+
print(
33+
f"Starting table reset for {len(table_names)} table(s): {', '.join(table_names)}"
34+
)
35+
print(f"Pointers per type: {pointers_per_type}")
36+
37+
results = []
38+
failed_tables = []
39+
40+
for table_name in table_names:
41+
print(f"\n{'='*60}")
42+
print(f"Processing table: {table_name}")
43+
print(f"{'='*60}")
44+
45+
try:
46+
print("Step 1: Deleting all items from table...")
47+
pointers_deleted_count = delete_all_table_items(table_name=table_name)
48+
print(f"✓ Deleted {pointers_deleted_count} items")
49+
50+
print("Step 2: Seeding table with fresh data...")
51+
seed_result = seed_sandbox_table(
52+
table_name=table_name,
53+
pointers_per_type=pointers_per_type,
54+
force=True,
55+
write_csv=False,
56+
)
57+
print(f"✓ Created {seed_result['successful']} pointers")
58+
59+
results.append(
60+
{
61+
"table_name": table_name,
62+
"status": "success",
63+
"pointers_deleted": pointers_deleted_count,
64+
"pointers_created": seed_result["successful"],
65+
"pointers_attempted": seed_result["attempted"],
66+
"pointers_failed": seed_result["failed"],
67+
}
68+
)
69+
70+
except Exception as e:
71+
error_msg = f"Failed to reset table {table_name}: {str(e)}"
72+
print(f"ERROR: {error_msg}")
73+
failed_tables.append(table_name)
74+
results.append(
75+
{
76+
"table_name": table_name,
77+
"status": "failed",
78+
"error": str(e),
79+
}
80+
)
81+
82+
if failed_tables:
83+
status_code = 500 if len(failed_tables) == len(table_names) else 207
84+
message = (
85+
f"Failed to reset {len(failed_tables)} table(s): {', '.join(failed_tables)}"
86+
)
87+
else:
88+
status_code = 200
89+
message = f"Successfully reset {len(table_names)} table(s)"
90+
91+
result = {
92+
"statusCode": status_code,
93+
"body": json.dumps(
94+
{
95+
"message": message,
96+
"tables_processed": len(table_names),
97+
"tables_succeeded": len(table_names) - len(failed_tables),
98+
"tables_failed": len(failed_tables),
99+
"results": results,
100+
"pointers_per_type": pointers_per_type,
101+
}
102+
),
103+
}
104+
105+
print(f"\n{'='*60}")
106+
print(f"RESULT: {message}")
107+
print(f"{'='*60}")
108+
return result
Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
import json
2+
from unittest.mock import MagicMock, patch
3+
4+
import pytest
5+
6+
from lambdas.seed_sandbox.index import handler
7+
8+
9+
@pytest.fixture
10+
def mock_lambda_context():
11+
mock_context = MagicMock()
12+
mock_context.function_name = "test-function"
13+
mock_context.invoked_function_arn = (
14+
"arn:aws:lambda:eu-west-2:123456789012:function:test-function"
15+
)
16+
return mock_context
17+
18+
19+
@pytest.fixture
20+
def mock_env_vars():
21+
with patch.dict(
22+
"os.environ", {"TABLE_NAMES": "test-table", "POINTERS_PER_TYPE": "2"}
23+
):
24+
yield
25+
26+
27+
@patch("lambdas.seed_sandbox.index.seed_sandbox_table")
28+
@patch("lambdas.seed_sandbox.index.delete_all_table_items")
29+
def test_single_table_reset_success(
30+
mock_delete, mock_seed, mock_lambda_context, mock_env_vars
31+
):
32+
mock_delete.return_value = 10
33+
mock_seed.return_value = {"successful": 8, "attempted": 8, "failed": 0}
34+
35+
result = handler({}, mock_lambda_context)
36+
37+
assert result["statusCode"] == 200
38+
body = json.loads(result["body"])
39+
assert body["message"] == "Successfully reset 1 table(s)"
40+
assert body["tables_processed"] == 1
41+
assert body["tables_succeeded"] == 1
42+
assert body["tables_failed"] == 0
43+
assert len(body["results"]) == 1
44+
assert body["results"][0]["table_name"] == "test-table"
45+
assert body["results"][0]["status"] == "success"
46+
assert body["results"][0]["pointers_deleted"] == 10
47+
assert body["results"][0]["pointers_created"] == 8
48+
49+
mock_delete.assert_called_once_with(table_name="test-table")
50+
mock_seed.assert_called_once_with(
51+
table_name="test-table", pointers_per_type=2, force=True, write_csv=False
52+
)
53+
54+
55+
@patch("lambdas.seed_sandbox.index.seed_sandbox_table")
56+
@patch("lambdas.seed_sandbox.index.delete_all_table_items")
57+
def test_multiple_table_reset_success(
58+
mock_delete, mock_seed, mock_lambda_context, mock_env_vars
59+
):
60+
with patch.dict(
61+
"os.environ",
62+
{"TABLE_NAMES": "table1,table2,table3", "POINTERS_PER_TYPE": "5"},
63+
):
64+
mock_delete.return_value = 15
65+
mock_seed.return_value = {"successful": 20, "attempted": 20, "failed": 0}
66+
67+
result = handler({}, mock_lambda_context)
68+
69+
assert result["statusCode"] == 200
70+
body = json.loads(result["body"])
71+
assert body["message"] == "Successfully reset 3 table(s)"
72+
assert body["tables_processed"] == 3
73+
assert body["tables_succeeded"] == 3
74+
assert body["tables_failed"] == 0
75+
assert len(body["results"]) == 3
76+
assert all(r["status"] == "success" for r in body["results"])
77+
78+
assert mock_delete.call_count == 3
79+
assert mock_seed.call_count == 3
80+
81+
82+
@patch("lambdas.seed_sandbox.index.seed_sandbox_table")
83+
@patch("lambdas.seed_sandbox.index.delete_all_table_items")
84+
def test_partial_failure(mock_delete, mock_seed, mock_lambda_context):
85+
with patch.dict(
86+
"os.environ",
87+
{"TABLE_NAMES": "table1,table2,table3", "POINTERS_PER_TYPE": "2"},
88+
):
89+
# First and third tables succeed, second fails during delete
90+
mock_delete.side_effect = [10, Exception("Access denied"), 5]
91+
mock_seed.side_effect = [
92+
{"successful": 8, "attempted": 8, "failed": 0},
93+
{"successful": 8, "attempted": 8, "failed": 0},
94+
]
95+
96+
result = handler({}, mock_lambda_context)
97+
98+
assert result["statusCode"] == 207
99+
body = json.loads(result["body"])
100+
assert "Failed to reset 1 table(s): table2" in body["message"]
101+
assert body["tables_processed"] == 3
102+
assert body["tables_succeeded"] == 2
103+
assert body["tables_failed"] == 1
104+
assert len(body["results"]) == 3
105+
assert body["results"][0]["status"] == "success"
106+
assert body["results"][1]["status"] == "failed"
107+
assert body["results"][1]["error"] == "Access denied"
108+
assert body["results"][2]["status"] == "success"
109+
110+
111+
@patch("lambdas.seed_sandbox.index.seed_sandbox_table")
112+
@patch("lambdas.seed_sandbox.index.delete_all_table_items")
113+
def test_complete_failure(mock_delete, mock_seed, mock_lambda_context):
114+
with patch.dict(
115+
"os.environ", {"TABLE_NAMES": "table1,table2", "POINTERS_PER_TYPE": "2"}
116+
):
117+
mock_delete.side_effect = Exception("Database error")
118+
119+
result = handler({}, mock_lambda_context)
120+
121+
assert result["statusCode"] == 500
122+
body = json.loads(result["body"])
123+
assert "Failed to reset 2 table(s)" in body["message"]
124+
assert body["tables_processed"] == 2
125+
assert body["tables_succeeded"] == 0
126+
assert body["tables_failed"] == 2
127+
128+
129+
def test_missing_table_names_env_var(mock_lambda_context):
130+
with patch.dict("os.environ", {}, clear=True):
131+
result = handler({}, mock_lambda_context)
132+
133+
assert result["statusCode"] == 500
134+
body = json.loads(result["body"])
135+
assert body["error"] == "TABLE_NAMES environment variable is required"
136+
137+
138+
def test_empty_table_names(mock_lambda_context):
139+
with patch.dict("os.environ", {"TABLE_NAMES": " ", "POINTERS_PER_TYPE": "2"}):
140+
result = handler({}, mock_lambda_context)
141+
142+
assert result["statusCode"] == 500
143+
body = json.loads(result["body"])
144+
assert body["error"] == "No valid table names provided in TABLE_NAMES"
145+
146+
147+
@patch("lambdas.seed_sandbox.index.seed_sandbox_table")
148+
@patch("lambdas.seed_sandbox.index.delete_all_table_items")
149+
def test_table_names_with_whitespace(mock_delete, mock_seed, mock_lambda_context):
150+
with patch.dict(
151+
"os.environ",
152+
{"TABLE_NAMES": " table1 , table2 , ", "POINTERS_PER_TYPE": "2"},
153+
):
154+
mock_delete.return_value = 5
155+
mock_seed.return_value = {"successful": 4, "attempted": 4, "failed": 0}
156+
157+
result = handler({}, mock_lambda_context)
158+
159+
assert result["statusCode"] == 200
160+
body = json.loads(result["body"])
161+
assert body["tables_processed"] == 2
162+
assert body["results"][0]["table_name"] == "table1"
163+
assert body["results"][1]["table_name"] == "table2"
164+
165+
166+
@patch("lambdas.seed_sandbox.index.seed_sandbox_table")
167+
@patch("lambdas.seed_sandbox.index.delete_all_table_items")
168+
def test_seed_with_failures(mock_delete, mock_seed, mock_lambda_context, mock_env_vars):
169+
mock_delete.return_value = 5
170+
mock_seed.return_value = {"successful": 6, "attempted": 8, "failed": 2}
171+
172+
result = handler({}, mock_lambda_context)
173+
174+
assert result["statusCode"] == 200
175+
body = json.loads(result["body"])
176+
assert body["results"][0]["pointers_created"] == 6
177+
assert body["results"][0]["pointers_attempted"] == 8
178+
assert body["results"][0]["pointers_failed"] == 2

0 commit comments

Comments
 (0)