Skip to content

Fix AWS infrastructure deployment issues preventing successful stack …#424

Open
ssimmie wants to merge 1 commit intomasterfrom
fix/aws-infrastructure-deployment
Open

Fix AWS infrastructure deployment issues preventing successful stack …#424
ssimmie wants to merge 1 commit intomasterfrom
fix/aws-infrastructure-deployment

Conversation

@ssimmie
Copy link
Owner

@ssimmie ssimmie commented Nov 2, 2025

…creation

Resolves three critical issues that would prevent AWS deployment from working:

  1. Fixed Keyspace dependency race condition where CloudFormation attempted to create Cassandra tables before keyspace existed, causing "Keyspace todos does not exist" errors. Added explicit CloudFormation dependencies using addDependency().

  2. Replaced PRIVATE_ISOLATED subnets with VPC Endpoints to enable AWS service access without NAT Gateway. PRIVATE_ISOLATED subnets have no connectivity, preventing ECS tasks from pulling ECR images or connecting to Keyspaces. Added S3 Gateway endpoint (free) and Interface endpoints for ECR, CloudWatch Logs, and Keyspaces (~$29/month vs $32/month for NAT Gateway).

  3. Made ECS service creation optional on first deploy to avoid referencing non-existent Docker images. Infrastructure now deploys in two phases: Phase 1 creates VPC, Keyspaces, and ECR repository; Phase 2 (after image push) creates ECS service. Controlled via CDK context parameter createEcsService (default: false).

Updated GitHub Actions workflow to implement two-phase deployment strategy with image build and push between infrastructure phases.

All tests passing: 21 unit tests, 99% JaCoCo coverage, PMD/Checkstyle/SpotBugs/fmt clean.

🤖 Generated with Claude Code


Note

Adds VPC endpoints, enforces Keyspaces table dependencies, makes ECS service creation optional, and updates CI to a two-phase deploy with image build/push.

  • Infrastructure/CDK:
    • ECS (EcsStack):
      • Make service creation optional via -c createEcsService=true; output ServiceArn only when created and always output EcrRepositoryUri/ClusterName.
      • Refactor to store cluster and taskDefinition fields.
    • Keyspaces (KeyspacesStack):
      • Return CfnTable instances and add explicit dependencies on CfnKeyspace to ensure creation order.
    • Networking (NetworkStack):
      • Add VPC endpoints: S3 gateway, ECR API, ECR Docker, CloudWatch Logs, and Keyspaces.
      • Introduce VpcEndpointSecurityGroup with ingress from app SG on ports 443 and 9142.
  • CI/CD (.github/workflows/ci.yml):
    • Split deploy into two phases: deploy infra without ECS service, build/push image to ECR, then deploy ECS service with createEcsService=true.
    • Add steps to build native image, tag/push to ECR, and verify ECS deployment.
  • Tests:
    • Update EcsStackTest and NetworkStackTest for new constructor flag, outputs, and VPC endpoint SG accessors; add cases for service creation on/off.

Written by Cursor Bugbot for commit f27e0c8. This will update automatically on new commits. Configure here.

…creation

Resolves three critical issues that would prevent AWS deployment from working:

1. Fixed Keyspace dependency race condition where CloudFormation attempted to create
   Cassandra tables before keyspace existed, causing "Keyspace todos does not exist"
   errors. Added explicit CloudFormation dependencies using addDependency().

2. Replaced PRIVATE_ISOLATED subnets with VPC Endpoints to enable AWS service access
   without NAT Gateway. PRIVATE_ISOLATED subnets have no connectivity, preventing ECS
   tasks from pulling ECR images or connecting to Keyspaces. Added S3 Gateway endpoint
   (free) and Interface endpoints for ECR, CloudWatch Logs, and Keyspaces (~$29/month
   vs $32/month for NAT Gateway).

3. Made ECS service creation optional on first deploy to avoid referencing non-existent
   Docker images. Infrastructure now deploys in two phases: Phase 1 creates VPC,
   Keyspaces, and ECR repository; Phase 2 (after image push) creates ECS service.
   Controlled via CDK context parameter createEcsService (default: false).

Updated GitHub Actions workflow to implement two-phase deployment strategy with image
build and push between infrastructure phases.

All tests passing: 21 unit tests, 99% JaCoCo coverage, PMD/Checkstyle/SpotBugs/fmt clean.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Bug: Missing Port Mapping Breaks Health Check Startup

The container definition is missing port mappings. The health check attempts to access port 8181 (http://localhost:8181/actuator/health), but no port mapping is configured to expose this port from the container. This will cause the health check to fail and prevent the ECS service from starting successfully. The container needs a port mapping configuration added to ContainerDefinitionOptions, such as .portMappings(List.of(PortMapping.builder().containerPort(8181).protocol(Protocol.TCP).build())).

infrastructure/src/main/java/net/ssimmie/todos/infrastructure/EcsStack.java#L137-L167

// Add container with secure environment variables
taskDefinition.addContainer(
"TodosContainer",
software.amazon.awscdk.services.ecs.ContainerDefinitionOptions.builder()
.image(ContainerImage.fromRegistry(ecrRepository.getRepositoryUri() + ":latest"))
.environment(
Map.of(
"SPRING_PROFILES_ACTIVE",
"aws",
"SPRING_DATA_CASSANDRA_KEYSPACE_NAME",
keyspaceName,
"SPRING_DATA_CASSANDRA_LOCAL_DATACENTER",
this.getRegion(),
"SPRING_DATA_CASSANDRA_CONTACT_POINTS",
"cassandra." + this.getRegion() + ".amazonaws.com",
"SPRING_DATA_CASSANDRA_PORT",
"9142",
"SPRING_DATA_CASSANDRA_SSL",
"true"))
.logging(
LogDrivers.awsLogs(
software.amazon.awscdk.services.ecs.AwsLogDriverProps.builder()
.logGroup(logGroup)
.streamPrefix("todos")
.build()))
.healthCheck(
software.amazon.awscdk.services.ecs.HealthCheck.builder()
.command(
List.of(
"CMD-SHELL", "curl -f http://localhost:8181/actuator/health || exit 1"))
.timeout(Duration.seconds(5))

Fix in Cursor Fix in Web


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant