Skip to content

Commit 0d6c6da

Browse files
antonisclaude
andcommitted
fix(ci): Fix E2E test flakiness with stable checks instead of retries
Replace retry-based approach (PR #5830) with deterministic fixes: ### Simulator stability (Cirrus Labs Tart VMs) - `wait_for_boot: true` / `erase_before_boot: false` on simulator-action - `xcrun simctl bootstatus booted -b` to block until boot completes - Settings.app warm-up for SpringBoard/system service initialization - `MAESTRO_DRIVER_STARTUP_TIMEOUT` bumped to 180s ### e2e-v2 test runner (cli.mjs) - Per-flow process isolation via individual `maestro test` calls - Maestro driver warm-up flow before real tests (non-fatal) - crash.yml runs first so the next flow verifies post-crash recovery - `execSync` → `execFileSync` to avoid shell interpolation - SENTRY_AUTH_TOKEN redaction in debug logs ### Sample application test fixes - Search all envelopes for app start transaction (slow VM delivery) - Sort envelopes by timestamp for deterministic ordering - Allow-list for TTID/TTFD ops (`navigation`, `ui.load`) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c0cfa19 commit 0d6c6da

File tree

8 files changed

+170
-45
lines changed

8 files changed

+170
-45
lines changed

.github/workflows/e2e-v2.yml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -508,12 +508,26 @@ jobs:
508508
with:
509509
model: ${{ env.IOS_DEVICE }}
510510
os_version: ${{ env.IOS_VERSION }}
511+
wait_for_boot: true
512+
erase_before_boot: false
513+
514+
- name: Wait for iOS simulator to be fully ready
515+
if: ${{ steps.platform-check.outputs.skip != 'true' && matrix.platform == 'ios' }}
516+
run: |
517+
# Wait for boot to complete at the system level
518+
xcrun simctl bootstatus booted -b
519+
# Launch and dismiss Settings.app to ensure SpringBoard and system services
520+
# are fully initialized — this avoids Maestro connecting to a half-booted
521+
# simulator on Cirrus Labs Tart VMs.
522+
xcrun simctl launch booted com.apple.Preferences
523+
sleep 5
524+
xcrun simctl terminate booted com.apple.Preferences
511525
512526
- name: Run tests on iOS
513527
if: ${{ steps.platform-check.outputs.skip != 'true' && matrix.platform == 'ios' }}
514528
env:
515529
# Increase timeout for Maestro iOS driver startup (default is 60s, some CI runners need more time)
516-
MAESTRO_DRIVER_STARTUP_TIMEOUT: 120000
530+
MAESTRO_DRIVER_STARTUP_TIMEOUT: 180000
517531
run: ./dev-packages/e2e-tests/cli.mjs ${{ matrix.platform }} --test
518532

519533
- name: Upload logs

.github/workflows/sample-application.yml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ concurrency:
1414
env:
1515
SENTRY_AUTH_TOKEN: ${{ secrets.SENTRY_AUTH_TOKEN }}
1616
MAESTRO_VERSION: '2.3.0'
17-
MAESTRO_DRIVER_STARTUP_TIMEOUT: 90000 # Increase timeout from default 30s to 90s for CI stability
17+
MAESTRO_DRIVER_STARTUP_TIMEOUT: 180000 # Increase timeout from default 30s to 180s for CI stability
1818
RN_SENTRY_POD_NAME: RNSentry
1919
IOS_APP_ARCHIVE_PATH: sentry-react-native-sample.app.zip
2020
ANDROID_APP_ARCHIVE_PATH: sentry-react-native-sample.apk.zip
@@ -332,6 +332,16 @@ jobs:
332332
with:
333333
model: ${{ env.IOS_DEVICE }}
334334
os_version: ${{ env.IOS_VERSION }}
335+
wait_for_boot: true
336+
erase_before_boot: false
337+
338+
- name: Wait for iOS simulator to be fully ready
339+
if: ${{ steps.platform-check.outputs.skip != 'true' && matrix.platform == 'ios' }}
340+
run: |
341+
xcrun simctl bootstatus booted -b
342+
xcrun simctl launch booted com.apple.Preferences
343+
sleep 5
344+
xcrun simctl terminate booted com.apple.Preferences
335345
336346
- name: Run iOS Tests
337347
if: ${{ steps.platform-check.outputs.skip != 'true' && matrix.platform == 'ios' }}

dev-packages/e2e-tests/cli.mjs

Lines changed: 85 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -290,35 +290,96 @@ if (actions.includes('test')) {
290290
if (!sentryAuthToken) {
291291
console.log('Skipping maestro test due to unavailable or empty SENTRY_AUTH_TOKEN');
292292
} else {
293+
const maestroDir = path.join(e2eDir, 'maestro');
294+
const flowFiles = fs.readdirSync(maestroDir)
295+
.filter(f => f.endsWith('.yml') && !f.startsWith('utils'))
296+
.sort((a, b) => {
297+
// Run crash.yml first — it kills the app via nativeCrash(), and the
298+
// NEXT flow's launchTestAppClear.yml verifies post-crash recovery
299+
// (killApp + launchApp + assertTestReady). Per-flow process isolation
300+
// ensures the crashed Maestro session doesn't affect other flows.
301+
if (a === 'crash.yml') return -1;
302+
if (b === 'crash.yml') return 1;
303+
return a.localeCompare(b);
304+
});
305+
306+
console.log(`Discovered ${flowFiles.length} Maestro flows: ${flowFiles.join(', ')}`);
307+
308+
// Warm up Maestro's driver connection before running test flows.
309+
// The first Maestro launchApp after simulator boot can fail on Cirrus
310+
// Labs Tart VMs because the IDB/XCUITest driver isn't fully connected.
311+
// Running a lightweight warmup flow ensures the driver is ready.
312+
const warmupFlow = path.join('maestro', 'utils', 'warmup.yml');
313+
console.log('\n--- Warming up Maestro driver ---');
293314
try {
294-
execSync(
295-
`maestro test maestro \
296-
--env=APP_ID="${appId}" \
297-
--env=SENTRY_AUTH_TOKEN="${sentryAuthToken}" \
298-
--debug-output maestro-logs \
299-
--flatten-debug-output`,
300-
{
301-
stdio: 'inherit',
302-
cwd: e2eDir,
303-
},
304-
);
305-
} finally {
306-
// Always redact sensitive data, even if the test fails
307-
const redactScript = `
308-
if [[ "$(uname)" == "Darwin" ]]; then
309-
find ./maestro-logs -type f -exec sed -i '' "s/${sentryAuthToken}/[REDACTED]/g" {} +
310-
echo 'Redacted sensitive data from logs on MacOS'
311-
else
312-
find ./maestro-logs -type f -exec sed -i "s/${sentryAuthToken}/[REDACTED]/g" {} +
313-
echo 'Redacted sensitive data from logs on Ubuntu'
314-
fi
315-
`;
315+
execFileSync('maestro', [
316+
'test',
317+
warmupFlow,
318+
'--env', `APP_ID=${appId}`,
319+
'--env', `SENTRY_AUTH_TOKEN=${sentryAuthToken}`,
320+
], {
321+
stdio: 'inherit',
322+
cwd: e2eDir,
323+
});
324+
console.log('--- Maestro driver warm-up: OK ---');
325+
} catch (error) {
326+
console.warn('--- Maestro driver warm-up failed (non-fatal, continuing with tests) ---');
327+
}
328+
329+
const failedFlows = [];
316330

331+
// Run each flow in its own process to prevent crash cascade —
332+
// when crash.yml kills the app, a shared Maestro session would fail
333+
// all subsequent flows.
334+
for (const flow of flowFiles) {
335+
const flowPath = path.join('maestro', flow);
336+
console.log(`\n--- Running flow: ${flow} ---`);
317337
try {
318-
execSync(redactScript, { stdio: 'inherit', cwd: e2eDir, shell: '/bin/bash' });
338+
execFileSync('maestro', [
339+
'test',
340+
flowPath,
341+
'--env', `APP_ID=${appId}`,
342+
'--env', `SENTRY_AUTH_TOKEN=${sentryAuthToken}`,
343+
'--debug-output', 'maestro-logs',
344+
'--flatten-debug-output',
345+
], {
346+
stdio: 'inherit',
347+
cwd: e2eDir,
348+
});
349+
console.log(`--- Flow ${flow}: PASSED ---`);
319350
} catch (error) {
320-
console.warn('Failed to redact sensitive data from logs:', error.message);
351+
console.error(`--- Flow ${flow}: FAILED ---`);
352+
failedFlows.push(flow);
321353
}
322354
}
355+
356+
// Always redact sensitive data, even if some tests failed
357+
try {
358+
const logDir = path.join(e2eDir, 'maestro-logs');
359+
if (fs.existsSync(logDir)) {
360+
const redactFiles = (dir) => {
361+
for (const entry of fs.readdirSync(dir, { withFileTypes: true })) {
362+
const fullPath = path.join(dir, entry.name);
363+
if (entry.isDirectory()) {
364+
redactFiles(fullPath);
365+
} else {
366+
const content = fs.readFileSync(fullPath, 'utf8');
367+
if (content.includes(sentryAuthToken)) {
368+
fs.writeFileSync(fullPath, content.replaceAll(sentryAuthToken, '[REDACTED]'));
369+
}
370+
}
371+
}
372+
};
373+
redactFiles(logDir);
374+
console.log('Redacted sensitive data from logs');
375+
}
376+
} catch (error) {
377+
console.warn('Failed to redact sensitive data from logs:', error.message);
378+
}
379+
380+
if (failedFlows.length > 0) {
381+
console.error(`\nFailed flows: ${failedFlows.join(', ')}`);
382+
process.exit(1);
383+
}
323384
}
324385
}

dev-packages/e2e-tests/maestro/crash.yml

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,4 @@ appId: ${APP_ID}
22
jsEngine: graaljs
33
---
44
- runFlow: utils/launchTestAppClear.yml
5-
- tapOn: "Crash"
6-
7-
- launchApp
8-
9-
- runFlow: utils/assertTestReady.yml
5+
- tapOn: 'Crash'

dev-packages/e2e-tests/maestro/utils/sentryApi.js

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,8 +62,22 @@ switch (fetch) {
6262
break;
6363
}
6464
case 'replay': {
65-
const event = json(fetchFromSentry(`${baseUrl}/events/${eventId}/json/`));
66-
const replayId = event._dsc.replay_id.replace(/\-/g, '');
65+
// The replay_id may not be available immediately after the event is
66+
// created — Sentry needs time to process and link the replay. Retry
67+
// fetching the event until _dsc.replay_id is present.
68+
let replayId;
69+
for (let attempt = 0; attempt < RETRY_COUNT; attempt++) {
70+
const event = json(fetchFromSentry(`${baseUrl}/events/${eventId}/json/`));
71+
if (event._dsc && event._dsc.replay_id) {
72+
replayId = event._dsc.replay_id.replace(/\-/g, '');
73+
break;
74+
}
75+
console.log(`replay_id not yet available, retrying: ${attempt + 1}/${RETRY_COUNT}`);
76+
sleep(RETRY_INTERVAL);
77+
}
78+
if (!replayId) {
79+
throw new Error(`replay_id not available after ${RETRY_COUNT} retries`);
80+
}
6781
const replay = json(fetchFromSentry(`${baseUrl}/replays/${replayId}/`));
6882
const segment = fetchFromSentry(`${baseUrl}/replays/${replayId}/videos/0/`);
6983

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
appId: ${APP_ID}
2+
jsEngine: graaljs
3+
---
4+
# Warm up Maestro's IDB/XCUITest driver connection on the simulator.
5+
# The very first Maestro launchApp after simulator boot can fail on Cirrus
6+
# Labs Tart VMs — running a lightweight flow first ensures the driver is
7+
# fully connected before real test flows start.
8+
- launchApp:
9+
clearState: true
10+
- extendedWaitUntil:
11+
visible: "E2E Tests Ready"
12+
timeout: 300_000 # 5 minutes
13+
- killApp

samples/react-native/e2e/tests/captureErrorScreenTransaction/captureErrorsScreenTransaction.test.ts

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -31,15 +31,23 @@ describe('Capture Errors Screen Transaction', () => {
3131
});
3232

3333
it('envelope contains transaction context', async () => {
34-
const envelope = getErrorsEnvelope();
35-
36-
const items = envelope[1];
37-
const transactions = items.filter(([header]) => header.type === 'transaction');
38-
const appStartTransaction = transactions.find(([_header, payload]) => {
39-
const event = payload as any;
40-
return event.transaction === 'ErrorsScreen' &&
41-
event.contexts?.trace?.origin === 'auto.app.start';
42-
});
34+
// The app start transaction may arrive in a separate envelope on slow CI VMs,
35+
// so search all matching envelopes instead of just the first one.
36+
const allEnvelopes = sentryServer.getAllEnvelopes(
37+
containingTransactionWithName('ErrorsScreen'),
38+
);
39+
40+
let appStartTransaction: EventItem | undefined;
41+
for (const envelope of allEnvelopes) {
42+
const items = envelope[1];
43+
const transactions = items.filter(([header]) => header.type === 'transaction') as EventItem[];
44+
appStartTransaction = transactions.find(([_header, payload]) => {
45+
const event = payload as any;
46+
return event.transaction === 'ErrorsScreen' &&
47+
event.contexts?.trace?.origin === 'auto.app.start';
48+
});
49+
if (appStartTransaction) break;
50+
}
4351

4452
expect(appStartTransaction).toBeDefined();
4553

samples/react-native/e2e/tests/captureSpaceflightNewsScreenTransaction/captureSpaceflightNewsScreenTransaction.test.ts

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,13 @@ describe('Capture Spaceflight News Screen Transaction', () => {
4242
await waitForSpaceflightNewsTx;
4343

4444
newsEnvelopes = sentryServer.getAllEnvelopes(containingNewsScreen);
45+
// Sort by transaction timestamp — envelope delivery order may vary on slow CI VMs,
46+
// but test assertions depend on chronological order.
47+
newsEnvelopes.sort((a, b) => {
48+
const aItem = getItemOfTypeFrom<EventItem>(a, 'transaction');
49+
const bItem = getItemOfTypeFrom<EventItem>(b, 'transaction');
50+
return (aItem?.[1]?.timestamp ?? 0) - (bItem?.[1]?.timestamp ?? 0);
51+
});
4552
allTransactionEnvelopes = sentryServer.getAllEnvelopes(
4653
containingTransaction,
4754
);
@@ -64,9 +71,11 @@ describe('Capture Spaceflight News Screen Transaction', () => {
6471
allTransactionEnvelopes
6572
.filter(envelope => {
6673
const item = getItemOfTypeFrom<EventItem>(envelope, 'transaction');
67-
// Only check navigation transactions, not user interaction transactions
68-
// User interaction transactions (ui.action.touch) don't have time-to-display measurements
69-
return item?.[1]?.contexts?.trace?.op !== 'ui.action.touch';
74+
// Only navigation and app start transactions have time-to-display measurements.
75+
// Filter with an allow-list — other ops like 'ui.action.touch' or
76+
// 'navigation.processing' do not include TTID/TTFD.
77+
const op = item?.[1]?.contexts?.trace?.op;
78+
return op === 'navigation' || op === 'ui.load';
7079
})
7180
.forEach(envelope => {
7281
expectToContainTimeToDisplayMeasurements(

0 commit comments

Comments
 (0)