Skip to content

Commit b68c9a4

Browse files
Add temporalmetricsdemo sample (cloud + worker metrics via Prometheus/Grafana)
1 parent 081a7a4 commit b68c9a4

File tree

21 files changed

+10877
-0
lines changed

21 files changed

+10877
-0
lines changed
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
# Temporal Cloud OpenMetrics → Prometheus → Grafana (Step-by-step)
2+
3+
This demo shows how to **scrape Temporal Cloud OpenMetrics(https://docs.temporal.io/cloud/metrics/openmetrics/)**
4+
with **Prometheus** and **visualize them in Grafana**.
5+
6+
It uses the Grafana Temporal mixin dashboard template:
7+
https://github.com/grafana/jsonnet-libs/blob/master/temporal-mixin/dashboards/temporal-overview.json
8+
9+
Once imported/provisioned, the dashboard lets you view the key Temporal metrics in a ready-made layout.
10+
it may take a second to load, refresh if it takes longer than that.
11+
12+
![Grafana dashboard 4](docs/images/img4.png)
13+
14+
Grafana dashboard view :-
15+
16+
**temporal cloud openmetrics**
17+
![Grafana dashboard 1](docs/images/img1.png)
18+
![Grafana dashboard 2](docs/images/img2.png)
19+
20+
21+
**worker metrics**
22+
![Grafana dashboard 3](docs/images/img6.png)
23+
![Grafana dashboard 4](docs/images/img7.png)
24+
25+
Prometheus :-
26+
27+
**cloud metrics**
28+
![Prometheus Cloud metrics](docs/images/img3.png)
29+
**worker metrics**
30+
![Prometheus Worker metrics](docs/images/img3.png)
31+
32+
---
33+
34+
## 1) Create Service Account + API Key (Temporal Cloud)
35+
36+
OpenMetrics auth reference:
37+
https://docs.temporal.io/production-deployment/cloud/metrics/openmetrics/api-reference#authentication
38+
39+
In Temporal Cloud UI:
40+
- **Settings → Service Accounts**
41+
- Create a service account with **Metrics Read-Only** role
42+
- Generate an **API key** ( copy this, it will be needed later)
43+
---
44+
45+
46+
## 2) Create the `secrets/` folder + API key file
47+
48+
From the repo root (same folder as `docker-compose.yml`), run:
49+
50+
```
51+
cd temporalmetricsdemo
52+
mkdir -p secrets
53+
echo "put your CLOUD API KEY HERE" > secrets/temporal_cloud_api_key
54+
```
55+
now the folder will look like below and temporal_cloud_api_key will have the above api key that we generated in step 1
56+
57+
```
58+
temporalmetricsdemo/
59+
├── docker-compose.yml
60+
├── prometheus/
61+
├── grafana/
62+
├── secrets/
63+
│ └── temporal_cloud_api_key
64+
```
65+
66+
## 3) Configure TemporalConnection.java
67+
68+
Edit `TemporalConnection.java` and set your defaults:
69+
70+
```
71+
public static final String NAMESPACE = env("TEMPORAL_NAMESPACE", "<namespace>.<account-id>");
72+
public static final String ADDRESS = env("TEMPORAL_ADDRESS", "<namespace>.<account-id>.tmprl.cloud:7233");
73+
public static final String CERT = env("TEMPORAL_CERT", "/path/to/client.pem");
74+
public static final String KEY = env("TEMPORAL_KEY", "/path/to/client.key");
75+
public static final String TASK_QUEUE = env("TASK_QUEUE", "openmetrics-task-queue");
76+
public static final int WORKER_SECONDS = envInt("WORKER_SECONDS", 60);
77+
```
78+
79+
## 4) 4) Update Prometheus scrape config
80+
81+
prometheus/config.yml
82+
Update it to use your namespace
83+
```
84+
params:
85+
namespaces: [ '<namespace>.<account-id>' ]
86+
```
87+
88+
89+
## 5) Start Prometheus + Grafana
90+
91+
docker compose up -d
92+
docker compose ps
93+
94+
95+
## 6) View Grafana dashboard
96+
97+
http://localhost:3001/
98+
99+
- Username: admin
100+
- Password: admin
101+
102+
You should see the Temporal Cloud OpenMetrics dashboard.
103+
104+
## 7) Verify metrics in Prometheus
105+
106+
Prometheus: http://localhost:9093/
107+
108+
Go to:
109+
Status → Targets (make sure the scrape target is UP)
110+
Graph tab (search for Temporal metrics and run a query)
111+
112+
## 8) Ran the sample and view the cloud metrics
113+
114+
- `./gradlew -q execute -PmainClass=io.temporal.samples.temporalmetricsdemo.WorkerMain`
115+
- `METRICS_PORT=9465 ./gradlew -q execute -PmainClass=io.temporal.samples.temporalmetricsdemo.Starter`
116+
117+
starter logs
118+
```
119+
➜ samples-java git:(deepika/openmetrics-demo) ✗ METRICS_PORT=9465 ./gradlew -q execute -PmainClass=io.temporal.samples.temporalmetricsdemo.Starter
120+
Worker metrics exposed at http://0.0.0.0:9465/metrics
121+
13:58:44.102 { } [main] INFO i.t.s.WorkflowServiceStubsImpl - Created WorkflowServiceStubs for channel: ManagedChannelOrphanWrapper{delegate=ManagedChannelImpl{logId=1, target=deepika-test-namespace.a2dd6.tmprl.cloud:7233}}
122+
123+
=== Starting scenario: success workflowId=scenario-success-6120c15f-b2f0-40cb-b3a6-f39bf0af6698 ===
124+
Scenario=success Result=Hello Temporal
125+
126+
=== Starting scenario: fail workflowId=scenario-fail-0ff499af-1c3c-4a78-b75d-7d24c66ef46d ===
127+
Scenario=fail ended: WorkflowFailedException - Workflow execution {workflowId='scenario-fail-0ff499af-1c3c-4a78-b75d-7d24c66ef46d', runId='', workflowType='ScenarioWorkflow'} failed. Metadata: {closeEventType='EVENT_TYPE_WORKFLOW_EXECUTION_FAILED', retryState='RETRY_STATE_RETRY_POLICY_NOT_SET', workflowTaskCompletedEventId=10'}
128+
129+
=== Starting scenario: timeout workflowId=scenario-timeout-e63954fb-c39b-4fb4-9dc9-ee2fa2835b52 ===
130+
Scenario=timeout ended: WorkflowFailedException - Workflow execution {workflowId='scenario-timeout-e63954fb-c39b-4fb4-9dc9-ee2fa2835b52', runId='', workflowType='ScenarioWorkflow'} timed out. Metadata: {closeEventType='EVENT_TYPE_WORKFLOW_EXECUTION_TIMED_OUT', retryState='RETRY_STATE_RETRY_POLICY_NOT_SET'}
131+
132+
=== Starting scenario: continue workflowId=scenario-continue-26e780ab-1f0a-4222-9b2b-5c09c62cb824 ===
133+
Scenario=continue Result=Hello Temporal
134+
135+
=== Starting scenario: cancel workflowId=scenario-cancel-08c9d1f6-6bfb-4ee9-9a5d-23b35fe1af7c ===
136+
Scenario=cancel ended: WorkflowFailedException - Workflow execution {workflowId='scenario-cancel-08c9d1f6-6bfb-4ee9-9a5d-23b35fe1af7c', runId=''} was cancelled. Metadata: {closeEventType='EVENT_TYPE_WORKFLOW_EXECUTION_CANCELED', retryState='RETRY_STATE_NON_RETRYABLE_FAILURE'}
137+
<===========--> 87% EXECUTING [18m 6s]
138+
139+
```
140+
there will be some failures in the worker logs as we are intentionally failing workflows for the data generation purpose.
141+
give few seconds to see the data in both the dashboard and try to run couple of workflows so the rate
142+
queries show properly.
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
package io.temporal.samples.temporalmetricsdemo;
2+
3+
import io.temporal.client.WorkflowClient;
4+
import io.temporal.client.WorkflowOptions;
5+
import io.temporal.client.WorkflowStub;
6+
import io.temporal.samples.temporalmetricsdemo.workflows.ScenarioWorkflow;
7+
import java.time.Duration;
8+
import java.util.UUID;
9+
10+
public class Starter {
11+
12+
public static void main(String[] args) throws Exception {
13+
WorkflowClient client = TemporalConnection.client();
14+
15+
String name = "Temporal";
16+
String[] scenarios = {"success", "fail", "timeout", "continue", "cancel"};
17+
18+
for (String scenario : scenarios) {
19+
String wid = "scenario-" + scenario + "-" + UUID.randomUUID();
20+
21+
WorkflowOptions.Builder optionsBuilder =
22+
WorkflowOptions.newBuilder()
23+
.setTaskQueue(TemporalConnection.TASK_QUEUE)
24+
.setWorkflowId(wid);
25+
26+
// workflow timeout
27+
if ("timeout".equalsIgnoreCase(scenario)) {
28+
optionsBuilder.setWorkflowRunTimeout(Duration.ofSeconds(3));
29+
}
30+
31+
ScenarioWorkflow wf = client.newWorkflowStub(ScenarioWorkflow.class, optionsBuilder.build());
32+
33+
System.out.println("\n=== Starting scenario: " + scenario + " workflowId=" + wid + " ===");
34+
35+
try {
36+
if ("cancel".equalsIgnoreCase(scenario)) {
37+
WorkflowClient.start(wf::run, scenario, name);
38+
Thread.sleep(2000);
39+
WorkflowStub untyped = client.newUntypedWorkflowStub(wid);
40+
untyped.cancel();
41+
untyped.getResult(String.class);
42+
continue;
43+
}
44+
45+
// normal synchronous execution
46+
String result = wf.run(scenario, name);
47+
System.out.println("Scenario=" + scenario + " Result=" + result);
48+
49+
} catch (Exception e) {
50+
System.out.println(
51+
"Scenario="
52+
+ scenario
53+
+ " ended: "
54+
+ e.getClass().getSimpleName()
55+
+ " - "
56+
+ e.getMessage());
57+
}
58+
}
59+
}
60+
}
Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
package io.temporal.samples.temporalmetricsdemo;
2+
3+
import com.sun.net.httpserver.HttpServer;
4+
import com.uber.m3.tally.RootScopeBuilder;
5+
import com.uber.m3.tally.Scope;
6+
import com.uber.m3.tally.StatsReporter;
7+
import com.uber.m3.util.Duration;
8+
import io.grpc.netty.shaded.io.grpc.netty.GrpcSslContexts;
9+
import io.grpc.netty.shaded.io.netty.handler.ssl.SslContext;
10+
import io.grpc.netty.shaded.io.netty.handler.ssl.SslContextBuilder;
11+
import io.micrometer.prometheus.PrometheusConfig;
12+
import io.micrometer.prometheus.PrometheusMeterRegistry;
13+
import io.temporal.client.WorkflowClient;
14+
import io.temporal.client.WorkflowClientOptions;
15+
import io.temporal.common.reporter.MicrometerClientStatsReporter;
16+
import io.temporal.serviceclient.WorkflowServiceStubs;
17+
import io.temporal.serviceclient.WorkflowServiceStubsOptions;
18+
import java.io.FileInputStream;
19+
import java.io.InputStream;
20+
import java.io.OutputStream;
21+
import java.net.InetSocketAddress;
22+
import java.nio.charset.StandardCharsets;
23+
24+
public final class TemporalConnection {
25+
private TemporalConnection() {}
26+
27+
// Read from environment (docker-compose env_file: .env)
28+
public static final String NAMESPACE = env("TEMPORAL_NAMESPACE", "<namespace>.<account-id>");
29+
public static final String ADDRESS =
30+
env("TEMPORAL_ADDRESS", "<namespace>.<account-id>.tmprl.cloud:7233");
31+
public static final String CERT =
32+
env("TEMPORAL_CERT", "path/to/client.pem");
33+
public static final String KEY =
34+
env("TEMPORAL_KEY", "path/to/client.key");
35+
public static final String TASK_QUEUE = env("TASK_QUEUE", "openmetrics-task-queue");
36+
private static final int METRICS_PORT = envInt("METRICS_PORT", 9464);
37+
private static final int METRICS_REPORT_SECONDS = envInt("METRICS_REPORT_SECONDS", 10);
38+
39+
private static volatile WorkflowClient CLIENT;
40+
private static volatile WorkflowServiceStubs SERVICE;
41+
42+
private static volatile boolean METRICS_STARTED = false;
43+
private static volatile PrometheusMeterRegistry PROM_REGISTRY;
44+
45+
public static WorkflowClient client() {
46+
if (CLIENT != null) return CLIENT;
47+
synchronized (TemporalConnection.class) {
48+
if (CLIENT != null) return CLIENT;
49+
50+
SERVICE = serviceStubs();
51+
CLIENT =
52+
WorkflowClient.newInstance(
53+
SERVICE, WorkflowClientOptions.newBuilder().setNamespace(NAMESPACE).build());
54+
return CLIENT;
55+
}
56+
}
57+
58+
// create service stubs used by worker + starter
59+
private static WorkflowServiceStubs serviceStubs() {
60+
try (InputStream clientCert = new FileInputStream(CERT);
61+
InputStream clientKey = new FileInputStream(KEY)) {
62+
63+
SslContext sslContext =
64+
GrpcSslContexts.configure(SslContextBuilder.forClient().keyManager(clientCert, clientKey))
65+
.build();
66+
67+
Scope metricsScope = metricsScope(); // ✅ tally scope that writes into Prometheus registry
68+
69+
WorkflowServiceStubsOptions options =
70+
WorkflowServiceStubsOptions.newBuilder()
71+
.setTarget(ADDRESS)
72+
.setSslContext(sslContext)
73+
.setMetricsScope(metricsScope) // ✅ Temporal SDK emits metrics here
74+
.build();
75+
76+
return WorkflowServiceStubs.newServiceStubs(options);
77+
} catch (Exception e) {
78+
throw new RuntimeException("Failed to create Temporal TLS connection", e);
79+
}
80+
}
81+
82+
private static Scope metricsScope() {
83+
synchronized (TemporalConnection.class) {
84+
if (PROM_REGISTRY == null) {
85+
PROM_REGISTRY = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
86+
}
87+
88+
StatsReporter reporter = new MicrometerClientStatsReporter(PROM_REGISTRY);
89+
90+
Scope scope =
91+
new RootScopeBuilder()
92+
.reporter(reporter)
93+
.reportEvery(Duration.ofSeconds(METRICS_REPORT_SECONDS));
94+
95+
if (!METRICS_STARTED) {
96+
METRICS_STARTED = true;
97+
startMetricsHttpServer(PROM_REGISTRY);
98+
}
99+
100+
return scope;
101+
}
102+
}
103+
104+
private static void startMetricsHttpServer(PrometheusMeterRegistry registry) {
105+
try {
106+
HttpServer server = HttpServer.create(new InetSocketAddress("0.0.0.0", METRICS_PORT), 0);
107+
server.createContext(
108+
"/metrics",
109+
exchange -> {
110+
byte[] body = registry.scrape().getBytes(StandardCharsets.UTF_8);
111+
exchange
112+
.getResponseHeaders()
113+
.add("Content-Type", "text/plain; version=0.0.4; charset=utf-8");
114+
exchange.sendResponseHeaders(200, body.length);
115+
try (OutputStream os = exchange.getResponseBody()) {
116+
os.write(body);
117+
}
118+
});
119+
server.setExecutor(null);
120+
server.start();
121+
System.out.println("Worker metrics exposed at http://0.0.0.0:" + METRICS_PORT + "/metrics");
122+
} catch (Exception e) {
123+
throw new RuntimeException("Failed to start /metrics endpoint", e);
124+
}
125+
}
126+
127+
private static String env(String key, String def) {
128+
String v = System.getenv(key);
129+
return (v == null || v.isBlank()) ? def : v;
130+
}
131+
132+
private static int envInt(String key, int def) {
133+
String v = System.getenv(key);
134+
if (v == null || v.isBlank()) return def;
135+
try {
136+
return Integer.parseInt(v.trim());
137+
} catch (Exception e) {
138+
return def;
139+
}
140+
}
141+
}
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
package io.temporal.samples.temporalmetricsdemo;
2+
3+
import io.temporal.client.WorkflowClient;
4+
import io.temporal.samples.temporalmetricsdemo.activities.ScenarioActivitiesImpl;
5+
import io.temporal.samples.temporalmetricsdemo.workflows.ScenarioWorkflowImpl;
6+
import io.temporal.worker.WorkerFactory;
7+
8+
public class WorkerMain {
9+
public static void main(String[] args) throws Exception {
10+
WorkflowClient client = TemporalConnection.client();
11+
12+
WorkerFactory factory = WorkerFactory.newInstance(client);
13+
io.temporal.worker.Worker worker = factory.newWorker(TemporalConnection.TASK_QUEUE);
14+
15+
worker.registerWorkflowImplementationTypes(ScenarioWorkflowImpl.class);
16+
worker.registerActivitiesImplementations(new ScenarioActivitiesImpl());
17+
18+
factory.start();
19+
System.out.println(
20+
"Worker started. namespace="
21+
+ TemporalConnection.NAMESPACE
22+
+ " taskQueue="
23+
+ TemporalConnection.TASK_QUEUE
24+
+ " metrics=http://0.0.0.0:"
25+
+ System.getenv().getOrDefault("METRICS_PORT", "9464")
26+
+ "/metrics");
27+
28+
Thread.currentThread().join();
29+
}
30+
}
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
package io.temporal.samples.temporalmetricsdemo.activities;
2+
3+
import io.temporal.activity.ActivityInterface;
4+
import io.temporal.activity.ActivityMethod;
5+
6+
@ActivityInterface
7+
public interface ScenarioActivities {
8+
@ActivityMethod
9+
String doWork(String name, String scenario);
10+
}

0 commit comments

Comments
 (0)