Skip to content

Conversation

@pvillard31
Copy link
Contributor

Summary

NIFI-15448 - Add option for using predefined schemas in GenerateRecord

Using JSON Record Writer with specified pattern for date, time and timestamp formats:

  • Person
{
  "id" : "e86c234e-133d-48d7-a267-54c3a632d077",
  "firstName" : "Mercedez",
  "lastName" : "Rodriguez",
  "email" : "[email protected]",
  "phoneNumber" : "(447) 857-5497",
  "dateOfBirth" : "11/01/2002",
  "age" : 70,
  "active" : false,
  "address" : {
    "street" : "28368 Cartwright Landing",
    "city" : "West Sunshine",
    "state" : "Vermont",
    "zipCode" : "18616",
    "country" : "Denmark"
  }
}
  • Order
{
  "orderId" : "a19b9fbc-486c-4445-8c20-6fa0f45b88a0",
  "customerId" : "23105625-d197-4ad7-82ae-2b2e010e978c",
  "customerName" : "Elouise Nikolaus",
  "customerEmail" : "[email protected]",
  "orderDate" : "04/17/2025",
  "orderTime" : "18:44:47",
  "orderTimestamp" : "04/17/2025 18:44:47",
  "totalAmount" : 4496.86,
  "currency" : "CAD",
  "status" : "SHIPPED",
  "shipped" : true,
  "itemCount" : 3,
  "lineItems" : [ {
    "productId" : "PRD-71258615",
    "productName" : "Heavy Duty Wooden Car",
    "quantity" : 7,
    "unitPrice" : 164.17
  }, {
    "productId" : "PRD-63121751",
    "productName" : "Fantastic Paper Shoes",
    "quantity" : 9,
    "unitPrice" : 327.25
  }, {
    "productId" : "PRD-42145137",
    "productName" : "Enormous Paper Table",
    "quantity" : 2,
    "unitPrice" : 201.21
  } ]
}
  • Event
{
  "eventId" : "dc8a0971-bce7-40b8-8a9c-af9a1d8ef18d",
  "eventType" : "WARNING",
  "eventDate" : "12/10/2025",
  "eventTime" : "17:38:01",
  "eventTimestamp" : "12/10/2025 17:38:01",
  "source" : "api-gateway",
  "severity" : "CRITICAL",
  "message" : "Illo quibusdam eligendi fugiat a quaerat eos laborum.",
  "processed" : true,
  "retryCount" : 0,
  "durationMs" : 4472,
  "tags" : [ "automated", "pending" ],
  "metadata" : {
    "environment" : "staging",
    "correlationId" : "85e7317e-6550-4d1b-a14d-646bcf4809c4",
    "region" : "eu-west-1",
    "version" : "1.5"
  }
}
  • Sensor
{
  "sensorId" : "SNS-3385333161",
  "deviceType" : "MULTI",
  "manufacturer" : "EnviroMonitor",
  "readingTimestamp" : "01/09/2026 12:39:43",
  "temperature" : 39.71,
  "humidity" : 61.94,
  "pressure" : 1017.96,
  "batteryLevel" : 38,
  "signalStrength" : -46,
  "online" : false,
  "location" : {
    "latitude" : 55.63841,
    "longitude" : -134.873745,
    "altitude" : 336.13
  }
}
  • Product
{
  "productId" : "48083f6d-7fc7-4a90-8eca-3303d527e4d5",
  "sku" : "SKU-43443749",
  "name" : "Gorgeous Copper Car",
  "description" : "Voluptatibus nesciunt possimus totam nobis. Eum dicta deserunt. Expedita accusantium quisquam. Reiciendis porro modi officiis numquam necessitatibus. Repellat asperiores distinctio est velit.",
  "category" : "Jewelry & Tools",
  "brand" : "Zemlak and Sons",
  "price" : 1441.33,
  "currency" : "USD",
  "inStock" : true,
  "quantity" : 166,
  "rating" : 4.8,
  "reviewCount" : 3081,
  "createdDate" : "01/12/2025",
  "lastUpdated" : "01/03/2026 17:03:25",
  "tags" : [ "bestseller", "exclusive" ],
  "dimensions" : {
    "length" : 9.24,
    "width" : 68.33,
    "height" : 77.48,
    "weight" : 48.9
  }
}
  • Stock Trade
{
  "tradeId" : "d3400be3-e1b9-4dcb-ab2a-ef025750f49e",
  "symbol" : "GOOGL",
  "companyName" : "Alphabet Inc.",
  "exchange" : "NYSE",
  "tradeType" : "BUY",
  "tradeTimestamp" : "01/08/2026 18:57:43",
  "price" : 2341.3815,
  "quantity" : 6428,
  "totalValue" : 1.505040028E7,
  "currency" : "USD",
  "bidPrice" : 2339.0401,
  "askPrice" : 2343.7229,
  "high52Week" : 3512.0723,
  "low52Week" : 1404.8289,
  "marketCap" : 1227598881693,
  "settled" : false
}
  • Complete Example
{
  "id" : "4043aad4-1160-4aa2-a60e-49171c9a4d54",
  "active" : true,
  "score" : 44,
  "count" : 186350,
  "rating" : 4.99,
  "price" : 743.37,
  "balance" : 44918.92,
  "initial" : "M",
  "flags" : 77,
  "rank" : 824,
  "createdDate" : "03/31/2025",
  "lastLoginTime" : "21:26:28",
  "lastModified" : "01/05/2026 01:02:56",
  "tags" : [ "automated", "important", "verified", "pending" ],
  "scores" : [ 91, 93, 96 ],
  "metadata" : {
    "environment" : "staging",
    "source" : "web",
    "region" : "us-east-1",
    "version" : "1.5"
  },
  "profile" : {
    "firstName" : "Cinthia",
    "lastName" : "Windler",
    "email" : "[email protected]",
    "age" : 79,
    "verified" : false,
    "address" : {
      "street" : "8130 Jenee Ford",
      "city" : "West Lewisport",
      "state" : "Indiana",
      "zipCode" : "15860",
      "country" : "French Southern Territories",
      "coordinates" : {
        "latitude" : -73.280023,
        "longitude" : -98.408473
      }
    }
  },
  "orders" : [ {
    "orderId" : "ORD-55940080",
    "amount" : 268.44,
    "currency" : "USD",
    "placed" : "12/14/2025",
    "shipped" : true
  } ]
}

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000
  • Pull request contains commits signed with a registered key indicating Verified status

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using ./mvnw clean install -P contrib-check
    • JDK 21
    • JDK 25

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for proposing this improvement @pvillard31.

The concept of a predefined schema sounds helpful in some scenarios, but I'm somewhat concerned about the choice of various values. Particularly for certain things like stock symbols or cloud provider regions, there appear to be a number of specific choices in this implementation.

As an alternative, it seems like maintain example schemas somewhere else, like the Confluence Wiki pages, would avoid putting a lot of these particular choices in project code.

If there is some other established generic types, that could be another approach.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also raises questions about field names themselves. Schema.org is one general place for community-based definition of many things, so that could be one potential pattern.

@pvillard31
Copy link
Contributor Author

Thanks for the review @exceptionfactory. I made some changes to:

  • use Faker's providers whenever possible - there is only a few places left where I keep a list of values but those are very standard (log level, etc)
  • rename the field names to be more aligned with what schema.org is suggesting for similar objects

I would definitely not go with the suggestion of having some Avro schemas in some place (even if it was in an additionalDetails page on the processor) because an avro schema does not rely on the Faker's providers. It only looks at the field types and would generate something completely random which is really not great.

Besides the goal is really to make it dead easy to generate some data for users. I find myself wasting a lot of time configuring this processor when I want to generate some basic data for quick demos/tests. I think this would be very helpful for users to have this option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants