Practice...

Practice...
SQL vs NoSQL? - Actually, Relational and NON-Relational databases
- Relational DB
  - Some interesting details
- NON-Relational DB

Repository structure:

durable_execution_engine: folder with the durable execution engines
images: folder with images - mostly used in this readme
rules_engine: folder with different rules engines
README.md: overview of this project + some info on two way sync

What is a rule engine?

A rule engine uses rules, which are conditional statements, to evaluate inputs. By having rules in a yaml file we can more easily maintain the code for the engine and better scale to use different rules for different purposes (the engine just consumes the rules).

Rules Rule - practice_4

https://martinfowler.com/bliki/RulesEngine.html

https://pypi.org/project/funnel-rules-engine/

https://github.com/funnel-io/funnel-rules-engine

Concrete example

https://coplane.com/perspectives/building_reliable_ai

practice_7

Durable Execution Engine

practice_5

Implemented 2 types:

just a simple one to execute jobs sequentially without support for parallel execution.
A more complex one with support for parallel execution and we can define dependencies for each job - if no dependency then it runs async until it hits a job with dependency then it waits for all the async jobs to finish - it then checks if depedencies are met and it either executes or moves on to the next job.

Simple Engine

Async Engine

Two way sync

FinOps recommendation engine

A engine that gathers data from different sources, generates recommendations to save money, then distributes these recommendations as tickets in different Jira boards to be assigned to the owner of that infrastructure/resource. It also syncs back from Jira to the engine itself to maintain the state of the ticket in our DB.

recommendations table

id, created_at, recommendation, resource
1,10,resize,VM1
2,12,delete,VM2
3,15,delete,VM3
4,17,resize,VM1
5,18,delete,VM4
6,19,resize,VM5

recommendations_generated

id, last_successful_recommendation_generated
1,16
2,20

if recommendation, resource are the same, for example in entry 1 and 4 where resize,VM1 == resize,VM1 then it's a repeated recommendation. We should only create the ticket for 1 and then 4 is just the "last_created"

to know if ticket created or not check recommendation_ticket_map table when making a recommendation. When querying - limit it to search by created_at - if script runs every 1 week to generate recommendation, then to search for "repeated" recommendations in that limit - we can get the last recommendation date by getting last_successful_recommendation_generated from the "generated recommendations table"

tickets table

id, created_at, status, 
1,110,created,
2,112,created,
3,115,created,
4,118,closed,
5,119,closed,

recommendation_ticket_map table id(supressed),recommendation_id,first_recommendation_id,ticket_id

1,1,1
2,2,2
3,3,3
4,1,1
5,5,4
6,6,5

a distributor to push to 3 party a "monitoring" to pull data from the 3 party in a set interval - how can we query the 3 party? what are the options we have? GET data FROM users WHERE updated=(from last_pool_time to now)

last_pool_time we keep in our own side to know when was the time of the last successful data pull.

How to resume a execution that failed???

We can have partial runs - so what we can do is that for each run to create the tickets we are actually checking for last_successful_recommendation_generated - if it was 2 weeks ago then it failed....

How can we improve this? We need to track a few things in the DB:

current recommendation we are generating (like the durable workflow engine we need to keep track of the current recommendation/jira ticket being created so we can resume later)
... WIP

WIP/Notes

maintaining data consistency and enabling seamless collaboration between different platforms. defining sync rules, mapping fields, automating updates, and ensuring secure, real-time data exchange between the platforms

https://exalate.com/blog/two-way-integration/

Infrastructure deployment automation and orchestration

We would probably be looking into CI/CD and using the info from state file.. rules in OPA as part of the pipeline to enforce some of our rules..

Important things in any pipeline:

Security scans
- Vulnerabilities
- Secrets
Tests
- passing unit tests
- functional
- integration
- end-2-end
- smoke..
Terraform and/or tf Workspace config:
- tracking branch/tag..
- auto-apply
- auto-apply run triggers (cascade changes into other workspaces that are orchestrating resources that must be changed in case a previous workspace was modified)
- a workspace with "global outputs" to help orchestrate
  - Remote state sharing organization wide.
  - have several "global" workspaces, 1 for global networks, 1 for users info, etc and then from the "global_outputs" we can agreggate those global ones into only one.
- reusable modules - publish them in a resgistry, version them and make sure we have a workflow to orchestrate changes in all repositories

Some vendors..

spacelift: https://spacelift.io/ terragrunt: https://terragrunt.gruntwork.io/

Orchestrating Infrastructure

SaaS vs Dedicated vs Self-Hosted

Control plane always SaaS? Data stored centrally? Maintenance? Cost?

SaaS vs Dedicated vs Self-Hosted

Feature/Question	SaaS	Dedicated	Self-Hosted
Control plane	Always managed by provider	Can be managed or customer	Fully managed by customer
Data storage	Centrally (provider's cloud)	Can be isolated per customer	On-premises or customer cloud
Maintenance	Handled by provider	Shared or customer responsibility	Fully customer responsibility
Cost	Subscription, pay-as-you-go	Higher, often per-customer	Upfront infra + maintenance

Other...

Control plane always SaaS?
Data stored centrally?
Maintenance?
Scalability?
Cost for SaaS is usually lower as infra is shared
SaaS hybrid with namespace isolation for workloads and other types of isolation for data

SQL vs NoSQL? - Actually, Relational and NON-Relational databases

SQL - query language; and NoSQL - refers to non-relational databases that usually use different languages...

Relational DB

Diff tables and each table represents one type of data model. Many to many relationships: foreign keys.. used to represent relationship between rows in diff tables - "normalized". Can levarage Join - Left, inner and outer?

Downside: distributed transactions can be slow.

Good for data represented by relationship...

relational/normalized data - one reference to the one piece of data and have it stay consistent across the database.

Common features:

Tend to be B-Tree based for indexing. (better reads?)
Single leader replication. (avoid write conflicts?) but multi-leader replication is also possible.
Configurable isolation levels - ACID
- atomicity, consistency, isolation, durability

Some interesting details

Skip lock - allows to avoid deadlocks by skipping locked rows

Scale vertically

NON-Relational DB

Better data locality but has repeated data.. Keep all the data we need in one single record. Will lead to more writes as well as it's denormalized. Writes can be pretty complicated if we have to write the data in multiple places

Good to use when data can be kept as separate messages/records - no relationship between them.

Good to scale horizontally - Kind of a key value store.. easier to split workload and only part of DB owned by a workload - diff partitions - where is an item stored? Key value store.. key determines in which partition the item is stored - where to store new items and retrieve existing one? Hash and then go to the host assign to that range's hash.

Schemaless - each item does not need to have the same structure... pretty good if app and data structure is constantly changing/evolving.

Bad for data analysis... eventually consistent..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Practice...

Table of Contents

What is a rule engine?

Useful links and Libs..

Videos...

PyCon Sweden

Concrete example

Durable Execution Engine

Simple Engine

Async Engine

Two way sync

FinOps recommendation engine

WIP/Notes

Infrastructure deployment automation and orchestration

Some vendors..

Orchestrating Infrastructure

SaaS vs Dedicated vs Self-Hosted

SQL vs NoSQL? - Actually, Relational and NON-Relational databases

Relational DB

Some interesting details

NON-Relational DB

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Practice...

Table of Contents

What is a rule engine?

Useful links and Libs..

Videos...

PyCon Sweden

Concrete example

Durable Execution Engine

Simple Engine

Async Engine

Two way sync

FinOps recommendation engine

WIP/Notes

Infrastructure deployment automation and orchestration

Some vendors..

Orchestrating Infrastructure

SaaS vs Dedicated vs Self-Hosted

SQL vs NoSQL? - Actually, Relational and NON-Relational databases

Relational DB

Some interesting details

NON-Relational DB