Skip to content

Commit 0816a27

Browse files
authored
[python] Introduce branch_manager API for pypaimon (#7448)
=
1 parent e92b27a commit 0816a27

File tree

14 files changed

+1596
-0
lines changed

14 files changed

+1596
-0
lines changed

docs/content/pypaimon/python-api.md

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -843,6 +843,108 @@ branch_manager = manager.with_branch('feature_branch')
843843
print(branch_manager.consumers()) # Consumers on feature branch
844844
```
845845

846+
## Branch Management
847+
848+
Branch management allows you to create multiple versions of a table, enabling parallel development and experimentation. Paimon supports creating branches from the current state or from specific tags.
849+
850+
{{< hint info >}}
851+
PyPaimon provides two implementations of `BranchManager`:
852+
- **FileSystemBranchManager**: For tables accessed directly via filesystem (default for filesystem catalog)
853+
- **CatalogBranchManager**: For tables accessed via catalog (e.g., REST catalog)
854+
855+
The `table.branch_manager()` method automatically returns the appropriate implementation based on the table's catalog environment.
856+
{{< /hint >}}
857+
858+
### Create Branch
859+
860+
Create a new branch from the current table state:
861+
862+
```python
863+
from pypaimon import CatalogFactory
864+
865+
catalog = CatalogFactory.create({'warehouse': 'file:///path/to/warehouse'})
866+
table = catalog.get_table('database_name.table_name')
867+
868+
# Create a branch from current state
869+
table.branch_manager().create_branch('feature_branch')
870+
```
871+
872+
Create a branch from a specific tag:
873+
874+
```python
875+
# Create a branch from tag 'v1.0'
876+
table.branch_manager().create_branch('feature_branch', tag_name='v1.0')
877+
```
878+
879+
Create a branch and ignore if it already exists:
880+
881+
```python
882+
# No error if branch already exists
883+
table.branch_manager().create_branch('feature_branch', ignore_if_exists=True)
884+
```
885+
886+
### List Branches
887+
888+
List all branches for a table:
889+
890+
```python
891+
# Get all branch names
892+
branches = table.branch_manager().branches()
893+
894+
for branch in branches:
895+
print(f"Branch: {branch}")
896+
```
897+
898+
### Check Branch Exists
899+
900+
Check if a specific branch exists:
901+
902+
```python
903+
if table.branch_manager().branch_exists('feature_branch'):
904+
print("Branch exists")
905+
else:
906+
print("Branch does not exist")
907+
```
908+
909+
### Drop Branch
910+
911+
Delete an existing branch:
912+
913+
```python
914+
# Drop a branch
915+
table.branch_manager().drop_branch('feature_branch')
916+
```
917+
918+
### Fast Forward
919+
920+
Fast forward the main branch to a specific branch:
921+
922+
```python
923+
# Fast forward main to feature branch
924+
# This is useful when you want to merge changes from a feature branch back to main
925+
table.branch_manager().fast_forward('feature_branch')
926+
```
927+
928+
{{< hint warning >}}
929+
Fast forward operation is irreversible and will replace the current state of the main branch with the target branch's state.
930+
{{< /hint >}}
931+
932+
### Branch Path Structure
933+
934+
Paimon organizes branches in the file system as follows:
935+
936+
- **Main branch**: Stored directly in the table directory (e.g., `/path/to/table/`)
937+
- **Feature branches**: Stored in a `branch` subdirectory (e.g., `/path/to/table/branch/branch-feature_branch/`)
938+
939+
### Branch Name Validation
940+
941+
Branch names have the following constraints:
942+
943+
- Cannot be "main" (the default branch)
944+
- Cannot be blank or whitespace only
945+
- Cannot be a pure numeric string
946+
- Valid examples: `feature`, `develop`, `feature-123`, `my-branch`
947+
846948
## Supported Features
847949

848950
The following shows the supported features of Python Paimon compared to Java Paimon:
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
from .branch_manager import BranchManager, DEFAULT_MAIN_BRANCH
19+
from .catalog_branch_manager import CatalogBranchManager
20+
from .filesystem_branch_manager import FileSystemBranchManager
21+
22+
__all__ = ['BranchManager', 'DEFAULT_MAIN_BRANCH', 'CatalogBranchManager', 'FileSystemBranchManager']
Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
import logging
19+
from typing import List, Optional
20+
21+
logger = logging.getLogger(__name__)
22+
23+
BRANCH_PREFIX = "branch-"
24+
DEFAULT_MAIN_BRANCH = "main"
25+
26+
27+
class BranchManager:
28+
"""
29+
Manager for Branch.
30+
31+
This is a base class for managing table branches in Paimon.
32+
Branches allow multiple lines of development on the same table.
33+
"""
34+
35+
def create_branch(
36+
self,
37+
branch_name: str,
38+
tag_name: Optional[str] = None,
39+
ignore_if_exists: bool = False
40+
) -> None:
41+
"""
42+
Create a branch from the current state or from a tag.
43+
44+
Args:
45+
branch_name: Name of the branch to create
46+
tag_name: Optional tag name to create branch from, None for current state
47+
ignore_if_exists: If true, do nothing when branch already exists;
48+
if false, throw exception
49+
50+
Raises:
51+
NotImplementedError: Subclasses must implement this method
52+
"""
53+
raise NotImplementedError("Subclasses must implement create_branch")
54+
55+
def drop_branch(self, branch_name: str) -> None:
56+
"""
57+
Drop a branch.
58+
59+
Args:
60+
branch_name: Name of the branch to drop
61+
62+
Raises:
63+
NotImplementedError: Subclasses must implement this method
64+
"""
65+
raise NotImplementedError("Subclasses must implement drop_branch")
66+
67+
def fast_forward(self, branch_name: str) -> None:
68+
"""
69+
Fast forward the current branch to the specified branch.
70+
71+
Args:
72+
branch_name: The branch to fast forward to
73+
74+
Raises:
75+
NotImplementedError: Subclasses must implement this method
76+
"""
77+
raise NotImplementedError("Subclasses must implement fast_forward")
78+
79+
def branches(self) -> List[str]:
80+
"""
81+
List all branches.
82+
83+
Returns:
84+
List of branch names
85+
86+
Raises:
87+
NotImplementedError: Subclasses must implement this method
88+
"""
89+
raise NotImplementedError("Subclasses must implement branches")
90+
91+
def branch_exists(self, branch_name: str) -> bool:
92+
"""
93+
Check if a branch exists.
94+
95+
Args:
96+
branch_name: Name of the branch to check
97+
98+
Returns:
99+
True if branch exists, False otherwise
100+
"""
101+
return branch_name in self.branches()
102+
103+
@staticmethod
104+
def branch_path(table_path: str, branch: str) -> str:
105+
"""
106+
Return the path string of a branch.
107+
108+
Args:
109+
table_path: The table path
110+
branch: The branch name
111+
112+
Returns:
113+
The path to the branch
114+
"""
115+
if BranchManager.is_main_branch(branch):
116+
return table_path
117+
return f"{table_path}/branch/{BRANCH_PREFIX}{branch}"
118+
119+
@staticmethod
120+
def normalize_branch(branch: str) -> str:
121+
"""
122+
Normalize branch name.
123+
124+
Args:
125+
branch: The branch name to normalize
126+
127+
Returns:
128+
The normalized branch name
129+
"""
130+
if not branch or not branch.strip():
131+
return DEFAULT_MAIN_BRANCH
132+
return branch.strip()
133+
134+
@staticmethod
135+
def is_main_branch(branch: str) -> bool:
136+
"""
137+
Check if the branch is the main branch.
138+
139+
Args:
140+
branch: The branch name to check
141+
142+
Returns:
143+
True if the branch is the main branch, False otherwise
144+
"""
145+
return branch == DEFAULT_MAIN_BRANCH
146+
147+
@staticmethod
148+
def validate_branch(branch_name: str) -> None:
149+
"""
150+
Validate branch name.
151+
152+
Args:
153+
branch_name: The branch name to validate
154+
155+
Raises:
156+
ValueError: If branch name is invalid
157+
"""
158+
if BranchManager.is_main_branch(branch_name):
159+
raise ValueError(
160+
f"Branch name '{branch_name}' is the default branch and cannot be used."
161+
)
162+
if not branch_name or not branch_name.strip():
163+
raise ValueError("Branch name is blank.")
164+
if branch_name.strip().isdigit():
165+
raise ValueError(
166+
f"Branch name cannot be pure numeric string but is '{branch_name}'."
167+
)
168+
169+
@staticmethod
170+
def fast_forward_validate(branch_name: str, current_branch: str) -> None:
171+
"""
172+
Validate fast-forward parameters.
173+
174+
Args:
175+
branch_name: The branch to fast forward to
176+
current_branch: The current branch name
177+
178+
Raises:
179+
ValueError: If parameters are invalid
180+
"""
181+
if branch_name == DEFAULT_MAIN_BRANCH:
182+
raise ValueError(
183+
f"Branch name '{branch_name}' do not use in fast-forward."
184+
)
185+
if not branch_name or not branch_name.strip():
186+
raise ValueError("Branch name is blank.")
187+
if branch_name == current_branch:
188+
raise ValueError(
189+
f"Fast-forward from the current branch '{current_branch}' is not allowed."
190+
)

0 commit comments

Comments
 (0)