Skip to content

Commit cbdfd5b

Browse files
authored
Content Rev Nov21-1 (#19)
1 parent 6c26e7d commit cbdfd5b

File tree

10 files changed

+274
-3
lines changed

10 files changed

+274
-3
lines changed

docs/getting-started/first-pipeline.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ import clsx from "clsx";
1313

1414
Build and run your first machine learning pipeline in under 10 minutes using the TangleML editor's drag-and-drop interface.
1515

16-
<CTASection actions={<Link className={clsx(ctaStyles.button, ctaStyles.buttonOutline)} to={DEMO_URL}>🎮 Build and Run Your First Pipeline</Link>}>
16+
<CTASection actions={<Link className={clsx(ctaStyles.button, ctaStyles.buttonOutline)} to={DEMO_URL}>Build and Run Your First Pipeline</Link>}>
1717
<h2>Want to try it now?</h2>
1818
</CTASection>
1919

docs/index.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ import clsx from "clsx";
1515
Tangle is a service and Web app that allows users to build and run Machine Learning pipelines using drag and drop without having to set up a development environment.
1616

1717
<CTASection actions={<>
18-
<Link className={clsx(ctaStyles.button, ctaStyles.buttonOutline)} to={DEMO_URL}>🎮 Run Your First Pipeline In Seconds</Link>
18+
<Link className={clsx(ctaStyles.button, ctaStyles.buttonOutline)} to={DEMO_URL}>Run Your First Pipeline In Seconds</Link>
1919
</>}>
2020
<h2>Jump To Tangle</h2>
2121
<p>

docs/user-guide/assets/HF_0.png

149 KB
Loading

docs/user-guide/assets/HF_1.png

116 KB
Loading

docs/user-guide/assets/HF_2.png

277 KB
Loading

docs/user-guide/assets/HF_3.png

24.1 KB
Loading

docs/user-guide/assets/HF_4.png

117 KB
Loading
Lines changed: 270 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,270 @@
1+
---
2+
id: hugging-face-deployment
3+
title: Running Tangle on Hugging Face
4+
---
5+
6+
import { ImageAnnotation } from "@site/src/components/ImageAnnotation";
7+
import { HelpCircle, Info, Delete, Maximize2, PlusSquare, Copy, ListRestart } from "lucide-react";
8+
9+
# Running Tangle on Hugging Face
10+
11+
This guide covers everything you need to know about deploying and running TangleML on Hugging Face Spaces, from accessing the public playground to setting up your own team instance.
12+
13+
## Accessing Tangle on Hugging Face
14+
15+
There are two primary ways to access the Tangle application on Hugging Face:
16+
17+
### Main Hugging Face Interface
18+
Navigate to the TangleML organization on Hugging Face at [https://huggingface.co/spaces/tangleml/tangle](https://huggingface.co/spaces/tangleml/tangle). Here you'll find the Tangle space where you can click to start using the application.
19+
20+
<img src={require("./assets/HF_1.png").default} alt="Main Hugging Face Interface" style={{width: "100%", borderRadius: "6px"}} />
21+
22+
23+
This interface includes:
24+
- A header with access to files and community features
25+
- The main Tangle application embedded in an iframe
26+
- Options to duplicate the space or run locally
27+
28+
<img src={require("./assets/HF_3.png").default} alt="Main Hugging Face Interface" style={{width: "200px", borderRadius: "6px"}} />
29+
30+
31+
### Embedded Full-Screen Version
32+
There's also an embedded version that provides a better user experience with:
33+
- More vertical screen space
34+
- Proper URLs for individual runs
35+
- No iframe limitations
36+
37+
```html
38+
<iframe
39+
src="https://tangleml-tangle.hf.space"
40+
frameborder="0"
41+
width="850"
42+
height="450"
43+
></iframe>
44+
```
45+
46+
Link to the embedded version: https://tangleml-tangle.hf.space/
47+
48+
:::tip
49+
The embedded version is recommended when sharing run URLs or when you need maximum screen real estate for pipeline editing.
50+
:::
51+
52+
## Multi-Tenant Architecture
53+
54+
The main Tangle instance on Hugging Face operates as a **multi-tenant system**, where:
55+
56+
- Each user works in complete isolation
57+
- Every user has their own database for runs, components, and metadata
58+
- Each user has separate data artifact storage
59+
- Users cannot see or access other users' work
60+
61+
### Data Privacy and Storage
62+
63+
When using the multi-tenant Tangle:
64+
65+
- **Output artifacts** are stored in your personal Hugging Face dataset repository (e.g., `your-username/tangle-data`)
66+
- These repositories are **private by default**
67+
- You maintain full ownership of your artifacts
68+
- You can optionally make your data public through repository settings
69+
70+
<img src={require("./assets/HF_2.png").default} alt="Data Privacy and Storage" style={{width: "100%", borderRadius: "6px"}} />
71+
72+
:::warning
73+
The run database containing metadata is currently stored in Tangle's persistent storage, not in your personal repository. This may change in future updates.
74+
:::
75+
76+
### Where Executions Run
77+
78+
Pipeline executions run as **Hugging Face jobs** in your own account:
79+
- Jobs run under your username, not under tangleml/tangle
80+
- You can view and monitor job execution directly in Hugging Face
81+
- Each execution links to its corresponding Hugging Face job
82+
83+
84+
## Requirements and Costs
85+
86+
### What You Need
87+
88+
To use the shared Tangle instance on Hugging Face:
89+
90+
1. **Hugging Face Account**: Create and log in to your account
91+
2. **Permissions**: Grant Tangle access to:
92+
- Create repositories
93+
- Write to repositories
94+
- Create jobs
95+
3. **Pro Subscription**: Required for job execution ($9/month for individuals)
96+
97+
### Cost Breakdown
98+
99+
:::tip
100+
**Free to try**: You can explore the interface and create pipelines without a subscription. You only need Pro status to actually run jobs.
101+
:::
102+
103+
- **Pro Subscription**: $9/month (required for job execution)
104+
- **CPU Jobs**: Very affordable, almost negligible cost
105+
- **GPU Jobs**: More expensive, depending on hardware and duration
106+
- **Storage**: Your artifacts use your Hugging Face storage quota
107+
108+
## Creating Your Own Tangle Instance
109+
110+
Teams or individuals who want their own dedicated Tangle instance can duplicate the space.
111+
112+
### How to Duplicate
113+
114+
1. Navigate to the tangleml/tangle space
115+
2. Click the three-dots menu
116+
3. Select "Duplicate Space"
117+
118+
<img src={require("./assets/HF_4.png").default} alt="Duplicate Space" style={{width: "100%", borderRadius: "6px"}} />
119+
120+
### Configuration Options
121+
122+
When duplicating, you'll need to configure:
123+
124+
**Owner**: Choose your user account or organization
125+
126+
**Space Name**: Name your Tangle instance
127+
128+
**Visibility**:
129+
- Private (default) - only invited users can access
130+
- Public - anyone can view runs (read-only)
131+
132+
**Hardware**:
133+
- CPU Basic is sufficient for most users
134+
- No GPU required for the space itself
135+
136+
**Persistent Storage**:
137+
- Minimum 20GB recommended ($5/month)
138+
- Required to preserve runs and components
139+
140+
:::warning
141+
**Avoid ephemeral mode!** Without persistent storage, you'll lose all data when the space restarts.
142+
:::
143+
144+
**Hugging Face Token**: Create a token with permissions for:
145+
- Repository management ("manage repos" or "contribute repos")
146+
- Job submission ("jobs" permission)
147+
148+
149+
## Single-Tenant vs Multi-Tenant Differences
150+
151+
Your duplicated space operates differently from the shared instance:
152+
153+
### Authentication
154+
- Uses the configured token instead of individual user tokens
155+
- Allows fine-grained permission control
156+
- Can access private repositories if token has permissions
157+
158+
### Multi-User Support
159+
- Multiple team members can use the same instance
160+
- "Initiated by" field shows different users
161+
- All users share the same runs database
162+
163+
### Permissions Model
164+
User permissions in your Tangle instance mirror their organization roles:
165+
- **Read-only** org members → Read-only in Tangle
166+
- **Write** access → Can submit runs
167+
- **Admin** → Full Tangle admin capabilities
168+
169+
:::tip
170+
If you duplicate to a personal account (not an organization), you'll automatically be the admin of your instance.
171+
:::
172+
173+
### Data Storage
174+
- Artifacts stored in `your-space-name_data` repository
175+
- All team members share the same artifact storage
176+
- Database remains in the space's persistent storage
177+
178+
## Subscription Requirements by Setup
179+
180+
### Individual Users
181+
- **Shared Instance**: Pro subscription ($9/month)
182+
- **Personal Instance**: Pro subscription ($9/month) + storage costs
183+
184+
### Teams and Organizations
185+
- **Organization Instance**: Team subscription ($20/user/month)
186+
- **Required for**: Running jobs in organization namespace
187+
- **Includes**: Collaboration features and shared resources
188+
189+
## Limitations on Hugging Face
190+
191+
### Storage Constraints
192+
193+
The primary limitation is data storage:
194+
195+
- **Only dataset repositories** available for artifact storage
196+
- No direct mounting of storage (unlike Kubernetes deployments)
197+
- All data must be committed via Git operations
198+
- Input/output requires explicit download/upload steps
199+
200+
:::warning
201+
This adds overhead to pipeline execution as data must be downloaded before processing and uploaded after completion.
202+
:::
203+
204+
### Container Compatibility
205+
206+
Some technical requirements for containers:
207+
208+
- Must support Python installation (for Hugging Face CLI)
209+
- Requires compatibility with uv package manager
210+
- Issues with very old Alpine images (4-5 years old)
211+
- Problems with musl-based containers vs glibc
212+
213+
:::tip
214+
Most modern containers work without issues. Problems typically only occur with outdated or specialized minimal containers.
215+
:::
216+
217+
### Component Considerations
218+
219+
When creating components for Hugging Face deployment:
220+
221+
**Data Import/Export**:
222+
- Use Hugging Face-specific upload/download components
223+
- Standard library components being developed for HF repositories
224+
- Web downloads work universally
225+
226+
**Cross-Cloud Operations**:
227+
- Authentication challenges when accessing other clouds (GCS, AWS)
228+
- Requires credential management through private HF repositories
229+
- Private repos can act as secret managers for credentials
230+
231+
**Cloud-Specific Services**:
232+
- BigQuery, Vertex AI, etc. require special authentication
233+
- Consider using Hugging Face native services (inference, serving)
234+
- Plan for credential distribution in multi-cloud scenarios
235+
236+
## Public Space Access
237+
238+
Making your duplicated space public enables:
239+
240+
- Read-only access for non-authenticated users
241+
- Public viewing of runs and logs
242+
- Future: Artifact previews and visualizations
243+
- Shareable URLs for demonstrations
244+
245+
:::warning
246+
Keep your space private if you're working with sensitive data. Public spaces allow anyone to view your pipeline runs.
247+
:::
248+
249+
## Best Practices
250+
251+
### For Individuals
252+
1. Start with the shared multi-tenant instance
253+
2. Upgrade to Pro only when ready to run jobs
254+
3. Monitor job costs, especially for GPU workloads
255+
256+
### For Teams
257+
1. Duplicate to your organization namespace
258+
2. Configure appropriate persistent storage
259+
3. Set up team permissions before inviting members
260+
4. Consider data privacy requirements
261+
262+
### For Component Development
263+
1. Test containers for Python/HF CLI compatibility
264+
2. Plan data flow considering HF repository constraints
265+
3. Handle credentials securely for cross-cloud operations
266+
4. Leverage Hugging Face native services where possible
267+
268+
## Summary
269+
270+
Hugging Face provides a flexible deployment option for Tangle with both shared and dedicated instance options. While there are some limitations around storage and container compatibility, the platform offers a cost-effective way to run ML pipelines with built-in collaboration features. The "write once, run everywhere" philosophy of Cloud Pipelines ensures your components remain portable across different deployment targets.

docusaurus.config.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ const config: Config = {
121121
label: "Documentation",
122122
},
123123
// { to: "/updates", label: "Updates", position: "left" },
124-
{ to: DEMO_URL, label: "Demo", position: "left", className: "button button--primary button--lg" },
124+
{ to: DEMO_URL, label: "Playground", position: "left", className: "button button--primary button--lg" },
125125
],
126126
},
127127
footer: {

sidebars.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ const sidebars: SidebarsConfig = {
3030
'user-guide/studio-app-ui-overview',
3131
'getting-started/pipelines-persistence',
3232
'component-development/adding-components',
33+
'user-guide/hugging-face-deployment',
3334
],
3435
},
3536
{

0 commit comments

Comments
 (0)