|
| 1 | +--- |
| 2 | +id: hugging-face-deployment |
| 3 | +title: Running Tangle on Hugging Face |
| 4 | +--- |
| 5 | + |
| 6 | +import { ImageAnnotation } from "@site/src/components/ImageAnnotation"; |
| 7 | +import { HelpCircle, Info, Delete, Maximize2, PlusSquare, Copy, ListRestart } from "lucide-react"; |
| 8 | + |
| 9 | +# Running Tangle on Hugging Face |
| 10 | + |
| 11 | +This guide covers everything you need to know about deploying and running TangleML on Hugging Face Spaces, from accessing the public playground to setting up your own team instance. |
| 12 | + |
| 13 | +## Accessing Tangle on Hugging Face |
| 14 | + |
| 15 | +There are two primary ways to access the Tangle application on Hugging Face: |
| 16 | + |
| 17 | +### Main Hugging Face Interface |
| 18 | +Navigate to the TangleML organization on Hugging Face at [https://huggingface.co/spaces/tangleml/tangle](https://huggingface.co/spaces/tangleml/tangle). Here you'll find the Tangle space where you can click to start using the application. |
| 19 | + |
| 20 | +<img src={require("./assets/HF_1.png").default} alt="Main Hugging Face Interface" style={{width: "100%", borderRadius: "6px"}} /> |
| 21 | + |
| 22 | + |
| 23 | +This interface includes: |
| 24 | +- A header with access to files and community features |
| 25 | +- The main Tangle application embedded in an iframe |
| 26 | +- Options to duplicate the space or run locally |
| 27 | + |
| 28 | +<img src={require("./assets/HF_3.png").default} alt="Main Hugging Face Interface" style={{width: "200px", borderRadius: "6px"}} /> |
| 29 | + |
| 30 | + |
| 31 | +### Embedded Full-Screen Version |
| 32 | +There's also an embedded version that provides a better user experience with: |
| 33 | +- More vertical screen space |
| 34 | +- Proper URLs for individual runs |
| 35 | +- No iframe limitations |
| 36 | + |
| 37 | +```html |
| 38 | +<iframe |
| 39 | + src="https://tangleml-tangle.hf.space" |
| 40 | + frameborder="0" |
| 41 | + width="850" |
| 42 | + height="450" |
| 43 | +></iframe> |
| 44 | +``` |
| 45 | + |
| 46 | +Link to the embedded version: https://tangleml-tangle.hf.space/ |
| 47 | + |
| 48 | +:::tip |
| 49 | +The embedded version is recommended when sharing run URLs or when you need maximum screen real estate for pipeline editing. |
| 50 | +::: |
| 51 | + |
| 52 | +## Multi-Tenant Architecture |
| 53 | + |
| 54 | +The main Tangle instance on Hugging Face operates as a **multi-tenant system**, where: |
| 55 | + |
| 56 | +- Each user works in complete isolation |
| 57 | +- Every user has their own database for runs, components, and metadata |
| 58 | +- Each user has separate data artifact storage |
| 59 | +- Users cannot see or access other users' work |
| 60 | + |
| 61 | +### Data Privacy and Storage |
| 62 | + |
| 63 | +When using the multi-tenant Tangle: |
| 64 | + |
| 65 | +- **Output artifacts** are stored in your personal Hugging Face dataset repository (e.g., `your-username/tangle-data`) |
| 66 | +- These repositories are **private by default** |
| 67 | +- You maintain full ownership of your artifacts |
| 68 | +- You can optionally make your data public through repository settings |
| 69 | + |
| 70 | +<img src={require("./assets/HF_2.png").default} alt="Data Privacy and Storage" style={{width: "100%", borderRadius: "6px"}} /> |
| 71 | + |
| 72 | +:::warning |
| 73 | +The run database containing metadata is currently stored in Tangle's persistent storage, not in your personal repository. This may change in future updates. |
| 74 | +::: |
| 75 | + |
| 76 | +### Where Executions Run |
| 77 | + |
| 78 | +Pipeline executions run as **Hugging Face jobs** in your own account: |
| 79 | +- Jobs run under your username, not under tangleml/tangle |
| 80 | +- You can view and monitor job execution directly in Hugging Face |
| 81 | +- Each execution links to its corresponding Hugging Face job |
| 82 | + |
| 83 | + |
| 84 | +## Requirements and Costs |
| 85 | + |
| 86 | +### What You Need |
| 87 | + |
| 88 | +To use the shared Tangle instance on Hugging Face: |
| 89 | + |
| 90 | +1. **Hugging Face Account**: Create and log in to your account |
| 91 | +2. **Permissions**: Grant Tangle access to: |
| 92 | + - Create repositories |
| 93 | + - Write to repositories |
| 94 | + - Create jobs |
| 95 | +3. **Pro Subscription**: Required for job execution ($9/month for individuals) |
| 96 | + |
| 97 | +### Cost Breakdown |
| 98 | + |
| 99 | +:::tip |
| 100 | +**Free to try**: You can explore the interface and create pipelines without a subscription. You only need Pro status to actually run jobs. |
| 101 | +::: |
| 102 | + |
| 103 | +- **Pro Subscription**: $9/month (required for job execution) |
| 104 | +- **CPU Jobs**: Very affordable, almost negligible cost |
| 105 | +- **GPU Jobs**: More expensive, depending on hardware and duration |
| 106 | +- **Storage**: Your artifacts use your Hugging Face storage quota |
| 107 | + |
| 108 | +## Creating Your Own Tangle Instance |
| 109 | + |
| 110 | +Teams or individuals who want their own dedicated Tangle instance can duplicate the space. |
| 111 | + |
| 112 | +### How to Duplicate |
| 113 | + |
| 114 | +1. Navigate to the tangleml/tangle space |
| 115 | +2. Click the three-dots menu |
| 116 | +3. Select "Duplicate Space" |
| 117 | + |
| 118 | +<img src={require("./assets/HF_4.png").default} alt="Duplicate Space" style={{width: "100%", borderRadius: "6px"}} /> |
| 119 | + |
| 120 | +### Configuration Options |
| 121 | + |
| 122 | +When duplicating, you'll need to configure: |
| 123 | + |
| 124 | +**Owner**: Choose your user account or organization |
| 125 | + |
| 126 | +**Space Name**: Name your Tangle instance |
| 127 | + |
| 128 | +**Visibility**: |
| 129 | +- Private (default) - only invited users can access |
| 130 | +- Public - anyone can view runs (read-only) |
| 131 | + |
| 132 | +**Hardware**: |
| 133 | +- CPU Basic is sufficient for most users |
| 134 | +- No GPU required for the space itself |
| 135 | + |
| 136 | +**Persistent Storage**: |
| 137 | +- Minimum 20GB recommended ($5/month) |
| 138 | +- Required to preserve runs and components |
| 139 | + |
| 140 | +:::warning |
| 141 | +**Avoid ephemeral mode!** Without persistent storage, you'll lose all data when the space restarts. |
| 142 | +::: |
| 143 | + |
| 144 | +**Hugging Face Token**: Create a token with permissions for: |
| 145 | +- Repository management ("manage repos" or "contribute repos") |
| 146 | +- Job submission ("jobs" permission) |
| 147 | + |
| 148 | + |
| 149 | +## Single-Tenant vs Multi-Tenant Differences |
| 150 | + |
| 151 | +Your duplicated space operates differently from the shared instance: |
| 152 | + |
| 153 | +### Authentication |
| 154 | +- Uses the configured token instead of individual user tokens |
| 155 | +- Allows fine-grained permission control |
| 156 | +- Can access private repositories if token has permissions |
| 157 | + |
| 158 | +### Multi-User Support |
| 159 | +- Multiple team members can use the same instance |
| 160 | +- "Initiated by" field shows different users |
| 161 | +- All users share the same runs database |
| 162 | + |
| 163 | +### Permissions Model |
| 164 | +User permissions in your Tangle instance mirror their organization roles: |
| 165 | +- **Read-only** org members → Read-only in Tangle |
| 166 | +- **Write** access → Can submit runs |
| 167 | +- **Admin** → Full Tangle admin capabilities |
| 168 | + |
| 169 | +:::tip |
| 170 | +If you duplicate to a personal account (not an organization), you'll automatically be the admin of your instance. |
| 171 | +::: |
| 172 | + |
| 173 | +### Data Storage |
| 174 | +- Artifacts stored in `your-space-name_data` repository |
| 175 | +- All team members share the same artifact storage |
| 176 | +- Database remains in the space's persistent storage |
| 177 | + |
| 178 | +## Subscription Requirements by Setup |
| 179 | + |
| 180 | +### Individual Users |
| 181 | +- **Shared Instance**: Pro subscription ($9/month) |
| 182 | +- **Personal Instance**: Pro subscription ($9/month) + storage costs |
| 183 | + |
| 184 | +### Teams and Organizations |
| 185 | +- **Organization Instance**: Team subscription ($20/user/month) |
| 186 | +- **Required for**: Running jobs in organization namespace |
| 187 | +- **Includes**: Collaboration features and shared resources |
| 188 | + |
| 189 | +## Limitations on Hugging Face |
| 190 | + |
| 191 | +### Storage Constraints |
| 192 | + |
| 193 | +The primary limitation is data storage: |
| 194 | + |
| 195 | +- **Only dataset repositories** available for artifact storage |
| 196 | +- No direct mounting of storage (unlike Kubernetes deployments) |
| 197 | +- All data must be committed via Git operations |
| 198 | +- Input/output requires explicit download/upload steps |
| 199 | + |
| 200 | +:::warning |
| 201 | +This adds overhead to pipeline execution as data must be downloaded before processing and uploaded after completion. |
| 202 | +::: |
| 203 | + |
| 204 | +### Container Compatibility |
| 205 | + |
| 206 | +Some technical requirements for containers: |
| 207 | + |
| 208 | +- Must support Python installation (for Hugging Face CLI) |
| 209 | +- Requires compatibility with uv package manager |
| 210 | +- Issues with very old Alpine images (4-5 years old) |
| 211 | +- Problems with musl-based containers vs glibc |
| 212 | + |
| 213 | +:::tip |
| 214 | +Most modern containers work without issues. Problems typically only occur with outdated or specialized minimal containers. |
| 215 | +::: |
| 216 | + |
| 217 | +### Component Considerations |
| 218 | + |
| 219 | +When creating components for Hugging Face deployment: |
| 220 | + |
| 221 | +**Data Import/Export**: |
| 222 | +- Use Hugging Face-specific upload/download components |
| 223 | +- Standard library components being developed for HF repositories |
| 224 | +- Web downloads work universally |
| 225 | + |
| 226 | +**Cross-Cloud Operations**: |
| 227 | +- Authentication challenges when accessing other clouds (GCS, AWS) |
| 228 | +- Requires credential management through private HF repositories |
| 229 | +- Private repos can act as secret managers for credentials |
| 230 | + |
| 231 | +**Cloud-Specific Services**: |
| 232 | +- BigQuery, Vertex AI, etc. require special authentication |
| 233 | +- Consider using Hugging Face native services (inference, serving) |
| 234 | +- Plan for credential distribution in multi-cloud scenarios |
| 235 | + |
| 236 | +## Public Space Access |
| 237 | + |
| 238 | +Making your duplicated space public enables: |
| 239 | + |
| 240 | +- Read-only access for non-authenticated users |
| 241 | +- Public viewing of runs and logs |
| 242 | +- Future: Artifact previews and visualizations |
| 243 | +- Shareable URLs for demonstrations |
| 244 | + |
| 245 | +:::warning |
| 246 | +Keep your space private if you're working with sensitive data. Public spaces allow anyone to view your pipeline runs. |
| 247 | +::: |
| 248 | + |
| 249 | +## Best Practices |
| 250 | + |
| 251 | +### For Individuals |
| 252 | +1. Start with the shared multi-tenant instance |
| 253 | +2. Upgrade to Pro only when ready to run jobs |
| 254 | +3. Monitor job costs, especially for GPU workloads |
| 255 | + |
| 256 | +### For Teams |
| 257 | +1. Duplicate to your organization namespace |
| 258 | +2. Configure appropriate persistent storage |
| 259 | +3. Set up team permissions before inviting members |
| 260 | +4. Consider data privacy requirements |
| 261 | + |
| 262 | +### For Component Development |
| 263 | +1. Test containers for Python/HF CLI compatibility |
| 264 | +2. Plan data flow considering HF repository constraints |
| 265 | +3. Handle credentials securely for cross-cloud operations |
| 266 | +4. Leverage Hugging Face native services where possible |
| 267 | + |
| 268 | +## Summary |
| 269 | + |
| 270 | +Hugging Face provides a flexible deployment option for Tangle with both shared and dedicated instance options. While there are some limitations around storage and container compatibility, the platform offers a cost-effective way to run ML pipelines with built-in collaboration features. The "write once, run everywhere" philosophy of Cloud Pipelines ensures your components remain portable across different deployment targets. |
0 commit comments