-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy pathREADME.md.core
More file actions
392 lines (299 loc) · 9.46 KB
/
README.md.core
File metadata and controls
392 lines (299 loc) · 9.46 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
# LANL EDGE Core
A comprehensive bioinformatics platform for managing sequencing workflows and data analysis. EDGE (Enabling Distributed Genomics Analysis) provides a web-based interface for submitting, monitoring, and analyzing genomic sequencing projects using containerized workflows (Nextflow and Cromwell).
## Overview
EDGE Core is a full-stack application consisting of:
- **Web Server**: Node.js/Express backend with REST API and MongoDB database
- **Web Client**: React-based frontend built with Vite, CoreUI, and Material-UI
- **Workflow Engine**: Support for Nextflow and Cromwell workflow orchestration
- **Data Management**: Project and bulk submission management with file upload capabilities
## Features
- **Project Management**: Create and manage genomic analysis projects
- **Bulk Submissions**: Process multiple samples in batch operations
- **Workflow Orchestration**: Execute Nextflow and Cromwell-based workflows
- **File Management**: Upload and organize analysis data
- **Database Backups**: Automated database backup system
- **Email Notifications**: SMTP-based notifications for workflow status
- **Authentication**: JWT-based authentication and authorization
- **API Documentation**: Swagger UI for REST API exploration
## Tech Stack
### Backend
- **Runtime**: Node.js 20.19+
- **Framework**: Express.js
- **Database**: MongoDB
- **Authentication**: Passport.js with JWT
- **Validation**: Express-validator
- **Logging**: Winston with daily rotation
- **Task Scheduling**: node-cron
- **Email**: Nodemailer with Mailgun transport
- **Testing**: Jest with supertest
### Frontend
- **Framework**: React 18
- **Build Tool**: Vite
- **UI Libraries**: CoreUI, Material-UI
- **State Management**: Redux Toolkit
- **HTTP Client**: Axios
- **Forms**: React Hook Form
- **Styling**: SCSS with PostCSS
- **Tables**: Material React Table
- **Date Handling**: Moment.js
### Workflows
- **Nextflow**: Scalable workflow engine for genomics
- **Cromwell**: Workflow management system supporting WDL
## Project Structure
```
edge-core/
├── webapp/
│ ├── server/ # Express backend
│ │ ├── config.js # Configuration
│ │ ├── server.js # Entry point
│ │ ├── cronServer.js # Scheduled tasks
│ │ ├── indexRouter.js # API routes
│ │ ├── edge-api/ # API controllers
│ │ ├── crons/ # Scheduled job definitions
│ │ ├── workflow/ # Workflow utilities
│ │ ├── mailers/ # Email templates
│ │ ├── tests/ # Test suite
│ │ └── utils/ # Helper utilities
│ └── client/ # React frontend
│ ├── vite.config.mjs # Vite configuration
│ ├── src/ # React components
│ └── public/ # Static assets
├── workflows/
│ ├── Nextflow/ # Nextflow pipeline definitions
│ ├── Cromwell/ # Cromwell workflow definitions
│ └── docs/ # Workflow documentation
├── io/
│ ├── projects/ # Project data
│ ├── bulkSubmissions/ # Bulk submission data
│ ├── nextflow/ # Nextflow execution data
│ ├── db/ # Database backups
│ ├── upload/ # User uploads
│ └── sra/ # SRA data
└── installation/
├── install.sh # Installation script
└── README.md # Installation guide
```
## Installation
### Prerequisites
- **Node.js** 20.19 or higher
- **MongoDB** Community Edition or compatible
- **npm** (comes with Node.js)
- **pm2** (for production deployment)
### Quick Start
1. **Clone the repository**
```bash
git clone https://github.com/LANL-Bioinformatics/edge-core.git
cd edge-core
```
2. **Run the installation script**
```bash
cd installation
./install.sh
```
3. **Configure environment variables**
Client configuration:
```bash
cp webapp/client/.env.example webapp/client/.env
```
Edit `webapp/client/.env` with your API endpoint and settings.
Server configuration:
```bash
cp webapp/server/.env.example webapp/server/.env
```
Edit `webapp/server/.env` with your database, email, and other settings.
4. **Build the client**
```bash
cd webapp/client
npm run build
cd ../..
```
5. **Start MongoDB**
```bash
# Using Homebrew (macOS)
brew services start mongodb-community
# Using system service (Linux)
sudo systemctl start mongod
```
6. **Start the application**
```bash
pm2 start pm2.config.js
pm2 save
```
## Development
### Start Development Servers
**Terminal 1 - Backend**
```bash
cd webapp/server
npm install
npm test # Run tests
npm run lint # Check code quality
node server.js # Start server
```
**Terminal 2 - Frontend**
```bash
cd webapp/client
npm install
npm start # Development server (Vite)
npm run lint # Check code quality
```
The client will be available at `http://localhost:5173` by default.
The API server runs on the port specified in your `.env` file (typically 3000).
### Available Scripts
**Backend**
- `npm test` - Run test suite with Jest
- `npm run lint` - Check code style with ESLint
- `npm run lint:fix` - Auto-fix code style issues
**Frontend**
- `npm start` - Start development server
- `npm run build` - Create production build
- `npm run serve` - Preview production build
- `npm run lint` - Check code style
## Configuration
### Server Environment Variables
Key variables in `webapp/server/.env`:
```env
# Database
MONGODB_URI=mongodb://localhost:27017/edge-core
# Email/Notifications
MAILGUN_API_KEY=your_mailgun_key
MAILGUN_DOMAIN=your_mailgun_domain
# Authentication
JWT_SECRET=your_jwt_secret
JWT_EXPIRE=7d
# File Upload
MAX_UPLOAD_SIZE=5000000000 # 5GB
# Workflow
WORKFLOW_ENGINE=nextflow # or cromwell
```
### Client Environment Variables
Key variables in `webapp/client/.env`:
```env
VITE_API_BASE_URL=http://localhost:3000/api
```
## API Documentation
Once the server is running, access the Swagger API documentation at:
```
http://localhost:3000/api-docs
```
## Database Management
### Backup Database
```bash
mongodump --out ./io/db/db-backup_$(date +%Y-%m-%d:%H:%M:%S)
```
### Restore Database
```bash
mongorestore ./io/db/db-backup_DATE
```
The application automatically creates periodic backups in the `io/db/` directory.
## Workflow Management
### Nextflow Workflows
Nextflow pipelines are located in `workflows/Nextflow/`:
- `sra2fastq/` - Convert SRA files to FASTQ format
Run a Nextflow workflow:
```bash
nextflow run workflows/Nextflow/sra2fastq/main.nf -profile standard
```
### Cromwell Workflows
Cromwell workflows are defined in `workflows/Cromwell/`:
- `sra2fastq/` - SRA to FASTQ conversion workflow
Run a Cromwell workflow:
```bash
java -jar cromwell.jar run workflows/Cromwell/sra2fastq/main.wdl
```
## Monitoring & Logging
### View Application Logs
```bash
# All processes
pm2 logs
# Specific process
pm2 logs edge-server
# Real-time monitoring
pm2 monit
```
### Log Files
Application logs are written to:
- `io/log/` - Daily rotated log files
- `io/projects/*/log.txt` - Per-project logs
- `io/bulkSubmissions/*/log.txt` - Per-submission logs
## Testing
### Run Server Tests
```bash
cd webapp/server
npm test
```
### Run with Coverage
```bash
npm test -- --coverage
```
## Email Configuration
The application uses Nodemailer with Mailgun for sending notifications. To configure:
1. Get your Mailgun API credentials from https://mailgun.com
2. Set `MAILGUN_API_KEY` and `MAILGUN_DOMAIN` in `webapp/server/.env`
3. Email templates are in `webapp/server/email_templates/`
## Troubleshooting
### MongoDB Connection Issues
```bash
# Check if MongoDB is running
brew services list # macOS
sudo systemctl status mongod # Linux
# Test connection
mongosh mongodb://localhost:27017/edge-core
```
### Port Already in Use
```bash
# Kill process using port 3000
lsof -ti :3000 | xargs kill -9
# Kill process using port 5173
lsof -ti :5173 | xargs kill -9
```
### Clear Cache & Rebuild
```bash
# Client
cd webapp/client
rm -rf node_modules dist
npm install
npm run build
# Server
cd webapp/server
rm -rf node_modules
npm install
```
## Production Deployment
### Using PM2
1. **Create ecosystem file** (already in `pm2.config.js`)
2. **Start with PM2**
```bash
pm2 start pm2.config.js
pm2 save
pm2 startup # Enable auto-start on reboot
```
3. **Monitor**
```bash
pm2 logs
pm2 monit
```
### Using Docker (Optional)
Build and run containers:
```bash
docker build -t edge-core .
docker run -d -p 3000:3000 -p 5173:5173 edge-core
```
## Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
Please ensure:
- Code follows the ESLint configuration
- Tests pass and maintain coverage
- Commits are properly formatted
## License
This project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details.
## Support
For issues, questions, or contributions, please visit:
- GitHub Issues: https://github.com/LANL-Bioinformatics/edge-core/issues
- LANL Bioinformatics: https://www.lanl.gov/
## Acknowledgments
EDGE Core is developed by the Bioinformatics team at Los Alamos National Laboratory (LANL).