M1522.006300 Distributed Systems

This is the course project folder of M1522.006300 Distributed Systems of Group 17.

Project Description

The goal of this project is to deploy and manage a prototype cloud cluster running batch processing WordLetterCount applications. There are two WordLetterCount applications implemented in different ways: one used the Spark API, the other used WordCount API and a self-designed resource scheduler.

Developer Tutorials

Refer to the docs folder for useful guides. `` The project specification is specified in Specification.md.

Refer to GCP guide for a detailed tutorial on how to configure, access and use your GCP clusters.

Our project ID is peaceful-fact-294309, you can use the web-based dashboard GCP Console to view our cluster, VMs and Pods.

To-Dos

Deploy Google Dataproc on GKE (ref: Dataproc on Google Kubernetes Engine)
Install WordCount locally to test
Test WordCount on GKE
- Deploy Hadoop on GKE
- Tweak Hadoop deployment, integration with GCS

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
docs		docs
res		res
wordlettercount		wordlettercount
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

M1522.006300 Distributed Systems

Project Description

Developer Tutorials

To-Dos

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

M1522.006300 Distributed Systems

Project Description

Developer Tutorials

To-Dos

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages