Skip to content

Dataset Cleanup and Comepletion #5387

@BradReesWork

Description

@BradReesWork

This issues in an extension to the current Datasets Epic - #3357

Goals: at the Python (aka cugraph) layer

  • All Python code should be using the Datasets API

  • All example code should be using the Datasets API

  • All functions should be well documented

  • All functions should be fully coded and not be stubs

  • Provide results as a Graph: cugraph.Graph or nx.Graph

  • Provide results as a dataframe: cuDF or Pandas

  • Provide option to return data as a SciPy Sparse Matrix

  • Data file downloaded if not found in the specified dataset folder

  • Data file can be cleaned up at the end of use (function argument)

  • The "dataset" folder currently consumes 4.6 GB , would be nice to reduce that

  • Remove all file currently in Datasets

  • move all other files to an "extra" folder and start determining if they are used

  • validate download script for the C code

  • fix race condition

Metadata

Metadata

Assignees

Labels

improvementImprovement / enhancement to an existing function

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions