Skip to content

Dependency Graph Design

Elliotte Rusty Harold edited this page Jan 22, 2020 · 16 revisions

Definitions

Artifact

An artifact is a resource in the Maven repository system that has Maven coordinates. An artifact has a name, Maven coordinates, and a sequence of bytes. It is often instantiated as a file or a Web resource. However, the absolute and relative paths to the file, the created and modified times of the file, and other file system metadata are not part of the artifact. Furthermore, the same artifact can exist in many files and many file systems at the same time. Two copies of a resource with the same Maven coordinates in two different repositories are the same artifact. An artifact does not have dependencies.

The name is usually derived from the Maven coordinates and is fixed. The byte sequence is usually fixed after the artifact is first published. The notable exception to this are snapshot artifacts. Two artifacts are considered to be the same if they have the same coordinates, even if the name or the bytes have changed.

TBD: do we even need to consider names here?

Maven Coordinates

Maven coordinates are a colon separated string that uniquely identifies a Maven artifact. The exact syntax of the coordinates is defined by the Maven Project's POM reference. Maven coordinates contain up to five colon separated parts:

  • Group ID
  • Artifact ID
  • Version
  • Classifier
  • Packaging

Two strings that are character by character identical identify the same artifact. Furthermore the classifier and packaging have the default values of the empty string and "jar" respectively if they're omitted.

There is no guarantee that an artifact identified by syntactically correct Maven coordinates can be located or exists in any particular repositories.

Project Coordinates

Project coordinates are a colon separated string that uniquely identifies a Maven project. The exact syntax of the coordinates is defined by the Maven Project's POM reference. Project coordinates contain exactly three colon separated strings:

  • Group ID
  • Artifact ID
  • Version

Two strings that are character by character identical identify the same project.

A Maven artifact belongs to the project that has same Group ID, Artifact ID, and version that the artifact has.

Dependency

A dependency belongs to a project and is defined by a dependency element in the pom.xml. A dependency contains:

  • The group ID of a Maven project
  • The artifact ID of a Maven project
  • A version range
  • An ordered list of the dependencies of the project identified by the first three (TBD: how to consider ranges?)
  • A scope

Project

A Maven project is defined by a pom.xml file. It has a single group ID, artifact ID, and version. It contains zero or more dependencies of the project, each of which is identified by a dependency element in the dependencies section of the pom.xml file. It also contains one or more artifacts. Most projects contain a single artifact. However a project can use different classifiers and packaging types to produce multiple artifacts.

Projects also have various other metadata such as organization, copyright, issue tracker URL, developers, and more that we do not need to consider or model.

effective POM????

pom.xml

Dependency Graph

Dependency Tree

There is no such thing. A Maven dependency graph is not a tree. It can and more often than not does contain cycles.

Classpath

An ordered list of jar files, zip files, and directories, each of which contains Java class files. Maven-repository-based Build tools such as Maven, Gradle, and Ivy use different algorithms to convert a dependency graph into a classpath. That is, there is not a unique classpath for each dependency graph. javac and the Java virtual machine only read the classpath and do not consider the dependency graph.

Library Design (not yet implemented)

This is how we model the above concepts.

Artifacts are represented by the Aether Artifact class.

Dependencies are represented by the Dependency class.

Dependency graphs are represented by the DependencyGraph class.

Instead of producing a pure classpath, we produce an annotated classpath. This is an ordered list of jar files, the same as Maven or Gradle would produce. However in our data structure each node in the list is annotated with its Maven coordinates and with a pointer to the corresponding Dependency node in the graph.

Clone this wiki locally