Skip to content

How to create Link property and save the data in OrientDB from Spark? #18

@LianaN

Description

@LianaN

OrientDB Version: 2.2

Scala Version: 2.11.8

I use spark-orientdb connector to store DataFrame to OrientDB.

<dependency>
   <groupId>com.orientechnologies</groupId>
   <artifactId>orientdb-graphdb</artifactId>
   <version>2.2.2</version>
</dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-orientdb-2.2.1_2.11</artifactId>
            <version>1.4</version>
        </dependency>

Everything works fine, except one case. One of the fields of the OrientDB table User should be of the type Link. This field called ContentLink should point to the field ContentId of the table Content.

val cClass = graph.getVertexType("Content")
userA.createProperty("ContentLink", OType.LINK, cClass)

Then I save the data in Spark as follows:

   users
      .write
      .format("org.apache.spark.orientdb.graphs")
      .option("dburl", uri)
      .option("user", username)
      .option("password", password)
      .option("vertextype", "User")
      .mode(SaveMode.Overwrite)
      .save()

When I create properties using createProperty as shown above, the database structure seems to be correct and the field ContentLink is of the type Link. However, when I write the data into the table, this field is changed to String and the link is lost.

This happens because Link should indicate the RecordId (physical address - @rid) of the relevant row in the Content table.
I tried to retrieve all @rid of the table Content in Spark, but I was unable to do select @rid because of IndexOutOfBounds error:

    val df_idrid = spark.read
                      .format("org.apache.spark.orientdb.graphs")
                      .option("dburl", uri)
                      .option("user", username)
                      .option("password", password)
                      .option("vertextype", "Content")
                      .option("query", s"select @rid, id from Content")
                      .load()

How can I solve this issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions