Skip to content

Allow mapping empty cells (columns) as "absent" properties (i.e. skip) by CsvReadFeature.EMPTY_UNQUOTED_STRING_AS_MISSING #355

@hosswald

Description

@hosswald

When trying to parse a CSV with missing cells (in present columns) into a Kotlin Data class with fields that have default values, I found (FasterXML/jackson-module-kotlin#605) that I needed to combine CsvParser.Feature.EMPTY_STRING_AS_NULL and KotlinFeature.NullIsSameAsDefault to achieve this.
Looking through the test cases, it looks to me as if the CSV handling of missing columns could be improved.
Quoting myself from the above mentioned ticket:

the usage of default parameter should be the default when parsing a CSV and encountering an empty cell, regardless of whether or not EMPTY_STRING_AS_NULL is used, in my opinion.

That is, because CSVs don't have explicit nulls (like JSONs do).
If I'm not mistaken, missing fields in JSONs are parsed in a way that default values on the respective field are used. So far so good.
However, in CSVs, there is a difference between missing columns (defaults are used) and missing cells in present columns (requires me to mix feature flags from CsvParser and Kotlin module to achieve this.
I'm not sure about the situation for Java/POJOs. Looking at the existing tests,

MissingColumnsTest::testDefaultMissingHandling() handles missing columns the way I would like missing cells (in a present column) to be handled, but NullReadTest::testEmptyStringAsNull330() ensures empty cells are handled as null even if there is a default value.

As a infrequent user, I find the different handling of missing/present cells/columns confusing. In my opinion, all three of the following assertions should succeed or there should be a single flag to make it so (something like EMPTY_AND_MISSING_AS_DEFAULT or so):

    static class PojoWithDefault {
        public Integer id;
        public String value = "default";
    }

    public void testDefault() throws Exception {
        CsvSchema headerSchema = CsvSchema.emptySchema().withHeader();

        PojoWithDefault missingColumn = MAPPER
                .readerFor(PojoWithDefault.class)
                .with(headerSchema)
                .<PojoWithDefault>readValues("id\n"
                                             + "1").next();
        Assert.assertEquals(missingColumn.value, "default"); //succeeds

        PojoWithDefault missingCellInPresentColumnWithoutComma = MAPPER
                .readerFor(PojoWithDefault.class)
                .with(headerSchema)
                .<PojoWithDefault>readValues("id,value\n"
                                    + "1").next();
        Assert.assertEquals(missingCellInPresentColumnWithoutComma.value, "default"); //succeeds

        PojoWithDefault missingCellInPresentColumnWithComma = MAPPER
                .readerFor(PojoWithDefault.class)
                .with(headerSchema)
                .<PojoWithDefault>readValues("id,value\n"
                                    + "1,").next();
        Assert.assertEquals(missingCellInPresentColumnWithComma.value, "default"); //fails
    }

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions