Skip to content

Step_dummy_hash() #304

@nhward

Description

@nhward

In step_dummy_hash, the keep_original_cols=TRUE parameter is not working when collapse=TRUE

library(modeldata)
library(recipes)
library(textrecipes)
data("tate_text", package = "modeldata")

With collapse=FALSE original columns: artist, title, medium are still present after step_dummy_hash() since keep_original_cols=TRUE

recipes::recipe(tate_text, ~., strings_as_factors = FALSE) |>
  textrecipes::step_dummy_hash(artist, title, medium, signed = TRUE, collapse = FALSE, keep_original_cols = TRUE) |>
  recipes::prep() |>
  recipes::bake(new_data = NULL) |>
  colnames()

With collapse=TRUE original columns: artist, title, medium are no longer present after step_dummy_hash() despite
keep_original_cols=TRUE. This is not what should happen.

recipes::recipe(tate_text, ~., strings_as_factors = FALSE) |>
  textrecipes::step_dummy_hash(artist, title, medium, signed = TRUE, collapse = TRUE, keep_original_cols = TRUE) |>
  recipes::prep() |>
  recipes::bake(new_data = NULL) |>
  colnames()

The creation of "collapse" column "artist_title_medium" is an undocumented side-effects of collapse=TRUE. Curiously, this is the column that is dropped when keep_original_cols = FALSE (along with artist, title, medium)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions