Get Better handling for Different Input Formats

Some problems when inputs to the ML model are not in standard formats

- When processing data, we expect an R data.frame, should make it possible to give other data formats 
- When models expect one hot encoded categorical variables, this is not what we expect and we end treating them as continuous variables, which is weird (causes problems for fitting the Distillation)