Imput missing values by variable samplingd distribution — impute_by_sampling

This function take advantage of generate_synthetic_object to impute missing data. Read help(generate_synthetic_object) for more information.

Uso

impute_by_sampling_distribution(obj, set_seed = NULL)

Argumentos

obj: A dataframe, numeric vector or character/factor vector.
seed: Specify seed when replication is desired.

Valor

The same object without NA values as they have been imputed.

Ejemplos


impute_by_sampling_distribution(c(mtcars$mpg,NA,NA,NA,NA,NA))
#>  [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
#> [16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
#> [31] 15.0 21.4 28.9 31.7 20.5 25.5 20.2

data_temp <- data.frame(
    x = c(mtcars$mpg,NA,NA,NA,NA,NA),
    y = c(mtcars$cyl,NA,NA,NA,NA,NA))

as.data.frame(impute_by_sampling_distribution(data_temp) )
#> Error in impute_by_sampling_distribution(data_temp): objeto 'obj_label' no encontrado

dplyr::mutate(data_temp, x_impute = impute_by_sampling_distribution(x))
#>       x  y x_impute
#> 1  21.0  6     21.0
#> 2  21.0  6     21.0
#> 3  22.8  4     22.8
#> 4  21.4  6     21.4
#> 5  18.7  8     18.7
#> 6  18.1  6     18.1
#> 7  14.3  8     14.3
#> 8  24.4  4     24.4
#> 9  22.8  4     22.8
#> 10 19.2  6     19.2
#> 11 17.8  6     17.8
#> 12 16.4  8     16.4
#> 13 17.3  8     17.3
#> 14 15.2  8     15.2
#> 15 10.4  8     10.4
#> 16 10.4  8     10.4
#> 17 14.7  8     14.7
#> 18 32.4  4     32.4
#> 19 30.4  4     30.4
#> 20 33.9  4     33.9
#> 21 21.5  4     21.5
#> 22 15.5  8     15.5
#> 23 15.2  8     15.2
#> 24 13.3  8     13.3
#> 25 19.2  8     19.2
#> 26 27.3  4     27.3
#> 27 26.0  4     26.0
#> 28 30.4  4     30.4
#> 29 15.8  8     15.8
#> 30 19.7  6     19.7
#> 31 15.0  8     15.0
#> 32 21.4  4     21.4
#> 33   NA NA     22.7
#> 34   NA NA     15.4
#> 35   NA NA     18.0
#> 36   NA NA     17.7
#> 37   NA NA     23.9