Login  Register

Missing value imputation and Outlier treatment

classic Classic list List threaded Threaded
1 message Options Options
Embed post
Permalink
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

Missing value imputation and Outlier treatment

Ujjawal
41 posts
Should missing value imputation and outlier treatment be done prior to splitting data into training and validation data sets? Suppose, i have split my data into training and validation data. I have done median imputation for missing values and capped data at 1 and 99th percentile in training data set. While imputing missing data and outlier treatment in validation data set, should i use the same median and capping value that were calculated in training data. Would it be fine if i calculate the median and percentile scores according to validation data set? In future, the same process will hold for a new data set in which we do scoring? I know it's not a SPSS question. As many analytics professionals are active in this forum, i thought i would get an answer if i post my question here :-)

Thanks in anticipation!