drop_defective_table()

home > kero > Documentation

kero.DataHandler.DataTransform.py

def drop_defective_table(dataframe):
  return clean_df, crippled_df
dataframe panda dataframe
return clean_df, crippled_df clean_df and crippled_df are panda data frames.

Drop defective rows of a data frame. The function returns clean_df, which is the input data frame with defective rows removed, and crippled_df, which is the defective parts of the input data frame. Both returned objects are panda data frame. What constitutes defect by the function column_defect_index().

Example usage 1.

import pandas as pd
import kero.DataHandler.DataTransform as dt
import kero.DataHandler.Debuggers as dhdeb

rdf = dhdeb.check_initiate_random_table()
df = pd.read_csv(r"check_table_defect_index.csv")
dt.drop_defective_table(df)
clean_data, crippled_data = dt.drop_defective_table(df)

print(df)
print("\nclean df:\n")
print(clean_data)
print("\ncrippled df:\n")
print(crippled_data)

The above prints the full table, followed by “clean df” i.e. the table with the defects removed, and crippled df, i.e. the table with only defective rows.

    first     second third fourth
0     3.0  15.106383   not     us
1     2.0  17.234043   not     id
2     1.0  11.489362   not     sg
3     2.0        NaN    gg     bf
4     2.0        NaN   not     my
5     1.0  13.617021    gg     jp
6     3.0        NaN    gg     jp
7     2.0  18.510638    gg     bf
8     3.0  18.936170    gg     jp
9     2.0  17.234043   NaN     jp
10    NaN  16.808511   not     sg
11    3.0  14.893617   NaN     sg
12    3.0  17.234043    gg     jp
13    2.0  19.361702    gg    NaN
14    2.0  12.340426   not     us
15    1.0  15.744681    gg     id
16    1.0  15.744681   not     sg
17    3.0  17.659574   not     id
18    3.0  12.765957    gg     bf
19    3.0  14.042553   not     id

clean df:

    first     second third fourth
0     3.0  15.106383   not     us
1     2.0  17.234043   not     id
2     1.0  11.489362   not     sg
3     1.0  13.617021    gg     jp
4     2.0  18.510638    gg     bf
5     3.0  18.936170    gg     jp
6     3.0  17.234043    gg     jp
7     2.0  12.340426   not     us
8     1.0  15.744681    gg     id
9     1.0  15.744681   not     sg
10    3.0  17.659574   not     id
11    3.0  12.765957    gg     bf
12    3.0  14.042553   not     id

crippled df:

   first     second third fourth
0    2.0        NaN    gg     bf
1    2.0        NaN   not     my
2    3.0        NaN    gg     jp
3    2.0  17.234043   NaN     jp
4    NaN  16.808511   not     sg
5    3.0  14.893617   NaN     sg
6    2.0  19.361702    gg    NaN

kero version: 0.1 and above