initiate_random_table()

home > kero > Documentation

This function, as the name suggests, initiates random table systematically. We can specify the column data type etc.

kero.DataHandler.RandomDataFrame.py

class RandomDataFrame:
  def initiate_random_table(self, n_row, *argv, panda=True, with_unique_ID=None):
    return out
n_row (integer) no. of row entries
argv* (dictionary) For each k, argv[k] is a dictionary

key : name of column

value : possible data point in the column

Each column k is uniform randomly populated with x where x is any one of the element of value.

panda (Bool) True : output tuple (df,[]) where df is the random panda DataFrame
False: output (x,y) such that x is the matrix that forms a randomly initiated table and y the set of column names
with_unique_ID (String) If set to None¸then nothing happens.

If set to string, add to the first column a unique ID in this manner:

– if with_unique_ID=”person” then the ID will be person1, person2 etc in order

row_name_list If set to None, then nothing happens.

If set to string str, the row will be named with str1, str2, …

return out tuple, either (df,[]) or (x,y) as specified by the argument panda above

Example usage 1.

import kero.DataHandler.RandomDataFrame as RDF
import numpy as np

rdf=RDF.RandomDataFrame()
col1={"column_name": "first", "items":[1,2,3]}
itemlist=list(np.linspace(10,20,8))
col2={"column_name": "second", "items": itemlist}
df,_=rdf.initiate_random_table(4,col1,col2,panda=True)
print(df)

The output looks like the following.

   first     second
0      3  17.142857
1      3  18.571429
2      1  15.714286
3      1  11.428571

Example usage 2.

See example usage 1 in section “Puncture a table” in this link.

Example usage 3.

import kero.DataHandler.RandomDataFrame as RDF

rdf = RDF.RandomDataFrame()
itemlist = range(100)
col1 = {"column_name": "first", "items": itemlist}
col2 = {"column_name": "second", "items": itemlist}
col3 = {"column_name": "third", "items": itemlist}
col4 = {"column_name": "fourth", "items": itemlist}
N_row = 20
# rdf.initiate_random_table(N_row, col1, col2, col3, col4, panda=True)
# rdf.clean_df.index = ["".join(('aa', str(x))) for x in range(N_row)]
rdf.initiate_random_table(N_row, col1, col2, col3, col4, panda=True, row_name_list='aa')
print(rdf.clean_df)

The example output is partially shown here.

      first  second  third  fourth
aa0      50      14     25       3
aa1      91      28     52      90
aa2      36      90      4      82
aa3      54      82     85      46
aa4      81      57     68      94
aa5      89      52     51      24

Example Usage 4

See example usage 1 in repair_single_column. The example shows how a random table is initiated with unique ID.

kero version: 0.1 and above