Parameter Estimation for Differential Equations using Scipy Least Square

home > Research

The post is about parameter estimation based on gradient descent specifically for the differential equation \frac{dn}{dt}=G-k_1 n-k_2 n^2-k_3 n^3, i.e. we are looking for a good (G,k_1,k_2,k_3) that best fits the experimental data. We will go through the methods of choosing initial parameter guesses and running the optimization. This post is more fully explained in this note: Differential Equation Parameter Estimations. The data used here is a fraction of the sample data I was given. The codes and sample data can also be found here.

As usual, do use virtual environment for cleaner package management (you can see here). The packages needed can be installed via pip.

pip install numpy
pip install scipy
pip install matplotlib

Put the scripts scipyode_header.pycubicODE.py and cubicODE_post_processing.py into the project folder – in the virtual environment if you are using one. The code can be found at the end of the post.

  1. PART 1. Loading data. In scipyode_header.py, customize load_data() as you see fit. As it is, it will load the data starting from a maximum point, and assume that the csv file contains 2 columns: column 1 is for variable t, column 2 is n. The file does not have any header.
    Go into cubicODE.py, set the following toggles.

    plot_data True
    initial_guess_mode False
    optimize_mode False

    Run cubicODE.py. You will see figure like figure 1. In this example, the relevant data points (blue) are the points that overlap with the orange plot.

  2. PART 1. Define the ODE. Write the ODE in def f(t,n,f_args).
  3. PART 2. Choosing initial guesses. Set any number of initial guesses you want to try out with by setting collection_of_initial_guesses.  Since we optimize [G,k1,k2,k3], an example guess is [0,1e-3,0.5e-3,1.2e-7]. Set the following toggles and then run cubicODE.py. Plot like figure 1 will be generated.
    plot_data False
    initial_guess_mode True
    optimize_mode False
  4. PART 2. Optimization. Once the choices of initial guesses are decided upon, set the following toggles and then run cubicODE.py. The optimized parameters from each initial guess will be saved as a .par file, and the save name is set by the variable save_name. The code in our example used 3 initial guesses and save_name = “20190110_cubic”. Thus, 20190110_cubic_1.par, 20190110_cubic_2.par and 20190110_cubic_3.par will be created.
    plot_data False
    initial_guess_mode False
    optimize_mode True
  5. Loading optimized parameters. Go to cubicODE_post_processing.py. If we want to load all the 3 saved .par files, then set load_name = “20190110_cubic” and serial_number = [1,2,3]. Then run cubicODE_post_processing.py. The plot results are shown in figure 2(A, B, C), showing good fit. The optimized parameters for each initial guess can be read off from the console, as in figure 2(D). The smaller cost value is, the better the fit is generally.

The optimized parameters seem to produce good fitting. We are done!

guess1

Figure 1.Green plot shows the experimental data points while the rest are generated using RK45 using 3 different initial guesses, given by collection_of_initial_guesses = [  [0,1e-3,0.5e-3,1.2e-7], [0,1e-2,0,1e-7], [0,0,1.5e-3,1e-7] ].

opres.JPG

Figure 2. (A, B, C) shows the results of parameter estimation from 3 different initial guesses. (D) shows the results in the console. For each serial number, the optimized parameters are printed as [G, k1, k2, k3].

scipyode_header.py

import numpy as np
from scipy.integrate import RK45
import matplotlib.pyplot as plt
import csv

def RK45_wrapper(x0, time_checkpoints, f, f_args, stepsize=None):
	if stepsize is None:
		stepsize = 0.01*(time_checkpoints[-1]-time_checkpoints[0])
	t_set = [time_checkpoints[0]]
	x_set = [x0]
	t_checkpoint_new = [time_checkpoints[0]]
	x_checkpoint = [x0]

	f_temp = lambda t,y : f(t,y, f_args)

	for i in range(len(time_checkpoints)-1):	
		t1 = time_checkpoints[i+1]
		if i>0:
			x0 = ox.y
			t0 = ox.t
			
		else:
			t0 = time_checkpoints[i]
		ox = RK45(f_temp,t0,x0,t1,max_step=stepsize)
		
		while ox.t < t1:
			msg = ox.step()
			t_set.append(ox.t)
			x_set.append(ox.y)
		x_checkpoint.append(ox.y)
		t_checkpoint_new.append(ox.t)
	return x_set, t_set, x_checkpoint, t_checkpoint_new

def read_csv(filename, header_skip=1,get_the_first_N_rows = 0):
    # filename : string, name of csv file without extension
    # headerskip : nonnegative integer, skip the first few rows
    # get_the_first_N_rows : integer, the first few rows after header_skip to be read
    #   if all rows are to be read, set to zero

    # example
    # xx = read_csv('tryy', 3)
    # xx = [[np.float(x) for x in y] for y in xx] # nice here
    # print(xx)
    # print(np.sum(xx[0]))
    
    out = []
    with open(str(filename)+'.csv') as csv_file:
        data_reader = csv.reader(csv_file)
        count = 0
        i = 0
        for row in data_reader:
            if count < header_skip:
                count = count + 1
            else:
                out.append(row)
                if get_the_first_N_rows>0:
                	i = i + 1
                	if i == get_the_first_N_rows:
                		break
    return out

def load_data(plot_data=False):
	print(" --+ load_data().")

	# load data
	filename = 'sample_data'
	all_data = read_csv(filename, header_skip=0)
	all_data = [[np.float(x) for x in y] for y in all_data]

	X=[x[0] for x in all_data]
	Y=[x[1] for x in all_data]

	# customize your data loading here
	max_index = np.argmax(Y)
	relevant_data= all_data[max_index:]
	Xr = [x[0] for x in relevant_data]
	Yr = [x[1] for x in relevant_data]

	print("     data size = ", len(Xr))
	print("     peak = ", X[max_index],",", Y[max_index])
	if plot_data:
		fig = plt.figure()
		ax=fig.add_subplot(111)
		ax.scatter(X,Y,3)
		ax.scatter(X[max_index],Y[max_index])
		ax.plot(Xr,Yr,'r')
		ax.set_xlabel("t")
		ax.set_ylabel("n")
		plt.show()

	return Xr, Yr

cubicODE.py

import numpy as np
import matplotlib.pyplot as plt
import scipyode_header as sh
import pickle
from scipy.optimize import least_squares

# TOGGLES:
initial_guess_mode = 0
optimize_mode = 1
# SETTINGS:
save_name = "20190110_cubic"

# --------------- PART 1 ----------------
#
# Load data, prepare the differential equation

def f(t,n, f_args): # function for pendulum motion with friction 
	dndt = [f_args[0] - f_args[1] * n - f_args[2] * n**2 - f_args[3] * n**3]
	return dndt
print("\n------------- LOADING DATA ----------------\n")
Xr, Yr = sh.load_data(plot_data=0)

x0 = np.array([Yr[0]])
t_set_expt = [np.array([x]) for x in Xr]
n_set_expt = [np.array([x]) for x in Yr]
print("x0 = ",x0)
print("Xr[:5] = ",Xr[:5])
print("Yr[:5] = ",Yr[:5])
print("t_set_expt[:5] = ",t_set_expt[:5])
print("n_set_expt[:5] = ",n_set_expt[:5])

# --------------- PART 2 ----------------
#
# try out initial guesses

# initial guess [G, k1, k2, k3]
collection_of_initial_guesses = [ 
 [0,1e-3,0.5e-3,1.2e-7],
 [0,1e-2,0,1e-7],
 [0,0,1.5e-3,1e-7]
]

if initial_guess_mode:
	print("\n------------- INITIAL TESTING ----------------\n")
	color_scheme = np.linspace(0,1,len(collection_of_initial_guesses))
	fig = plt.figure()
	ax = fig.add_subplot(111)
	ax.scatter(Xr,Yr,3,color='g',label="expt data")
	for i in range(len(collection_of_initial_guesses)):
		f_args0 = collection_of_initial_guesses[i]
		# print(f_args0)
		this_label = "initial guess" + str(i+1)
		_, _, x_test, t_test = sh.RK45_wrapper(x0, t_set_expt, f, f_args0, stepsize=None)		
		ax.scatter(t_test,x_test,3,color=(1-color_scheme[i],0,color_scheme[i]),label=this_label)
	ax.legend()
	ax.set_xlabel("t")
	ax.set_ylabel("n")	

if optimize_mode:
	def target_function(f_args):
		# "checkpoint" here refers to the intended data points
		#   since RK45 will generate more points in between the checkpoints.
		#   What happens is just that with smaller time steps in between the checkpoints,
		#   we can compute the checkpoints more precisely.
		_, _, x_checkpoint, t_checkpoint_new = sh.RK45_wrapper(x0, t_set_expt, f, f_args, stepsize=None)
		
		collection_of_f_i = []
		# each i is a data point in this particular example
		for x1, x2 in zip(x_checkpoint,n_set_expt):
			this_norm = np.linalg.norm(np.array(x1)-np.array(x2))
			collection_of_f_i.append(this_norm)
		return np.array(collection_of_f_i)
	print("\n------------- OPTIMIZATION STAGE ----------------\n")

	for i in range(len(collection_of_initial_guesses)):
		f_args0 = collection_of_initial_guesses[i]
		print(" initial guess = ", f_args0)
		optimized_res = least_squares(target_function,f_args0)
		f_args_optimized = optimized_res.x

		SAVE_DATA = {
			"initial_guess": f_args0,
			"optimized": f_args_optimized,
			"cost":optimized_res.cost
		}

		print(" + optimized result = ",f_args_optimized)
		print(" + cost = ",optimized_res.cost)
		output = open(save_name+"_"+ str(i+1) +".par", 'wb')
		pickle.dump(SAVE_DATA, output)
		output.close()


plt.show()

cubicODE_post_processing.py

import numpy as np
import matplotlib.pyplot as plt
import pickle
import scipyode_header as sh

# SETTINGS
load_name = "20190110_cubic"
serial_number = [1,2,3] 

print("\n------------- LOADING DATA ----------------\n")
def f(t,n, f_args): # function for pendulum motion with friction 
	dndt = [f_args[0] - f_args[1] * n - f_args[2] * n**2 - f_args[3] * n**3]
	return dndt
Xr, Yr = sh.load_data(plot_data=0)
x0 = np.array([Yr[0]])
t_set_expt = [np.array([x]) for x in Xr]
n_set_expt = [np.array([x]) for x in Yr]
print("x0 = ",x0)
print("Xr[:5] = ",Xr[:5])
print("Yr[:5] = ",Yr[:5])
print("t_set_expt[:5] = ",t_set_expt[:5])
print("n_set_expt[:5] = ",n_set_expt[:5])

print("\n------------- PLOTTING ----------------\n")
color_scheme = np.linspace(0,1,len(serial_number))

for j in range(len(serial_number)):
	fig = plt.figure()
	ax = fig.add_subplot(111)
	ax.scatter(Xr,Yr,3,color='g',label="expt data")
	i = serial_number[j]
	this_label = "initial guess" + str(i+1)
	pkl_file = open(load_name+'_'+str(i)+'.par', 'rb')
	SAVE_DATA = None
	SAVE_DATA = pickle.load(pkl_file)
	pkl_file.close()
	# print(SAVE_DATA)
	f_args_initial = SAVE_DATA["initial_guess"]
	f_args_optimized = SAVE_DATA["optimized"]
	cost  = SAVE_DATA["cost"]
	_, _, x_op, t_op = sh.RK45_wrapper(x0, t_set_expt, f, f_args_optimized, stepsize=None)	
	print("Serial number:",j+1, " - Optimized parameters = " )
	print(" ",f_args_optimized)
	print(" cost = ", cost)
	title_msg = load_name +":"+ str(j+1)
	ax.plot(t_op,x_op,color=(1-color_scheme[j],0,color_scheme[j]),label=this_label)
	ax.legend()
	ax.set_title(title_msg)
	ax.set_xlabel("t")
	ax.set_ylabel("n")

plt.show()

Parameter Estimation for Differential Equations using Gradient Descent

home > Research

The post is about parameter estimation based on gradient descent specifically for the differential equation \frac{dn}{dt}=G-k_1 n-k_2 n^2-k_3 n^3, i.e. we are looking for a good (G,k_1,k_2,k_3) that best fits the experimental data. We will go through the methods of choosing initial parameter guesses and running the optimization. This post is more fully explained in this note: Differential Equation Parameter Estimations. The data used here is a fraction of the sample data I was given. The codes and sample data can also be found here.

Consider this link instead Parameter Estimation for Differential Equations using Scipy Least Square since it utilizes more robust optimization component from scipy.

Place fitting.py, fitting_post_processing.py and param_estimate.py (the codes are at the end of the post) in the project folder. Some required python packages include matplotlib, scipy and kero; do pip install them.

  1. In the command line, cd into the directory where the project is.
  2. Preliminary. Prepare sample_data.csv containing data points (t,n). First column is for t (horizontal axis), second column is n (vertical axis). Set the following toggles in fitting.py.
    plot_data True
    do_initial_testing False
    start_op False

    Run python fitting.py. This plots the sample data, shown in figure 1. In this context, the relevant data points are the data after the peak. The orange curve shows the plot of only the extracted relevant data.

  3. Initial testing stage. Set the following toggles in fitting.py.
    plot_data False
    do_initial_testing True
    start_op False

    Set collection_of_p_init. You can place as many guesses as you want. In this example,

    collection_of_p_init =[
      [0,1e-3,0.5e-3,1.2e-7], # guess 1
      [0,1e-2,0,1e-7], # guess 2
      [0,0,1.5e-3,1e-7], # guess 3
    ]

    we try using 3 different initial parameters, where each of them is of the format [G,k1,k2,k3], corresponding to the 4 parameters that we want to fit. Then, run the command:  python fitting.py. The results are shown in figure 2. It can be seen that the first and third guesses are closer to the experimental data; we might want to use the parameters near these values for random initiation in the optimization stage (but for demonstration sake, we will not do it here).

  4. Optimization stage. Set the following toggles in fitting.py.
    plot_data False
    do_initial_testing False
    start_op True

    Set the preferred settings. p_init_max and p_init_min are the range of random initial parameters to try from. We can set the number of attempts via the variable no_of_tries, and of course each attempt starts with a random initial parameter based on the range we just specified. Run the command:  python fitting.py. You will see optimization in progress with decreasing mse values.

  5. Viewing the optimization result. We want to load all 3 fitting_data files we have created. Thus, in fitting _post_processing.py, set
    filename = 'fitting_data'
    save_data_series = [1,2,3]

    Run the command:  python fitting _post_processing.py. Figure 3 shows how after every iteration, the resulting curve is approaching the experimental data.

Finally, we want to see what is the best parameter p. When viewing the optimization result, the parameters will be printed. We are more or less done!

What else to do?

In this post I have only run a few iterations over a few number of random parameter initializations. To obtain the best results, we should run them over larger iterations and try more random initial parameters.

That’s all, have fun~


diffit1.JPG

Figure 1. A plot of n-t from the sample data is in blue. The orange data is in

Figure 2. The curves show 3 different plots from 3 different guesses of initial parameters. The red curve is the experimental data. The console shows recommended setting for learning_rate variable.

diffit3.JPG

Figure 3. As the algorithm is iterated, the curve goes from red to blue. The figures generally show that the curve approaches the experimental data and that MSE trend decreases along the iteration of i.

fitting.py

import numpy as np
import kero.utils.utils as ut
import param_estimate as pe
import matplotlib.pyplot as plt
from scipy.integrate import RK45
import csv
import pickle

es = pe.DE_param_estimator()

print("----------------- Load Data -------------------------------")
Xr, Yr = pe.load_data(plot_data=0)

x0 = np.array([Yr[0]])
t_set_expt = [np.array([x]) for x in Xr]
n_set_expt = [np.array([x]) for x in Yr]
print("x0 = ",x0)
print("Xr[:5] = ",Xr[:5])
print("Yr[:5] = ",Yr[:5])
print("t_set_expt[:5] = ",t_set_expt[:5])
print("n_set_expt[:5] = ",n_set_expt[:5])

#.#

print("----------------- Prepare Model ---------------------------") 
# MODEL used to fit this data
#
# We are estimating the parameters of the following differential equations:
#   dn/dt = F(n, p) = G - k1 * n - k2 * n**2 - k3 * n**3
#   where p is the parameter (G,k1,k2,k3)
def F(y,p):
	return p[0] - p[1] * y -p[2] * y**2 - p[3] * y**3
diff_part_G = lambda x: 1
diff_part_k1 = lambda x: -x 
diff_part_k2 = lambda x: -x**2 
diff_part_k3 = lambda x: -x**3 
list_of_derivatives = [diff_part_G, diff_part_k1, diff_part_k2, diff_part_k3]
print("  Model prepared.")
#.#

do_initial_testing = 0
if do_initial_testing:
	print("---------------- Prepare Initial guess --------------------")
	# Before optimizing, try playing around the p_init values here
	# We will choose the range of uniform random values for guess p based on tests conducted here
	# Write down choices that seem good and any remarks here 
	# 
	#   1. p_init = [0,5e-2,2e-3,1.1e-7] # near the real value. all n less than expt data
	#   2. p_init = [0,5e-2,1e-3,1.1e-7] # seems to be better than 1
	#   3. p_init = [0,5e-2,0,1.1e-7] # all n above expt data 
	#   4. p_init = [0,1e-1,1e-4,1.1e-7] # small n above expt data, larger n below
	#   5. p_init = [0,1e-2,1e-3,1.1e-7] # exceedingly close! Let's vary around these values instead
	#
	#
	#

	collection_of_p_init =[
		[0,1e-3,0.5e-3,1.2e-7], # guess 1
		[0,1e-2,0,1e-7], # guess 2
		[0,0,1.5e-3,1e-7], # guess 3
	]

	fig = plt.figure()
	ax = fig.add_subplot(111)
	no_of_test = len(collection_of_p_init)
	for i in range(no_of_test):
		p_init = collection_of_p_init[i] # guess p
		time_checkpoints = t_set_expt
		def f(t,y):
			return np.array([F(y[0],p_init)])
		_,_, x_checkpoint, t_checkpoint_new = pe.RK45_wrapper(x0, time_checkpoints, f, stepsize=None)
		ax.plot(t_checkpoint_new,x_checkpoint, label="guess "+str(i+1), color=(0,1-1*(i+1)/no_of_test,1*(i+1)/no_of_test))

		# To find what is the best learning_rate
		print(" testing initial guess parameters: ",i+1)
		es.update_p(t_set_expt, n_set_expt, F, p_init, list_of_derivatives, verbose=11)
	ax.plot(Xr,Yr, color = "r", label="expt values")
	ax.legend( )
	ax.set_xlabel("t")
	ax.set_ylabel("n")

	plt.show()
#.#


# ------------------------------------------------------------------------------- #
start_op = 1
if start_op:	
	print("----------------- Start Optimizing ---------------------------")
	
	# ********* Settings *************
	p_init_max = [1e-10,0 ,0,0.9e-7] # [Gmax,k1max,k2max,k3max]
	p_init_min = [1e-10, 2e-2,2e-3,1.2e-7] # [Gmin ,k1min,k2min,k3min]
	no_of_tries = 3
	save_name = 'fitting_data'
	es.learning_rate= 1e-16
	no_of_iterations = 10
	save_interval = 1
	# ********************************

	n_set_expt_MAT = pe.RK45_output_to_list(n_set_expt) # MATRIX list form
	
	for j in range(no_of_tries):
		p_init = []
		for a,b in zip(p_init_min, p_init_max):
			p_init.append(np.random.uniform(a,b))
		p_now= p_init
		save_count = 0
		SAVE_DATA = []
		print("TRY ", j+1," / ",no_of_tries)
		mse_list = []
		for i in range(no_of_iterations + 1):
			if i>0:
				# for i =0 , initial state, do not iterate yet
				p_now = es.update_p(t_set_expt, n_set_expt, F, p_now, list_of_derivatives, verbose=0)
			
			#------------------------ FOR REALTIME OBSERVATION -----------
			def f(t,y):
				return np.array([F(y[0],p_now)])
			time_checkpoints = t_set_expt
			_,_, n_set_next, t_set_next = pe.RK45_wrapper(x0, time_checkpoints, f, stepsize=None)
			n_set_next_MAT = pe.RK45_output_to_list(n_set_next)
			mse = pe.MSE(n_set_next_MAT,n_set_expt_MAT)
			mse_list.append(mse)
			print(" i = ", i , " -- > mse = ", mse)
			# print("   p_new = ", p_now)

			#------------------------ TO BE SAVED -------------------------
			save_count = save_count + 1
			if save_count == save_interval or i==0:
				save_count = 0
				data_store = {
				"t_set":t_set_next,
				"n_set":n_set_next,
				"p":p_now,
				"learning_rate":es.learning_rate,
				"mse_list": mse_list
				}
				SAVE_DATA.append(data_store)
		output = open(save_name+"_"+ str(j+1) +".par", 'wb')
		pickle.dump(SAVE_DATA, output)
		output.close()

print("\nClosing Program...")

fitting_post_processing.py

import param_estimate as pe
import matplotlib.pyplot as plt
import numpy as np
import time
import pickle

filename = 'fitting_data'
save_data_series = [1,2,3]

for j in save_data_series:
	pkl_file = open(filename+'_'+str(j)+'.par', 'rb')
	SAVE_DATA = None
	SAVE_DATA = pickle.load(pkl_file)
	pkl_file.close()


	fig = plt.figure(j)
	ax = None
	ax2 = None
	ax = fig.add_subplot(211)

	# ------------------- EXPT DATA ---------------------------------=
	Xr, Yr = pe.load_data(plot_data=0)
	ax.scatter(Xr,Yr,3,marker="x",c="g")

	# ------------------- DATA FROM OPTIMIZATION ---------------------
	color_scheme = np.linspace(0,1,len(SAVE_DATA))
	for i in range(len(SAVE_DATA)):
		data_store = SAVE_DATA[i]
		n_set = data_store["n_set"]
		t_set = data_store["t_set"]
		p = data_store["p"]
		mse_list = data_store["mse_list"]
		print(" i = ", i)
		print("   p = ", p )
		if i==0:
			this_label = "init curve" 
		else:
			this_label = str(i)
		ax.plot(t_set,n_set, color=(1-color_scheme[i],0,color_scheme[i]), label=this_label)
		if i ==0:
			this_title = "initial G,k1,k2,k3 = " + str(p)
	
	plt.title(this_title)
	ax2 = fig.add_subplot(212)
	ax2.plot(range(len(mse_list)),mse_list)
	ax.legend()
	ax.set_xlabel("t")
	ax.set_ylabel("n")
	ax2.set_xlabel("i")
	ax2.set_ylabel("MSE")
plt.show()	

print("\nClosing Post-Processing Program...")

param_estimate.py

# We are estimating the parameters of the following differential equations:
#   dn/dt = F(n, p) = G - k1 * n - k2 * n**2 - k3 * n**3
#   where p is the parameter (G,k1,k2,k3)

import numpy as np
import kero.utils.utils as ut
from scipy.integrate import RK45
import matplotlib.pyplot as plt
import csv


class DE_param_estimator:
	def __init__(self):
		self.learning_rate = 1e-21
		return
	# l2
	def update_p(self,t_set_expt, n_set_expt, F, p_now, list_of_derivatives,
		verbose = False):
		# t_set_expt: list of numpy array
		#   for example: [array([25.2047]), array([25.279]), array([25.3533]), array([25.4277]),...]
		# n_set_expt: list of numpy array
		#   for example: [array([789.696]), array([784.109]), array([763.528]), array([726.782]),...]
		# p_now: list of parameters
		# 
		#
		# return
		#   p_next: list of parameters
 		nabla_mse = self.nabla_MSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives)
 		if verbose>10:
 			# temp = self.learning_rate*nabla_mse
 			temp2 = np.transpose(np.matrix(p_now))
 			for_recommend = []
 			if verbose>30:
	 			print("    nabla_mse = ")
	 			ut.print_numpy_matrix(nabla_mse,formatting="%s",no_of_space=10)
	 			print("    p_now =")
	 			ut.print_numpy_matrix(temp2,formatting="%s",no_of_space=10)
	 		det_list = []
	 		with np.errstate(divide='ignore'):
	 			for k in range(len(p_now)):
	 				det = 1e-2*p_now[k]/nabla_mse.item(k)
	 				if verbose>20:
		 				print("    p_now[k]/(nabla_mse[k]*100) =",det)
		 			det2 = np.log10(np.abs(det))
		 			if (det2!=np.inf) and (det2!=-np.inf) and (not np.isnan(det2)):
			 			det_list.append(det2)
		 	if verbose>20: print("      det_list = ",det_list)
		 	print("       recommended learning rate ~ ", 10**np.min(det_list))
 		p_next= np.transpose(np.matrix(p_now)) - self.learning_rate * nabla_mse
 		p_next = np.transpose(p_next).tolist()
 		p_next = p_next[0]
 		return p_next
	# L3	
	def dkdMSE(self, t_set_expt, n_set_expt, F, p_now, diff_part):
		# lambda function : d/dk d/dt of n
		# t_set_expt, n_set_expt : experimental data
		N = len(t_set_expt)
		dkmse = 0 

		def f(t,y):
			return np.array([F(y[0],p_now)])

		x0 = n_set_expt[0] # array
		time_checkpoints= t_set_expt
		x_set, t_set, x_checkpoint, t_checkpoint_new = RK45_wrapper(x0, time_checkpoints, f, stepsize=None)

		for i in range(1,len(n_set_expt)):			
			ni = x_checkpoint[i][0]
			# print(" ni =", ni)
			m = i+1
			diff_term = 0
			for j in range(m):
				diff_term = diff_term + diff_part(x_checkpoint[j][0])
			diff_term = diff_term / m 
			dkmse = dkmse + ( ni - n_set_expt[i][0])* diff_term
		dkmse = 1/N*dkmse 
		return dkmse

	def nabla_MSE(self, t_set_expt, n_set_expt, F, p_now, list_of_derivatives,
		verbose = False):
		# list_of_derivatives: list of lambda functions

		dGdMSE = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives[0])
		dk1mse = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives[1])
		dk2mse = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives[2])
		dk3mse = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives[3])
		# print("dGdMSE=", dGdMSE)
		# print("dk1mse=",dk1mse)
		# print("dk2mse=",dk2mse)
		# print("dk3mse=",dk3mse)
		nabla_mse = np.transpose(np.matrix([dGdMSE, dk1mse, dk2mse, dk3mse]))
		return nabla_mse

def load_data(plot_data=False):
	print(" --+ load_data().")

	# load data
	filename = 'sample_data'
	all_data = read_csv(filename, header_skip=0)
	all_data = [[np.float(x) for x in y] for y in all_data]

	X=[x[0] for x in all_data]
	Y=[x[1] for x in all_data]

	# customize your data loading here
	max_index = np.argmax(Y)
	relevant_data= all_data[max_index:]
	Xr = [x[0] for x in relevant_data]
	Yr = [x[1] for x in relevant_data]

	print("     data size = ", len(Xr))
	print("     peak = ", X[max_index],",", Y[max_index])
	if plot_data:
		fig = plt.figure()
		ax=fig.add_subplot(111)
		ax.scatter(X,Y,3)
		ax.scatter(X[max_index],Y[max_index])
		ax.plot(Xr,Yr,'r')
		ax.set_xlabel("t")
		ax.set_ylabel("n")
		plt.show()

	return Xr, Yr

def read_csv(filename, header_skip=1,get_the_first_N_rows = 0):
    # filename : string, name of csv file without extension
    # headerskip : nonnegative integer, skip the first few rows
    # get_the_first_N_rows : integer, the first few rows after header_skip to be read
    #   if all rows are to be read, set to zero

    # example
    # xx = read_csv('tryy', 3)
    # xx = [[np.float(x) for x in y] for y in xx] # nice here
    # print(xx)
    # print(np.sum(xx[0]))
    
    out = []
    with open(str(filename)+'.csv') as csv_file:
        data_reader = csv.reader(csv_file)
        count = 0
        i = 0
        for row in data_reader:
            if count < header_skip:
                count = count + 1
            else:
                out.append(row)
                if get_the_first_N_rows>0:
                	i = i + 1
                	if i == get_the_first_N_rows:
                		break
    return out

def MSE(Y_set,Y0_set):
	# Y_set is a list of Y
	# Y0_set is a list of Y0
	# Assume Y_set and Y0_set have the same size, otherwise error will be raised
	#
	# Y (numpy matrix) is a vector: computed data from model
	# Y0 (numpy matrix) is a vector: observed/true data
	# Assume Y and Y0 have the same size, otherwise error will be raised
	# Assume all entries are real
	
	mse = 0
	n = len(Y_set)
	if not (n==len(Y0_set)):
		print("MSE(). Error. Inconsistent dimension between Y_set and Y_set0. Return nothing.")
		return	
	for i in range(n):
		Y = Y_set[i]
		Y0 = Y0_set[i]
		Nout1 = len(Y.tolist())
		Nout2 = len(Y0.tolist())
		if not (Nout1==Nout2):
			print("MSE(). Error. Inconsistent dimension between Y and Y0. Return nothing.")
			return
		# now compute the norms between true and model output values
		for j in range(Nout1):
			mse = mse + (np.abs(Y0.item(j)-Y.item(j)))**2
	mse = mse / (2*n)

	return mse

def RK45_wrapper(x0, time_checkpoints, f, stepsize=None):
	# example
	#
	#
	#
	# def f(t,y):
	# 	# dx/dt = f(x,t)
	# 	return np.array([-1*y[0]])
	# x0 = np.array([1.0]) # initial value
	# time_checkpoints = np.linspace(0,10,11)
	# x_set, t_set, x_checkpoint, t_checkpoint_new = RK45_wrapper(x0, time_checkpoints, f, stepsize=None)

	# t_checkpt_diff = []
	# for t,t_new in zip(time_checkpoints, t_checkpoint_new):
	# 	t_checkpt_diff.append(abs(t-t_new))
	# print("max checkpt diff = ", np.max(t_checkpt_diff))

	if stepsize is None:
		stepsize = 0.01*(time_checkpoints[-1]-time_checkpoints[0])
	t_set = [time_checkpoints[0]]
	x_set = [x0]
	t_checkpoint_new = [time_checkpoints[0]]
	x_checkpoint = [x0]

	for i in range(len(time_checkpoints)-1):	
		t1 = time_checkpoints[i+1]
		if i>0:
			x0 = ox.y
			t0 = ox.t
			
		else:
			t0 = time_checkpoints[i]
		ox = RK45(f,t0,x0,t1,max_step=stepsize)
		
		while ox.t < t1:
			msg = ox.step()
			t_set.append(ox.t)
			x_set.append(ox.y)
		x_checkpoint.append(ox.y)
		t_checkpoint_new.append(ox.t)
	return x_set, t_set, x_checkpoint, t_checkpoint_new

def RK45_output_to_list(xx):
	return [np.matrix((x.tolist())[0]) for x in xx]