Preface
This series is aimed at providing tools for an electrical engineer to analyze data and solve problems in design. The focus is on applying linear algebra to systems of equations or large sets of matrix data.
Introduction
This article will demonstrate the use polynomial interpolation.
If you are not familiar with linear algebra or need a brush up I recommend Linear Algebra and its Applications by Lay, Lay and McDonald. It provides an excellent review of theory and applications.
This also assumes you are familiar with Python or can stumble your way through it.
The data and code are available here: linear3.py and linear3.xlsx.
Concepts
Interpolation: a method by which an equation is fitted to match a set of data. This equation relates the relationship of the variables in the data. This allows you to evaluate new data points not contained within a data set. It also allows you to evaluate and analyze the behavior of a system without actually measuring the system. Interpolated functions require an exact fit (i.e. the equation must pass through all data points).
Regression Analysis: a statistical method for analyzing relationships of variables in data, including curve fitting (linear regression and least squares) and statistical analysis. Often further analysis is pursued, for example forecasting and profiling noise (as a probability distribution).
Curve Fitting: a method by which an equation is fitted to match a set of data, usually overdetermined and containing noise, with the least error possible. The equation does not have to pass through all data points. In essence the measured data is smoothed or low pass filtered. Common methods include linear regression and least squares.
Least Squares: a curve fitting method by which a function is found that minimizes error between the function and the data set. The method is called least squares because it seeks to minimum the sum of the square of residuals, where the residual is the difference between the data point and the estimated function for all points in the data.
Importing Your Data Set
I will use the Anaconda package suite with Python, numpy and the Spyder IDE for mathematical analysis. It is cross platform, free and open source. It is also easy to import data from Excel once you have the code snippet.
I usually have the following boilerplate code in my scripts:
import sys
import xlrd
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
np.set_printoptions(threshold=np.nan)
print sys.version
print __name__
The data set in this example is four data points: (1,29), (-1,-35), (2,31), and (-3,-19).
The interpolation code in numpy is straight forward:
filename='linear_3.xlsx'
print
print 'Opening',filename
ss=xlrd.open_workbook(filename,on_demand=True)
sheet_index=0
print ' Worksheet',ss.sheet_names()[sheet_index]
sh=ss.sheet_by_index(sheet_index)
print ' Reading data values'
xa=[]
ya=[]
for row_index in range(0,sh.nrows):
xa.append(sh.cell_value(row_index,0))
ya.append(sh.cell_value(row_index,1))
xa=np.array(xa)
ya=np.array(ya)
print ' Design x array size:',len(xa)
print ' Design y array size:',len(ya)
Here we assemble two arrays. One with the x values and one with the y value measurements.
The last bit of code converts the array to a numpy array which will be necessary for the code below.
Finding the Interpolated Equation
Now that the test results and input variables have been imported let's evaluate the fn at new values.
coeff = interp1d(xa, ya, kind='cubic')
xnew = np.linspace(-3, 2, num=125, endpoint=True)
ynew=coeff(xnew)
print
The type of interpolation that are available: ‘linear’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic, ‘cubic’ (where ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of first, second or third order) or as an integer specifying the order of the spline interpolator to use. Default is ‘linear’.
This is the output:
2.7.11 |Anaconda 2.5.0 (64-bit)| (default, Jan 29 2016, 14:26:21) [MSC v.1500 64 bit (AMD64)]
__main__
Opening linear_3.xlsx
Worksheet Sheet1
Reading data values
Design x array size: 4
Design y array size: 4
Now plot the original values and the interpolated values:
plt.axis([-10,10,-50,50])
plt.plot(xa,ya,'ro')
plt.plot(xnew,ynew)
Unfortunately interp1d is not good for trend analysis outside the min/max values in the original data array.
If you want to perform some forecasting of values beyond the original data you're better off using least squares interpolation.
Next Up
In the next article we will analyze networks using adjacency matrices.