Title: | Penalized Data Sharpening for Local Polynomial Regression |
---|---|
Description: | Functions and data sets for data sharpening. Nonparametric regressions are computed subject to smoothness and other kinds of penalties. |
Authors: | W.J. Braun [aut], D. Wang [aut, cre], X.J. Hu [aut, ctb] |
Maintainer: | D. Wang <[email protected]> |
License: | Unlimited |
Version: | 2.0 |
Built: | 2025-02-27 07:18:08 UTC |
Source: | https://github.com/cran/sharpPen |
Penalized data sharpening has been proposed as a way to enforce certain constraints on a local polynomial regression estimator. In addition to a bandwidth, a coefficient of the penalty term is also required. We propose propose systematic approaches for choosing these tuning parameters, in part, by considering the optimization problem from the perspective of ridge regression.
data_sharpening(xx,yy,zz,p,h=NULL,gammaest=NULL,penalty,lambda=NULL)
data_sharpening(xx,yy,zz,p,h=NULL,gammaest=NULL,penalty,lambda=NULL)
xx |
numeric vector of x data. Missing values are not accepted. |
yy |
numeric vector of y data.
This must be same length as |
zz |
numeric vector of gridpoint z data. Missing values are not accepted. |
p |
degree of local polynomial used. |
h |
the kernel bandwidth smoothing parameter. If NULL, this value will be estimated by function dpill for Local Linear Regression, and will be estimated by function dpilc for Local Quadratic and Cubic Regression. |
gammaest |
the shape constraint parameter. Cannot be NULL for Periodic shape constraint. Can be NULL for Exponential shape constraint. |
penalty |
the type of shape constraint, can be "Exponential" and "Periodicity". |
lambda |
a coefficient of the penalty term, default is NULL. |
the sharpened response variable.
D.Wang and W.J.Braun
set.seed(1234567) gam<-4 gamest<-gam g <- function(x) 3*sin(x*(gam*pi))+5*cos(x*(gam*pi))+6*x sigma<-3 xx<-seq(0,1,length=100) yy<-g(xx)+rnorm(100,sd=sigma) zz<-xx h1<-dpilc(xx,yy) local_fit<-t(lprOperator(h=h1,xx=xx,zz=zz,p=2))%*%yy y_sharp<-data_sharpening(xx=xx,yy=yy,zz=zz,p=2,gammaest=gamest,penalty="Periodicity") sharp_fit<-t(lprOperator(h=h1,xx=xx,zz=zz,p=2))%*%y_sharp plot(c(min(xx),max(xx)),c(min(yy)-0.5,max(yy)+0.5),type="n",,xlab="x",ylab="y") legend("bottomright",legend=c("curve_local","curve_sharpen"), col=c(1,3),bty="n",pch=c("-","-")) lines(xx,local_fit) lines(xx,sharp_fit,col=3, lwd=2) points(xx,yy,col= rgb(0.8,0.2,0.2,0.2))
set.seed(1234567) gam<-4 gamest<-gam g <- function(x) 3*sin(x*(gam*pi))+5*cos(x*(gam*pi))+6*x sigma<-3 xx<-seq(0,1,length=100) yy<-g(xx)+rnorm(100,sd=sigma) zz<-xx h1<-dpilc(xx,yy) local_fit<-t(lprOperator(h=h1,xx=xx,zz=zz,p=2))%*%yy y_sharp<-data_sharpening(xx=xx,yy=yy,zz=zz,p=2,gammaest=gamest,penalty="Periodicity") sharp_fit<-t(lprOperator(h=h1,xx=xx,zz=zz,p=2))%*%y_sharp plot(c(min(xx),max(xx)),c(min(yy)-0.5,max(yy)+0.5),type="n",,xlab="x",ylab="y") legend("bottomright",legend=c("curve_local","curve_sharpen"), col=c(1,3),bty="n",pch=c("-","-")) lines(xx,local_fit) lines(xx,sharp_fit,col=3, lwd=2) points(xx,yy,col= rgb(0.8,0.2,0.2,0.2))
Construct a shape constraint matrix at a corresponding sequence of x data and sequence of gridpoint z.
derivOperator(penalty,gamma,h, xx,zz,p)
derivOperator(penalty,gamma,h, xx,zz,p)
penalty |
the type of shape constraint, can be "drv1", "drv2", "drv3", "drv4", "Exponential" and "Periodicity". |
gamma |
the shape constraint parameter |
h |
the kernel bandwidth smoothing parameter. |
xx |
numeric vector of x data. Missing values are not accepted. |
zz |
numeric vector of gridpoint z data. Missing values are not accepted. |
p |
degree of local polynomial used. |
shape constraint matrix
X.J. Hu
Use direct plug-in methodology to select the bandwidth
of a local quadratic and local cubic Gaussian kernel regression estimate,
as an extension of Wand's dpill
function.
dpilc(xx, yy, blockmax = 5, divisor = 20, trim = 0.01, proptrun = 0.05, gridsize = 401L, range.x = range(x))
dpilc(xx, yy, blockmax = 5, divisor = 20, trim = 0.01, proptrun = 0.05, gridsize = 401L, range.x = range(x))
xx |
numeric vector of x data. Missing values are not accepted. |
yy |
numeric vector of y data.
This must be same length as |
blockmax |
the maximum number of blocks of the data for construction of an initial parametric estimate. |
divisor |
the value that the sample size is divided by to determine a lower limit on the number of blocks of the data for construction of an initial parametric estimate. |
trim |
the proportion of the sample trimmed from each end in the
|
proptrun |
the proportion of the range of |
gridsize |
number of equally-spaced grid points over which the function is to be estimated. |
range.x |
vector containing the minimum and maximum values of |
This function is a local cubic (also quadratic) extension of
the dpill
function of Wand's KernSmooth package.
The kernel is the standard normal density.
Least squares octic fits over blocks of data are used to
obtain an initial estimate. As in Wand's implementation
of the Ruppert, Sheather and Wand selector,
Mallow's is used to select
the number of blocks. An option is available to
make use of a periodic penalty (with possible trend)
relating the 4th derivative of the regression function
to a constant (gamma) times the 2nd derivative. This
avoids the need to calculate the octic fits and reverts
back to the original quartic fits of
dpill
with
appropriate adjustments to the estimated functionals
needed in the direct-plug-in bandwidth calculation. This
code is similar to dpilq
but uses a 6th degree
polyomial approximation instead of an 8th degree polynomial
approximation.
the selected bandwidth.
If there are severe irregularities (i.e. outliers, sparse regions)
in the x
values then the local polynomial smooths required for the
bandwidth selection algorithm may become degenerate and the function
will crash. Outliers in the y
direction may lead to deterioration
of the quality of the selected bandwidth.
D.Wang and W.J.Braun
Ruppert, D., Sheather, S. J. and Wand, M. P. (1995). An effective bandwidth selector for local least squares regression. Journal of the American Statistical Association, 90, 1257–1270.
Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.
x <- faithful$eruptions y <- faithful$waiting plot(x, y) h <- dpill(x, y) fit <- locpoly(x, y, bandwidth = h, degree=1) lines(fit) h <- dpilc(x, y) fit <- locpoly(x, y, bandwidth = h, degree=2) lines(fit, col=3, lwd=2) fit <- locpoly(x, y, bandwidth = h, degree=3) lines(fit, col=2, lwd=2)
x <- faithful$eruptions y <- faithful$waiting plot(x, y) h <- dpill(x, y) fit <- locpoly(x, y, bandwidth = h, degree=1) lines(fit) h <- dpilc(x, y) fit <- locpoly(x, y, bandwidth = h, degree=2) lines(fit, col=3, lwd=2) fit <- locpoly(x, y, bandwidth = h, degree=3) lines(fit, col=2, lwd=2)
The function uses raw or trimmed data, applies grid-based binning, and estimates local bandwidth based on the provided parameters.
dpilc_PTW(xx, yy, blockmax = 5, divisor = 20, trim = 0.01, proptrun = 0.05, gridsize = 401L, range.x = range(x), use_raw_data = FALSE)
dpilc_PTW(xx, yy, blockmax = 5, divisor = 20, trim = 0.01, proptrun = 0.05, gridsize = 401L, range.x = range(x), use_raw_data = FALSE)
xx |
A numeric vector of x-values. |
yy |
A numeric vector of y-values, corresponding to the x-values in |
blockmax |
An integer specifying the maximum number of blocks to be used in the blockwise selection. Default is 5. |
divisor |
An integer that controls the block size. Default is 20. |
trim |
A numeric value between 0 and 1 specifying the proportion of data to be trimmed from the extremes. Default is 0.01. |
proptrun |
A numeric value between 0 and 1 indicating the proportion of data to be excluded from the running procedure. Default is 0.05. |
gridsize |
An integer specifying the number of grid points to be used. Default is 401L. |
range.x |
A numeric vector of length 2 indicating the range over which the smoothing is applied. Default is the range of |
use_raw_data |
A logical value indicating whether to use the raw data (TRUE) or trimmed data (FALSE) for analysis (default is FALSE). |
This function provides a point-wise calculation of the functional theta(4,4) at each data point \(x_i\). It employs various auxiliary functions for binning the data, calculating local polynomial estimations, and performing necessary cross-validation. The function returns a point-wise estimate of the relevant quantity, which is used for localized analysis of the data distribution.
The core methodology used in this function is based on the nonparametric regression framework. For detailed information on the theoretical aspects and derivations, refer to Chapter 3 of Fan and Gijbels (1996).
A list with three elements:
x |
A numeric vector of x-values. |
y |
A numeric vector of y-values. |
h |
A numeric vector of bandwidth values computed for each corresponding x-value. |
D.Wang and W.J.Braun
Fan, J., & Gijbels, I. (1996). Local Polynomial Modelling and its Applications. Chapman and Hall/CRC.
# Example usage: x <- rnorm(100) y <- rnorm(100) # Run the pointwise estimation result <- dpilc_PTW(x, y, blockmax = 5, divisor = 20, trim = 0.01, proptrun = 0.05, gridsize = 40, range.x = range(x), use_raw_data = TRUE) # Inspect the result plot(result$x, result$h, type = "l", col = "blue", xlab = "X", ylab = "Bandwidth Estimate")
# Example usage: x <- rnorm(100) y <- rnorm(100) # Run the pointwise estimation result <- dpilc_PTW(x, y, blockmax = 5, divisor = 20, trim = 0.01, proptrun = 0.05, gridsize = 40, range.x = range(x), use_raw_data = TRUE) # Inspect the result plot(result$x, result$h, type = "l", col = "blue", xlab = "X", ylab = "Bandwidth Estimate")
Local linear regression is applied to bivariate data. The response is ‘sharpened’ or perturbed in a way to render a curve estimate that satisfies some specified shape constraints.
DR_sharpen( x, y, xgrid=NULL, M=200, h=NULL, mode=NULL, ratio_1=0.14,ratio_2=0.14,ratio_3=0.14,ratio_4=0.14, constraint_1=NULL, constraint_2=NULL, constraint_3=NULL, constraint_4=NULL, norm="l2", augmentation=FALSE, maxit = 10^5)
DR_sharpen( x, y, xgrid=NULL, M=200, h=NULL, mode=NULL, ratio_1=0.14,ratio_2=0.14,ratio_3=0.14,ratio_4=0.14, constraint_1=NULL, constraint_2=NULL, constraint_3=NULL, constraint_4=NULL, norm="l2", augmentation=FALSE, maxit = 10^5)
x |
a vector of explanatory variable observations |
y |
binary vector of responses |
xgrid |
gridpoints on x-axis where estimates are taken |
M |
number of equally-spaced gridpoints (if xgrid not specified) |
h |
bandwidth |
mode |
the location of the peak on the x-axis, valid in the unimode case |
ratio_1 |
control the first derivative shape constraint gap aroud the peak, valid in the unimode case |
ratio_2 |
control the second derivative shape constraint gap aroud the peak, valid in the unimode case |
ratio_3 |
control the third derivative shape constraint gap aroud the peak, valid in the unimode case |
ratio_4 |
control the fourth derivative shape constraint gap aroud the peak, valid in the unimode case |
constraint_1 |
a vector of the first derivative shape constraint |
constraint_2 |
a vector of the second derivative shape constraint |
constraint_3 |
a vector of the third derivative shape constraint |
constraint_4 |
a vector of the fourth derivative shape constraint |
norm |
the smallest possible distance type: "l2", "l1" or "linf". Default is "l2" |
augmentation |
data augmentation: "TRUE" or "FALSE", default is "FALSE" |
maxit |
maximum iterarion number, default is 10^5 |
Data are perturbed the smallest possible L2 or L1 or Linf distance subject to the constraint that the local linear estimate satisfies some specified shape constraints.
ysharp |
sharpened responses |
iteration |
number of iterations the function has been spend for the convergence |
D.Wang and W.J.Braun
Wang, D. (2022). Penalized and constrained data sharpening methods for kernel regression (Doctoral dissertation, University of British Columbia).
set.seed(1234567) gam<-4 g <- function(x) (3*sin(x*(gam*pi))+5*cos(x*(gam*pi))+6*x)*x n<-100 M<-200 noise <- 1 x<-sort(runif(n,0,1)) y<-g(x)+rnorm(n,sd=noise) z<- seq(min(x)+1/M, max(x)-1/M, length=M) ############xgrid points h1<-dpill(x,y) A<-lprOperator(h=h1,x=x,z=z,p=1) local_fit<-t(A) ss_1<-c(sign(numericalDerivative(z,g,k=1))) DR_sharpen(x=x, y=y, xgrid=z, h=h1, constraint_1=ss_1, norm="linf",maxit =10^3)
set.seed(1234567) gam<-4 g <- function(x) (3*sin(x*(gam*pi))+5*cos(x*(gam*pi))+6*x)*x n<-100 M<-200 noise <- 1 x<-sort(runif(n,0,1)) y<-g(x)+rnorm(n,sd=noise) z<- seq(min(x)+1/M, max(x)-1/M, length=M) ############xgrid points h1<-dpill(x,y) A<-lprOperator(h=h1,x=x,z=z,p=1) local_fit<-t(A) ss_1<-c(sign(numericalDerivative(z,g,k=1))) DR_sharpen(x=x, y=y, xgrid=z, h=h1, constraint_1=ss_1, norm="linf",maxit =10^3)
Construct a matrix based on the local polynomial estimation at a corresponding sequence of x data and sequence of gridpoint z.
lprOperator(h,xx,zz,p)
lprOperator(h,xx,zz,p)
h |
the kernel bandwidth smoothing parameter. |
xx |
numeric vector of x data. Missing values are not accepted. |
zz |
numeric vector of gridpoint z data. Missing values are not accepted. |
p |
degree of local polynomial used. |
local polynomial estimator matrix
X.J. Hu
Time Series of noon temperature observations from the Winnipeg International Airport from January 1, 1960 through December 31, 1980.
data(noontemp)
data(noontemp)
A single vector.
Cubic spline interpolation of columns of a matrix for purpose of computing numerical derivatives at a corresponding sequence of gridpoints.
numericalDerivative(x, g, k, delta=.001)
numericalDerivative(x, g, k, delta=.001)
x |
numeric vector |
g |
numeric-valued function of x |
k |
number of derivatives to be computed |
delta |
denominator of Newton quotient approximation |
numeric vector of kth derivative of g(x)
W.J. Braun
Compute the projection operator for rectangle or nonnegative space. For example, we construct
,
where can be rectangle or nonnegative space.
projection_C( lambda,family=c("rectangle","nonnegative"), input,bound=c(-1,1))
projection_C( lambda,family=c("rectangle","nonnegative"), input,bound=c(-1,1))
lambda |
parameter |
family |
type of |
input |
input x in the above equation |
bound |
lower bound and upper bound for rectangle |
Take as input,
as parameter.
Calculate
for a given
family
projection |
|
D.Wang and W.J.Braun
set.seed(1234567) family <- "nonnegative" temp_p1<-runif(10,-1,1) projection_C(0.5,family=family,temp_p1)
set.seed(1234567) family <- "nonnegative" temp_p1<-runif(10,-1,1) projection_C(0.5,family=family,temp_p1)
Compute the projection operator for norm balls. For example, we construct
,
where can be
-norm,
-norm, and
-norm.
projection_nb( lambda,radius,family=c("norm2","norm1","norminf"), input)
projection_nb( lambda,radius,family=c("norm2","norm1","norminf"), input)
lambda |
parameter |
radius |
parameter |
family |
select the norm ball type, can be |
input |
input x in the above equation |
Take as input,
and
as parameters.
Calculate
for a given norm ball type.
projection |
|
D.Wang and W.J.Braun
set.seed(1234567) family <- "norm1" temp_p1<-rep(10,100) projection_nb(3,1,family=family,temp_p1)
set.seed(1234567) family <- "norm1" temp_p1<-rep(10,100) projection_nb(3,1,family=family,temp_p1)
This is a function to shrink responses towards their estimations of local polynomial regression with large bandwidth as a form of data sharpening to remove roughness, prior to use in local polynomial regression.
relsharp_bigh(x, y, alpha, bigh)
relsharp_bigh(x, y, alpha, bigh)
x |
numeric vector of equally spaced x data. Missing values are not accepted. |
y |
vector of y data. Missing values are not accepted. |
alpha |
the elasticnet mixing parameter vector, with alpha in [0,1]. |
bigh |
the kernel bandwidth smoothing parameter. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_bigh(x, y,alpha=c(0.2,0.8), dpill(x,y)*4) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_bigh(x, y,alpha=c(0.2,0.8), dpill(x,y)*4) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
This is a function to shrink responses towards their estimations of local polynomial regression with large bandwidth and then apply residual sharpening as a form of data sharpening to remove roughness, prior to use in local polynomial regression.
relsharp_bigh_c(x, y, alpha, bigh, hband)
relsharp_bigh_c(x, y, alpha, bigh, hband)
x |
numeric vector of equally spaced x data. Missing values are not accepted. |
y |
vector of y data. Missing values are not accepted. |
alpha |
the elasticnet mixing parameter vector, with alpha in [0,1]. |
bigh |
the kernel bandwidth smoothing parameter. |
hband |
the kernel bandwidth smoothing parameter, which will be used in the residual sharpening method. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_bigh_c(x, y,alpha=c(0.2,0.8), dpill(x,y)*4, dpill(x,y)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_bigh_c(x, y,alpha=c(0.2,0.8), dpill(x,y)*4, dpill(x,y)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
This is a function to shrink responses towards their estimations of linear regression as a form of data sharpening to remove roughness, prior to use in local polynomial regression.
relsharp_linear(x, y, alpha)
relsharp_linear(x, y, alpha)
x |
numeric vector of equally spaced x data. Missing values are not accepted. |
y |
vector of y data. Missing values are not accepted. |
alpha |
the elasticnet mixing parameter vector, with alpha in [0,1]. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_linear(x, y,alpha=c(0.2,0.8)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_linear(x, y,alpha=c(0.2,0.8)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
This is a function to shrink responses towards their estimations of linear regression and then apply residual sharpening as a form of data sharpening to remove roughness, prior to use in local polynomial regression.
relsharp_linear_c(x, y, alpha, hband)
relsharp_linear_c(x, y, alpha, hband)
x |
numeric vector of equally spaced x data. Missing values are not accepted. |
y |
vector of y data. Missing values are not accepted. |
alpha |
the elasticnet mixing parameter vector, with alpha in [0,1]. |
hband |
the kernel bandwidth smoothing parameter, which will be used in the residual sharpening method. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_linear_c(x, y,alpha=c(0.2,0.8),dpill(x,y)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_linear_c(x, y,alpha=c(0.2,0.8),dpill(x,y)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
This is a function to shrink responses towards their mean as a form of data sharpening to remove roughness, prior to use in local polynomial regression.
relsharp_mean(y, alpha)
relsharp_mean(y, alpha)
y |
vector of y data. Missing values are not accepted. |
alpha |
The elasticnet mixing parameter vector, with alpha in [0,1]. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_mean(y,alpha=c(0.2,0.8)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_mean(y,alpha=c(0.2,0.8)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
This is a function to shrink responses towards their mean and then apply residual sharpening as a form of data sharpening to remove roughness, prior to use in local polynomial regression.
relsharp_mean_c(x, y, alpha, hband)
relsharp_mean_c(x, y, alpha, hband)
x |
numeric vector of equally spaced x data. Missing values are not accepted. |
y |
vector of y data. Missing values are not accepted. |
alpha |
the elasticnet mixing parameter vector, with alpha in [0,1]. |
hband |
the kernel bandwidth smoothing parameter, which will be used in the residual sharpening method. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_mean_c(x, y,alpha=c(0.2,0.8), dpill(x,y)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharp_mean_c(x, y,alpha=c(0.2,0.8), dpill(x,y)) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
This is a data sharpening function to remove roughness, prior to use in local polynomial regression.
relsharpen(x, y, h, alpha, p=2, M=51)
relsharpen(x, y, h, alpha, p=2, M=51)
x |
numeric vector of equally spaced x data. Missing values are not accepted. |
y |
vector of y data. Missing values are not accepted. |
h |
the kernel bandwidth smoothing parameter. |
alpha |
the elasticnet mixing parameter vector, with alpha in [0,1]. |
p |
the order of the polynomial regression. |
M |
the length of the constraint points. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharpen(x, y, dpill(x,y), alpha=c(0.2,0.8), p=2, M=51) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-relsharpen(x, y, dpill(x,y), alpha=c(0.2,0.8), p=2, M=51) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
This is a function to shrink responses towards their mean/estimations of local polynomial regression with large bandwidth/estimations of linear regression as a form of data sharpening to remove roughness, and reduce the bias (when "combine=TRUE"), prior to use in local polynomial regression.
RELsharpening(x,y,alpha,type,bigh,hband,combine)
RELsharpening(x,y,alpha,type,bigh,hband,combine)
x |
numeric vector of equally spaced x data. Missing values are not accepted. |
y |
vector of y data. Missing values are not accepted. |
alpha |
the elasticnet mixing parameter vector, with alpha in [0,1]. |
type |
The type of the base line. In total, we have three types: "mean", "big_h", and "linear". |
bigh |
the kernel bandwidth smoothing parameter. |
hband |
the kernel bandwidth smoothing parameter, which will be used in the residual sharpening method. |
combine |
Should the smoother combined with residual method or not, default=FALSE. |
Note that the predictor values are assumed to be equally spaced.
numeric matrix of sharpened responses, with each column corresponding to different values of alpha
D.Wang
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-RELsharpening(x, y,alpha=c(0.2,0.8),"big_h", dpill(x,y)*4, dpill(x,y),combine=TRUE) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
x<-seq(0,10,length=100) g <- function(x) sin(x) y<-g(x)+rnorm(100) ys<-RELsharpening(x, y,alpha=c(0.2,0.8),"big_h", dpill(x,y)*4, dpill(x,y),combine=TRUE) y.lp2<-locpoly(x,ys[,1],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp8<-locpoly(x,ys[,2],bandwidth=dpill(x,y),degree=1,gridsize=100) y.lp<-locpoly(x,y,bandwidth=dpill(x,y),degree=1,gridsize=100) curve(g,x,xlim=c(0,10)) lines(y.lp2,col=2) lines(y.lp8,col=3) lines(y.lp,col=5) norm(as.matrix(g(x) - y.lp2$y),type="2") norm(as.matrix(g(x) - y.lp8$y),type="2") norm(as.matrix(g(x) - y.lp$y),type="2")
Functions that can be used in simulations to test the effectiveness of the sharpening procedures.
testfun(x, k)
testfun(x, k)
x |
numeric vector |
k |
a numeric constant that controls the height of the peak of the test function; if missing, a periodic function is supplied |
numeric vector of function output
D.Wang