Exam (elaborations)

A variable selection approach based on the Delta Test for Extreme Learning Machine models

3 views 0 purchase

Course
A variable selection approach based on the Delta T

Institution
A Variable Selection Approach Based On The Delta T

eal-life problems it is convenient to reduce the number of involved features (variables) in order to reduce the complexity, especially when the number of features is large compared to the number of observations. There are several criteria to tackle this variable reduction problem. Three of the m...

[Show more]

Preview 2 out of 10 pages

View example

Uploaded on August 8, 2024
Number of pages 10
Written in 2024/2025
Type Exam (elaborations)
Contains Questions & answers

a variable selection approach based on the delta t

Institution A variable selection approach based on the Delta T
Course A variable selection approach based on the Delta T

$14.99

Added

Add to cart Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

A variable selection approach based on the Delta
Test for Extreme Learning Machine models
Fernando Mateo1 and Amaury Lendasse2

1- Universidad Politécnica de Valencia - Dept. Ingenierı́a Electrónica
Camino de Vera s/n, 46022 Valencia - Spain
2- Helsinki University of Technology - Adaptive Informatics Research Centre
Konemiehentie 2, 02150 Espoo - Finland

Abstract. Extreme Learning Machine, ELM, is a newly available learn-
ing algorithm for single layer feedforward neural networks (SLFNs), and
it has proved to show the best compromise between learning speed and
accuracy of the estimations. In this paper, a methodology based on
Optimal-Pruned ELM (OP-ELM) for function approximation enhanced
with variable selection using the Delta Test is introduced. The least angle
regression (LARS) algorithm is used after variable selection to rank the
input variables, and scaling is also introduced as a way to estimate the
influence of each input in the output value. The performance is assessed
on a dataset related to anthropometric measurements for children weight
prediction. The accurate results show that this combination of techniques
is very promising to solve real world problems and represents a good al-
ternative to classic backpropagation methods.

1 Introduction
In many real-life problems it is convenient to reduce the number of involved fea-
tures (variables) in order to reduce the complexity, especially when the number
of features is large compared to the number of observations. There are several
criteria to tackle this variable reduction problem. Three of the most common
are: maximization of the mutual information (MI) between the inputs and the
outputs, minimization of the k-nearest neighbors (k-NN) leave-one-out gener-
alization error estimate and minimization of a nonparametric noise estimator
(NNE).
Extreme Learning Machine (ELM)[1] is a new learning technique to train
single layer feedforward neural networks (SLFN) which chooses the input weights
randomly and determines the output weights analytically. This algorithm is
designed to build models that provide the best possible generalization in the
shortest time. Given its success, it has already been applied to several fields of
machine learning such as text classification [2] or time series prediction [3].
This work intends to make use of the methodology described in [4] which
proposes a combination of Extreme Learning Machine with optimal pruning
(OP-ELM) and variable selection. In this case, we focus on the use of a NNE as
a selection criterion, concretely by using the Delta Test (DT) as estimator. The
applicability of the method is assessed on a dataset of children anthropometric
measurements.

, This paper is structured as follows: Section 2 explains the variable selection
methodology using the Delta Test as a criterion, and how it is integrated in the
forward-backward search (FBS) algorithm. In Section 3 there is a description of
the LARS methodology for input ranking and Section 4 gives a mathematical
perspective on the Extreme Learning Machine method. In Section 5, the exper-
iments are described and some relevant results are presented, while Section 6
summarizes the conclusions of this work.

2 Variable selection
2.1 The Delta Test
The Delta Test, firstly introduced by Pi and Peterson for time series [5], is
a technique to estimate the variance of the noise, or the mean squared error
(MSE), that can be achieved without overfitting. Given N input-output pairs
(xi , yi ) ∈ RM × R, the relationship between xi and yi can be expressed as

yi = f (xi ) + ri , i = 1, ..., N

where f is the unknown function and r is the noise. The DT estimates the
variance of the noise r.
The DT is useful for evaluating the nonlinear correlation between two random
variables, namely, input and output pairs. The DT can be also applied to input
selection: the set of inputs that minimizes the DT is the one that is selected.
Indeed, according to the DT, the selected set of inputs is the one that represents
the relationship between inputs and output in the most deterministic way. DT
is based on hypotheses coming from the continuity of the regression function.
If two points x and x′ are close in the input space, the continuity of regression
function implies the outputs f (x) and f (x′ ) will be close enough in the output
space. Alternatively, if the corresponding output values are not close in the
output space, this is due to the influence of the noise.
The DT can be interpreted as a particularization of the Gamma Test [6] con-
sidering only the first nearest neighbor. Let us denote the first nearest neighbor
of a point xi in the RM space as xN N (i) . The nearest neighbor formulation of
the DT estimates Var[r] by
N
1 X
Var[r] ≈ δ = (yi − yN N (i) )2 , with Var[δ] → 0 for N → ∞
2N i=1

where yN N (i) is the output of xN N (i) .

2.2 Forward-backward search methodology
To overcome the difficulties and the high computational time that an exhaustive
search would entail (i.e. 2N − 1 input combinations, being N the number of vari-
ables), there are several other search strategies. These strategies are suboptimal
because they do not test every input combination, and they are clearly affected

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Ariikelsey. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $14.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

76462 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications