Model-free estimation of a psychometric function

modelfree v1.2

Home	Downloads	Demonstration	Documentation	Examples	Functions	Contacts

Overview of methodology

Parametric fitting

The psychometric function underlying a given set of data is traditionally fitted by assuming that it has a standard shape, e.g. a Gaussian or Weibull function, and adjusting its parameters for best fit. The correct model function, however, is rarely known, and as a consequence, the fit may be biased, leading to incorrect inferences.

In parametric fitting, a psychometric function P is estimated by a generalized linear model with binomial distribution. The model for the psychometric function consists of a link function l (ensuring the response does not take impossible values) and a linear function of the stimulus levels x. The link function transforms the psychometric function to a linear function η; that is l(P(x)) = η = a₀ + a₁ x. Equivalently, P(x) = l^–1(a₀ + a₁ x). The link l is specified a priori, and it is defined by the chosen model function and the specified guessing and lapsing rates. For example, if the chosen model function is Gaussian and both guessing and lapsing rates are assumed to be zero, then the link function is simply the inverse of the Gaussian cumulative distribution function (see McCullagh and Nelder (1989), and Żychaluk and Foster (2009) Download PDF). The parameters a₀ and a₁ are estimated by maximizing the log-likelihood,

l(a₀, a₁) = ∑_i[k_i ln(P(x_i)) + (m_i – k_i) ln(1 – P(x_i))].
The fit is strongly dependent on the chosen link function. The correct link function, however, is rarely known, and as a consequence, the ﬁt may be biased, leading to incorrect inferences.

Local linear fitting

To avoid these problems, the psychometric function can be modelled locally. The link function is still needed to ensure that the psychometric function does not take prohibited values but the function after transformation is no longer assumed to be linear. The only assumption is that it is smooth. By a Taylor expansion, any smooth function can be approximated locally by a linear function

l(P(x_i)) = η ≈ a₀ + a₁ (x_i – x),

where the parameters α₀, α₁ depend on x and their values are estimated by maximizing local log-likelihood,

l(x) = ∑_i[k_i ln(P(x_i)) + (m_i – k_i) ln(1 – P(x_i))] w(x,x_i).

The weights w(x,x_i) determine the influence of each point x_i on the estimate at a point x: the further the points x and x_i are apart, the smaller the value of w(x,x_i).

The weight funciton is defined as w(x,x_i) = K((x – x_i)/h)/h for a kernel K and a bandwidth h. The kernel is a function usually assumed to be positive and symmetric about zero, e.g. a Gaussian function. It is known that the influence of the weight function w depends less on the shape of the kernel K than on the bandwidth h. The choice is crucial, as h controls the domain of influence of w and hence the smoothness of the local linear estimate. For small h, the estimate follows the data very closely and is therefore susceptible to the random variations in the data. For large h, the estimate is very smooth, close to that obtained by a parametric method, and therefore potentially biased.

To decide what is an optimal bandwidth, first the appropriate loss function needs to be chosen. This is a function which measures discrepancy between the true psychometric function and the estimate. The most commonly used measures are deviance (McCullagh and Nelder (1989), Żychaluk and Foster (2009) Download PDF) and integrated squared error (ISE),

ISE(ĝ) = ∫ (g(x) – ĝ(x))² dx,

where g is the function of interest and ĝ the estimator of this function. Here, the ISE was considered both for the psychometric function P and for the function η. Thus there are three possible loss functions: the deviance, ISE for the psychometric function P, and ISE for the function η. There are also three main methods of bandwidth choice available in this package: the plug-in, cross-validation, and bootstrap. The plug-in bandwidth estimates the ISE-optimal bandwidth for the function η. For cross-validation and bootstrap, there are three possible loss functions as just described.

Finally, once the data have been fitted, there are several ways to assess the goodness of fit, such as the deviance or Pearson's X². With psychometric functions, the responses are binomially distributed, but if they were normally distributed, then the deviance would coincide with the residual sum of squares.

Reference: McCullagh, P. and Nelder, J. A. Generalized linear models, 1989, London: Chapman & Hall.

A how-to guide to finding nonparametric estimates of psychometric functions with this package (MatLab option) follows.

Loading data from a text file

The data in a text file should be in three columns, with no header: first column for stimulus levels, second for number of successes, and third for number of trials. Each line in the file should correspond to a stimulus level and the values should be separated by spaces. An example follows

0.1 0 3 0.2 0 3 0.3 0 5 0.4 2 13 0.5 2 10 0.6 2 20 0.7 2 14 0.9 3 11 1.1 4 5 1.3 18 20

To load the data in MatLab write

[ x, r, m ] = textread( '<path\filename>' );

Save and Load data in MatLab format

To save the loaded data in MatLab format, type

save('<path\filename>', 'x', 'r', 'm');

The filename should have extension ".mat" or no extension. Once created it can be loaded by typing

load('<path\filename>');

Local linear fitting

Local linear fitting is performed by the function locglmfit. The arguments are xfit – points in which the function is to be estimated, r – the number of successful responses,m – the number of trials, x – stimulus levels, and bwd – bandwidth. The syntax is

pfit = locglmfit( xfit, r, m, x, bwd );

Bandwidth selection

The selection of bandwidth is critical in local linear fitting. In this package there are three functions to select the bandwidth, bandwidth_bootstrap, bandwidth_cross_validation, and bandwidth_plugin. For all the methods, the number of successes r in m trials at each stimulus level x are mandatory parameters.

For the function bandwidth_bootstrap a search interval bwdInterval for the optimal bandwidth and the number of bootstrap replications are also mandatory. The syntax is

bwd = bandwidth_bootstrap( r, m, x, bwdInterval, replications );

For the function bandwidth_cross_validation a search interval bwdInterval for the optimal bandwidth is also mandatory. The syntax is

bwd = bandwidth_cross_validation( r, m, x, bwdInterval );

The function bandwidth_plugin calculates an estimate of the AMISE optimal bandwidth for the local polynomial estimator of a psychometric function. The syntax is

bwd = bandwidth_plugin( r, m, x);

An additional function, used only for testing purposes, is the function bandwidth_optimal. This function finds the actual optimal bandwidth for synthetic data with known, true psychometric function ptrue. The arguments for this function are the same as for bandwidth_cross_validation, except that the true psychometric function ptrue is also required. The syntax is

bwd = bandwidth_optimal( ptrue, r, m, x, bwdInterval );

Deviance

Calculates deviance D for data and fitted values of the psychometric function pfit. The syntax is

D = deviance( r, m, pfit );

Parametric fitting

The package also provides fits for several parametric models, such as Gaussian, Weibull, reverse Weibull, and logistic cumulative distribution functions. The parameters are estimated in functions binomfit_lims, binom_weib, and binom_revweib. The fitted values are obtained with the function binomval_lims.

To fit a logistic function and calculate the fitted values, the syntax is

b = binomfit_lims( r, m, x ); fest = binomval_lims( b, xfit );

To fit other functions, e.g. the Gaussian cumulative distribution, check the optional inputs of the function binomfit_lims.

To fit a Weibull function, the optimal exponent K for the Weibull needs to be estimated in addition to the linear coefficients b. To obtain the estimated psychometric function at values xfit, the guessing and lapsing rate (in this example both are equal to 0) and the estimated power parameter K need to be provided in binomval_lims. The syntax is

[ b, K ] = binom_weib( r, m, x ); fest = binomval_lims( b, xfit, 'weibull', 0, 0, K );

To fit a reverse Weibull function and calculate the fitted values, replace the function binom_weib by binom_revweib in the previous code and the parameter 'weibull' by 'revweibull' in the function binomval_lims.

Link, kernel, and loss functions available

The link functions available in the package are

logit

probit

loglog

comploglog

weibull

revweibull

The kernels available in the package are

normpdf

epanechikov

triangular

tricube

bisquare

uniform

The loss functions available to estimate the optimal bandwidth by bandwidth_bootstrap, bandwidth_cross_validation, or bandwidth_optimal are

ISE: Integrated squared error

ISEeta: Integrated squared error on eta scale(eta is the function that is related to the psychometric function via the link function)

deviance function

Check the Functions to see how to modify both links and kernels in the different functions.