Pliska Studia Mathematica Bulgarica
Volume 22, 2013
C O N T E N T S
- Yanev, N. M. Preface. (pp. 3−4)
- Atanasov, D., V. Stoimenova. A Computational Approach for the Statistical Estimation of Discrete Time Branching Processes with Immigration (pp. 5−24)
- Dackova, D., P. Mateev. Classification of Texts' Authorship Using a Regression Model on Compressed Data (pp. 25−32)
- Dimitrov, D. M., D. Atanasov. Group Comparisons on Cognitive Attributes Using the Least Squares Distance Model of Cognitive Diagnosis (pp. 33−40)
- Grigorova, D., R. Gueorguieva. Implementation of the EM Algorithm for Maximum Likelihood Estimation of a Random Effects Model for One Longitudinal Ordinal Outcome (pp. 41−56)
- Hyrien, O., K. Mitov, N. Yanev. L^{p} Microlocal Supercritical Markov Branching Processes with Non-Homogeneous Poisson Immigration (pp. 57−70)
- Jacob, C. A Generalized Quasi-Likelihood Estimator for
Nonstationary Stochastic Processes−Asymptotic Properties and Examples (pp. 71−88)
- P. Jordanova, P., M. Stehlík, Z. Fabián, L. Střelec. On Estimation and Testing for Pareto Tails (pp. 89−108)
- Kolkovska, E. T., J. A. Lóopez-MimbelaSub- and Super-solutions of a Nonlinear PDE, and Application to a Semilinear SPDE (pp. 109−116)
- Kostadinova, K. Y., L. D. Minkova. On the Poisson Process of Order k (pp. 117−128)
- Nonchev, B. Minimum Description Length Principle in Discriminating Marginal Distributions (pp. 129−142)
- Prodanova, K., S. Pashkunova. Modeling Data for Complications in Diabetics Using Logistic Regression (pp. 143−158)
- Sečkárová, V. On Supra-Bayesian Weighted Combination of Available Data Determined by Kerridge Inaccuracy and Entropy (pp. 159−168)
- Slavtchova-Bojkova, M. Time to Extinction in Branching Processes and its Application in Epidemiology (pp. 169−194)
- Stehlík, M., F. Wartner, M. Minárová. Fractal Analysis for Cancer Research: Case Study and Simulation of Fractals (pp. 195−206)
- Trayanov, P. Crump-Mode-Jagers Branching Process: Modelling and Application for Human Population (pp. 207−224)
- Veleva, E. Marginal Densities of the Wishart Distribution (pp. 225−236)
- Yanev, G. P, S. Chakraborty. Characterizations of Exponential Distribution Based on Sample of Size Three (pp. 237−244)
A B S T R A C T S
A COMPUTATIONAL APPROACH FOR THE STATISTICAL ESTIMATION OF DISCRETE TIME BRANCHING PROCESSES WITH IMMIGRATION
Dimitar Atanasov
datanasov@nbu.bg,
Vessela Stoimenova
stoimenova@fmi.uni-sofia.bg
2010 Mathematics Subject Classification: 60J80.
Key words: branching processes, immigration, estimation, statistical software.
It is well known that the estimation of the parameters of branching processes (BP) as an important issue used for studying and predicting their behavior needs lots of energy consuming work and faces many computational difficulties. The task is even more complicated in the presence of outliers − "wrong", "untypical" or "contaminated" data, which require a different statistical approach. The existing asymptotic results for the classical estimators can be combined with a generic method for constructing robust estimators, based on the trimmed likelihood and called weighted least trimmed estimators (WLTE). Despite the computational intensity of the procedure it gives well interpretable results even in the case of minor a priori satisfied asymptotic requirements. In the paper we explain the main outlines of this routine and show some classical estimators and their robust modifications in the important class of discrete-time branching processes with immigration. We present a software package for MATLAB for simulation, plotting and estimation of the process parameters. The package is available on the Internet, under the GNU License.
CLASSIFICATION OF TEXTS' AUTHORSHIP USING A REGRESSION MODEL ON COMPRESSED DATA
Diana Dackova
diana.dackova@gmail.com,
Plamen Mateev
p.mateev@gmail.com
2010 Mathematics Subject Classification: 68T50,62H30,62J05.
Key words: Text authorship identification, Classification, Compression, Linear Regression.
An algorithm for text authorship identification is proposed. The procedure is based on the Kolmogorov complexity and uses regression models on the length of the compressed texts. The classification employs the regression parameters estimates. Different combinations of compressor parameters and the preliminary processing on the data are examined using prose texts of a few English classics.
GROUP COMPARISONS ON COGNITIVE ATTRIBUTES USING THE LEAST SQUARES DISTANCE MODEL OF COGNITIVE DIAGNOSIS
Dimiter M. Dimitrov
datanasov@nbu.bg,
Dimitar Atanasov
ddimitro@gmu.edu
2010 Mathematics Subject Classification: 62P15.
Key words: cognitive diagnosis modeling, item response theory, assessment.
As the cognitive operations are hypothesized according to a cognitive theory in the context of a study, they are latent (hidden) in nature and cannot be measured and scored directly from the test. The least squares distance model (LSDM) of cognitive diagnosis uses estimates of the item parameters under a specific item-response theory (IRT) model to provide estimates of the probability of a person to process correctly any cognitive attribute given the person’s location on the IRT logit scale. In this paper a methodology for comparing two (or more) groups of individuals, according to their performance on a given set of cognitive attributes is presented.
IMPLEMENTATION OF THE EM ALGORITHM FOR MAXIMUM LIKELIHOOD ESTIMATION OF A RANDOM EFFECTS MODEL FOR ONE LONGITUDINAL ORDINAL OUTCOME
Denitsa Grigorova
dgrigorova@fmi.uni-sofia.bg,
Ralitza Gueorguieva
ralitza.gueorguieva@yale.edu
2010 Mathematics Subject Classification: 62J99.
Key words: correlated probit model, EM algorithm, free software environment for statistical computing and graphics R, ordinal longitudinal data.
Longitudinal data arise when we have repeated measures on subjects over time. The correlated probit model is frequently used for ordered longitudinal data since it allows to seamlessly incorporate different correlation structures. The estimation of the probit model parameters based on direct maximization of the limited information maximum likelihood is a numerically intensive procedure especially when we have repeated measures on subjects. We propose an extension of the EM algorithm for obtaining maximum likelihood estimates for one ordinal longitudinal outcome. The algorithm is implemented in the free software environment for statistical computing and graphics R. We use simulations to examine the performance of the developed algorithm and apply the model to data from the Health and Retirement Study (HRS). We apply a bootstrap approach for standard error approximation. Advantages of the presented algorithm include the potential of dealing with high-dimensional random effects and of extending the algorithm to combinations of ordinal and continuous longitudinal outcomes.
SUPERCRITICAL MARKOV BRANCHING PROCESSES WITH NON-HOMOGENEOUS POISSON IMMIGRATION
Ollivier Hyrien
Ollivier_Hyrien@urmc.rochester.edu,
Kosto V. Mitov
kmitov@yahoo.com,
Nikolay M. Yanev
yanev@math.bas.bg
2010 Mathematics Subject Classification: 60J80.
Key words: Branching processes, Immigration, Poisson process, Limit theorems.
The paper proposes an extension of Sevastyanov (1957) model allowing an
immigration in the moments of a homogeneous Poisson process. Markov
branching processes with time-nonhomogeneous Poisson immigration are
considered as models in cell proliferation kinetics and limit theorems are
proved in the supercritical case. Some of the limiting results can be
interpreted as generalizations of the classical result of
Sevastyanov (1957) and new effects are obtained due to the
non-homogeneity.
A GENERALIZED QUASI-LIKELIHOOD ESTIMATOR FOR NONSTATIONARY STOCHASTIC PROCESSES−-ASYMPTOTIC PROPERTIES AND EXAMPLES
Christine Jacob
christine.jacob@jouy.inra.fr
2010 Mathematics Subject Classification: 62F12, 62M05, 62M09, 62M10, 60G42.
Key words: Quasi-likelihood estimator, minimum contrast estimator, least-squares
estimator, least absolute deviation estimator, maximum likelihood estimator, uniform strong law of large numbers for martingales, nonstationary stochastic process, stochastic regression, consistency, asymptotic distribution.
Let {Z_{n}}_{n∈N} be a real stochastic process on (Ω, F, P_{θ0}), where θ_{0} is a unknown p-dimensional parameter. We propose a GQLE (Generalized Quasi-Likelihood Estimator) of θ_{0} based on a single trajectory of the process and defined by
ˆθ_{n}:=argmin_{θ}
∑_{k=1}^{n}Ψ_{k}(Z_{k}, θ),
where Ψ_{k}(z, θ) is F_{k-1}-measurable,
{F_{n}}_{n} being an increasing sequence of σ-algebras. This class of estimators includes many different
types of estimators such as conditional least squares estimators, least absolute deviation estimators and maximum likelihood estimators, and allows missing data, outliers, or infinite conditional variance. We give general conditions leading to the strong consistency and the asymptotic
normality of ˆθ_{n. The key tool is a uniform strong law of large numbers for martingales. We illustrate the results in the branching processes setting.
}
ON ESTIMATION AND TESTING FOR PARETO TAILS
Pavlina Jordanova
pavlina_kj@abv.bg,
Milan Stehlík
Milan.Stehlik@jku.at,
Zdeněk Fabián
zdenek@cs.cas.cz,
Luboš Střelec
lubos.strelec@mendelu.cz
2010 Mathematics Subject Classification: 62F10, 62F12.
Key words: Point estimation, asymptotic properties of estimators, testing against heavy tails.
The t-Hill estimator for independent data was introduced by Fabian and Stehlik (2009). It estimates the extreme value index of distribution function with regularly varying tail. This paper considers sampling of an infinite moving average model. We prove that in the discussed case the t-Hill estimator is weak consistent. However, in contrast to independent identically distributed case here it is shown that the t-Hill and the Hill estimator applied to the moving average model are not robust with respect to large observations.
SUB- AND SUPER-SOLUTIONS OF A NONLINEAR PDE, AND APPLICATION TO A SEMILINEAR SPDE
E. T. Kolkovska
todorova@cimat.mx,
J. A. López-Mimbela
jalfredo@cimat.mx
2010 Mathematics Subject Classification: 35R60, 60H15, 74H35.
Key words: Blowup of semi-linear equations,
stochastic partial differential equations, sub-solutions and super-solutions.
We obtain upper and lower bounds for the explosion time of a semi-linear heat equation on a bounded $d$-dimensional domain, perturbed by white noise. The bounds we get are expressed in terms of exponential functionals of one-dimensional Brownian motion, whose density function can be explicitly calculated.
ON THE POISSON PROCESS OF ORDER k
Krasimira Y. Kostadinova
kostadinova@shu-bg.net,
Leda D. Minkova
leda@fmi.uni-sofia.bg
2010 Mathematics Subject Classification: 60E05, 62P05.
Key words: Distributions of order k, compound distributions, Poisson
process, ruin probability.
In this notes, the Poisson process of order k as a compound
Poisson process is analyzed. We give a brief review of the
distributions of order k. Then, some properties of the Poisson
process of order k are given as well as probability mass function
and recursion formulas. We then describe the defined process as a
compound birth process. As application we consider the standard risk
model which counting process is the Poisson process of order k.
For the Poisson of order k risk model we derive the joint
distribution of the time to ruin and the deficit at ruin. As a
limiting case we obtain an equation for the ruin probability. We
discuss in detail the particular case of exponentially distributed
claims.
MINIMUM DESCRIPTION LENGTH PRINCIPLE IN DISCRIMINATING MARGINAL DISTRIBUTIONS
Bono Nonchev
bono.nonchev@gmail.com
2010 Mathematics Subject Classification: 94A17, 62B10, 62F03.
Key words: MDL, Model Selection, Complexity, Distribution Selection
In this paper the MDL principle is explored in discriminating between
a model with normal marginal distributions vs a model with Student-T
marginal distributions. The shape complexity of a distribution is
defined with insights from the closed-form solution for model complexity
for normal distribution. An optimised numerical approach for the Student-T
distribution is devised with the aim of extending it to the fat-tailed
distributions commonly found in econometric time series.
MODELING DATA FOR COMPLICATIONS IN DIABETICS USING LOGISTIC REGRESSION
K. Prodanova
kprod@tu-sofia.bg,
S. Pashkunova
pashkunovasylvia@yahoo.com
2010 Mathematics Subject Classification: 62P10.
Key words: diabetes-related complications, genotypes, logistic regression.
A prospective study of the relationship between some clinical parameters, genetic markers and complications of the patients with diabetes is considered. About 200 patients (male and female) have been examined. The patients are classified into five groups subject to the type of the diabetes. Data obtained for each patient are related to the type of the complications -- macro vascular, retina pathology, neuron pathology and nephrite pathology, 12 clinical parameters and 7 genetic markers. Data for the same genetic markers for 94 healthy persons (control group) are compared with those of the diabetics patients. The association of the genetic markers and the different types diabetes-related complications are investigated. A logistic regression to identify which factors are associated with the complications is performed. The associations between pathogenesis and gene genotypes are investigated for the first time for the population of diabetics in Bulgaria.
ON SUPRA-BAYESIAN WEIGHTED COMBINATION OF AVAILABLE DATA DETERMINED BY KERRIDGE INACCURACY AND ENTROPY
Vladimíra Sečkárová
seckarov@utia.cas.cz
2010 Mathematics Subject Classification: 94A17.
Key words: Kerridge inaccuracy, maximum entropy principle, parameter estimation.
Every process in our environment can be described with a statistical model containing inner properties expressed by parameters. These are usually unknown and the determination of their values is of interest in the statistical branch called parameter estimation. This branch involves many methods solving different estimation cases, e.g. the estimation of location and scale parameters. To obtain the parameter estimate we exploit the data given by data sources. In particular, the estimate is their combination. Improvement of the parameter estimates involve the assignment of the weights to the data sources resulting in a weighted combination of data. Unfortunately this approach brings difficulties regarding the determination of the weights and their subjective affection. In recently introduced Supra-Bayesian approach it is proposed to use the Kerridge inaccuracy and the maximum entropy principle to overcome the problem of subjective influence. In this paper we focus on the derivation of the weights arisen within the Supra-Bayesian approach and on the simulation study of their behaviour and the behaviour of the final estimate.
TIME TO EXTINCTION IN BRANCHING PROCESSES AND ITS APPLICATION IN EPIDEMIOLOGY
M. Slavtchova-Bojkova
bojkova@fmi.uni-sofia.bg
2010 Mathematics Subject Classification: Primary 60J80; Secondary 92D30.
Key words: age-dependent branching process, extinction time, epidemiological modeling, vaccination level.
The contemporary state of the theory of branching processes implies their application to any abstract population where individuals produce a set of new individuals. In this survey paper some recent developments in the study of time to extinction of continuous-time branching processes (BP) motivated by their applications in epidemiological modeling will be presented. The developed methodology and results are concerning Bellman−Harris (age-dependent) BP and more general Sevast'yanov's BP, as well.
FRACTAL ANALYSIS FOR CANCER RESEARCH: CASE STUDY AND SIMULATION OF FRACTALS
Milan Stehlík
Milan.Stehlik@jku.at,
Fabian Wartner
f_wartner@gmx.at,
Mária Minárová
minarova@math.sk
2010 Mathematics Subject Classification: 65D18.
Key words: cancer research, Fractal, genetic algorithm, simulation of fractal.
This paper discusses the possibilities of application of fractal geometry for cancer research. Fractal geometry is a new tool that can be extremely useful for many problems in almost every scientific field. The studies recently done in medicine show fractals can be applied for cancer detection and the description of pathological architecture of tumors. This fact is not surprising, as due to the irregular structure, cancerous cells can be interpreted as fractals. Cancer diagnosis can be done via determination of fractal dimension. A likelihood ratio test for the Hausdorff dimension is employed in [7] We empirically checked the obtained tests on Sierpinski Carpet and on cancer data. However, several issues arisen, especially those related to simulation of fractals which may mimic tissues. These are discussed in the present paper.
CRUMP-MODE-JAGERS BRANCHING PROCESS: MODELLING AND APPLICATION FOR HUMAN POPULATION
Plamen Trayanov
plamentrayanov@gmail.com
2010 Mathematics Subject Classification: 60J85, 92D25.
Key words: branching process, human population, Malthusian parameter.
The future human population count in a country depends on many factors which influence birth and death. The interaction of birth and death determines the rate at which the population grows or diminishes. Modelling the population can give us information for the current condition of a country. The paper describes a methodology based on Crump-Mode-Jagers branching process theory (see [2]) for modelling human population and shows how the Malthusian parameter can be numerically estimated using the model. A population that has greater Malthusian parameter is expected to have greater population count from some point on in the future. The model results from comparison between Sweden, Greece, Slovenia and Bulgaria using official EUROSTAT data (see [1]) show the Malthusian parameter for Greece is declining for the past few years due to the crisis. Before that Greece was comparable to a country with good social and demographic policy like Sweden. The model General Branching Process (GBP) could also be used for population projection. The results for Bulgaria show the model expects decreasing population count.
MARGINAL DENSITIES OF THE WISHART
DISTRIBUTION
Evelina Veleva
eveleva@abv.bg
2010 Mathematics Subject Classification: 62H10.
Key words: Wishart distribution, positive definite matrix, marginal density, covariance matrix, decomposable graph.
We consider marginal densities obtained by elimination of non-diagonal
elements of a positive definite random matrix with an arbitrary
distribution. For a p × p random matrix
W such a marginal density is presented by
a graph with p vertices. For every non-diagonal
element of W, included in the density we
draw in the graph an undirected edge between the corresponding
vertices. By giving an equivalent definition of decomposable graphs we
show that the bounds of the integration with respect to every excluded
element of W can be exactly obtained if
and only if the corresponding graph is decomposable. The author gives in an explicit form some of the marginal densities of an
arbitrary Wishart distribution.
Canonically Conjugate Variables for the μCH Equation
George P. Yanev
yanevgp@utpa.edu,
Santanu Chakraborty
schakraborty@utpa.edu
2010 Mathematics Subject Classification: 62G30, 62E10.
Key words: characterization, exponential distribution, order statistics.
Two characterizations of the exponential distribution based on equalities among order statistics in a random sample of size three are proved. This proves two conjectures stated recently in Arnold and Villaseñor [4].