Pliska Studia Mathematica Bulgarica Volume 17, 2005

Pliska Studia Mathematica Bulgarica

Volume 17, 2005

GUEST EDITOR: N.Yanev

Sofia, 2005

C O N T E N T S

Atanasov, D. Study on Robustness of Correlated Frailty Model. (pp. 5-12)
Benchettah A. Characterization of Schrödinger Processes with Unbounded Potentials. (pp. 13-26)
Christozov, D., Mateev, P. Assessment of Information Asymmetry. (pp. 27-38)
Dias, G., Alves, E., Nunes, C. Topic Segmentation: How Much Can We Do by Counting Words and Sequences of Words. (pp. 39-70)
Furlan, R., Corradetti, R. Analysing Conjoint Analysis Data by a Random Coefficient Regression Model. (pp. 71-84)
Gonzalez, M., Martinez, R., Mota, M. A Note on the Extinction Problem for Controlled Multitype Branching Processes. (pp. 85-96)
Gregori, D., Rosato, R., Ciccone, G., Lusa, L. Parameterized Link Functions in Generalized Linear Random Effect Models: a Case Study on Breast Cancer Treatmen. (pp. 97-107)
Jacob, C., Lalam, N., Yanev, N. Statistical Inference for Processes Depending on Environments and Application in Regenerative Processes. (pp. 109-136)
Kharin, Yu., Huryn, A. Sensitivity Analysis of the Risk of Forecasting for Autoregressive Time Series with Missing Values. (pp. 137-146)
Martinez, R., Slavtchova-Bojkova, M. Comparison between Numerical and Simulation Methods for Age-dependent Branching Models with Immigration. (pp. 147-154)
Molina, M., Mota, M., Ramos, A. Nonparametric Estimation in the Class of Bisexual Processes with Population-Size Dependent Mating. (pp. 155-169)
Mouhoubi, Z., Aiissani, D. Some Inequalities of the Uniform Ergodicity and Strong Stability of Homogeneous Markov Chains. (pp. 171-186)
Neykov, N., Dimova, R., Neytchev, P. Trimmed Likelihood Estimation of the Parameters of the Generalized Extreme Value Distributions: a Monte-Carlo Study. (pp. 187-200)
Popovich, B., Stojanovich, V. Split-ARCH. (pp. 201-220)
Prodanova, K., Stoinov, I., Terziiski, D. Logistic Regression in Modelling Data for CVC - Related Infection. (pp. 221-228)
Romisch, U. Application of Statistical Experimental Design in Food Sciences. (pp. 229-239)
Sanjari-Farsipour, N. Modelling Covariates in Multipath Change. (pp. 241-248)
Shishkov, B., Matsumoto, H., Shinohara, N. Probabilistic Approach to Design of Large Antenna Arrays. (pp. 249-269)
Stoev, S., Taqqu, M. Weak Convergence to the Tangent Process of the Linear Multifractional Stable Motion. (pp. 271-294)
Stoimenova, V., Yanev, N. Parametric Estimation in Branching Processes with an Increasing Random Number of Ancestors. (pp. 295-312)
Tsvetanova, Y. Comparison of Multivariate and Univariate Models for Genetic Evaluation of Milk Yield based on Test Day Data. (pp. 313-322)
Vandev, D. Stochastic Optimization in Robust Statistic. (pp. 323-335)
Yaneva, J., Daskalova, N., Yanev, N. Statistical Analysis of Data on Linker Histones/DNA Interactions. (pp. 337-348)
Mitov, K. Extremes of Bivariate Geometric Variables with Application to Bisexual Branching Processes. (pp. 349-362)

A B S T R A C T S

Study on Robustness of Correlated Frailty Model
Dimitar Atanasov datanasov@fmi.uni-sofia.bg

AMS 2000 Subject Classification: 62J12
Key words: frailty models, robust maximum likelihood estimation.
This study considers a robust properties of correlated frailty models. The dependence between related individuals must be considered in order to be studied the difference between the gene information and the environment as causes of death. To do that, one can introduce the frailty parameter Z, which can be decomposed as Z =Z_g+Z_e, where Z_g represents the frailty, due to the gene information, and Z_e represents the influence of the environment. Using the WLTE(k) one can obtain a robust maximum likelihood estimation of the unknown parameters of the model.

Characterization of Schrödinger Processes with Unbounded Potentials
A. Benchettah abenchettah@hotmail.com

AMS 2000 Subject Classification: 49L20, 60J60, 93E20
Key words: Schrödinger process, minimum entropy distance, stochastic optimal control, Schrödinger's system, variational characterisation.
This work is concerned with a class of Schrödinger process with unbounded potentials : a variant of Jamison's theorem is given without the assumption of continuity and of everywhere strict positivity of q. It associates with Jamison's data (q, P_a, P_b), the Csiszar's projection Q^* of a reference measure R^* on a set E(P_a, P_b) of probability measures with marginals P_a, P_b. Existence of a solution to the corresponding Schrödinger's system, construction of the Schrödinger's bridge and variational characterisation of Schrödinger process are established.

Assessment of Information Asymmetry
D. Christozov dgc@aubg.bg
P. Mateev pmat@math.bas.bg

AMS 2000 Subject Classification: 62P20, 91B42.
Key words: information asymmetry, warranty.
In the process of trading, the seller and buyer participate with different initial knowledge about the technical capabilities and about the expected use of the good. This two-side asymmetry affects the success of the negotiation in e -trading. This paper discusses one practical approach to assess information asymmetry and the role of warranty in seller-buyer communication relationship. The presented approach is illustrated with a survey experiment.

Topic Segmentation: How Much Can We Do by Counting Words and Sequences of Words
G. Dias ddg@di.ubi.pt
E. Alves elsalves@zmail.pt
C. Nunes celia@mat.ubi.pt

In this paper, we present an innovative topic segmentation system based on a new informative similarity measure that takes into account word co-occurrence in order to avoid the accessibility to existing linguistic resources such as electronic dictionaries or lexico -semantic databases such as thesauri or ontology. Topic segmentation is the task of breaking documents into topically coherent multi-paragraph subparts. Topic segmentation has extensively been used in information retrieval and text summarization. In particular, our architecture proposes a language-independent topic segmentation system that solves three main problems evidenced by previous research: systems based uniquely on lexical repetition that show reliability problems, systems based on lexical cohesion using existing linguistic resources that are usually available only for dominating languages and as a consequence do not apply to less favored languages and finally systems that need previously existing harvesting training data. For that purpose, we only use statistics on words and sequences of words based on a set of texts. This solution provides a flexible solution that may narrow the gap between dominating languages and less favored languages thus allowing equivalent access to information.

Analysing Conjoint Analysis Data by a Random Coefficient Regression Model
Roberto Furlan roberto.furlan@libero.it
Roberto Corradetti roberto.corradetti@unito.it

AMS 2000 Subject Classification: 62J12, 62K15, 91B42, 62H99.
Key words: conjoint analysis, random coefficient regression model, full factorial design, fractional factorial design, design matrix
Since late 1960s conjoint analysis has been applied in estimating consumer preferences in marketing research.

A Note on the Extinction Problem for Controlled Multitype Branching Processes
Miguel González mvelasco@unex.es
Rodrigo Martínez rmartinez@unex.es
Manuel Mota mota@unex.es

AMS 2000 Subject Classification: 60J80, 60J10.
Key words: controlled multitype branching processes, random control, homogeneous mutitype Markov chains.
In this paper we consider a discrete time controlled multitype branching process with random control in discrete time. We provide sufficient conditions for the almost sure extinction of the process as well as for its indefinite growth with a positive probability. Moreover an illustrative example is shown and some simulations are given.

Parameterized Link Functions in Generalized Linear Random Effect Models: a Case Study on Breast Cancer Treatment
Dario Gregori
Rosalba Rosato
Giovannino Ciccone
Lara Lusa

In non-linear random effects some attention has been very recently devoted to the analysis ofsuitable transformation of the response variables separately (Taylor 1996) or not (Oberg and Davidian 2000) from the transformations of the covariates and, as far as we know, no investigation has been carried out on the choice of link function in such models. In our study we consider the use of a random effect model when a parameterized family of links (Aranda-Ordaz 1981, Prentice 1996, Pregibon 1980, Stukel 1988 and Czado 1997) is introduced. We point out the advantages and the drawbacks associated with the choice of this data-driven kind of modeling. Difficulties in the interpretation of regression parameters, and therefore in understanding the influence of covariates, as well as problems related to loss of efficiency of estimates and overfitting, are discussed. A case study on radiotherapy usage in breast cancer treatment is discussed.

Statistical Inference for Processes Depending on Environments and Application in Regenerative Processes
Christine Jacob cj@banian.jouy.inra.fr
Nadia Lalam
Nickolay Yanev yanev@math.bas.bg

We consider a process {Z_n}_{{n in
N}}, recursively defined by Z_n = f(F_n-1,E_n) + h _n, where F_n-1={Z_k}_{k £ n-1}, E_{n}={C_k}_{k£
n}, {C_n}_n is an observed exogenous process and {h _n}_n is a martingale difference sequence for the filtration generated by (F_n-1, E_n) such that Var(h _n|F_n-1,E_n)g(F_n-1,E_n) < ¥ , a.s. for some known function {g(F_n-1,E_n)}_n. This class of models covers a very broad range of models such as regression models, ANOVA models, autoregressive processes, branching processes, regenerative processes, ... We assume that f(F_n-1,E_n) depends on an unknown parameter m ₀ and that by notation f(.)= f_m₀(.) may be decomposed according to f_m₀(.)=f⁽¹⁾_q₀(.) + f⁽²⁾_m₀(.), where q ₀ in R ^d, dz < ¥ , is asymptotically identifiable in f⁽¹⁾_q₀(.) as n ® ¥ at some rate v(.) whereas f⁽²⁾_m₀(.)v(.) is asymptotically negligible. We build the Conditional Least Squares Estimator of q ₀ based on the observation of a single trajectory of {Z_k,C_k}_k, and give conditions ensuring its strong consistency. The particular case of general linear models according to m ₀=(q₀,n₀) and among them, regenerative processes, are studied more particularly. In this frame, we may also prove the consistency of the estimator of n ₀ although it belongs to an asymptotic negligible part of the model, and the asymptotic law of the estimator may also be calculated.

Sensitivity Analysis of the Risk of Forecasting for Autoregressive Time Series with Missing Values
Yu. S. Kharin kharin@bsu.by
A. S. Huryn hurynaliaksandr@yahoo.com

AMS 2000 Subject Classification: 62M20, 62M10, 62-07.
Key words: forecasting, autoregression, missing values, risk, sensitivity
The problems of statistical forecasting of vector autoregressive time series with missing values are considered for different levels of prior information on the parameters of the underlying model. The mean square risk of forecasting and the risk sensitivity coefficient are evaluated and analyzed. Results of numerical experiments are presented.

Comparison between Numerical and Simulation Methods for Age-dependent Branching Models with Immigration
R. Martínez rmartinez@unex.es
M. Slavtchova-Bojkova bojkova@math.bas.bg

AMS 2000 Subject Classification: 60J80, 60J85
Key words: age-dependent branching processes with immigration at zero state, numerical computations, Monte-Carlo method
This work aims to provide and to compare numerical computation and simulation method to estimate the distribution of some relevant variables related to an age-dependent model allowing immigration at state zero. Specifically, we analyze the behaviour of the following variables: the extinction time and the waiting time for the beginning of the survival of population forever. They are strongly related to the population and re-population experiments in biology and to the wastewater treatment, as well. Throughout the paper, we illustrate the methods provided by some proper examples.

Nonparametric Estimation in the Class of Bisexual Processes with Population-Size Dependent Mating
Manuel Molina mmolina@unex.es
Manuel Mota mota@unex.es
Alfonso Ramos aramos@unex.es

AMS 2000 Subject Classification: 60J80, 62M05
Key words: bisexual branching processes, population-size dependent processes, nonparametric inference, asymptotic properties.
In this paper the class of bisexual branching processes with population-size dependent mating is considered. Nonparametric estimators and confidence intervals for the main parameters involved in such a class of stochastic models are provided. For the proposed estimators, the main conditional to non-extinction and conditional moments are established and some asymptotic properties are investigated. As illustration, a simulated example is given.

Some Inequalities of the Uniform Ergodicity and Strong Stability of Homogeneous Markov Chains
Zahir Mouhoubi z_mouhoubi@yahoo.fr
Djamil Aiissani

AMS 2000 Subject Classification: 60J45, 60K25
Key words: quantitative estimates, uniform ergodicity, stability, strong stability, perturbation.
In this paper we have established some uniform and strong stability estimates for homogeneous Markov chains under mixing conditions. As a general rule, the initial parameters values of the most complex systems has approximately known (they are defined on basis statistics methods), which involve errors for the calculus of research characteristics for each studied system. For this, the stability inequalities obtained in this paper allow us to use them in order to estimate numerically the error of definition for concerned characteristics, for a small perturbations of system's parameters. As an example of application, we are interesting about the well known waiting process where we consider the perturbation for the characteristics of the system when we apply a small perturbation for the control sequence.

Trimmed Likelihood Estimation of the Parameters of the Generalized Extreme Value Distributions: a Monte-Carlo Study
Neyko Neykov neyko.neykov@meteo.bg
Rositsa Dimova rdimova@fmi.uni-sofia.bg
Plamen Neytchev plamen.neytchev@meteo.bg

AMS 2000 Subject Classification: 62F35, 62P99
Key words: generalized extreme value distribution, maximum likelihood estimation, trimmed likelihood estimation, Monte-Carlo simulation.
The applicability of the Trimmed Likelihood Estimator (TLE) proposed by Neykov and Neytchev to the extreme value distributions is considered. The effectiveness of the TLE in comparison with the classical MLE in the presence of outliers in various scenarios is illustrated by an extended simulation study. The FAST-TLE algorithm developed by Neykov Müuller is used to get the parameter estimate. The computations are carried out in the R environment using the packages ismev originally developed by Coles and ported in R by Stephenson.

Split-ARCH
Biljana Popović biljanap@junis.ni.ac.yu
Vladica Stojanović vlada70@verat.net

AMS 2000 Subject Classification: 62M10
Key words: conditional heteroscedasticity, conditional least squares
We supplied the GARCH Zoo with the new model and introduce it in this paper. We named it Split-ARCH. It was empirically motivated by means of the real data set on soybean meal price on the Product exchange. Split-ARCH is the superstructure of the previously known models of GARCH type. We defined volatility exchange to follow sudden and great changes of the price, and volatility also. As far as the log returns of the price are defined as X_n=s_ne_n, we set the volatility to be
s _n²=a₀ +å ^p_j=1 a _j X_n-j² +å ^q_k=1 f_k(s_n-k²) I(e _n-k² >c ) n³ 0
with the threshold c>0. Under the stationarity conditions and specified f, we discus the possibilities of estimating parameters in this paper also.

Logistic Regression in Modelling Data for CVC - Related Infection
Krasimira Prodanova kprod@tu-sofia.bg
Ionko Stoinov
Dimitar Terziiski

AMS 2000 Subject Classification: 62J12, 62P10
Key words: Central Venious Catheter (CVC)infection, multinomial logistic regression
A prospective study of all new central venous catheters (CVC) inserted for patients in intensive care unit in order to identify risk factors for CVC infection and to determine the rate of CVC related infection is undertaken. A catheter-related infection and sepsis was suspected in 62 cases of 118 CVC inserted in intensive care patients. A multiple logistic regression to obtain adjusted estimate of odds ratios and to identify which factors were associated independently with CVC related infection was performed. The variables which entered in the model were those found to be statistically significant (\alpha &le: 0.5) on univariate analysis and those which were established risk factors from previous research reports. The dependent variable was the CVC related infection. The independent variables were ten: age, sex, insertion site, number of lumens, duration of catheterization etc. The software package STATISTICA 6.0 was used for analyzing the real data.

Application of Statistical Experimental Design in Food Sciences
Ute Römisch ute.roemisch@tu-berlin.de

The development of new, health supporting food of high quality and the optimization of food technological processes today require the application of statistical methods of experimental design. The principles and steps of statistical planning and evaluation of experiments will be explained. By example of the development of a gluten-free rusk (zwieback), which is enriched by roughage compounds the application of a simplex-centroid mixture design will be shown. The results will be illustrated by different graphics.

Modelling Covariates in Multipath Change
N. Sanjari Farsipour nsf@susc.ac.ir

AMS 2000 Subject Classification: 62N02
Key words: covariate, maximum likelihood, modelling, multipath change-point problems.
In the multipath change - point problems, it is often of interest to assess the impact of covariates on the change point itself as well as on the parameter before and after the change point. In this paper, we consider a simple model for the change-point distribution, and then through hazard, we include covariates in the change point distribution. Maximum likelihood estimation is discussed.

Probabilistic Approach to Design of Large Antenna Arrays
Blagovest Shishkov bshishkov@math.bas.bg
Hiroshi Matsumoto matsumot@rish.kyoto-u.ac.jp
Naoki Shinohara shino@rish.kyoto-u.ac.jp

AMS 2000 Subject Classification: 78A50
Key words: microwave power transmission, large antenna array, uniform spacing, random spacing, spatial and amplitude tapering, sidelobe level, grating lobes, workspace, transmitting efficiency.
Recent advances in space exploration have shown a great need for antennas with high resolution, high gain and low sidelobe (SL) level. The last characteristic is of paramount importance especially for the Microwave Power Transmission (MPT) in order to achieve higher transmitting efficiency. In this concern statistical methods play an important role. Various probabilistic properties of a large antenna array with randomly, uniformly and combined spacing of elements are studied and especially the relationship between the required number of elements and their appropriate spacing from one part and the desired SL level, the aperture dimension, the beamwidth and transmitting efficiency from the other. We propose a new unified approach in searching for reducing SL level by exploiting the interaction of deterministic and stochastic workspaces of proposed algorithms, emphasizing on the distribution of the maximums of SL level. These models indicate any advantages with respect to sidelobes in the large area around the main beam. A new concept of designing a large antenna array system is proposed. Our theoretic study and simulation results clarify how to deal with the problems of sidelobes in designing a large antenna array, which seems to be an important step toward the realization of future SPS/MPT systems.

Weak Convergence to the Tangent Process of the Linear Multifractional Stable Motion
Stilian Stoev sstoev@bu.edu
Murad S. Taqqu murad@bu.edu

AMS 2000 Subject Classification: 60G18, 60E07
Key words: path continuity, Hölder regularity, linear fractional stable motion, self-similarity, multifractional Brownian motion, local self-similarity, heavy tails.
The linear multifractional stable motion (LMSM), Y={Y(t)}_{{t
in R}}, is an a-stable (0<a<2) stochastic process which exhibits local self-similarity. It is constructed by using a stochastic integral representation of the linear fractional stable motion (LFSM) process X_H,a(t), where the self-similarity exponent H is replaced by a function H(t) in (0,1) of time t. Here, we focus on LMSM processes with continuous paths and study the convergence {1/d(l)(Y(lt + t₀) - Y(t₀))}_{t in [-1,1]} Þ {Z(t)}_{t in [-1,1]}, as ¯0, where Þ denotes the weak convergence of probability distributions on the space of continuous functions C[-1,1] equipped with the uniform norm and where d(l) ¯ 0. We show that if the function H(t) is sufficiently regular and if 1/a<H(t₀)<1, then the above weak convergence holds with normalization d(l)=l^H(t₀) and the limit (tangent) process Z(t) is the LFSM X_H(t₀),a(t). We also show that one can have degenerate tangent processes Z(t), when the function H(t) is not sufficiently regular. The LMSM process is closely related to the Gaussian multifractional Brownian motion (MBM) process. We establish similar weak convergence results for the MBM.

Parametric Estimation in Branching Processes with an Increasing Random Number of Ancestors
Vessela Stoimenova stoimenova@fmi.uni-sofia.bg
Nickolay Yanev yanev@math.bas.bg

AMS 2000 Subject Classification: 60J80, 62M05
Key words: branching processes, random number of ancestors, power series distribution, parametric estimation, consistency, asymptotic normality, efficiency.
The paper deals with a parametric estimation in branching processes {Z_t(n)} having random number of ancestors Z₀(n) as both n and t tend to infinity (and thus Z₀(n) in some sense). The offspring distribution is considered to belong to a discrete analogue of the exponential family - the class of the power series offspring distributions. Consistency and asymptotic normality of the estimators are obtained for all values of the offspring mean m, 0<m<¥, in the subcritical, critical and supercritical case.

Comparison of Multivariate and Univariate Models for Genetic Evaluation of Milk Yield based on Test Day Data
Yanka Tsvetanova yanka@uni-sz.bg

AMS 2000 Subject Classification: 62H12, 62P99
Key words: mixed linear models, repeated measures, fixed regression, genetic parameters
Multivariate and univariate lactation models were applied to test day data to predict genetic value of daily milk yield of a sample of Black and White cows. The models for genetic evaluation include a set of fixed main effects, fixed regression on functions of days im milk, random effects of permanent environment within lactation, random additive genetic effect and residual effect. Under multivariate model for daily milk yield test day records within lactation are considered as repeated measurements, and different lactations are treated as separate traits. Univariate model is applied for each lactation using test day yield as repeated measure. The variance components, genetic parameters and ranging of the animals through the multivariate and univariate metod were compared.

Stochastic Optimization in Robust Statistic
D. Vandev

AMS 2000 Subject Classification: 62J05, 62G35
Key words: robust estimators of location, least median of squares, stochastic approximation algotithm, Monte-Carlo study.
The paper studies a stochastic optimization algorithm for computing of robust estimators of location proposed by Vandev (1992). A random approximation of the exact solution was proposed which is much cheaper in time and easy to program. Two examples are presented. Besides standard estimators of location like trimmed mean also robust regressions (LMS and LTS) introduced by Rousseeuw and Leroy are considered. MATLAB programs are included.

Statistical Analysis of Data on Linker Histones/DNA Interactions
J. Yaneva
N. Daskalova nina_rdm@yahoo.com
N. Yanev yanev@math.bas.bg

AMS 2000 Subject Classification: 62P10, 92C40
Key words: linker histones/DNA interactions, anticancer antibiotics, chi-square test.
Linker histones (H1, H1o H5, subtypes and variants) play a pivotal role in formation of higher order chromatin structure and thus - as main regulators of the expression of genetic information kept in DNA. That is why the knowledge of the nature of linker histones/DNA interactions is of a greatest interest in understanding of such important issues as transcription regulation, cell division, and cancerogenesis. As DNA is a main "target" of most anticancer antibiotics, the analysis of competitive reactions between that drugs (in our case actinomycin D and netropsin) and linker histones for binding to certain sites in DNA gives hopeful information concerning the mode of such interactions. In this work we present statistical analysis of some experimental data concerning the influence of some anticancer antibiotics on linker histones/DNA interactions. First, it was investigated the formulated hypothesis of the dependence of H1/DNA interaction on actinomycin D concentration. Such a relation was expected knowing the different mode for binding of the both drugs to DNA double helix. The applied statistical analysis using chi-square test for independence showed that the concentration of Actinomycin D in reaction mixture had no essential effect on linker histone/DNA binding. On the contrary, the same analysis with the second antibiotic - netropsin showed that we could not reject the hypothesis of dependence. Some other statistical models are also proposed, applying chi-square test for homogeneity, test of Willcockson, Smirnov's test and others.

Extremes of Bivariate Geometric Variables with Application to Bisexual Branching Processes
Kosto V. Mitov kmitov@af-acad.bg

AMS 2000 Subject Classification: 60J80, 60G70
Key words: bivariate geometric distributions, bisexual branching processes, varying environments, maximum family sizes.
We obtain a limit theorem for the row maximum of a triangular array of bivariate geometric random vectors. An application of this limit theorem is provided for maximum family size within a generation of a bisexual branching process with varying geometric offspring laws.