Optimal Allocation of Stratified Sampling Design Using Gradient Projection Method

This article deals with the problem of finding an optimal allocation of sample sizes in stratified sampling design to minimize the cost function. In this paper the iterative procedure of Rosen’s Gradient projection method is used to solve the Non linear programming problem (NLPP), when a non integer solution is obtained after solving the NLPP then Branch and Bound method provides an integer solution.


INTRODUCTION
One of the areas of statistics that is most commonly used in all fields of scientific investigation is that of stratified sampling.In order to achieve greater precision of the estimate it is desirable to decrease the heterogeneity of the population units.This is achieved by a technique known as stratification, in which the entire population is divided into a number of sub population called strata.Generally, the stratification is done according to administrative grouping, geographic regions and on the basis of auxiliary characters.These subpopulation are non-overlapping and together they comprise the whole of the population.These strata are so formed that they are homogeneous within and heterogeneous between.When the strata have been determined, a sample is drawn from each stratum, the drawing being made independently in different strata.In stratified sampling the most important consideration is the allocation of sample sizes in each stratum either to minimize the variance subject to cost or minimize cost subject to variance The problem of optimally choosing the sample sizes is known as the optimal allocation problem.The problem of optimal allocation for univariate stratified population was first considered by Neyman (1934).In univariate stratified random sampling for one characteristics, the sample size and its allocation is cited in Cochran (1977), Sukhatme et al. (1984) and Thompson (1997).The allocation problem becomes more complicated in multivariate surveys because in univariate an allocation that is optimal for one characteristic may be far from optimal for other characteristics.Some of the author who addressed the problem of obtaining a compromise allocation that suits well to all the characteristics are Ghosh (1958), Yates (1960), Aoyama (1963), Hartley (1965), Folks and Antle (1965), Gren (1966), Chatterjee (1972), Chromy (1987), Wywial (1988), Bethel (1989), kreienbrock (1993), Khan et al. (1997Khan et al. ( ,2003)).Ahsan et al. (2005), Kozak (2006), Ansari et al. (2009), Optimum allocation has been stated as a non-linear mathematical programming problem in which the objective function is the variance subject to a cost restriction or vice versa.This problem has been solved by using Lagrange's multiplier method, see Sukhatme et al. (1984) or the Cauchy-Schwarz inequality, see Cochran(1977) for univariate case and Arthanari and Dodge (1981) for multivariate one, both from deterministic point of view.In this paper objective is to use Rosen's Gradient projection method to determine the optimal allocation of sample sizes in stratified sampling design.

Formulation of the problem
L e t u s a s s u m e t h a t t h e r e a r e p characteristics under study and Y j be the j th characteristics to be considered.Suppose we have L strata, and N be the total population, N i units in the i th stratum such that Also, we assume that n i samples are drawn independently from each stratum and is an unbiased estimatew ij y of such that ( ) for all i=1,2,3,...L,j=1,2,3,...,p y y ij ijh n i h=1 ∑ Where Y ijh is the observed value for Y j in the i th stratum for the h th sample unit.
∑ is an unbiased estimate of population mean .The precision of this estimate is measured by the variance of the sample estimate of the population characteristics.
Also, let be the cost of sampling all the p characteristics on a single unit in the i th stratum.The total variable cost of the survey assuming linearity is Here we consider the problem of deriving statistical information on population characteristics, based on sample data, and can be formulated as an optimization problem, in which we determine optimum allocation of sample size n i , (i=1,2,3,...L), such that cost of the survey is minimized.The multivariate sample design and optimization has been treated as a mathematical programming problem (Arthanari and Dodge (1981) .Thus the problem of allocation can be stated as (see Sukhatme et al. (1984) and Arthanari and Dodge (1981). Minimize Where V j is the allowable error in the estimate of the jth characteristic.Here objective function as well as set of constraints is linear but the restriction ( 3) is non linear.The problem (P1) can also be written as Minimize The above problem (P1) can be equivalently written as Minimize 1 is strictly convex for >0 c i because of this objective function is strictly convex and the set of constraints provides a bounded convex feasible region and an optimal solution will also exit.Dalenius (1957) proposed a graphical solution to the problem for two characteristics.Kokan and khan (1967) have proved the existence and uniqueness of the solution and have given the optimal solution through iterative procedure.Chatterjee (1968) gives an algorithm to solve the problem.In 1960, Rosen developed the Gradient Projection method for linear constraints and later in 1961, generalized it for nonlinear constraints.
It uses the projection of the negative gradient in such away that improves the objection function and maintains feasibility.Although the method has been described by Rosen for a general non linear programming problem, its effectiveness is confined primarily to problems in which the constraints are all linear.The procedure involved in the application of the gradient projection method can be described in the following Algorithm.

Algorithm of Gradient Projection method
Step 1: Read in X (0) , X (0) is an initial feasible point.
Step 2: set the iteration as K=0.
Step 3: compute Step 7: compute maximum step size l * that minimizes Step 8: Step 9: else Step 10: Step 11: if all the components of l are non negative then Step 12: Optimal solution Step 13: else Step 14: delete the most negative value of l from P k .
Step 15: end if Step 16: end if Step 17: until all components of are l non- negative.

Numerical illustration
The given data has been taken from Mohd.Vaseem Ismail et al (2010).The population contains 450 units, the stratum weights and stratum variance of a population which has been divided into two strata with three characteristics under study is given below in the table 1.
Assume that C (available budget) =400 units including C o and C o = 300 units (overhead cost).Therefore the total amount available for the survey is 100 units.Also we assume that the cost of measurement C i in various strata are C 1 = 3 and C 2 = 4 for the cost function Substituting these values of the parameters into problem (P3), the allocation problem may be stated as: It is also assume that, the variance of the estimate for each character can not be greater than the specified limit.i. e. 0.30 , 0.60 0.50

Solution procedure
Step 1: let the initial feasible point be Step 2: set the iteration number as k=0.
Step 3: Since for j=2, we have The search direction S (0) is given by The normalized search direction can be obtained as  Step 4: since S (0) ≠ 0.
Step 1: We obtain a new point as x (1) = x (1)
Step 6: compute the vector l at X (1) as The non-negative value of l indicates that we have reached the optimum point and the optimum value is Min Z= 94.85.
Step 11: The optimum allocation is Since n 1 and n 2 are required to be the integers.For practical purpose if the solution is non integer then the NLPP is solved using Branch and Bound method instead of rounding the non integer sample sizes to the nearest integral values.However in some situation for small samples the rounded off allocation may become infeasible and non optimal In order to get the integer value we use Branch and bound method as given below for problem (p1).Using LINGO-13.0,we obtain the integer optimum allocation as n 1 =4 and n 2 = 21 and the optimal value is 96.

CONCLUSION
This paper concludes that when a non integer solution exists after solving the NLPP then Branch and Bound method is used to obtain the integer solution of the NLPP.