Ehsan's Statistical pages

 These web-sites are maintained by - Mohammad Ehsanul Karim Institute of Statistical Research and Training University of Dhaka, Dhaka -1000, Bangladesh

 Selection of the Best Regression Equation by sorting out Variables In many practical applications of regression analysis, the set of variables to be included in the regression model is not predetermined, and it is often the first part of the analysis to select these variables. Steps in selecting Best Regression Model: Model Misspecification problem, Purposes of Regression Equations, Steps in selecting Best Regression Model, Criterion for selecting a model (R-sq, cp, Residual MS), Strategy for selecting variables (All Possible Regressions Procedures, Backward Elimination Procedure, Forward Selection Procedure, Stepwise Regression Procedure using statistical packages. Care to know more? Click here and for PDF version (UPDATED) click here!

 Naogaon Zila: District - 64 MEASURING INEQUALITY IN NAOGAON ZILA GENERAL INFORMATIONS on Naogaon Zila: District - 64 of former Rajshahi district of Bangladesh, its ECONOMIC SITUATION, PROFILE, NATURAL CALAMITY, CONSEQUENCES OF INEQUALITY and MEASURING INEQUALITY OF Income and Expenditure and LAND (using Gini Index, Lorenz Ratio, Atkinson Measure of Inequality along with R packages and easy to understand S language commands). Care to know more? Click here!

 Non-Linear Models Intrinsically Linear Regression Models, Intrinsically Non-linear Regression Models, Least Squares in the Non-linear Case, Notations, Procedure, Approaches to Estimation Non-linear Regression Models: Direct search (Trial-and-Error or Derivative-free technique ), Linearization (iterative method or Gauss-Newton Method in full details), Steepest descent (Direct Optimization), Marquardt’s compromise, and in addition to these approach, there are several currently employed methods available for obtaining the parameter estimates by Statistical Computer Packages. Choice of initial starting values (grid search), Stopping Rules, Nonlinear Regression using Statistical Software: SAS, R, NLREG. Also a simple Illustrative Example is also given here. Care to know more? Click here (UPDATED)for a PDF (complete) version, or here for descriptive part only, or here for an example only!

 Analysis of Regression using S-plus/ R with Diagnostics How to run regression, and draw its plots in R or S-plus? What to do in presence of heteroscedastic errors while using least squares methods? Care to know more? Click here! Also see Simple Linear Regression, Weighted Least Squares, Identifying Outliers, Transformation using the same data.

 Getting Started with R and S-plus If you are a total “newbie” in S-plus (or R; a different non-commercial software originated from same S language; learning any one will be enough to work in another), get at least S-plus 4.0 ® or higher or R 1.3.1 § or higher version and just type the commands after “ > ” sign given here in “commands window” in S-plus or in “R console” in R. In most cases I presented the respective results as well, just in case you like to match your results with me. Also whenever I used # [….] after any command, write the command with out # [….].The main topics discussed here are: Getting help from S-plus help, Assigning Values to Variables, Arithmetic Operations and built-in functions in S-plus, Removing an object / variable, Working with Vectors, Working in Character mode, Working in Logical mode, Working in Complex mode, Creating Sequential objects, Creating repetitive objects, Matrix Algebra, Using Frames, Using Lists, Working With Missing Values, Matrix Operations, Using Graphs, Making Use of External Resources, Programming with S-plus and so on. Care to know more? Click here!

 Fitting Orthogonal Polynomial Model A simple Illustrative Example of Fitting Orthogonal Polynomial Model is given here. Care to know more? Click here for a PDF version!

 Simple Linear Regression Introduction to Bivariate Relationship (Generating Scatter plots by the use of Computer), Regression Models (Population , Sample Regression Model, significance of the Stochastic Disturbance term), Exploring Regression Analysis (Historical Background, Modern Interpretation of “Regression”), Finding Least Squares Equation ( Inference by testing Hypothesis and Some warnings), Correlation Analysis (which one is preferred most?), Assumptions that we make in Linear Regression Analysis (15 Assumptions!), Case Study : Following a general strategy for Regression, Using Statistical software packages in Regression Analysis (SPSS, SAS, MATLAB, Minitab, S-plus, R, STATA) including Biography of some Personalities who contributed to Statistics: Gosset Pearson Galton Gauss and Student’s t Table. Also the full-text is downloadable in PDF format!! Care to know more? Click ! Also for a PDF (partial) version, click !

 Epidemiology: The Primer Contents Chapter I: Basic Terminology:                  03 Chapter II: Types of Epidemiological Studies:  07 I.    Observational Study:                     07 a)    Descriptive Epidemiology:                07 A.    Measurement:                             07 Measures Of Disease Frequency:                 08 a)    Incidence:                               08 1)    Risk:                                    08 2)    Rate:                                    09 b)    Prevalence:                              10 c)    Removal:                                 11 B.    Distribution:                            13 Person:                                        13 Place:                                         16 Time:                                          17 Process of formulating Hypotheses:             19 Sources of Epidemiological Data:               20 b)    Analytic Epidemiology:                   22 [1] Cohort Study:                              22 [2] Case-Control Study:                        27 [3] Cross-sectional Study:                     32 II.   Experimental (or Interventional) Study:  34 a)    Random Experiments or Randomized Trials: 34 b)    Quasi- Experiments:                      36 Summary: Epidemiological Flow Chart:           37 Last Updated at 31st Dec, 04.      To learn more about this, download this PDF format file, !

 Bias - Direction and Classification in Epidemiologic Studies Conceptual framework: Concepts and terminology (External population, Target population, Actual population, Study population, Accuracy, Reliability, Validity), Direction of bias (Positive bias, Negative bias, Towards the null, Away from the null), Classifying sources of bias ( Selection bias, Information bias, Confounding bias and their Assessment and Controlling in design stage or in the Analysis.) Care to know more? Click here!

 Drawing Correlogram Using Stata version 7.0 An example using STATA "generate", "format", "egen", "tsset", "corrgram", "lags()". Care to know more? Click here!

## Methods of collecting the information

Every survey expert has his own ideas, but however well grounded in past experience these are, they have neither the certainty nor the objective that goes with his choice of sample design. Perhaps matters like interviewing and questionnaire design can never achieve a theoretical basis in the sense that sampling has one, but research on methods of data collection must be given priority if the development of this aspect of surveys is to catch up with that of sampling and analytical techniques. Methods of obtaining data about a group of people can be classified in many ways: Documentary sources or Transcription from records, Physical observation or measurement, Mail Questionnaire or Mail enquiry, Face-to-face Interviewing, Schedule, Indirect Oral interviews, Information from Correspondents, Telephone Interview, Method of Registration, Mall Intercept Method, E-mail enquiry and Online survey and many others. Also the full-text is downloadable in PDF format here!! Care to know more? Click here!

# Nominal Associations

#### Association refers to coefficients which gauge the strength of a relationship. Coefficients in this section are designed for use with nominal data. Though nominal-level coefficients may be computed for ordinal data or higher, measures designed for higher levels have greater power and are preferred. Measures for dichotomous nominal data are discussed separately. Cohen's Kappa, a nominal measure of agreement, is not dicussed here. Of the measures discussed here, phi, the contingency coefficient, Tschuprow's T, and Cramer's V are based on adjusting chi-square significance to factor out sample size. These measures do not lend themselves to easily expressable interpretation. The adjusted contingency coefficient C* and Cramer's V vary between 0 and 1 regardless of table size, whereas phi, C, and T do not. V is by far the most used measure of association for this subset. Lambda is a popular measure because of its easily-understood interpretation in terms of proportionate reduction in error (PRE) varying from 0 to 1, but it defines perfect association as predictive monotonicity and null relationship as accord, unlike most measures, which use the criteria of strict monotonicity and statistical independence. The Uncertainty Coefficient also has a PRE meaning but its formula takes into account the entire distribution rather than just the mode (which lambda uses) and therefore may be preferred to lambda. Care to know more? Click here!

 A Few Programs Written in ForTran Computer programming is an important skill for experimental, observational and theoretical scientific work, and ForTran is (still!) one of the most important computer languages used for such work. The only real way to learn to program is by writing programs, so this manual is structured around a set of simple examples. All the problems provided here is written and tested on ProFor [ Prospero Fortran (1989) version iio and iid 2.156, compatible with the Prospero DOS Extender Kit ]. To understand the following examples, you need to know the basics of ForTran 77 or 90 first. I tried to order them as types which may help beginners. Here we go for Advanced programming, Statistical programming in ForTran, Programming related to Matrix, Programming Using Function / Subroutine and more. Care to know more? Click here!

 Intelligence: Theories and Tests Definitions of Intelligence, Theories, Tests, Controversies and Issues in Intelligence. Care to know more? Click here for PDF version!

 Basics of Demography in context of Bangladesh What is demography and its different components? Need a discussion of population in the context of bangladesh? what are the basic sources of demographic data in bangladesh? Multhus's theory of population, transition theory of demography and its stage's discussion and many more. Care to know more? Click here!

 Analyzing survey Data with available Statistical software packages Before processing survey data, it is really important to know the type of data and analysis we have to handle according to our pre-specified aim / objective. Selecting appropriate Statistical package depends on what sort of analysis we are doing on what sort of data. Therefore first we need to fix detail rules relative to what we are trying to do and what specific techniques we might be looking to apply. However, sometimes the statistics package used most often comes down to personal preference. Here we discuss some packages with specific features of analyzing survey data : SAS, STATA, SPSS, S-PLUS, R, MATLAB, SUDAAN, WesVar, Epi Info, CENVAR etc. Care to know more? Click here for PDF version!

 Using Statistical software packages in Statistical Data Analysis Many statisticians use several of the statistical packages at the same time. The core statistical capabilities exist in each of the packages. Each of the packages has its own strengths and ease of use features for the different types of analysis. We used some of the most popular statistical packages, which were available to us: SPSS, SAS, MATLAB, Minitab, S-plus, R, STATA. Care to know more? Click here for PDF version!

 These web-sites are maintained by - Mohammad Ehsanul Karim Institute of Statistical Research and Training University of Dhaka, Dhaka -1000, Bangladesh