Showing posts with label logistic. Show all posts
Showing posts with label logistic. Show all posts

Friday, April 06, 2007

Confidence intervals for the predicted values - logistic regression

Using predict
after logistic
to get predicted probabilities and confidence intervals is somewhat tricky. The
following two commands will give you predicted probabilities:


        . logistic ...
. predict phat


The following does not give you the standard error of the predicted
probabilities:


        . logistic ...
. predict se_phat, stdp


Despite the name we chose, se_phat does not contain the
standard error of phat. What does it contain? The standard error
of the predicted index. The index is the linear combination of the estimated
coefficients and the values of the independent variable for each observation
in the dataset. Suppose we fit the following logistic
regression model:


        . logistic y x 


This model estimates b0 and b1 of the following model:


P(y = 1) = exp(b0+b1*x)/(1 + exp 0+b1*x))
Here the index is b0 + b1*x. We could get
predicted values of the index and its standard error as follows:

        . logistic y x
. predict lr_index, xb
. predict se_index, stdp


We could transform our predicted value of the index into a predicted
probability as follows:


. gen p_hat = exp(lr_index)/(1+exp(lr_index))


This is just what predict does by default after a logistic regression
if no options are specified. Using a similar procedure, we can get a 95%
confidence interval for our predicted probabilities by first generating the
lower and upper bounds of a 95% confidence interval for the index and then
converting these to probabilities:



. gen lb = lr_index - invnorm(0.975)*se_index
. gen ub = lr_index + invnorm(0.975)*se_index
. gen plb = exp(lb)/(1+exp(lb))
. gen pub = exp(ub)/(1+exp(ub))


Generating the confidence intervals for the index and then
converting them to probabilities to get confidence intervals for the predicted
probabilities is better than estimating the standard error of the predicted
probabilities and then generating the confidence intervals directly from that
standard error. The distribution of the predicted index is
closer to normality than the predicted probability.

  • Confidence intervals for the predicted values - logistic regression-stata
  • Wednesday, March 28, 2007

    Hosmer and Lemeshow Test

    Hosmer-Lemeshow
    test of goodness-of-fit can be performed by using the lackfit option after the model statement. This test divides subjects into deciles based on predicted probabilities, then computes a chi-square from observed and expected frequencies.
    It tests the null hypothesis that there is no difference between the observed and predicted values of the response variable.Therefore, when the test is not significant, as in this example, we can not reject the null hypothesis and say that the model fits the data well. We can also request the generalized R-square measure for the model by
    using rsquare option after the model statement. SAS gives the likelihood-based
    pseudo R-square measure and its rescaled measure.

    Categorical Data Analysis Using The SAS System
    , by M. Stokes, C. Davis
    and G. Koch offers more details on how the generalized R-square measures that
    you can request are
    constructed and how to interpret them.
    proc logistic
    data = hsb2;
    class prog(ref='1') /param = ref;
    model hiwrite(event='1') = female prog read math / rsq lackfit;
    run;

    Friday, February 02, 2007

    proc logistic

    ods trace on;

    It helps for us to receive all possible outputs
    for example Type3

    proc logistic data = k outest=jj;
    class ses race schtyp/param =glm;
    model female = age ses race schtyp science write/outroc =roc lackfit;
    units age=35 45 55;
    ods output ParameterEstimates = model_female
    Type3=chisq;
    run;

    I want to keep only significent variables:

    data chisq1;
    set chisq;
    /*format ProbChiSq;*/
    if ProbChiSq>0.05 then delete;
    run;


    PROC SQL ;
    SELECT effect INTO :mm separated BY " "
    from chisq1;
    quit;




    %put &mm;
    proc logistic data = k outest=jj;
    class &mm/param =glm;
    model female = &mm/outroc =roc lackfit;
    units age=35 45 55;
    ods output ParameterEstimates = model_female
    TypeIII=chisq;
    run;