pspp-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NaN on linear regression with many categorical variables


From: Dr. Walter Statistics
Subject: Re: NaN on linear regression with many categorical variables
Date: Thu, 15 Mar 2018 16:36:13 +0100
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

Dear Ms Pieri,

without checking your data set it is hard to definitely say why you got these results in PSPP. My first guess is that the number of variables in the analysis leads to multicollinearity - the set of variables is linear dependent or almost linear dependent - and / or a low ratio of cases-to-variables. At least when I did an analysis with a multicollinear set of variables PSPP printed NaN for standard errors, t values and significance levels of some variables. This problem disappeared when I deleted the subset of variables from the analysis which was linear dependent on the other variables.

Kind regards from Germany,

Dr. Oliver Walter


Am 15.03.2018 um 16:13 schrieb Elisa Pieri:
Hello,

I'm using PSPP (psppire 0.8.5) on Linux Mint 18.3.

Premise: I'm a big newbie in statistical analysis, so please be patient :)

I have a data set with 23 categorical variables (binary values 0/1) and a continuous variable. I would like to calculate linear regression, using the continuous variable as the dependent one, to understand which ones have the strongest impact.

The syntax that I'm using is:

REGRESSION
        /VARIABLES= GLU4 HIS8 HIS21 GLU36 ASP57 LYS60 GLU62 HIS69 ASP75 LYS96 LYS97 ASP98 ASP120 GLU123 ASP125 LYS153 GLU160 ASP166 LYS167 ASP198 ASP217 HIS219 ASP226
        /DEPENDENT=      Energy
        /STATISTICS=COEFF R ANOVA.

When I try to use less than 10 variables, the analysis works, but when I use all of them I get a lot of Nan:

Model Summary (Energy)
#====#========#=================#==========================#
#  R #R Square|Adjusted R Square|Std. Error of the Estimate#
##===#========#=================#==========================#
#|NaN#     NaN|              NaN|                       NaN#
##===#========#=================#==========================#

ANOVA (Energy)
#===========#==============#=====#===========#===#====#
#           #Sum of Squares|  df |Mean Square| F |Sig.#
##==========#==============#=====#===========#===#====#
#|Regression#           NaN|   23|        NaN|NaN| NaN#
#|Residual  #           NaN|39976|        NaN|   |    #
#|Total     #        499,89|39999|           |   |    #
##==========#==============#=====#===========#===#====#

Coefficients (Energy)
#===========#============================#=========================#===#====#
#           # Unstandardized Coefficients|Standardized Coefficients|   |    #
#|          #-----------+----------------+-------------------------+   |    #
#|          #     B     |   Std. Error   |           Beta          | t |Sig.#
##==========#===========#================#=========================#===#====#
#|(Constant)#        NaN|             NaN|                      ,00|NaN| NaN#
#|GLU4      #        NaN|             NaN|                      NaN|NaN| NaN#
#|HIS8      #        NaN|             NaN|                      NaN|NaN| NaN#
#|HIS21     #        NaN|             NaN|                      NaN|NaN| NaN#
#|GLU36     #        NaN|             NaN|                      NaN|NaN| NaN#
#|ASP57     #        NaN|             NaN|                      NaN|NaN| NaN#
#|LYS60     #        NaN|             NaN|                      NaN|NaN| NaN#
#|GLU62     #        NaN|             NaN|                      NaN|NaN| NaN#
#|HIS69     #        NaN|             NaN|                      NaN|NaN| NaN#
#|ASP75     #        NaN|             NaN|                      NaN|NaN| NaN#
#|LYS96     #       -,01|             NaN|                     -,01|NaN| NaN#
#|LYS97     #        ,50|             NaN|                      ,40|NaN| NaN#
#|ASP98     #        ,00|             NaN|                     -,01|NaN| NaN#
#|ASP120    #        ,12|             NaN|                      ,01|NaN| NaN#
#|GLU123    #        ,02|             NaN|                      ,04|NaN| NaN#
#|ASP125    #        ,00|             NaN|                     -,01|NaN| NaN#
#|LYS153    #        ,00|             NaN|                      ,01|NaN| NaN#
#|GLU160    #       -,02|             NaN|                     -,01|NaN| NaN#
#|ASP166    #       -,02|             NaN|                      ,00|NaN| NaN#
#|LYS167    #        ,00|             NaN|                      ,00|NaN| NaN#
#|ASP198    #        ,00|             NaN|                      ,00|NaN| NaN#
#|ASP217    #       -,04|             NaN|                     -,11|NaN| NaN#
#|HIS219    #        ,02|             NaN|                      ,08|NaN| NaN#
#|ASP226    #        ,00|             NaN|                      ,00|NaN| NaN#
##==========#===========#================#=========================#===#====#


Is there a kind soul amongst you that would explain to me what is going on?
Thank you very much in advance.

Elisa


_______________________________________________
Pspp-users mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/pspp-users

--
Dr. Walter Statistics
Gabelsberger Straße 27
24148 Kiel
Tel.: 0431/7802809
E-Mail: address@hidden
https://www.walter-statistics.com

reply via email to

[Prev in Thread] Current Thread [Next in Thread]