Relational databases and (multivariate) statistics

A

======= Date Modified 18 05 2009 11:05:11 =======

Hey,

I was hoping one of the statistically wise ones could spare a few thoughts on this- and apologies if the question is relatively daft, this is because I'm asking about possible consequences of something I'm not yet familiar with.

I'm in the process of transforming historical data that has a hierarchical structure into computerized form. Using a normal excel-type spreadsheet for putting in the data is rather cumbersome due to its nested structure. I therefore thought it might be useful to familiarize myself with relational databases and construct one (e.g. in MS Access).

My question relates to the future analysis of this data. Can I straightforwardly analyse such a relational database descriptively in standard stats software such as SPSS? I'm also thinking of possibly performing multilevel regression on the data (the only software I'm familiar for such is HLM). Does the format of a relational database cause any special issues in this case?

I'd be very grateful for any hints..!



K

Hi Apple

If you are thinking of doing multilevel regression you should definitely use a decent statistical package. MLWin is the specialist software for this and can deal with all the possible regression types. However, if it is just linear multilevel regression you can use Stata or SAS. I'm sure it is possible in SPSS too but I haven't done it in that. It's really important to make sure that you enter the data in a format that will make it easy for you to analyse if you don't want to faf around with rearranging the data in a software that is new to you. Try getting the manual or looking at the online help for the package you will use and how their data is organised in their multilevel regression examples. Also, you may prefer putting the data directly into the analysis package as opposed to via Excel as you can then add value labels and variable descriptions, etc.

Good luck

11913