Help, how to analyse questionnaire data? - eek

T

======= Date Modified 03 Nov 2012 12:16:02 =======
Just met my supervisor and according to them, the statisticians said that my questionnaire is ok and I can proceed with them.

I have done some pilot study with the questionnaire by collecting data from some pilot sample. When I looked at the collected data, I got massive headaches on how to analyse them :$

How do I go about doing it? Is there a technique on how to transfer those data and analyse them? Is there a matrix of some sort that I could sort the data in a category, and start from there?

Thank you so much :-(

H

======= Date Modified 03 Nov 2012 10:15:57 =======
It's hard to know without knowing the kind of questions you have asked. If the answers are, say, numbers on a scale, you can treat them as continuous variables. If the answers are tick box options, you may treat them as categorical variables. Free text entries may need handling with qualitative techniques.

I STRONGLY recommend, before you move on to your main study you talk to the statisticians some more, and attend some basic stats classes. If you uni doesn't offer any, or there is no budget for them, there are some freely available online. There is no point in ploughing ahead with loads of data collection and then finding there's some flaw because of the way the questionnaire was designed (though the fact that a statistician has said it's ok is reassuring). But it would help if you yourself understood why it's ok.

Basic statistical analyses can be carried out in Excel, but there are a number of specialist packages too (SPSS, SAS, Stats Direct, Stata, R...) Different disciplines tend to favour different packages so see what your colleagues use. Which field are you in? SPSS might be the best place to start - check if your uni/department has a license.

B

Hazyjane's answer is excellent - just wanted to support it. If you are at all shaky on stats, data analysis etc get the necessary training at this stage. If not you can end up wasting an awful lot of time later, if errors in your analysis emerge and there's no easy fix so you have to repeat work. And although support from statisticians is great, make sure you understand the maths behind their advice, as it's you who will have to defend it in your viva.

T

Hello Hazyjane,

Thank you for the reply.

Yes, 90% of them are numbers on a scale; likert scale. I think another 10% would be free text.

I'm aware of SPSS and Statistics; it's just that I'm not sure how to transfer all the collected data and put it into a manageable form for analysis.

Do you have any links that you could share with me? especially the "continuous variable" part.

Quote From hazyjane:

======= Date Modified 03 Nov 2012 10:15:57 =======
It's hard to know without knowing the kind of questions you have asked. If the answers are, say, numbers on a scale, you can treat them as continuous variables. If the answers are tick box options, you may treat them as categorical variables. Free text entries may need handling with qualitative techniques.

I STRONGLY recommend, before you move on to your main study you talk to the statisticians some more, and attend some basic stats classes. If you uni doesn't offer any, or there is no budget for them, there are some freely available online. There is no point in ploughing ahead with loads of data collection and then finding there's some flaw because of the way the questionnaire was designed (though the fact that a statistician has said it's ok is reassuring). But it would help if you yourself understood why it's ok.

Basic statistical analyses can be carried out in Excel, but there are a number of specialist packages too (SPSS, SAS, Stats Direct, Stata, R...) Different disciplines tend to favour different packages so see what your colleagues use. Which field are you in? SPSS might be the best place to start - check if your uni/department has a license.

H

A lot of stats packages allow you to either enter data directly into a data viewer, or import data from other file types. For example, I work with existing databases that are saved as '.csv' files and I then import them into a stats package.

Check the requirements of whatever package you will use, but one thing you could do is to enter your data into Excel, save it in a suitable format and then import it to another program. Use a new row for each study participant, and a new column for each question reponse.

I should point out, I don't typically work with questionnaire data and likert scales, so someone else who does might have a better suggestion for specific ways to go about this. Likert scales are a funny kind of continuous variable in as much as the values are kind of arbitrarily defined, unlike true continuous variables such as age, height, weight etc. Depending on your sample size, you might need to use non parametric methods such as the Wilcoxon signed ranked sum test or Mann-Whitney U, but this isn't really my area of knowledge so check with the statistician who approved your questionnaire!

Avatar for Batfink27

I recommend you find out some more about basic stats too before you collect your data. My university runs basic stats courses and they're great for giving an overview of what the stats can show you, worth finding out if your uni runs similar. If you've never done any stats before it can be a bit of a steep learning curve, but don't panic, you'll be fine. There's two things to look at - the statistical techniques themselves, and the software you can use to apply those techniques.

In terms of the techniques, and what different kinds of stats tests can show, for a good introduction you could look at something like Frances Clegg's book Simple Statistics: A course book for the social sciences. There are various other titles aimed at people with different levels of confidence/knowledge, with titles like Statistics Without Tears, or Statistics Without Maths. Your university library's probably a good place to go and just have a browse and see which ones seem to be written in a way that makes the most sense to you.

It sounds like your data would be really suitable for something like SPSS software, and there are plenty of sources of advice for that. There are YouTube tutorials for different techniques, just google them. Or a couple of good books I've used - when I was first starting out I used Julie Pallant's SPSS Survival Manual, which was a great step-by-step guide to the kinds of analyses that could be done and how to do them with that software. More detailed is Andy Field's Discovering Statistics Using SPSS - that may be a little intimidating for you to begin with as it's a very large book, but once you have some basic ideas about what you're looking at it gives all the explanations you're ever likely to need plus detailed instructions on how to apply the software.

Good luck!

T

Ok, guys. I *think* I know how it works now.
I've just revised a few sources regarding Likert scale and the Steven's scale of measurement. I've transferred all the necessary parts from the questionnaire in a form of a variable on SPSS.
However, something concerns me; is it normal to have almost 30 variables to be analyzed? :$

H

If you asked 30 questions (or sub questions) then yes! You sound concerned though - is there something you're unsure of.


T

Quote From hazyjane:

If you asked 30 questions (or sub questions) then yes! You sound concerned though - is there something you're unsure of.




Yes. I have 30 questions (including stubs). Actually, ATM, I've just included 25 variables. Another 5 variables I've no idea whether I should include it or not.

I don't know why I'm "unsure"; it seems 30 variables are "quite a lot" - or I'm just inexperience with this sort of thing.... :$

H

======= Date Modified 05 Nov 2012 16:29:59 =======

Quote From tt_dan:

Quote From hazyjane:

If you asked 30 questions (or sub questions) then yes! You sound concerned though - is there something you're unsure of.




Yes. I have 30 questions (including stubs). Actually, ATM, I've just included 25 variables. Another 5 variables I've no idea whether I should include it or not.

I don't know why I'm "unsure"; it seems 30 variables are "quite a lot" - or I'm just inexperience with this sort of thing.... :$


At the end of the day, it's not necessarily the *number* of variables that is important, it's how *useful* or *relevant* they are to the hypothesis being tested.

Most of my work is done on existing databases of dozens of variables and thousands of records. But for a particular analysis, I don't use all that information - just the variables and records that relevant to the specific question I'm trying to answer.

May I ask what general field your project is based in (e.g. social sciences? medicine?) I don't need all the details, but it might help people direct you to appropriate forms of advice.

H

Actually, there is a caveat. One CAN have too many variables for accurately carrying out some types of analysis, particularly if the sample size is small.

But really it's impossible to give you further advice on this without knowing the nature of your project. Sorry.

T

Hello Hazyjane,

Thank you for the reply.

Yes, I was thinking about it as well; perhaps during the presentation, I would just select the strongest ones, or I could further generalize the variables or grouped them further for easier analysis - does that make sense?

This particular research I'm doing is a small part of the main thesis; it's conducted to support the main thesis. I'm developing some model for this X problem my field is facing. But I need some "social" survey research so that I have updated evidence that the "problem" is still "existing" and are being faced by the users. Thus, the users would need to be surveyed. Hope that helps in some ways.

Quote From hazyjane:

======= Date Modified 05 Nov 2012 16:29:59 =======
Quote From tt_dan:

Quote From hazyjane:

If you asked 30 questions (or sub questions) then yes! You sound concerned though - is there something you're unsure of.




Yes. I have 30 questions (including stubs). Actually, ATM, I've just included 25 variables. Another 5 variables I've no idea whether I should include it or not.

I don't know why I'm "unsure"; it seems 30 variables are "quite a lot" - or I'm just inexperience with this sort of thing.... :$


At the end of the day, it's not necessarily the *number* of variables that is important, it's how *useful* or *relevant* they are to the hypothesis being tested.

Most of my work is done on existing databases of dozens of variables and thousands of records. But for a particular analysis, I don't use all that information - just the variables and records that relevant to the specific question I'm trying to answer.

May I ask what general field your project is based in (e.g. social sciences? medicine?) I don't need all the details, but it might help people direct you to appropriate forms of advice.

23492