paired statistical analysis....

A

On writing papers form my phd, an old issue has resurfaced between one of my supervisors and myself. He keeps telling me to do a paired comparison of sites I have sampled, and I've told him before I can't do it as I don't have enough data for it, and I though we had it sorted, but we obviously don't. So, hopefully some lovely forumites can help out....
I have 30 sites, roughly paired up, so that there are 15 in each group. I sampled all sites twice, so I have 2 data points for each site. I have done a mann whitney, comparing all of group A with group B, for each sampling set, i.e month 1 and month 2, as I'm expecting (and found) differences in sites depending on month sampled.
As far as I can make out, my supervisor wants me to do a paired comparison, like a t-test, to see if there are any differences between site 1 in group A and site 1 in Group 2. Looking at each sampling month separately, I have only one data point for each site, so I'm comparing one against one, which obviously I can't do. If I combine sampling months, then I'll have two data points for each site, so I'm comparing two against two, which I think is pushing it. Is there some analysis that I'm overlooking that would allow me to do this?!

Thanks!

Avatar for sneaks

I can't get my head around it but it sounds like you need to be doing some kind of repeated measures or mixed ANOVA. However, depends on how you treat your data - how many you need to run.

I suspect you have a mixed ANOVA design, so you have...

DV - something or other you are measuring

IV1 = repeated measures - MONTH (has 12? levels i.e. for 12 months)
IV2 = indepentent variable = group A or group B

You can then request post hocs etc and maybe do some simple effects analysis to examine specific effects for each month.

At any rate, with multiple t-tests you should be doing a boneferroni correction.

K

Hi

How are they paired?  I would say your supervisor is partly right if you deliberately set out to sample your sites in these pairs by finding some kind of match for each site. If these sites are 'matched' I would take the value of one minus the other, which will give you 15 values and then see if the mean (or non-parametric equivalent) of those 15 values is zero. This would be a one-sample t-test, but you should get the same answer if you start with the paired data and do a paired t-test, or a non-parametric test given the small sample size. The result would tell you whether, overall, there was evidence for a difference between the sites within each pair.

Hope that helps

A

Thanks for the replies sneaks and kbara!
I think I've not made it really clear excatly what the data is...It's literally two unrelated groups, i.e. control sites and (different) experimental sites. I've got them in similar geographical locations, i.e. one control site is near to a corresponding experimental site, and there are 15 of each, dotted around the country. The aim is to see if there are similarities in these sites, caused by being in the same location. so, the are paired in that sense, Gp1_site 1 is paired/in the same location as Group 2_site 2, but it's not a true pairing so im not doing a paired t-test in that sense. I sampled on two months, so two data sets. However in each site, there was only 1 value, so no repeated measures stuff going on. So it's literally comparing one value with another.
So far I grouped all the Group 1 sites and compared them with all the group 2 sites, which was fine for my viva (!) but the sup has gone back to this individual comparison, and I'm not sure how he think I can do it!
Plus, to add to the fun, it's all non-parametric, and transforming doesn't fix it, plus there are lots of zeros for some values which messes up some transformations!

Avatar for sneaks

======= Date Modified 24 Oct 2011 13:11:21 =======
why is it non-parametric. I'd bootstrap and run parametric tests if its just non-normal/small sample size. so you could do a load (15) independent t-tests, but then you'd have to apply the boneferroni correction, so your p value would have to be < .003, rather than <.05 to be significant

Have you thought about doing something like cluster analysis or multi dimensional scaling so you can visually see where your data is clustering e.g. group 1 or group 2 or locations?

A

======= Date Modified 24 Oct 2011 20:49:43 =======
lol the data is a bit crazy sneaks, ecology stuff tends to not do how ud like it to do sometimes...there are so many variables, I'd have to twiddle with data in different ways to get it all parametric and, to me, it just sounds fishy when you play with it a lot, plus it's quite a sensitive result so I'd rather it not be open to questioning the stats, I'd prefer them to be solid and clear.
I've emailed my sup to see if he has any special test on how to do it, or even if he's actually thinking what he's thinking, as it's always possible that he's not....I did some cluster analysis on some of the data, stuff that I could get parametric, but he wants something more specific, and the cluster stuff is too general. ah hoo.
Thanks!

20851