Chris Wild

Lead educator for “Data to Insight”, Chris Wild’s interests are data from complex sampling, statistical thinking and reasoning processes, and visualisation
See: https://www.stat.auckland.ac.nz/~wild
Location Auckland, New Zealand
Activity
-
Chris Wild replied to Muriel Russell
HI Muriel, These particular articles get the essential ideas across well at the right level -- Chris
-
Hi Teresa. It doesn't really matter for essential learning if you used education rather than education.record (which you had to create .... "Exercise 2.5 showed the use of this technique to create a new variable called Education.reord. You will need to do that again.") -- Chris
-
Chris Wild replied to Chi Ni
Hi Chi Ni, they are different systems with different strengths and weaknesses -- Chris
-
Chris Wild replied to Elena A.
Hi Elena, The counts are a summary off a categorical variable. If you wanted to make a new dataset containing the counts, those counts would form a numerical variable in the new data set. So you are on the right track -- Chris
-
Chris Wild replied to Chi Ni
Hi Chi Ni, With a categorical variable each entity falls into one, and only one, category. With what you are calling "overlapped categories" you have/code a set of variables, one for each category and the variable records whether or not an entity falls in to that category. This situation arises with so-called multiple-response questionnaire items where,...
-
Chris Wild replied to Maria Martynova
Working fine for me on Windows Maria so guessing you are using a Mac. Email me at inzight_support@stat.auckland.ac.nz -- Chris
-
Not in the sampling variation module Olena, but there is in the module to follow -- Chris
-
Chris Wild replied to elvina Sequeira
Hi Elvina, Fill out the form at https://www.stat.auckland.ac.nz/~wild/iNZight/support/contact/ and it should tell us what we need to know to help you -- Chris
-
Chris Wild replied to elvina Sequeira
Hi Elvina, iNZight Lite doesn't get installed. You just connect to it online -- Chris
-
Yes it does Osama -- Chris
-
Chris Wild replied to THERESA DASHIE
The current one (3 5 3 at the moment) is always the one to use Theresa -- Chris
-
Hi Diana, https://www.stat.auckland.ac.nz/~wild/d2i/exercises/1.15%20exercise-import-data-into-inzight-lite_17.pdf . Make sure you use the getting started links near the top first - Chris
-
Chris Wild replied to Daniela Diaz
It's there if you get your copy of the gapminder dataset from the place where it says the instructions in next Step (2.15) -- Chris
-
Chris Wild replied to areej fatima
Hi Areej. Re going "back to previous steps" I guess you are talking about the Play button. Instead of using the Play button, use the slider and then you have control of what graphs you are looking at. Occasionally some combination of things you do stops iNZight but unless it happens in a way we can replicate there is no way to find it and fix it. You just...
-
Chris Wild replied to Mmuso Lerutla
Hi Mmuso. Your question is a bit advanced for this course. If you Google you'll find all sorts of things about sample size calculations/determination -- Chris
-
Chris Wild replied to Eva Groeneveld
Hi Eva, iNZight is free for anyone to use anywhere so no problems from our end -- Chris
-
Chris Wild replied to Eva Groeneveld
Hi Eva. Once iNZight is installed nothing you are using (except perhaps if you ask for an interactive graph) uses you internet connection so can't see how it can be the cause -- Chris
-
Strange. I still can't replicate! -- Chris
-
Chris Wild replied to Emily Ayres
Got them Emily. Thanks, Chris
-
The graphics were made for a first encounter with testing ideas and we decided that "2-tailed" added another obscuring layer of complexity. Take your tail area and double it for a reasonable approximation -- Chris
-
@PatrickKearns It is hard to understand but often happens when results are a little unusual but no extremely so. Remember not having a small tail area does not demonstrate that no real (non random) effect exists -- Chris
-
Chris Wild replied to areej fatima
@areejfatima Found it Areej, " Is this a problem for intended analysis??" is just a trigger to make you think about "Is the data problem I am seeing going to cause problems for the type of analysis I want to do?" You may have to find out more or learn more to be able to answer such a question -- Chris
-
Chris Wild replied to Suubi Kawooya
Hi Suubi, I would say the *estimate* becomes more accurate because the sample size is bigger". -- Chris
-
Chris Wild replied to Maggie Ding
Great to have you "back" Maggie -- Chris
-
Chris Wild replied to areej fatima
Hi Areej, I don't get what you are asking. Can you give me more detail please? -- Chris
-
Chris Wild replied to Kemi Kayode
Hi Kemi, Can you email me a screenshot at inzight_support@stat.auckland.ac.nz so I can see what you are seeing? -- Chris
-
Hi Hakeen, You can just use the p-value if it was obtained using an appropriate method -- Chris
-
Chris Wild replied to HAKEEM ALIMI
Hi Hakeem. This was just a taster. You will need a more full-on statistics course to get more into those aspects -- Chris
-
Chris Wild replied to Asya Avetyan
Thanks Patrick. That's an old link to the material on Step 6.9. I've removed it. Step 6.9 and the pdf of it linked from on Step 6.9 are fine -- Chris
-
Hi Areej, Please see this entry ... -- Chris
https://www.stat.auckland.ac.nz/~wild/iNZight/support/faq/?section=export_data -
Chris Wild replied to Marcio Valerio Silva
Fine Marcio but please see answer I've just posted to Areej immediately above -- Chris
-
Chris Wild replied to areej fatima
Hi Areej, Looks fine but I can't comment on everyone's answers to all these questions. I hope participants will look over one another's -- Chris
-
Hi Ali, Start from the top and email me at inzight_support@stat.auckland.ac.nz telling me about the first thing you strike that you can't understand -- Chris
-
I need more detail to understand what the problem is Areej -- Chris
-
Hi Anna, the bottom line is, if you are looking at the graph and want to spot evidence of where there are true differences, or get a visual indication of how small or big a true difference could be use the black lines in graph 5 and not the red lines. Even though we are illustrating with 2 groups here the technique is really for graphs with multiple groups. To...
-
Chris Wild replied to Asya Avetyan
Hi Patrick, (CI lower, CI upper) overweight 1.18,1.29, 1.40; normal weight 1.38, 1.5, 1.62 is consistent with the story unless you are looking somewhere I haven't seen --- Chris
-
Chris Wild replied to Olena Ripa
Hi Olena, We talked about the overlap between data fro 2 different groups. IQR is talking about where the centre 50% of the data for one dataset/group is -- Chris
-
Hi Ася, VIT and VITonline can only cope with csv and tab-delimited text files -- Chris
-
No RIMAMSIKWE, Lots in the R libraries of Rob Hyndman (Google him) -- Chris
-
Chris Wild replied to Rosebud Lambert
Can't currently in iNZight or with the iNZightPlot function in R Rosebud -- Chris (can in ggplot)
-
Are you using VIT or VITonline Victoria? -- Chris
-
Chris Wild replied to Asya Avetyan
Hi Ася, All the numbers above are produced by iNZight. Not sure what you mean about the "proportion rate"? Can you give more detail? Thanks, Chris
-
Chris Wild replied to Asya Avetyan
Hi Ася, I'm not quite following your question. Can you be more specific? -- Chris
-
Hi Ася, Step 7.11 confronts discusses issues like this -- Chris
-
Chris Wild replied to HAKUZAYESU Eliezer
Hi HAKUZAYESU, desktop (installed) iNZight works off line. iNZight Lite is driven by a remote server so online only -- Chris
-
Sorry Sadequllah, Button name has changed to "Record my choices" to be the same as desktop VIT. Fixed -- Chris
-
Hi Ася, It's doing the whole 49 but only 40 numbers showing in the panel onscreen -- Chris
-
Chris Wild replied to John Teko
HI John, The exercises are for you to play with software and data. The quizzes examine your level of understanding. With upgrading the assessments do it better -- Chris
-
Chris Wild replied to John Teko
That's no problem on Windows John but it is a problem on Macs -- Chris
-
Chris Wild replied to Asya Avetyan
Hi Ася, we set up a scenario where we know the answers to see what sort of misconceptions we could get from biased sampling -- Chris biased sampling
-
Chris Wild replied to Mark Bretherton
HI Mark, If you run enough resamples you stop getting the differences. We've only been doing 1000 mainly so the visualizations are not too slow since VIT is mainly a conceptual development too -- Chris
-
Chris Wild replied to Chris Wild
A clean download is more reliable. Try in the early morning or some other time you think the internet in your area is not likely to be overloaded. If it remains a problem use iNZight Lite -- Chris
-
Chris Wild replied to Muriel Russell
Hi Muriel, You'll soon find whether or not trying to use both is going to cost you more time than you are prepared to spend - in which case cut back to one -- Chris
-
Hi Pat, The point of this is demonstrating the variation you get in estimates from sample-to-sample. That variation gets smaller if the samples are bigger. So there is less sampling error in an estimate from a big sample than in an estimate from a smaller sample -- Chris
-
Chris Wild replied to Patrick Kearns
Hi Pat, Replication of results by many centres protects us against biases, experimental and other data-collection mistakes and special circumstances so in that sense a much higher bar for concluding a research hypothesis (not a null hypothesis) is true. Assuming that several trials have been done well pooling the results (meta analysis) is like having a very,...
-
Chris Wild replied to RIMAMSIKWE ANDE MAMMAN
Hi RIMAMSIKWE, In practice the true value of the population mean is unknown but if we take samples and construct 95% confidence intervals most of the time simulation experience in scenarios where we know the truth and theory show that for 95% of samples taken the true mean is in the calculated confidence interval -- Chris
-
No RIMAMSIKWE. Outliers can cause uncertainty (about whether this is a real observation or an error), but not the sort we can compensate for by using a confidence interval -- Chris
-
Hi RIMAMSIKWE, the confidence interval itself is a way of conveying the level of uncertainty in an estimate -- Chris
-
@RIMAMSIKWEANDEMAMMAN Not easily and definitely not with the bootstrap RIMAMSIKWE -- Chris
-
Email the data to me at inzight_support@stat.auckland.ac.nz Philip and I'll take a look -- Chris
-
Chris Wild replied to Barbara de la Hunty
No but you could try this one Barbara ... -- Chris
https://www.futurelearn.com/courses/data-mining-with-weka -
Chris Wild replied to Nadia B
I don't know of any Patrick but I do know that in medical trials the analyst is sometimes blinded - in the sense that they only get treatment labels in their data and don't know what actual treatment each patient got - Chris
-
Chris Wild replied to Dave Hall
If you get hooked on that stuff Dave, start poking around Rob Hyndman's website. Rob and his group are amazing -- Chris
-
Chris Wild replied to Asya Avetyan
Hi Ася, The diagram, above and the game itself should help. Look at the answers it produces and after a while you should get a better feel for it -- Chris
-
Chris Wild replied to N E
Email me directly then N E -- Chris
-
Hi Patrick. A negative relationship is one in which as one variable increases the other tends to decrease (opposite directions), so the wrong descriptor for "no association". Better ones would be unrelated or independent -- Chris
-
All working fine for me on v 3.5.3. Pretty sure time series hasn't been touched recently. As I said to Dave below for something similar, if restarting and trying again doesn't fix it email me at inzight_support@stat.auckland.ac.nz with an account of what you have done in what order, contents of the R Console window and a screenshot -- Chris
-
Chris Wild replied to Mark Eaglesfield
Hi mark. This would be plenty to prepare for Stats 101 here. In some place we have gone further -- Chris
-
Chris Wild replied to Adam Ennis
Hi Adam, p-values are a different idea - see Week 7 -- Chris
-
Thanks Dorota -- Chris
-
Chris Wild replied to Nadia B
Hi Nadia, Have lived in Auckland most of my life -- Chris
-
Working fine for me Dave. If restarting and trying again doesn't fix it email me at inzight_support@stat.auckland.ac.nz with an account of what you have done in what order, contents of the R Console window and a screenshot -- Chris
-
Hi Dorota, The behaviour in the China series isn't mistakes, it is real behaviour that is unlike behaviour before or since. Recently lots of series have gone crazy because of covid-19 induced behaviours that are unprecedented. In such circumstances there are no obvious ways of making good forecasts -- Chris
-
Hi Nadia, If you can't tell whether a seasonal series looks additive of multiplicative you'll generally get very similar results either way. Can always look at it both ways -- Chris
-
Chris Wild replied to S J
Hi S J, iNZight is a gui-driven system written in R -- Chris
-
Chris Wild replied to Mark Eaglesfield
Hi Mark, You'd have to calculate the other tail area as well -- Chris
-
Chris Wild replied to Nadia B
@NadiaB Hi Nadia and Laura. Jan-Mar is about 15,000 above the trend so the trend ... -- Chris (probably unnecessarily tricky)
-
Chris Wild replied to Gemma Welsh
Hi Gemma, That is the behaviour for a subsetting variable (slots 3 and 4) that is numeric. Put your variables in slots 1 and 2 -- Chris
-
Chris Wild replied to Hannah Keogh
Hi Hannah, Bar charts won't display for a categorical variable with more than about 200 categories (wouldn't be able to see anything anyway) -- Chris
-
Chris Wild replied to Hannah Keogh
Hi Hannah, View Data Set is not just a display. It has editing capabilities as well. When the data set gets too large it slows down the program. We disable it at about 20,000 cells I think it is. You can still view the data set using Dataset > View full dataset - Chris
-
Here's a nice treatment Roger ... -- Chris
https://robjhyndman.com/hyndsight/cyclicts/ -
Chris Wild replied to Roger Gee
Future Steps Roger -- Chris
-
Yes Adam, even in the nonlinear case curve fitting is a form of regression -- Chris
-
Chris Wild replied to Meshack Nwofor
Hi Meshack. That is not a question for a statistician. That is a question for an expert in the area of the problem (be it medical, business, political, ...) and the answer will change with the problem -- Chris
-
Chris Wild replied to Meshack Nwofor
One word wrong in there Meshack, "Treatment differences are **practically** significant if they are big enough to have a real world impact" -- Chris
-
Chris Wild replied to areej fatima
For data visualisation and analysis Areej, certainly -- Chris
-
Use iNZight Lite Jhonattan -- Chris
-
Chris Wild replied to Myra Dolon
Hi Myra, iNZight Lite slowdowns are usually caused by usage spikes -- Chris
-
Chris Wild replied to Petra Wolf
Thanks Petra. All the best for applying these ideas in your real world -- Chris
-
Fixed Petra --- Chris
-
Chris Wild replied to Geoff Jagoe
Hi Geoff, Maths in here ... Not in because few would understand -- Chris
https://www.stat.auckland.ac.nz/~wild/visdiffs/ -
Chris Wild replied to Geoff Jagoe
Sure Geoff - Chris
-
Hi Anna, In desktop iNZight you are not limited by the drop down choices, you can type in colours .. -- Chris
https://www.stat.auckland.ac.nz/~wild/iNZight/user_guides/advanced/#colours -
Hi RIMAMSIKWE, It is in principle but it tends to be an expensive strategy cost wise. In practice more complex methods of random sampling are used for large populations including elements of stratified sampling and cluster sampling (https://en.wikipedia.org/wiki/Stratified_sampling, https://en.wikipedia.org/wiki/Cluster_sampling,...
-
More like during Stats 310 I guess Mark -- Chris
-
Chris Wild replied to HASHAN TEEKSHANA
Congratulations Hashan. Glad you enjoyed it -- Chris
-
Chris Wild replied to Mark Eaglesfield
Hi Mark, I informed them. Thanks, Chris
-
Hi Emily, Depends what you are doing. Even severe lack of balance is fine for comparing men and women (provided you use things like means and proportions), not fine unless you make adjustments if you are want to use as a combined group representing the general population -- Chris
-
Hi Afshin. Restart the program and it will be fine. If you can reproduce a sequence of steps that leads to the error, only then do we have a good chance of diagnosing and fixing it. Best, Chris
-
Chris Wild replied to Sarah Thurley
Hi Sarah, You can often only see what is wrong with a line of code by seeing what has come before. Email to inzight_support@stat.auckland.ac.nz. It could be as simple as misspelling a variable name -- Chris