Marcio Valerio Silva

Marcio Valerio  Silva

I am a specialist in safety management, with experience in implementing and maintaining safety, health and environment management systems, in addition to risk management.

Location Rio de Janeiro, Brazil



  • When we are plotting several related series so that we can compare the patterns in them, what are the strengths and the weaknesses of a plot that puts all of the series on the same graph?
    The big advantage is that everything is on the same scale. The big disadvantage is that information from all countries with a much smaller number of arrivals are all...

  • ! agree that predictions can only work well if the historical relationships between variables continue to hold.

  • Which better summarizes what you see in the data?
    The multiplicative option.

    For the additive plot, the residuals tend to be small in the middle and large toward the edges. Why do you think this is?
    Because the swings are bigger when the trend is higher.

  • When we have a multiplicative decomposition, how do we adjust the trend value to incorporate the seasonal effect at each time point?
    We adjust multiplying the seasonal swing to the trend.

    What is the no-change value when we are making additive adjustments?
    Is the point of additive model crossing zero. Adding zero makes no change.

    What is the no-change...

  • What is the basic idea behind an additive model (or additive seasonal decomposition)?
    To find out seasonal swings, assuming that they are all the same apart from purely random differences.

    What is the basic idea behind a multiplicative model (or multiplicative seasonal decomposition)?
    To find out seasonal swings that are a lot smaller towards the...

  • What is the basic idea behind an additive model (or additive seasonal decomposition)?
    To find out seasonal swings, assuming that they are all the same apart from purely random differences.

    Why do we want to find stable structures in our time series?
    Because we need it for projecting into the future to form forecasts

  • Holes in series could be solved with guesses and then changing these guesses and see how that affects the results (“sensitivity analysis”).

  • What is time-series data?
    Is a kind of data collected over time.

    Why are people interested in time-series data?
    Becuase it can help understand the past but, ever more, predict the future.

    What is quarterly data?
    It is data reported four times a year, covering periods of three months.

    Why do people plot time-series data with points joined up by...

  • It was amazing testing if a simply randomly chosen groups can cause the same deviations of the factor been studied

  • VIT is very good to show the effect in experimental data can be produced by the randomisation done.

  • When we look at a plot of experimental data that compares two treatment groups, why is it not always just obvious which treatment is better? What question goes through our minds?
    Because it could just be "the luck of the draw". The question is: "Do the effects of a randomisation experiment real demonstrate ftreatment diferences?".

    What is the basic idea...

  • What is randomisation variation and why is it a problem?
    Is the type of variation caused by simply using randomly chosen groups. The problem is that we can´t state the true cause.

    How can we reduce the levels of randomisation variation?
    Having more observation points in the samples.

  • The short readings are very good references.

  • Why do we want to do randomised experiments? What is the point of them?
    Because we need to be sure whose factors are causing confounding. The true cause could be in other ways different

    What are the two elements we need to have in place to be able to have a fair test?
    We have to intervene in what cause is confounding and we have to use a balanced...

  • This statement is very clear:
    "With 95% confidence, the population quantity for variable for the subgroup is somewhere between ci.lower and ci.upper."

  • How is a “nest of trend curves” obtained for putting onto a scatterplot?
    Bootstrap resamples are taken from the data, and for each resample, the same type of curve is computed and added to the graph.

    How do we interpret such a “nest of trend curves”?
    We shouldn't have any confidence in the fitted trend in regions where the curves are far apart

  • What uncertainties are being conveyed by the confidence intervals drawn around means on a graph?
    As each average is within a probable range, trying to find the difference in the intervals adds uncertainties.

    What does the overlap between confidence intervals drawn about means suggest visually?
    The comparison might be difficult because the values may be...

  • What are the two main ways of approaching the problem of obtaining confidence intervals?
    mathematical theory (e.g. normal distribution) and computer intensive methods (e.g. bootstrap).

    Why are methods based on mathematical theory the default methods in most packages for long-standing problems?
    Because, in the historical route, it´s become first due to a...

  • What proportion of participants in the NHANES-1000 population do we expect to be classified as “Obese”?
    The simulation indicated a value between 8 and 24,5%.

    I am wondering the VIT. I could really see the confidence inteerval

  • What is the basic idea of how a bootstrap confidence interval is constructed to capture the true population value of some quantity (e.g. a mean, a median, a percentage, ..)?
    The basic idea is finding out the minimum and the maximum of a confidence interval by calculating the same quantitynfor a large number of bootstrap resamples.

    What do we do to find out...

  • Viewing the new words, we could guess good knowledge ahead

  • What is bootstrap re-sampling? How do we generate a bootstrap-resample?
    It is a method for estimating the margins of error of a small sample. We generate by sampling from the sample with replacement.

    What would happen if we took our re-samples using the ordinary way of sampling (without replacement)?
    The samples will be identical so there will be no...

  • What is the most reliable way we know of obtaining data about populations without misleading biases? Why is this method not perfect?
    Using random sampling. It´s not perfect because there will always be an error due to sampling.

    What happens whenever we use data from a sample to estimate a population quantity?
    We might find errors, which get smaller as the...

  • When increasing the sample size and repetitions, the error tend to be the minimum.

  • The bigger the sample size is, the smaller sampling error does

  • What effect does sample size have on sampling error?
    The bigger the sample is, the smaller sampling error goes.

    For what two reasons are non-random selection mechanisms worse than random selection mechanisms?
    Random selection minimizes bias influence and can get good idea of how reliable the estimates are.

    What were the 5 “take home messages” from this...

  • Do the problems caused by bad measurement systems and biased selection mechanisms go away when we get huge amounts of data?
    No, these problems don´t go away as we get more data.

    Do the problems caused by confounding go away when we get huge amounts of data?
    No, the influence of confounders doesn´t change with the amount of data.

    Do the problems caused...

  • What is a lurking variable?
    It is an alternative name for confounders, something that causes changes in both of the outcome and the predict variables.
    We have methods for adjusting for confounders, so why can we still not reliably draw causal conclusions from observational data alone?
    Because there is always the chace that effects we think we are seeing is...

  • What is a confounder?
    It is something that causes changes in both the outcome and the predictor of interest.

    What is a lurking variable?
    See confounder

    How can we adjust for a lack of balance on a known confounder?
    We must make comparisons within groups that have similar values of the confounder.

  • When is a variable a cause of changes in the outcome?
    When purposefully changing its value lead to a change in the pattern of outcomes.

    What is an observational study?
    Data result from observing conditions as they are in the world.

    When do we have positive association between variables? negative association?
    Positive, whne things tend to occur...

  • We have learned some ways for solving bad data:
    - checking back against original sources.
    - setting suspicius data as missing.
    - checking if values of each numeric variable lie within believable limits.
    - checking if values of each categorical variable correspond to what is expected.
    - looking for suspicious points in dot plots or scatter plots.

  • In terms of selection biases, there are important items to consider:
    - Biases or errors are generally biggest in the “non-scientific” polls or surveys that do not use sampling.
    - There are a host of things that can have a significant influence on the way people answer a question, such as information in the survey about why it is being done, differences in...

  • We have discussed about validity (measuring the right thing) and reliability (when you measure the same thing over and over again, you get pretty much the same answer).

  • It´s always good when we known new concepts.

  • What are artefacts?
    Artificial patterns caused by deficiencies in the data-collection process.

    What are the two main ways that systematic biases get into data?
    Bad measurement processes and biased selection process.

    Why can missing values cause biases?
    Because the data that we have values for could show trends different from those whose values have...

  • What is the first law of data analysis?
    "Garbage in, garbage out"

    Can sophisticated data analysis turn bad data into reliable conclusions?
    No. If data is really bad, we should just walk way form it.

    In terms of the patterns we see in data, what is the difference between facts and artefacts?
    Facts are patterns that reflect the way things really are in...

  • I have had some problems becuase it is not my native language. Sometimes, I made some confusion with the meaning of variabls used in iNZight software. After all, I can say that I realize the features of the software for showing data trends and relationships. I would like to highlight the "Overcoming perceptual problems" session. I became surprised with the...

  • It was very nice to visit many possibilities for visualizing trends and other relationships.

  • INZight is a great tool for analyzing big data sets

  • What are we looking for when we colour by a (third) numeric variable?
    The behavior in separated ranges or different groups of a third numerical variable, in the scatterplot.

    What are we looking for when we colour by a (third) categorical variable?
    The spectrum ranging from completely separated to totally mixed up, related to a third variable.

    What are...

  • I choose smother becuase it fits better the dataset. Until 20, there is a positive slope. Than, the weight stays stable until 60, when it decrease slowly. In smother graph, the dotted lines means the spread of points between 25 and 75% of weight data.

  • In large data sets, what is emphasized visually by a low transparency setting? by a high transparency setting?
    In low transparency, the concentration of values is shown. In hogh transparency, the bulk of the data is clearly shown.

    What are running quantiles and why are they useful for large data sets?
    Running quantiles are curves drawn and labelled as...