Introduction To
The Statistical
Concepts
Objectives
• Define statistics.
• Enumerate the importance and limitations of statistics.
• Explain the process of statistics.
• Know the difference between descriptive and inferential
statistics.
• Distinguish between qualitative and quantitative variables.
• Distinguish between discrete and continuous variables.
• Determine the level of measurement of variables.
“
STATISTICS?
3
Definition of Statistics
STATISTICS is the science of
collecting, organizing,
summarizing and analyzing
information to draw conclusions
or answer questions.
Definition of Statistics
1. Collection of information.
2. Organization and summarization of information.
3. Information is analyzed to draw conclusions or
answer specific questions.
4. Results should be reported using some measure
that represents how convinced we are that our
conclusions reflect reality.
Importance
of Statistics
It enables people to
make decisions based
on empirical evidence.
Importance of Statistics
Provides us with tools
needed to convert
massive data into
pertinent information
that can be used in
decision making.
Provides us
information that we
can used to make
sensible decision.
DATA
DATA are factual
information used as a basis
for reasoning, discussion, or
calculation.
Field of
Statistics
Field of Statistics
Mathematical
Statistics
- The study and development of statistical theory
and methods in the abstract.
Applied Statistics
- The application of statistical methods to solve real
problems involving randomly generated data and
the development of new statistical methodology
motivated by real problems.
Limitation of
Statistics
Limitation of Statistics
1. Statistics is not suitable
to the study of qualitative
phenomenon.
2. Statistics does not
study individuals.
3. Statistical laws are not
exact.
4. Statistics table may be
misused.
5. Statistics is only, one
of the methods of
studying a problem.
Process of
Statistics
Process of Statistics
1. Identify the research objective
- A researcher must determine the
question(s) he or she wants to
answered. The question(s) must
clearly identify the population that is
to be studied.
Process of Statistics
2. Collect the information needed to
answer the questions.
- Conducting research on an entire
population is often difficult and
expensive, so we typically look at a
sample.
EXAMPLE
The Philippine Mental Health Associations contacts 1,
028 teenagers who are 13 to 17 years of age and live in
Laoag City and asked whether or not they had been
prescribed medications for any mental disorders, such as
depression or anxiety.
Population:
Teenagers 13 to 17
years of age who live in
Laoag City.
Sample:
1, 028 teenagers 13
to 17 years of age who live
in Laoag City.
EXAMPLE
A farmer wanted to learn about the weight of
his corn crop. He randomly sampled 100 plants and
weighted the corn on each plant.
Population:
Entire corn crop
Sample:
100 selected
corn crop
Process of Statistics
3. Organize and summarize the
information
- Descriptive statistics allow the
researcher to obtain an overview of the
data and can help determine the type of
statistical methods the research should
use.
Process of Statistics
4. Draw conclusion from the
information
- Information collected from the sample is
generalized to the population.
- Inferential statistics uses methods.
Take Note!
If the entire population is
studied, then inferential statistics
is not necessary, because
descriptive statistics will provide
all the information that we need
regarding the population.
EXAMPLE
1. A badminton player wants to
know his average score for the
past 10 games.
EXAMPLE
2. A car manufacturer wishes to
estimate the average lifetime of
batteries by testing a sample of 50
batteries.
EXAMPLE
3. Janine wants to determine the
variability of her six exam scores in
Algebra.
EXAMPLE
4. A politician wants to determine
the total number of votes his rival
obtained in the past election
based on his copies of the tally
sheet of electoral returns.
EXAMPLE
5. A shipping company wishes to
estimate the number of
passengers traveling via their
ships next year using their data on
the number of passengers in the
past three years.
Distinction Between
Qualitative and
Quantitative
Variables
Qualitative and Quantitative Variables
- Characteristics of the
individuals within the
population.
Variables
Qualitative and Quantitative Variables
- is variable that yields
categorical responses. It is a
word or a code that represents
a class or category.
Qualitative Variable
Qualitative and Quantitative Variables
- takes on numerical values
representing an amount or
quantity.
Quantitative Variable
EXAMPLE
1. Hair Color
2. Temperature
3. Stages of Breast Cancer
4. Number of Hamburger Sold
EXAMPLE
5. Number of Children
6. Zip Code
7. Place of Birth
8. Degree of Pain
Distinction Between
Discrete and
Continuous
Discrete and Continuous
- is a quantitative variable
that either a finite number of
possible values or a countable
number of possible values.
Discrete Variable
Discrete and Continuous
- is a quantitative variable
that has an infinite number of
possible values that are not
countable.
Continuous Variable
EXAMPLE
1. The number of heads obtained after flipping a
coin five times.
2. The number of cars that arrive at a McDonald’s
drive-through between 12:00 P.M. and 1:00 P.M.
3. The distance of a 2005 Toyota Car can travel in
city conditions with a full tank of gas.
EXAMPLE
4. Number of words correctly spelled.
5. Time of a runner to finish one
lap.
Levels of
Measurement
Levels of Measurement
Nominal
Ordinal
Interval
Ratio
Quantitative
Qualitative
Levels of Measurement
Nominal
- They are sometimes called
categorical scales or categorical data.
Such a scale classifies persons or
objects into two or more categories.
Example
Nominal
Method of Payment
Type of School
Eye Color
Levels of Measurement
Ordinal
- This involves data that may be
arranged in some order, but
differences between data values
either cannot be determined or
meaningless.
Example
Food Preferences
Stage of Diseases
Social Economic Class
Severity of Pain
Ordinal
Levels of Measurement
- This is a measurement level not only
classifies and orders the measurement, but it
also specifies that the distances between
each interval on the scale are equivalent
along the scale from low interval to high
interval.
Interval
Example
• Temperature on Fahrenheit/Celsius
Thermometer
• Trait Anxiety
• IQ
Interval
Levels of Measurement
- A ratio scale represents the highest,
most precise, level of measurement. It has the
properties of the interval level of
measurement and the ratios of the values of
the variable have meaning.
Ratio
Example
• Height and Weight
• Time
• Time until death
Ratio
Levels of Measurement
Scales Counting Ranking
Addition/
Subtraction
Multiplication/
Division
Nominal √
Ordinal √ √
Interval √ √ √
Ratio √ √ √ √
Example
1. Ranking of college athletic teams.
2. Employee number.
3. Number of vehicles registered.
4. Brands of soft drinks.
5. Number of car passers along C5 on a
given day.
Assessments/
Activities
Identify each of the following data sets as
either Population or a Sample.
1. The grade point average (GPAs) of all students at a
college.
2. The GPAs of a randomly selected group of students at
a college campus.
3. The ages of the nine Supreme Court Justice of the
United States on January 1, 1842.
4. The gender of every second customer who enter a
movie theater.
5. The lengths of Atlantic croakers caught on a fishing trip
to the beach.
Identify the following measures as either
Quantitative or Qualitative.
1. The gender of the first 40 newborns in a hospital
one year.
2. The natural hair color of 20 randomly selected
fashion models.
3. The ages of 20 randomly selected fashion models.
4. The fuel economy in miles per gallon of 20 new
cars purchased last month.
5. The political affiliation of 500 randomly selected
voters.
Data Collection and
Basic Concepts in
Sampling Design
Objectives
• Determine the sources of data (primary and
secondary data).
• Distinguish the different methods data
collection under primary and secondary data.
• Determine the appropriate sample size.
• Differentiative various sampling techniques.
• Know the sources of errors in sampling.
Data Collection
Data collection is the process of
gathering and measuring information
on variables of interest, in an
established systemic fashion that
enables one to answer stated research
questions, test hypotheses, and
evaluate outcomes.
Consequences
from Improperly
Collected Data
Data Collection
• Inability to answer research questions accurately.
• Inability to repeat and validate the study.
• Distorted findings resulting in wasted resources.
• Misleading other researches to pursue fruitless
avenues of investigation.
• Compromising decisions for public policy.
• Causing harm to human participants and animal
subjects.
Steps in Data
Gathering
Steps in Data Gathering
1. Set the objectives for collecting data.
2. Determine the data needed based on the set
objectives.
3. Determine the method to be used in data
gathering and define the comprehensive data
collection points.
4. Design data gathering forms to be used.
5. Collect data.
Choosing of
Method of Data
Collection
Data Collection
Decision-makers need
information that is relevant, timely,
accurate and usable. The cost of
obtaining, processing and
analyzing these data is high.
Sources of
Data
Primary Sources
Provide a first-hand
account of an event or time
period and are considered to
be authoritative.
Primary Data
Data documented by the
primary source. The data
collectors documented the
data themselves.
Secondary Sources
Offer an analysis,
interception or a restatement
of primary sources and are
considered to be persuasive.
Secondary Data
Data documented by a
secondary source. The data
collectors had the data
documented by other
sources.
The Primary Data
Can Be Collected In
5 Methods
Methods
1. Direct Personal Interviews
- the researcher has direct
contact with the interviewee. The
researcher gathers information by
asking questions to the interviewee.
Methods
2. Interact/Questionnaire Method
- this methods of data collection
involve sourcing and accessing
existing data that were originally
collected for the purpose of the study.
Questions to be Considered
Who exactly do we want to know
according to the objectives and
variables we identified earlier?
Questions to be Considered
Of whom will we ask
questions and what
techniques will we use?
Questions to be Considered
Are our informants
mainly literate or
illiterate?
Questions to be Considered
How large is the
sample that will be
interviewed?
Key Design
Principles of a
Good
Questionnaire
Key Design Principles of a Good Questionnaire
1. Keep the questionnaire as short as
possible.
2. Decide on the type of questionnaire
(open ended or closed ended).
3. Write the questions properly.
4. Order the questions appropriately.
Key Design Principles of a Good Questionnaire
5. Avoid questions that prompt or motivate
the respondent to say what you would like
to hear.
6. Write an introductory letter or an
introduction.
7. Write special instructions for
interviewers or respondents.
Key Design Principles of a Good Questionnaire
8. Translate the questions if
necessary.
9. Always test your questions
before taking the survey.
Open-Ended
Question & Closed-
ended Question
Open-ended Question
- type of question that does
not include response categories.
The respondent is not given any
possible answers to choose
from.
Closed-ended Question
- is a type of question that
includes a list of response
categories from which
respondent will select his/her
answer.
Advantages
Open-ended VS Closed-ended
• More detailed
answer.
• Could reveal
additional
insights.
• Easy to encode,
tabulate, and
analyze.
• Easy to understand.
• Enables inter-study
comparison.
• Saves time and
money.
• High response rate.
Disadvantage
s
Open-ended VS Closed-ended
• Difficult to encode,
tabulate, and analyze.
• Low response rate.
• Respondent has to be
articulate.
• Respondent could feel
threatened.
• Responses could have
different levels of
detail.
• Could frustrate
respondents.
• Potentially biased
response sets.
• Difficult or impossible
to detect if
respondent truly
understood the
questions.
Methods
3. Focus Group
- is a group interview of
approximately six to twelve people
who share similar characteristics or
common interest.
Methods
4. Experiment
- is a method of collecting data
where there is direct human
intervention on the conditions that
may affect the values of the variable
of interest.
Experiment
• Ethical, moral, and legal
concerns.
• Unrealistic controlled
environments.
• Inability to control for all
variables.
Methods
5. Observation
- is a technique that involves
systematically selecting, watching and
recording behaviors of people or other
phenomena and aspects of the setting
in which they occur, for the purpose of
getting specified information.
Observation
• Radiographic
• Biochemical
• Xray machines
• Microscope
• Clinical examinations
• Microbiological examinations
The Secondary
Data Can Be
Collected In 5
Methods
Methods
1. Published report on newspaper and
periodicals.
2. Financial data reported in annual
reports.
3. Records maintained by the institution.
4. Internal reports of the government
departments.
5. Information from official publications.
Take Note!
• Always investigate the validity and reliability of
the data by examining the collection method
employed by your source.
• Do not use inappropriate data for your
research.
• The choice of methods of data collection is
largely based on the accuracy of the
information they yield.
Sample Size
Sample Size
“How many
participants should
be chosen for a
survey”?
Sample Size
- is typically denoted by n and
it is always a positive integer.
- no exact sample size can be
mentioned here and it can vary in
different research settings.
Take Note!
• Representativeness, not size, is the
more important consideration.
• Use no less than 30 subjects if possible.
• If you use complex statistics, you may
need a minimum of 100 or more in your
sample (varies with method)
Non-Statistical
and Statistical
Considerations
Non-Statistical Considerations
- It may include availability
of resources, man power,
budget, ethics and sampling
frame.
Statistical Considerations
- It will include the
desired precision of the
estimate.
Criteria in
Determining
the Appropriate
Sample Size
1. Level of Precision
- Also called sampling
error, the level of precision, is
the range in which the true
value of the population is
estimated to be.
2. Confidence Interval
- It is statistical measure of
the number of times out of
100 that results can be
expected to be within a
specified range.
2. Confidence Interval
Desired Confidence
Level
Z-Score
80% 1.28
85% 1.44
90% 1.65
95% 1.96
99% 2.58
3. Degree of Variability
- depending upon the
target population and
attributes under
consideration, the degree of
variability varies considerably.
Methods in
Determining
the Sample Size
1. Estimate the Mean or
Average
- The sample size required
to estimate the population
mean µ to with a level of
confidence with specified
margin of error e.
Take Note!
If when is unknown, it is common practice
to conduct a preliminary survey to
determine and use it as an estimate of or
use results from previous studies to obtain
an estimate of . When using this approach,
the size of the sample should be at least
30.
Example
A soft drink machine is regulated so that the
amount of drink dispensed is approximately
normally distributed with a standard
deviation equal to 0.5 ounce. Determine the
sample size needed if we wish to be 95%
confident that our sample mean will be
within 0.03 ounce from the true mean.
2. Estimating Proportion (Infinite
Population)
- The sample size required
to obtain a confidence interval
for p with specified margin of
error e.
Example
Suppose we are doing a study on the inhabitants
of a large town, and want to find out how many
households serve breakfast in the mornings. We
don’t have much information on the subject to
begin with, so we’re going to assume that half of
the families serve breakfast: this gives us
maximum variability. So p = 0.5. We want 99%
confidence and at least 1% precision.
3. Slovin’s Formula
- Slovin’s formula is used
to calculate the sample size n
given the population size and
error.
Example
A researcher plans to conduct a
survey about food preference of
BS Stat students. If the population
of students is 1000, find the
sample size if the error is 5%
4. Finite Population
Correction
- If the population is
small then the sample size
can be reduced slightly.
Online Calculator of Sample
Size
https://select-statistics.co.uk/calculato
rs/sample-size-calculator-population-
proportion/
https://www.calculator.net/sample-siz
e-calculator.html
Basic
Sampling
Design
Reason for Sampling
• Important that the individuals included
in sample represent a cross section
individuals in the population.
• If sample is not representative it is
biased. You cannot generalize to the
population from your statistical data.
Observation Unit
• An object on which a
measurement is taken. This
is the basic unit of
observation, sometimes
called an element.
Target Population
• The complete
collection of
observation we want to
study.
Sample Population
• The collection of all possible
observation units that might
have been chosen in a
sample; the population from
which the sample was taken.
Sample
• A subset of a
population.
Sampling Unit
• A unit that can be selected for a
sample. We may want to study
individuals, but do not have a
list of all individuals in the target
population.
Sampling Frame
• A list, map, or other
specification of sampling
units in the population from
which a sample may be
selected.
Sampling Bias
• This involves problems in
your sampling, which reveals
that your sample is not
representative of your
population.
Advantages of
Sampling Over
Complete
Enumeration
Advantage of Sampling
• Less Labor
• Reduced Cost
• Greater Speed
• Greater Scope
• Greater Efficiency and Accuracy
• Convenience
• Ethical Considerations
Two Type of
Sample
1. Probability Sample
• Samples are obtained using some
objective chance mechanism, thus
involving randomization.
• They require the use of a complete
listing of the elements of the
universe called sampling frame.
1. Non -Probability
Sample
• Samples are obtained haphazardly,
selected purposively or are taken as
volunteers.
• The probabilities of selection are
unknown.
• They should not be used for statistical
inference.
Sampling
Procedure
Sampling Procedure
• Identify the population
• Determine if population is accessible
• Select a sampling method.
• Choose a sample that is representative of
the population.
• Ask the question, can I generalize to the
general population from the accessible
population?
Basic Sampling
Technique of
Probability Sampling
1. Simple Random Sampling
• Most basic method of drawing a
probability sample.
• Assigns equal probabilities of
selection to each possible sample.
• Results to a simple random
sample.
Simple Random Sampling
Advantages and Disadvantages
• It is very
simple and
easy to
use.
• The sample
chosen may be
distributed over
a wide
geographic
area.
When to Use
Simple Random Sampling
• This is preferable to use if
the population is not
widely spread
geographically.
2. Systematic Random Sampling
• It is obtained by selecting every
kth individual from the
population.
• The first individual selected
corresponds to a random
number between 1 to n.
Obtaining a
Systematic
Random Sample
Obtaining a Systematic
Random Sample
• Decide on a method of
assigning a unique serial
number, from 1 to N, to each
one of the elements in the
populations.
Obtaining a Systematic
Random Sample
• Compute for the sampling
interval:
Obtaining a Systematic
Random Sample
• Select a number, from 1 to k, using a
randomization mechanism. The
element in the population assigned to
this number is the first elements of the
sample are those assigned to the
numbers and so on until you get a
sample of size.
Example
We want to select a sample
of 50 students from 500
students under this method kth
item and picked up from the
sampling frame.
Systematic Random Sampling
Advantages and Disadvantages
• Drawing of the
sample is easy. It is
easy to administer
in the field, and the
sample is spread
evenly over the
population.
• May give poor
precision when
unsuspected
periodicity is
present in the
population.
When to Use
Systematic Random Sampling
• This is advisable to us if the
ordering of the population is
essentially random and when
stratification with numerous
data is used.
3. Stratified Random
Sampling
• It is obtained by separating the
population into non-overlapping
groups called strata and then
obtaining a simple random
sample from each stratum.
Example
A sample of 50 students is to be
drawn from a population consisting of 500
students belonging to two institutions A
and B. The number of students in the
institution A is 200 and the institution B is
300. How will you draw the sample using
proportional allocation?
Stratified Random Sampling
Advantages and Disadvantages
• Stratification of
respondents is
advantageous in
terms of precision
of the estimates of
the characteristics
of the population.
• Values of the
stratification variable
may not be easily
available for all units in
the population
especially if the
characteristics of
interest is homogenous.
When to Use
Stratified Random Sampling
• If the population is such that the
distribution of the characteristics of
the respondents under
consideration concentrated in
small and spread segment of the
population.
4. Cluster Sampling
• You take sample from naturally
occurring groups in your population.
• The clusters are constructed such that
the sampling units are heterogeneous
within the cluster and homogeneous
among the clusters.
Obtaining a
Cluster Sample
Obtaining a Cluster Sample
• Divide the population into non-
overlapping clusters.
• Number the clusters in the
population from 1 to N.
Obtaining a Cluster Sample
• Select n distinct numbers from 1 to
N using a randomization
mechanism. The selected clusters
are the clusters associated with the
selected numbers.
• The sample will consist of all
elements in the selected clusters.
Example
A researcher wants to
survey academic performance
of high school students in
MIMAROPA.
Cluster Sampling
Advantages and Disadvantages
• There is no need to
come out with a list
of units in the
population; all what
is needed is simply
a list of the clusters.
• In actual field
applications adjacent
households tend to
have more similar
characteristics than
households distantly
apart.
When to Use
Cluster Sampling
• If the population can be grouped
into clusters where individual
population elements are known to
be different with respect to the
characteristics under study, this
preferable to use.
5. Multi-Stage Sampling
• Selection of the sample is done in two
or more steps or stages, with sampling
units varying in each stage.
• The population is first divided into
number of first-stage sampling units
from which a sample is drawn.
Obtaining a
Multi-Stage Sampling
Obtaining a Multi-Stage Sampling
• Organize the sampling process into
stages where the unit of analysis is
systematically grouped.
• Select a sampling technique for each.
• Systematically apply the sampling
technique to each stage until the unit of
analysis has been selected.
Example
Suppose we wish to study the
expenditure patterns of households in
NCR. We can select a sample of
households for this study using
simple three-stage sampling.
Multi-Stage Sampling
Advantages and Disadvantages
• It is easier to generate
adequate sampling
frames. Transportation
costs are greatly
reduced since there is
some form of
clustering among
ultimate or final
samples.
• It is complexity in
theory may be difficult
to apply in the field.
Estimation
procedures may be
difficult for non-
statisticians to follow.
When to Use
Multi-Stage Sampling
• If no population list is
available and if the
population covers a wide
area.
Take Note!
• Used probability sampling if the
main objective of the sample
survey is making inferences
about the characteristics of
the population under study.
Basic Sampling
Technique of Non-
Probability Sampling
Accidental Sampling
• There is no system of
selection but only those
whom the researcher or
interviewer meets by chance.
Quota Sampling
• There is specified number
of persons of certain
types is included in the
sample.
Convenience Sampling
• It is process of picking out
people in the most
convenient and fastest way
to get reactions immediately.
Purposive Sampling
• It is based on certain
criteria laid down by the
researcher.
Judgement Sampling
• Selects sample in
accordance with an
expert’s judgement.
Cases wherein Non-
Probability Sampling
is Useful
Cases wherein Non-Probability
Sampling is Useful
• Only few are willing to be interviewed.
• Extreme difficulties in locating or
identifying subjects.
• Probability sampling is more expensive
to implement.
• Cannot enumerate the population
elements.
Sources of Errors in
Sampling
1. Non-sampling Error
• Errors that results from the
survey process.
• Any errors that cannot be
attributed to the sample-to-
sample variability.
Sources of Non-sampling Error
• Non-response
• Interview Error
• Misrepresented Answers
• Data entry errors
• Questionnaire Design
• Wording of Questions
• Selection Bias
2. Sampling Error
• Error that results from taking one
sample instead of examining the
whole population.
• Error that results from using
sampling to estimate information
regarding a population.

Statistic quantitative qualitative sample

  • 1.
  • 2.
    Objectives • Define statistics. •Enumerate the importance and limitations of statistics. • Explain the process of statistics. • Know the difference between descriptive and inferential statistics. • Distinguish between qualitative and quantitative variables. • Distinguish between discrete and continuous variables. • Determine the level of measurement of variables.
  • 3.
  • 4.
    Definition of Statistics STATISTICSis the science of collecting, organizing, summarizing and analyzing information to draw conclusions or answer questions.
  • 5.
    Definition of Statistics 1.Collection of information. 2. Organization and summarization of information. 3. Information is analyzed to draw conclusions or answer specific questions. 4. Results should be reported using some measure that represents how convinced we are that our conclusions reflect reality.
  • 6.
  • 7.
    It enables peopleto make decisions based on empirical evidence. Importance of Statistics Provides us with tools needed to convert massive data into pertinent information that can be used in decision making. Provides us information that we can used to make sensible decision.
  • 8.
    DATA DATA are factual informationused as a basis for reasoning, discussion, or calculation.
  • 9.
  • 10.
    Field of Statistics Mathematical Statistics -The study and development of statistical theory and methods in the abstract. Applied Statistics - The application of statistical methods to solve real problems involving randomly generated data and the development of new statistical methodology motivated by real problems.
  • 11.
  • 12.
    Limitation of Statistics 1.Statistics is not suitable to the study of qualitative phenomenon. 2. Statistics does not study individuals. 3. Statistical laws are not exact. 4. Statistics table may be misused. 5. Statistics is only, one of the methods of studying a problem.
  • 13.
  • 14.
    Process of Statistics 1.Identify the research objective - A researcher must determine the question(s) he or she wants to answered. The question(s) must clearly identify the population that is to be studied.
  • 15.
    Process of Statistics 2.Collect the information needed to answer the questions. - Conducting research on an entire population is often difficult and expensive, so we typically look at a sample.
  • 16.
    EXAMPLE The Philippine MentalHealth Associations contacts 1, 028 teenagers who are 13 to 17 years of age and live in Laoag City and asked whether or not they had been prescribed medications for any mental disorders, such as depression or anxiety. Population: Teenagers 13 to 17 years of age who live in Laoag City. Sample: 1, 028 teenagers 13 to 17 years of age who live in Laoag City.
  • 17.
    EXAMPLE A farmer wantedto learn about the weight of his corn crop. He randomly sampled 100 plants and weighted the corn on each plant. Population: Entire corn crop Sample: 100 selected corn crop
  • 18.
    Process of Statistics 3.Organize and summarize the information - Descriptive statistics allow the researcher to obtain an overview of the data and can help determine the type of statistical methods the research should use.
  • 19.
    Process of Statistics 4.Draw conclusion from the information - Information collected from the sample is generalized to the population. - Inferential statistics uses methods.
  • 20.
    Take Note! If theentire population is studied, then inferential statistics is not necessary, because descriptive statistics will provide all the information that we need regarding the population.
  • 21.
    EXAMPLE 1. A badmintonplayer wants to know his average score for the past 10 games.
  • 22.
    EXAMPLE 2. A carmanufacturer wishes to estimate the average lifetime of batteries by testing a sample of 50 batteries.
  • 23.
    EXAMPLE 3. Janine wantsto determine the variability of her six exam scores in Algebra.
  • 24.
    EXAMPLE 4. A politicianwants to determine the total number of votes his rival obtained in the past election based on his copies of the tally sheet of electoral returns.
  • 25.
    EXAMPLE 5. A shippingcompany wishes to estimate the number of passengers traveling via their ships next year using their data on the number of passengers in the past three years.
  • 26.
  • 27.
    Qualitative and QuantitativeVariables - Characteristics of the individuals within the population. Variables
  • 28.
    Qualitative and QuantitativeVariables - is variable that yields categorical responses. It is a word or a code that represents a class or category. Qualitative Variable
  • 29.
    Qualitative and QuantitativeVariables - takes on numerical values representing an amount or quantity. Quantitative Variable
  • 30.
    EXAMPLE 1. Hair Color 2.Temperature 3. Stages of Breast Cancer 4. Number of Hamburger Sold
  • 31.
    EXAMPLE 5. Number ofChildren 6. Zip Code 7. Place of Birth 8. Degree of Pain
  • 32.
  • 33.
    Discrete and Continuous -is a quantitative variable that either a finite number of possible values or a countable number of possible values. Discrete Variable
  • 34.
    Discrete and Continuous -is a quantitative variable that has an infinite number of possible values that are not countable. Continuous Variable
  • 35.
    EXAMPLE 1. The numberof heads obtained after flipping a coin five times. 2. The number of cars that arrive at a McDonald’s drive-through between 12:00 P.M. and 1:00 P.M. 3. The distance of a 2005 Toyota Car can travel in city conditions with a full tank of gas.
  • 36.
    EXAMPLE 4. Number ofwords correctly spelled. 5. Time of a runner to finish one lap.
  • 37.
  • 38.
  • 39.
    Levels of Measurement Nominal -They are sometimes called categorical scales or categorical data. Such a scale classifies persons or objects into two or more categories.
  • 40.
  • 41.
    Levels of Measurement Ordinal -This involves data that may be arranged in some order, but differences between data values either cannot be determined or meaningless.
  • 42.
    Example Food Preferences Stage ofDiseases Social Economic Class Severity of Pain Ordinal
  • 43.
    Levels of Measurement -This is a measurement level not only classifies and orders the measurement, but it also specifies that the distances between each interval on the scale are equivalent along the scale from low interval to high interval. Interval
  • 44.
    Example • Temperature onFahrenheit/Celsius Thermometer • Trait Anxiety • IQ Interval
  • 45.
    Levels of Measurement -A ratio scale represents the highest, most precise, level of measurement. It has the properties of the interval level of measurement and the ratios of the values of the variable have meaning. Ratio
  • 46.
    Example • Height andWeight • Time • Time until death Ratio
  • 47.
    Levels of Measurement ScalesCounting Ranking Addition/ Subtraction Multiplication/ Division Nominal √ Ordinal √ √ Interval √ √ √ Ratio √ √ √ √
  • 48.
    Example 1. Ranking ofcollege athletic teams. 2. Employee number. 3. Number of vehicles registered. 4. Brands of soft drinks. 5. Number of car passers along C5 on a given day.
  • 49.
  • 50.
    Identify each ofthe following data sets as either Population or a Sample. 1. The grade point average (GPAs) of all students at a college. 2. The GPAs of a randomly selected group of students at a college campus. 3. The ages of the nine Supreme Court Justice of the United States on January 1, 1842. 4. The gender of every second customer who enter a movie theater. 5. The lengths of Atlantic croakers caught on a fishing trip to the beach.
  • 51.
    Identify the followingmeasures as either Quantitative or Qualitative. 1. The gender of the first 40 newborns in a hospital one year. 2. The natural hair color of 20 randomly selected fashion models. 3. The ages of 20 randomly selected fashion models. 4. The fuel economy in miles per gallon of 20 new cars purchased last month. 5. The political affiliation of 500 randomly selected voters.
  • 52.
    Data Collection and BasicConcepts in Sampling Design
  • 53.
    Objectives • Determine thesources of data (primary and secondary data). • Distinguish the different methods data collection under primary and secondary data. • Determine the appropriate sample size. • Differentiative various sampling techniques. • Know the sources of errors in sampling.
  • 54.
    Data Collection Data collectionis the process of gathering and measuring information on variables of interest, in an established systemic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes.
  • 55.
  • 56.
    Data Collection • Inabilityto answer research questions accurately. • Inability to repeat and validate the study. • Distorted findings resulting in wasted resources. • Misleading other researches to pursue fruitless avenues of investigation. • Compromising decisions for public policy. • Causing harm to human participants and animal subjects.
  • 57.
  • 58.
    Steps in DataGathering 1. Set the objectives for collecting data. 2. Determine the data needed based on the set objectives. 3. Determine the method to be used in data gathering and define the comprehensive data collection points. 4. Design data gathering forms to be used. 5. Collect data.
  • 59.
    Choosing of Method ofData Collection
  • 60.
    Data Collection Decision-makers need informationthat is relevant, timely, accurate and usable. The cost of obtaining, processing and analyzing these data is high.
  • 61.
  • 62.
    Primary Sources Provide afirst-hand account of an event or time period and are considered to be authoritative.
  • 63.
    Primary Data Data documentedby the primary source. The data collectors documented the data themselves.
  • 64.
    Secondary Sources Offer ananalysis, interception or a restatement of primary sources and are considered to be persuasive.
  • 65.
    Secondary Data Data documentedby a secondary source. The data collectors had the data documented by other sources.
  • 66.
    The Primary Data CanBe Collected In 5 Methods
  • 67.
    Methods 1. Direct PersonalInterviews - the researcher has direct contact with the interviewee. The researcher gathers information by asking questions to the interviewee.
  • 68.
    Methods 2. Interact/Questionnaire Method -this methods of data collection involve sourcing and accessing existing data that were originally collected for the purpose of the study.
  • 69.
    Questions to beConsidered Who exactly do we want to know according to the objectives and variables we identified earlier?
  • 70.
    Questions to beConsidered Of whom will we ask questions and what techniques will we use?
  • 71.
    Questions to beConsidered Are our informants mainly literate or illiterate?
  • 72.
    Questions to beConsidered How large is the sample that will be interviewed?
  • 73.
    Key Design Principles ofa Good Questionnaire
  • 74.
    Key Design Principlesof a Good Questionnaire 1. Keep the questionnaire as short as possible. 2. Decide on the type of questionnaire (open ended or closed ended). 3. Write the questions properly. 4. Order the questions appropriately.
  • 75.
    Key Design Principlesof a Good Questionnaire 5. Avoid questions that prompt or motivate the respondent to say what you would like to hear. 6. Write an introductory letter or an introduction. 7. Write special instructions for interviewers or respondents.
  • 76.
    Key Design Principlesof a Good Questionnaire 8. Translate the questions if necessary. 9. Always test your questions before taking the survey.
  • 77.
  • 78.
    Open-ended Question - typeof question that does not include response categories. The respondent is not given any possible answers to choose from.
  • 79.
    Closed-ended Question - isa type of question that includes a list of response categories from which respondent will select his/her answer.
  • 80.
  • 81.
    Open-ended VS Closed-ended •More detailed answer. • Could reveal additional insights. • Easy to encode, tabulate, and analyze. • Easy to understand. • Enables inter-study comparison. • Saves time and money. • High response rate.
  • 82.
  • 83.
    Open-ended VS Closed-ended •Difficult to encode, tabulate, and analyze. • Low response rate. • Respondent has to be articulate. • Respondent could feel threatened. • Responses could have different levels of detail. • Could frustrate respondents. • Potentially biased response sets. • Difficult or impossible to detect if respondent truly understood the questions.
  • 84.
    Methods 3. Focus Group -is a group interview of approximately six to twelve people who share similar characteristics or common interest.
  • 85.
    Methods 4. Experiment - isa method of collecting data where there is direct human intervention on the conditions that may affect the values of the variable of interest.
  • 86.
    Experiment • Ethical, moral,and legal concerns. • Unrealistic controlled environments. • Inability to control for all variables.
  • 87.
    Methods 5. Observation - isa technique that involves systematically selecting, watching and recording behaviors of people or other phenomena and aspects of the setting in which they occur, for the purpose of getting specified information.
  • 88.
    Observation • Radiographic • Biochemical •Xray machines • Microscope • Clinical examinations • Microbiological examinations
  • 89.
    The Secondary Data CanBe Collected In 5 Methods
  • 90.
    Methods 1. Published reporton newspaper and periodicals. 2. Financial data reported in annual reports. 3. Records maintained by the institution. 4. Internal reports of the government departments. 5. Information from official publications.
  • 91.
    Take Note! • Alwaysinvestigate the validity and reliability of the data by examining the collection method employed by your source. • Do not use inappropriate data for your research. • The choice of methods of data collection is largely based on the accuracy of the information they yield.
  • 92.
  • 93.
    Sample Size “How many participantsshould be chosen for a survey”?
  • 94.
    Sample Size - istypically denoted by n and it is always a positive integer. - no exact sample size can be mentioned here and it can vary in different research settings.
  • 95.
    Take Note! • Representativeness,not size, is the more important consideration. • Use no less than 30 subjects if possible. • If you use complex statistics, you may need a minimum of 100 or more in your sample (varies with method)
  • 96.
  • 97.
    Non-Statistical Considerations - Itmay include availability of resources, man power, budget, ethics and sampling frame.
  • 98.
    Statistical Considerations - Itwill include the desired precision of the estimate.
  • 99.
  • 100.
    1. Level ofPrecision - Also called sampling error, the level of precision, is the range in which the true value of the population is estimated to be.
  • 101.
    2. Confidence Interval -It is statistical measure of the number of times out of 100 that results can be expected to be within a specified range.
  • 102.
    2. Confidence Interval DesiredConfidence Level Z-Score 80% 1.28 85% 1.44 90% 1.65 95% 1.96 99% 2.58
  • 103.
    3. Degree ofVariability - depending upon the target population and attributes under consideration, the degree of variability varies considerably.
  • 104.
  • 105.
    1. Estimate theMean or Average - The sample size required to estimate the population mean µ to with a level of confidence with specified margin of error e.
  • 106.
    Take Note! If whenis unknown, it is common practice to conduct a preliminary survey to determine and use it as an estimate of or use results from previous studies to obtain an estimate of . When using this approach, the size of the sample should be at least 30.
  • 107.
    Example A soft drinkmachine is regulated so that the amount of drink dispensed is approximately normally distributed with a standard deviation equal to 0.5 ounce. Determine the sample size needed if we wish to be 95% confident that our sample mean will be within 0.03 ounce from the true mean.
  • 108.
    2. Estimating Proportion(Infinite Population) - The sample size required to obtain a confidence interval for p with specified margin of error e.
  • 109.
    Example Suppose we aredoing a study on the inhabitants of a large town, and want to find out how many households serve breakfast in the mornings. We don’t have much information on the subject to begin with, so we’re going to assume that half of the families serve breakfast: this gives us maximum variability. So p = 0.5. We want 99% confidence and at least 1% precision.
  • 110.
    3. Slovin’s Formula -Slovin’s formula is used to calculate the sample size n given the population size and error.
  • 111.
    Example A researcher plansto conduct a survey about food preference of BS Stat students. If the population of students is 1000, find the sample size if the error is 5%
  • 112.
    4. Finite Population Correction -If the population is small then the sample size can be reduced slightly.
  • 113.
    Online Calculator ofSample Size https://select-statistics.co.uk/calculato rs/sample-size-calculator-population- proportion/ https://www.calculator.net/sample-siz e-calculator.html
  • 114.
  • 115.
    Reason for Sampling •Important that the individuals included in sample represent a cross section individuals in the population. • If sample is not representative it is biased. You cannot generalize to the population from your statistical data.
  • 116.
    Observation Unit • Anobject on which a measurement is taken. This is the basic unit of observation, sometimes called an element.
  • 117.
    Target Population • Thecomplete collection of observation we want to study.
  • 118.
    Sample Population • Thecollection of all possible observation units that might have been chosen in a sample; the population from which the sample was taken.
  • 119.
    Sample • A subsetof a population.
  • 120.
    Sampling Unit • Aunit that can be selected for a sample. We may want to study individuals, but do not have a list of all individuals in the target population.
  • 121.
    Sampling Frame • Alist, map, or other specification of sampling units in the population from which a sample may be selected.
  • 122.
    Sampling Bias • Thisinvolves problems in your sampling, which reveals that your sample is not representative of your population.
  • 123.
  • 124.
    Advantage of Sampling •Less Labor • Reduced Cost • Greater Speed • Greater Scope • Greater Efficiency and Accuracy • Convenience • Ethical Considerations
  • 125.
  • 126.
    1. Probability Sample •Samples are obtained using some objective chance mechanism, thus involving randomization. • They require the use of a complete listing of the elements of the universe called sampling frame.
  • 127.
    1. Non -Probability Sample •Samples are obtained haphazardly, selected purposively or are taken as volunteers. • The probabilities of selection are unknown. • They should not be used for statistical inference.
  • 128.
  • 129.
    Sampling Procedure • Identifythe population • Determine if population is accessible • Select a sampling method. • Choose a sample that is representative of the population. • Ask the question, can I generalize to the general population from the accessible population?
  • 130.
  • 131.
    1. Simple RandomSampling • Most basic method of drawing a probability sample. • Assigns equal probabilities of selection to each possible sample. • Results to a simple random sample.
  • 132.
    Simple Random Sampling Advantagesand Disadvantages • It is very simple and easy to use. • The sample chosen may be distributed over a wide geographic area.
  • 133.
    When to Use SimpleRandom Sampling • This is preferable to use if the population is not widely spread geographically.
  • 134.
    2. Systematic RandomSampling • It is obtained by selecting every kth individual from the population. • The first individual selected corresponds to a random number between 1 to n.
  • 135.
  • 136.
    Obtaining a Systematic RandomSample • Decide on a method of assigning a unique serial number, from 1 to N, to each one of the elements in the populations.
  • 137.
    Obtaining a Systematic RandomSample • Compute for the sampling interval:
  • 138.
    Obtaining a Systematic RandomSample • Select a number, from 1 to k, using a randomization mechanism. The element in the population assigned to this number is the first elements of the sample are those assigned to the numbers and so on until you get a sample of size.
  • 139.
    Example We want toselect a sample of 50 students from 500 students under this method kth item and picked up from the sampling frame.
  • 140.
    Systematic Random Sampling Advantagesand Disadvantages • Drawing of the sample is easy. It is easy to administer in the field, and the sample is spread evenly over the population. • May give poor precision when unsuspected periodicity is present in the population.
  • 141.
    When to Use SystematicRandom Sampling • This is advisable to us if the ordering of the population is essentially random and when stratification with numerous data is used.
  • 142.
    3. Stratified Random Sampling •It is obtained by separating the population into non-overlapping groups called strata and then obtaining a simple random sample from each stratum.
  • 143.
    Example A sample of50 students is to be drawn from a population consisting of 500 students belonging to two institutions A and B. The number of students in the institution A is 200 and the institution B is 300. How will you draw the sample using proportional allocation?
  • 144.
    Stratified Random Sampling Advantagesand Disadvantages • Stratification of respondents is advantageous in terms of precision of the estimates of the characteristics of the population. • Values of the stratification variable may not be easily available for all units in the population especially if the characteristics of interest is homogenous.
  • 145.
    When to Use StratifiedRandom Sampling • If the population is such that the distribution of the characteristics of the respondents under consideration concentrated in small and spread segment of the population.
  • 146.
    4. Cluster Sampling •You take sample from naturally occurring groups in your population. • The clusters are constructed such that the sampling units are heterogeneous within the cluster and homogeneous among the clusters.
  • 147.
  • 148.
    Obtaining a ClusterSample • Divide the population into non- overlapping clusters. • Number the clusters in the population from 1 to N.
  • 149.
    Obtaining a ClusterSample • Select n distinct numbers from 1 to N using a randomization mechanism. The selected clusters are the clusters associated with the selected numbers. • The sample will consist of all elements in the selected clusters.
  • 150.
    Example A researcher wantsto survey academic performance of high school students in MIMAROPA.
  • 151.
    Cluster Sampling Advantages andDisadvantages • There is no need to come out with a list of units in the population; all what is needed is simply a list of the clusters. • In actual field applications adjacent households tend to have more similar characteristics than households distantly apart.
  • 152.
    When to Use ClusterSampling • If the population can be grouped into clusters where individual population elements are known to be different with respect to the characteristics under study, this preferable to use.
  • 153.
    5. Multi-Stage Sampling •Selection of the sample is done in two or more steps or stages, with sampling units varying in each stage. • The population is first divided into number of first-stage sampling units from which a sample is drawn.
  • 154.
  • 155.
    Obtaining a Multi-StageSampling • Organize the sampling process into stages where the unit of analysis is systematically grouped. • Select a sampling technique for each. • Systematically apply the sampling technique to each stage until the unit of analysis has been selected.
  • 156.
    Example Suppose we wishto study the expenditure patterns of households in NCR. We can select a sample of households for this study using simple three-stage sampling.
  • 157.
    Multi-Stage Sampling Advantages andDisadvantages • It is easier to generate adequate sampling frames. Transportation costs are greatly reduced since there is some form of clustering among ultimate or final samples. • It is complexity in theory may be difficult to apply in the field. Estimation procedures may be difficult for non- statisticians to follow.
  • 158.
    When to Use Multi-StageSampling • If no population list is available and if the population covers a wide area.
  • 159.
    Take Note! • Usedprobability sampling if the main objective of the sample survey is making inferences about the characteristics of the population under study.
  • 160.
    Basic Sampling Technique ofNon- Probability Sampling
  • 161.
    Accidental Sampling • Thereis no system of selection but only those whom the researcher or interviewer meets by chance.
  • 162.
    Quota Sampling • Thereis specified number of persons of certain types is included in the sample.
  • 163.
    Convenience Sampling • Itis process of picking out people in the most convenient and fastest way to get reactions immediately.
  • 164.
    Purposive Sampling • Itis based on certain criteria laid down by the researcher.
  • 165.
    Judgement Sampling • Selectssample in accordance with an expert’s judgement.
  • 166.
  • 167.
    Cases wherein Non-Probability Samplingis Useful • Only few are willing to be interviewed. • Extreme difficulties in locating or identifying subjects. • Probability sampling is more expensive to implement. • Cannot enumerate the population elements.
  • 168.
    Sources of Errorsin Sampling
  • 169.
    1. Non-sampling Error •Errors that results from the survey process. • Any errors that cannot be attributed to the sample-to- sample variability.
  • 170.
    Sources of Non-samplingError • Non-response • Interview Error • Misrepresented Answers • Data entry errors • Questionnaire Design • Wording of Questions • Selection Bias
  • 171.
    2. Sampling Error •Error that results from taking one sample instead of examining the whole population. • Error that results from using sampling to estimate information regarding a population.

Editor's Notes

  • #2 After successful completion of this module, you should be able to:
  • #3 When you hear the word statistics what comes into your mind? Statistics plays a major role in many aspects of our lives. It is used in sports, for example, to help a general manager decide which player might be the best fit for team. Many people say that statistics is numbers. After all, we are bombarded by numbers that supposedly represent how we feel and who we are. Certainly statistics has a lot to do with numbers, but this definition is only partially correct.
  • #4 In addition, statistics is about providing a measure of confidence in any conclusions. Let’s break this definition into four parts.
  • #5 Statistics is the science of conducting studies to collect, organize, summarize, analyze and draw conclusion from data
  • #6 3 importance
  • #7 All this importance referred to the definition DATA
  • #8 According to Meriam Webster dictionary ____. Data can be numerical, as in height, or non numerical, as in gender. In either case, data describe characteristics of an individual.
  • #9 Now lets proceed to the field of statistics It divides into two group.
  • #10 Some example branches of applied statistics are Psychometric, Econometric and Biostatistics.
  • #11 And we have also 5 limitation in statics.
  • #13 4 process
  • #14 And with that you will now identify your research objectives.
  • #15 This step is vital to the statistical process, because if the data are not collected correctly, the conclusions drawn are meaningless. Do not overlook the importance of appropriate data collection.
  • #16 So base on research objective presented what is the population and sample in the study? Population – consists of all subjects (human or otherwise) that are being studied. while Sample – is a group of subjects selected from a population. Let’s have another example.
  • #17 So base on research objective presented what is the population and sample in the study? Let’s have another example.
  • #18 When we say descriptive statistics consist of organizing and summarizing data. And also describe data through numerical summaries, tables and graphs.
  • #19 In this step the (read ppt) And in here (read ppt) to takes results obtained from sample, extends them to the population, and measures the reliability of the results. Unlike the 3rd step we used descriptive statistics. consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables, and making predictions
  • #20 And now lets have an example. For the following statements, decide whether it belongs to the field of Descriptive Statistics or Inferential Statistics.
  • #21 DS This statement involves DS because it focuses on summarizing and presenting data in a meaningful way without making predictions or generalizations.
  • #22 IS This statement is considered IS because it involves using data from a sample (50 batteries) to make a generalization or estimate about population.
  • #23 DS Because Janine is analyzing her existing data (six exam scores)
  • #24 DS Because the politician is focusing on summarizing and presenting data from the existing records to determine the total number of votes his rival received.
  • #25 IS Because the shipping company is using data from the past three years to estimate the number of passengers from the entire next year.
  • #27 For Example, recently my mother and I planted tomato plant in our backyard. We collected information about the tomatoes harvested from the plant. The individuals we studied were the tomatoes. The variable that interested us was the weight of a tomato. My mom noted that they came from the same plant. She discovered that variables such as weight may vary.
  • #28 Qualitative Variable or also known as your Categorical
  • #29 Quantitative Variable or also known as your Numeric.
  • #30 Quali Quanti Quali Quanti Qualitative - is variable that yields categorical responses. It is a word or a code that represents a class or category. Quantitative - takes on numerical values representing an amount or quantity.
  • #31 5. Quanti 6. Quali 7. Quali 8. Quali
  • #33 If you count to get the value of a quantitative variable. It is discrete variable. While continuous variable.
  • #34 If you measure to get the value of quantitative variable, it is continuous variable.
  • #35 D D C because distance cannot count but measured If you count to get the value of a quantitative variable. It is discrete variable. While continuous variable. If you measure to get the value of quantitative variable, it is continuous variable.
  • #36 4. D 5. C
  • #37 Quantitative Variables may be further classified into (next page)
  • #38 It is important to know which type of scale is represented by your data since different statistics are appropriate for different scales of measurement. A characteristics may be measured using nominal, ordinal, interval, and rational scales.
  • #39 Whatever the basis for classification , a person can only be in one category, and members of a given category have a common set of characteristics. Lets have an example.
  • #40 Cash , Check , Debit Card, Credit Card Public VS Private Blue, Green, Brown
  • #41 An ordinal scale not only classifies subjects but also ranks them in terms of the degree to which they possess a characteristics of interest. In other words, an ordinal scale puts the subjects in order from highest to lowest, from most to least.
  • #42 First, Middle, Lower
  • #43 A value of zero does not mean the absence of the quantity. Arithmetic operations such as addition and subtraction can be performed on values of the variable.
  • #44 High Anxious vs. Low Anxious High IQ vs Average IQ vs Low IQ
  • #45 A value of zero does not mean the absence of the quantity. Arithmetic operations such as addition and subtraction can be performed on values of the variable.
  • #48 Ordinal Nominal Ratio Nominal Ratio Zip code – Nominal Degree of pain - Ordinal
  • #53 After successful completion of this module, students should be able to……….
  • #54 Without proper planning for data collection, a number of problems can occur. If the data collection steps and processes are not properly planned, the research project can ultimately end up with a data set that does not serve the purpose for which it was intended. For example, if more than one person is involved in the data collection, but data collectors do not follow consistent data collection practices, they can end up with data with different units, collection processes, and variable names.
  • #60 The challenge is to find ways, which lead to information that is cost-effective, relevant, timely and important for immediate use. Some methods pay attention to timeliness and reduction in cost.
  • #62 They represent original thinking, reports on discoveries or events, or they can share new information. Often these sources are created at the time the events occurred but they can also include sources that are created later. They are usually the first formal appearance of original research.
  • #63 The first hand information obtained by the investigator is more reliable and accurate since the investigator can extract the correct information by removing doubts, if any, in the minds of the respondents regarding certain questions. High response rates might be obtained since the answers to various questions are obtained on the spot. It permits explanation of questions concerning difficult subject matter.
  • #64 They often involve generalization, synthesis, interpretation, commentary or evaluation in an attempt to convince the reader of the creator’s argument. They often attempt to describe or explain primary sources.
  • #65 In secondary data, data are primary data for the agency that collected them, and become secondary for someone else who uses these data for his own purposes. Secondary data are less expensive to collect both in money and time. These data can also be better utilized and sometimes the quality of such data may be better because these might have been collected by persons who were specially trained for that purpose.
  • #66 The primary data can be collected by the following 5 methods:
  • #68 Designing good questioning tools forms an important and time consuming phase in the development of most research proposals. Once the decision has been made to use these techniques, here the following questions should be considered before designing your tools. (next slide)
  • #69 Is questioning the right technique to obtain all answers, or do we need additional techniques, such as observations or analysis of records?
  • #70 Do we understand the topic sufficiently to design a questionnaire, or do we need some loosely structured interviews with key informants or focus group discussion first to orient ourselves?
  • #71 If illiterate, the use of self administered questionnaires is not an option.
  • #72 Studies with many respondents often use shorter, highly structured questionnaires, whereas smaller studies allow more flexibility and may use questionnaires with a number of open-ended questions.
  • #73 9 key design
  • #76 (Pre-test)
  • #78 This type of question is usually appropriate for collecting subjective data. It permit free responses that should be recorded in the respondent’s own words.
  • #79 It is useful if the range of possible responses is known. This type of question is usually appropriate for collecting objective data. It answer Yes or No or True or False.
  • #84 In here a facilitator guides the group based on a predetermined set of topics.
  • #85 Bear in mind that the experimental methods has several limitations that you should be aware of (next ppt)
  • #86 Bear in mind that the experimental methods has several limitations that you should be aware of (next ppt)
  • #87 It includes all methods from simple visual observations to the use of high level machines and measurements, sophisticated equipment or facilities such as: (next slide)
  • #88 And that is our 5 methods in Primary Data Direct personal interviews Indirect/questionnaire method Focus group Experiment and Observation
  • #89 The primary data can be collected by the following 5 methods:
  • #93 One of the most frequent problems in statistical analysis is the determination of the appropriate sample size. One may ask why sample size is so important. The answer to this is that an appropriate sample size is required for validity. If the sample size it too small, it will not yield valid results. An appropriate sample size can produce accuracy or results. Moreover, the results from the small sample size will be questionable. A sample size that is too large will result in wasting money and time because enough sample will normally give an accurate result.
  • #94 However, all else being equal, large sized sample leads to increased precision in estimates of various properties of the population.
  • #96 Choosing of sample size depends on non-statistical and statistical considerations
  • #99 We have 3 criteria need to be specified to determine the appropriate sample size.
  • #101 For example, a confidence interval of 90% means the results of an action will probably meet expectations 90% of the time.
  • #102 To find the right z-score to use, refer to this table
  • #103 The more heterogeneous a population is, the larger the sample size is required to get an optimum level of precision.
  • #105  Where: Z – is the z-score corresponding to level of confidence. e – is the level of precision.
  • #106 The formula for the sample standard deviation is
  • #107 Solution: The z – sore for confidence level 95% in the z – table is 1.96. Therefore we need a 1067 sample for our study.
  • #108  Where: Z – is the z-score corresponding to level of confidence. e – is the level of precision. p – is population proportion. There is dilemma in this formula: It dependents on which we know only after we have taken the sample. There are two ways to solve this dilema
  • #109 Solution: The z – sore for confidence level 99% in the z – table is 2.58. Therefore we need a 16,641 sample for our study.
  • #110  Where: N – is the total population. e – is the level of precision.
  • #111 Solution: Therefore the researcher need to survey 286 BS students.
  • #112  Where: N – is the population size. - is Cochran’s sample size recommendation.
  • #114 The goal in sampling is to obtain individuals for a study in such a way that accurate information about the population can be obtained. And here are some definitions are needed to make the notion of a good sample more precise.
  • #115 And here are some definitions are needed to make the notion of a good sample more precise.
  • #116 In studying human populations, observation units are often individuals.
  • #121 For a survey using in-person interviews, the sampling frame might be a list of all street addresses.
  • #124 Advantages of Sampling Over Complete Enumeration
  • #126 The probabilities of selection are known They are generally referred to as RANDOM SAMPLE They allow drawing of valid generalizations about the universe/population.
  • #129 Sampling technique can be grouped into how selections of item are made such as probability sampling and non-probability sampling.
  • #133 Also, this is more appropriate to use if the population is more or less homogenous with respect to the characteristics of the population.
  • #135 3 steps
  • #139  We start to get a sample starting from I and for every kth until subsequently. Suppose the random number I is 6, then we select 15, 25, 35, 45, ………..
  • #142 The individuals with each stratum should be homogeneous (or similar) in some way.
  • #143 There are two strata in this case. Given: N1 = 200 N2 = 300 N = 500 n = 50 (SAME with N2) Therefore the sample sizes are 20 from A and 30 from B. Then the units from each institution are to be selected by simple random sampling.
  • #144 Sampling designs may vary by stratum to adjust for the differences in the conditions across strata. It is easy to use as a random sampling design. It is possible that there are not representative in one or two strata. Also, transportation costs can be high if the population covers a wide geographic area.
  • #145 Thus, this is preferred to use if precise estimates are desired for stratified parts of the population and if sampling problems differ in the various strata of the population.
  • #147 4 steps
  • #150 He/She can divide the entire population into different clusters. Then the researcher selects a number of clusters depending on his/her research through simple or systematic random sampling. Then, from the selected clusters the researcher can either include all the high school students as subject or he can select a number of subjects from each cluster through simple or systematic random sampling.
  • #151 It is also less costly since the elements are physically closer together.
  • #154 3 steps
  • #156 First, divide into smaller cities/municipalities and a random sample of these cities/municipalities is collected. Second, a random sample of smaller areas such as barangays is taken from within each of the cities/municipalities chosen in the first stage. Third, a random sample of even smaller areas such as households is taken from within each of the areas chosen in the second stage.
  • #160 3 steps
  • #162 The researcher is aware of categories within the population and draws samples from each category. The size of each categorical sample is proportional to the proportion of the population that belongs in that category.
  • #163 This method can be done by telephone interview to get the immediate reactions of a certain group of sample for a certain issue.
  • #164 People who satisfy the criteria ae interviewed. It is used to determine the target population of those who will be taken for the study.
  • #166 3 steps
  • #168 3 steps