Basic Concepts of Quantitative Research

Dr. R. Ouyang




Selection of a sample

            Population and sample: So far we have talked about introduction, literature review and methods in a research.  Next we will discuss the subject.  Subjects mean the population or samples researcher used for study.   Population is the group of interest to the researcher, the group to which she or he would like to the results of the study to be generalizable.   The population that the researcher would ideally like to generalize to is referred to as the target population; the population that the researcher can realistically select from is referred to as the accessible or available population. 

            If the group of interest is unmanageably large or geographically scattered, study of this group could result in considerable expenditure of time money and effort.  Therefore, selecting a sample is needed and is a very important step in conducting a research study.  A “good” sample is one that is representative of the population from which it was selected.

            Regardless of the specific techniques used for sampling, the steps in sampling are essentially the same: Identification of the population, Determination of the required sample size, and Selection of the sample.  There are four basic sampling techniques: random sampling, stratified sampling, cluster sampling, and systematic sampling.


FOUR sampling techniques are: Random sampling, Stratified sampling, cluster sampling, and systematic sampling.

Random sampling is the process of selecting a sample in such a way that all individuals in the defined population have an equal and independent chance of being selected for the sample.  It is considered as the best single way to obtain a representative sample that is required by inferential statistics.


Steps of random sampling:

1. Identify and define the population;

2. Determine the desired sample size;

3. List all members of the population;

4. Assign all individuals on the list a consecutive number from zero to the required number, for example, 000 to 249, 00 to 99;

5. Select an arbitrary number in the table of random numbers (close eyes and point);

6. For the selected number look at only the appropriate number of digits (2 or 3 digits);

7. Match the selected number with the number assigned to individuals in the population, if the number corresponds to the number of assigned number of individual, the individual is in the sample;

8. Go to the next number in the column and repeat step 7;

9. Repeat step 8 until the desired number of individuals has been selected for the sample.


Example-- Label 5000 teacher with the number 0000 - 4999, then randomly point a number in the random number table, match the last four-digit number of the number in the table with the teacher’s label number.  If it is matched the teacher is selected in the sample.  Repeat the process until 500 teachers are selected as the required sample size.


Stratified sampling is the process of selecting a sample in such a way that identifies subgroups in the population are represented in the sample in the same proportion that they exist in the population.   In this process, random sampling is done more than once; it is done form each subgroup.


Steps of stratified sampling:

1. Identify and define the population;

2. Determine desired sample size;

3. Identify the variable and subgroups for each the researcher wants to guarantee appropriate representation;

4. Classify all members of the population as members of one of the identified subgroups;

5. Randomly select an “appropriate” number of individuals from each of the subgroups, “appropriate” meaning either a proportional number of individuals or an equal number of individuals.


Example: Suppose one is interested in comparing the performance of students of different IQ levels, following two different methods of mathematics instruction.  One can have 300 eighth graders classified in to high (>115), average (85-115), and low (< 85) IQ group, randomly select 30 students from each group, and then randomly assign 15 from each 30 students to experimental or control groups for the study.


Cluster sampling is sampling in which groups, not individuals, are randomly selected.  All the members of selected groups have similar characteristics.


Steps of cluster sampling:

1. Identify and define the population.

2. Determine the desired sample size.

3. Identify and define a logical cluster.

4. List all clusters that comprise the population.

5. Estimate the average number of population members per cluster.

6. Determine the number of clusters needed by dividing the sample size by the estimated size of a cluster.

7. Randomly select the needed number of cluster (using a table of random numbers).

8. Include in the study all population members in each selected cluster.


Example: One needs to have 500 teachers from 5000 for the study.  There are 100 schools with 50 teacher in each.  In cluster sampling, one randomly select 10 schools for study and have 500 teachers in the sample.  The problem of this sampling is that inferential statistics are not appropriate for analyzing data resulting from a study using cluster sampling.  Inferential statistics generally require random sampling.


System sampling is sampling in which individuals are selected from a list by taking every Kth name.    If the list of the population is randomly ordered, this sample can be considered as a random sample


Steps of systematic sampling:

1. Identify and define the population.

2. Determine the desired sample size.

3. Obtain a list of the population.

4. Determine what k is equal to by dividing the size of the population by the desired sample size.

5. Start at some random place at the top of the population list.

6. Starting at that point, take every Kith name on the list until the desired sample size is reached.

7. If the end of the list is reached before the desired sample is reached, go back to the top of the list.


Sample size:  Sample should be as large as possible.  Minimum acceptable sample size depends on the type of research.  For causal-comparative, correlational research 30 in each group, and 15 in experimental research are generally recommended as minimum sample size.


Nonprobability sampling that includes:

1) convenience sampling: the use of volunteers and existing groups

2) judgment sampling: the use of believed groups

3) quota sampling: the use of quota in the situation of not all members of the population be possible listed for sampling.


Selection of measuring instruments

All research studies involve data collection.  There are three major ways to collect dada: 1) administrate a standardized instrument; 2) administrate a self-developed instrument; 3) record naturally available data.

Validity and Reliability

Validity means the degree to which a test measures that it is supposed to measure and, consequently, permits appropriate interpretation of scores.  The concept of validity is that the validity for whom or for what.  Since tests are designed for a variety of purposes, and since validity can be evaluated only in terms of purpose, the validity is categorized in several different types: Content, construct, concurrent and predictive validates.

            Content validity:  Content validity is the degree to which a test measures an intended content area.  It requires both item validity and sampling validity.  Item validity is concerned with whether the test items represent measurement in the intended content area, and sampling validity is concerned with how well the test samples the total content area.  The content validity is determined by expert judgment.

            Construct validity:  Construct validity is the degree to which  a test measures an intended hypothetical construct.  It is not easy task to process a test of construct validity.  Generally, a number of independent studies are required to establish the credibility of a test of a construct.

            Concurrent validity:  It is the degree to which the scores on a test are related to the scores on another, already established, test administered at the same time, or to some other valid criterion available  at the same time.  The concurrent validity is determined by the resulting number, validity coefficient.  If the coefficient is high, the test has good concurrent validity.

            Predictive validity:  Predictive validity is the degree to which a test can predict how well an individual will do in a future situation.  The predictive validity of a test is determined by establishing the relationship between scores on the test and some measure of success in the situation of interest.   It is determined by the resulting number, validity coefficient.  If the coefficient is high, the predictive validity is good.  Generally, 0.5 might be acceptable predictive validity in certain situations, but not in some others.

Reliability means dependability or trustworthiness.  It is the degree to which a test consistently measures whatever it measures. 

            Reliability is expressed numerically, usually as a coefficient; a high coefficient indicates high reliability.  High reliability indicates minimum error variance; if a test has high reliability, then the effect of errors of measurement has been reduced.

            A valid test is always reliable but a reliable test is not necessarily valid.  In other word, if a test is measuring what it is supposed to be measuring, it will be reliable and do so every time, but a reliable test can consistently measure the wrong thing and be invalid.

            The reliability is easier than validity to be assessed.  Test-retest, equivalent-forms, and split-half reliability are all determined through correlation.

            Test-retest reliability is the degree to which scores are consistent over time.

            Reliability can also be expressed in terms of the standard error of measurement.  Standard error of measurement (degree of variance) is an estimate of how often you can expect errors of a given size.  A small standard error of measurement indicates high reliability and a large standard error of measurement indicates low reliability.


Design and Procedure

Basic causal-comparative research design

The basic causal-comparative design involves selecting two groups differing on some independent variable and comparing them on some dependent variable.


Case A:   (E) – (X) – (O)

    (C)    --      (O)        One group is with different experiences, the other is not


Case B:   (E) – (X1)--(O)

               (C) – (X2) –(O)         Two groups with two different experiences

***Symbols:   (E) – Experimental group, (X) – Independent variable

(C) – Control group, (O) – Dependent variable


Three types of control procedures that can be used in a causal-comparative study

Since the randomization is not possible in causal-comparative studies (the groups already exist), three types of control procedures usually adopted in the study to insure equality of groups.


1) Matching:  If a researcher match on IQ, then a subject in one  group with an IQ at or near 140 would have a match in the other group.

2) Comparing homogeneous groups or subgroups: If IQ were an identified extraneous variable, the researcher might limit groups to contain only subjects with IQ between 85 and 115 (average IQ).  Or, each group might be divided into high (116 and above), average (85-115) and low (84 and below) IQ subgroups.

3) A statistic method, Analysis of covariance (ANCOVA):  A statistic method to be used to equate groups on one or more variables. 


Common experimental research design


1) Pre-experimental Designs

X -- O                                     One-short case study

O-- X -- O                  One group pretest-post test design

X1 -- O / X2 -- O         Static group comparison


2) True experimental designs

R -- O -- X1 -- O

R -- O -- X2 -- O         Pretest-posttest control group design


R -- X1 -- O

R -- X2 -- O                Posttest only control group design


3) Quasi-experimental group designs

O -- X1 -- O

O -- X2 -- O               Nonequivalent control group design


O -- O -- O -- O -- X -- O -- O -- O -- O    Time series design


X1 -- O -- X2 -- O -- X3 -- O

X3 -- O -- X1 -- O -- X2 -- O

X2 -- O -- X3 -- O -- X1 -- O   Counterbalanced design


***Symbols:   (R) – Randon assignment of subjects to group

 (X) – Independent variable (O) – Dependent variable


Five ways to control extraneous variables

1) Randomization is effective in creating equivalent and representative groups that are essentially the same on all relevant variables thought of by the researcher.

2) Matching is a techniques for equating groups on one or more variables the researcher has identified as being highly related to performance on the dependent variable.

3) Comparing homogeneous groups or subgroups to control the extraneous variables.

4) Using subjects as their own controls involves exposing the same group to the different treatments, one treatment at a time.

5) Analysis of covariance is a statistical method for equating randomly formed groups on one or more variables.


Three types of replication involved in single-subject research

1) Direct replication refers to replication by the same investigator, with the same subject or with different subjects, in a specific setting.

2) Systematic replication refers to replication which follows direct replication, and which involves different investigators, behaviors, and settings.

3) Clinical replication involves the development of a treatment package, composed of two or more interventions that have been found to be effective individually, designed for persons with complex behavior disorders.



Gay, L. R. (1996). Educational research: Competencies for analysis and application.  Upper Saddle River, NJ: Merrill.


 Back to topics