Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Organizing Data, Graphing - Introduction to Statistics - Lecture notes, Study notes of Statistics

Organizing Data, Graphing, Raw data, Classes, Frequency distribution, Classes and Frequencies, Percentage, Creating Classes, Rules for determining classes, Constructing a frequency Distribution are learning points available in this lecture notes.

Typology: Study notes

2011/2012

Uploaded on 11/14/2012

dharm
dharm 🇮🇳

4.3

(24)

59 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Organizing Data & Graphing
Introduction:
When data are collected they are called raw data
Raw Data are frequently large, very messy and hard to interpret:
2
18
9
3
8
5
16
5
5
11
1
1
3
5
21
1
8
5
11
29
2
9
5
1
7
19
11
5
9
9
12
9
8
3
5
3
19
4
2
5
6
8
20
6
6
12
7
1
9
1
5
4
18
19
23
22
0
5
4
1
6
6
4
7
6
1
13
14
7
2
6
12
1
11
9
1
4
5
9
1
9
14
2
7
0
6
1
9
5
6
To help us organize the mess we can:
1) Separate it into hopefully meaningful classes
Above, one choice is: 1-4, 5-8, 9-12,16-20, 21-24, 25-28, 29-32
2) And count how many representatives occur from each class
(frequency)
A frequency distribution is the organization of raw data in table form using
classes and frequencies.
There are different types depending on the data.
Organizing and graphing the data using frequency distributions can help us
interpret and display our findings.
The Frequency Distribution
Classes and Frequencies
When nominal or ordinal level data is used, the data is by nature separated
into categories. The frequency distribution using these categories is called a
categorical frequency distribution.
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Organizing Data, Graphing - Introduction to Statistics - Lecture notes and more Study notes Statistics in PDF only on Docsity!

1

Introduction:^ Organizing Data & Graphing When data are collected they are called raw data Raw Data are frequently large, very messy and hard to interpret: (^21119 18294 922 395 856 518 16720 5196 5116 11512 197 191 147 3129 591 2185 134 8518 ) (^231 2211 09 51 44 15 69 61 49 714 62 17 96 130 146 71 29 65 ) To help us organize the mess we can: 1) Separate it into hopefully meaningful classes

    1. And count how many representatives occur from each class Above, one choice is: 1-4, 5-8, 9-12,16-20, 21-24, 25-28, 29- 32 A frequency distribution is the organization of raw data in table form using^ (frequency) classes and frequencies. There are different types depending on the data. Organizing and graphing the data using frequency distributions can help us interpret and display our findings. The Frequency Distribution When nominal or ordinal^ Classes and Frequencies level data is used, the data is by nature separated into catego categorical frequency distribution ries. The frequency distribution using these categories is called a.

2

For classes. Using these classes as we use categories, we can create a frequency distribution interval and ratio level data we must create our own categories called. The only time we do not group data into classes is grouped in extremely small samples where each data can be its own class. Two studies show the difference: Study 1: 25 students do a blood drive Variable Data : ________: blood type________________ Raw data: A O (^) OB BB ABAB OB AB^ B^ A^ OBA^ OOO^ OAB ABOA Study 2: US record high temperature for 50 states Variable Data : ________________________: Record High Raw Data: (^112110 100118 127117 120116 134118 118122 105114 110114 109105 ) (^107116120 112108113 114110120 115121117 118113105 117120110 118119118 122111112 106104114 ) Since (we call frequency. Study 1 classe (^) s has nominal level data, we already have qualitative categories) to count frequency. We simply need to tally to find the Raw data: A O (^) OB BB ABAB OB AB^ B^ A^ OBA^ OOO^ OAB ABOA

4

Terms , then rules , and then the method.^ Creating^ Classes Terms: • For interval and ratio level data, a A class of values for the above example (ex. 2 class is a quantitative category. - 3 in your text)

-^ might be 100 100 is the lower class limit˚-104˚ and 104 is the upper class limit.

  • The temperature between 99.5 and 104.5 is in our range. The temperatures 104.4 or 99.8 would be recorded as 104 and class boundaries , however, are 99.5 and 104.5, since any -^100 The^ - class width^ and^ would be included in this class. is the difference of the bounds: 104.5 – 99.5 = 5

• The limits: 99.5 + 104.5 class midpoint or Xm 100 + 104 is the middlemost value between the both yields 102, the midpoint.

Rules for determining classes:^2

1) 2) There should be between 5 and 20 classes.It is preferable that the class width be odd (optional)  This insures the midpoint is the same place value as 3) The classes must be mutually exclusive:  our measurements:This insures distinct boundaries class 1 class 2..... 100 - 104 105 - 108... 99.5 ≤ class 1 < 104.5 ≤ class 2 < 108.5 ≤....

5

4) The classes must be continuous:  Just because a class has no valu can omit it (unless its on either extreme).es does not mean you 5) The classes must be exhaustive:  all data needs a home! 6) The classes must be of equal width.

Method 1) Find the highest and lowest values. for determining classes so the rules are satisfied:

    1. Find the range.Select the number of desired classes (#classes).
  1. Find class width: Class width = range then round up!
  2. Select the lowest value or any lower value as your starting point.^ #classes
    1. Repeatedly add the width to get aDetermine upper limits. ll lower limits. Apply this to Study 2: Study 2 raw data (^112110107 100118112 127117114 120116115 134118118 118122117 105114118 110114122 109105106 ) (^116120 108113 110120 121117 113105 120110 119118 111112 104114 )

7

Class 100105 - - 104109 Tally///////// (^) / Frequency 28 Cumulative 102 110115120 - -- 114119124 ////////////////////////////////////// 18137 284148 125130 - - 129134 // (^11 ) Cumulative is useful: From such distributions we will next learn to analyze by: 41 of 50 states have high temperatures under 120˚.

  • • • GraphingAnalyzing distribution shapeMake comparisons with other data Cumulative frequencies are the predecessors to “percentiles”. If you’ve taken a standardized test like the SAT they give you are in the 80th (^) percentile, it means that out of a hundred students you you a percentile score. If rank 80 easily. th. We can calculate cumulative percent in the above example very Class 100 - 104 Tally// Frequency 2 Cumulative Frequency 2 Cumulative Percent2/50 · 100 = 4% 105110115 - -- 109114119 /////////////////////////////////////// (^18138 102841) 28/50 · 100 = 56%41/50 · 100 = 82%10/50 · 100 = 20% 120125130 - -- 124129134 ///////// 711 484950 49/50 · 100 = 98%50/50 · 100 =100%48/50 · 100 = 96% Cumulative percent = So 48 of 50 cities (or 96% of cities) have record highs of less than or equal cumulative frequency N. 100% to 124.5˚.

10

Graph Axes Example Histogram^ Frequency Class VS Boundaries

Frequency Polygon^ Frequency Class VS Midpoints Ogive^ Cumulative^ frequency VS Boundaries^ Class

11

  1. Draw and label axes with^ Steps^ Graph frequency on the y each class boundary on the x axis. -axis and plot-

  2. Draw a vertical bar for each class with the class frequency as the height.

  3. Draw and label axes with frequency on the y plot the class midpoints on the-axes and x 2) Plot points for each class:-axis. (midpoint, frequency). 3) Connect the dots.

  4. Draw and label axes with Cumulative frequency on the y axes and plot the class - boundaries on the x 2) Plot points for each class:-axis. (upper class boundary, cumulative freq frequency).

  5. Connect the dots.

  • • When labeling the y frequency values from the table. Plot an appropriate range and scale.Cumulative frequency graphs are used to visually represent how many-axes do not plot actual frequency or cumulative values are below a certain upper class boundary.

13

Rarely will a distribution have an exact shape but it is useful to classify by general pattern. What about our graph for high temperatures?

Other Types of Graphs Pie graph^ Pie Graphs s are circular with area representing frequencies.: Construction through example: Cookie Types Number Sold Chocolate Chip Peanut Butter Oatmeal 201530 Sugar 10

14

Step 1 Calculate pie pieces and use a protractor to: Drawing the pie pieces. measure out the degree amount. Calculating the degrees: There are 360 circle. We will use percentages.˚ in a circle. We need a proportion or a percentage of the Degrees in the pie piece = (^) n f****. 360˚

  • • • Degrees of “Chocolate Chip” pie piece = 20/75 x 360˚Degrees of “Peanut Butter” pie pieceDegrees of “Oatmeal” pie piece = 30/75 x 360˚ = 15/75 x 360˚ = 144 =72˚ = 96˚ ˚
  • Note: 96 + 72 + 144 + 48 = 360 Degrees of “Sugar” pie piece = 10/75 x 360˚ = 48˚ Step 2: • % Label of “Chocolate Chip” pie piece = 20/75 x Labeling the pie pieces - use percentages. 100% ≈ 27%
  • • • % Label of “Peanut Butter” pie piece = 15/75 x 100% = 20%% Label of “Oatmeal” pie piece = 30/75 x 100% = 40%% Label of “Sugar” pie piece = 10/75 x 100% ≈ 13% Finished product^ Note:^ ∑=100%

16

Time Series Graph This type of graph represents data that occur over a specific period of time.: Axes: Variable of choice Vs Time Example time series graph data: A local fundraiser wants to graphically display the contributions they have received over the past five years Year 1996 Contributions $ 199719981999 $700$800$ 2000 $

17

Stem Stem and leaf plots are a method of organizing data that is a combination of sorting and graphing. and leaf plots: A stem and part of the data value as the leaf to form groups or c stem and leaf plot is a data plot that uses part of the data value as thelasses. Explanation through example: Data: 12, 22, 22, 24, 34, 31, 26, 35, 27, 39, 49, 10 45, 36, 23, 16, 37, 28, 18, 13, 10, 23, 30, 31 Step 1: Arrange the data in order. 10, 10, 12, 13, 16, 18, 22, 22, 23, 23, 24, 26, 27, 28, 30, 31, 31, 34, 35, 36, 37, 39, 45, 49, Step 2: Separate into groups by first digit. 10, 10, 12, 13, 16, 18 22, 22, 23, 23, 24, 26, 27, 28 30, 31, 31, 34, 35, 36, 37, 39 45, 49 Step 3: 12 02 02 The leading digit is the “stem” and the trailing digit is the “leaf”: (^23 33 64 86 7 ) 34 05 19 1 4 5 6 7 9

19

Ending comment on graphing: Like anything else in statistics, graphs can be manipulated to give false impressions. Since graphing is a visual medium, fair choice of scale is essential to achieving reliable analysis. 2.4 Scatter Plots A determine if a relationship exists between two data points. scatter plot is a graph of ordered pairs of data values that is used to Example: Is there a relationship (do you think so?.. ..more accidents = more fatalities) between bike accidents and bike fatalities?

Just plot points... ..

. This is a scatter plot.

20

Analyzing Scatter Plots: • If they sort of fall on line it is called a o Positive linear relationship for increasing lines (positive slope) linear relationship o Negative slope) linear relationship for Decreasing line (Negative This data has a weak positive linear relationship Our example above is much stronger

  • • No relationship exists when theThere are other non-linear relationshipsre is no discernable pattern