Taxonomy of Data

STAT 20: Introduction to Probability and Statistics

Agenda

  • Concept Questions: Taxonomy of Data
  • Reading Questions
  • Worksheet on Paper
  • Break
  • RStudio & Markdown Introduction

Concept Questions

Types of Variables

There are three things a variable could be referring to

  1. a phenomenon
  2. how the phenomenon is being recorded or measured into data
    • what values can it take? (this is often an intent- or value-laden exercise!)
    • for numerical units, what unit should we express it in?
  3. How the recorded data is being analyzed
    • binning/discretizing income data
    • if a barchart has too many bars, using a histogram.



What type of variable is age?


Answer at pollev.com/<name>

01:00

Images as data

  • Images are composed of pixels (this image is 1520 by 1012)

  • The color in each pixel is in RGB

  • Each band takes a value from 0-255

  • This image is data with 1520 x 1012 x 3 values.

A shoebill with a duck in its mouth.

Grayscale

  • Grayscale images have only one band
  • 0 is black, 255 is white
  • This image is data with 1520 x 1012 x 1 values.

A shoebill with a duck in its mouth in grayscale.

Grayscale

  • To simplify, assume our photos are 8 x 8 grayscale images.

An 8 x 8 grayscale image

Images in a Data Frame

Consider the following images which are our data:

  • Let’s simplify them to 8 x 8 grayscale images

Images in a Data Frame

If you were to put the data from these (8 x 8 grayscale) images into a data frame, what would the dimensions of that data frame be in rows x columns?

01:00

Worksheet on Paper

20:00

Break

03:00

Writing in RStudio

Goal: write a brief memo about your home county.

Step-by-step

  1. Create and save a new .qmd file.
  2. Add a title and author.
  3. Add a header with the name of the county and state (if you are from abroad, provide the name of a county and state that you would like to visit)
  4. Add 1-3 paragraphs about this county in text, as described by Wikipedia.
  5. An example of bold and italics text.
  6. A bulleted list of your top three favorite things about this county.
  7. A sentence at the bottom with an acknowledgement that text came from Wikipedia along with a link to that source.

Step-by-step

  1. Create and save a new .qmd file.
  2. Add a title and author.
  3. Add a header with the name of the county and state (if you are from abroad, provide the name of a county and state that you would like to visit)
  4. Add 1-3 paragraphs about this county in text, as described by Wikipedia.
  5. An example of bold and italics text.
  6. A bulleted list of your top three favorite things about this county.
  7. A sentence at the bottom with an acknowledgement that text came from Wikipedia along with a link to that source.
20:00