We are now at the place where we have learned something about extracting data from a problem situation and recording it on data collection forms. Recording ”live” data that we have extracted from a problems situation, however, may not be the only way to gather the data we need to solve problems. Some or all of the data could have been collected by someone else and stored in computer data banks or archived in some other medium. By whatever means we have gathered our data, we will eventually need to input that data into a computer program so that we can use that program to analyze the data. The most common kind of program that is used in business to analyze data is the spreadsheet, and the most commonly used spreadsheet is Microsoft Excel. This section will teach you how to code and organize your data so you can process it with whatever data analysis tool you are most familiar.
Data should be organized in rows and columns. The intersection of a row and column is called a cell. Each column contains the data associated with a variable, e.g. salary, or age or gender or opinion. An observation is a complete row of data and contains all the information about a particular individual or a particular case of what we are studying. You may also see observations referred to as records.
EmpID | AnnualSalary | Gender | Height | Dept | YrsExp |
(thousands of dollars) | (inches) | (years) | |||
90020 | 31.5 | Male | 68 | Sales | 5.4 |
90034 | 40.3 | Female | 64 | Research | 0.5 |
92300 | 65.1 | Male | 72 | Admin | 15.1 |
92305 | 40.1 | Male | 69 | Sales | 6.1 |
92307 | 32.6 | Female | 68 | Admin | 7.8 |
92455 | 51.9 | Male | 70 | Sales | 3.1 |
94500 | 28.9 | Male | 65 | Research | 3.2 |
94700 | 44 | Female | 62 | Sales | 9.1 |
94545 | 49.9 | Male | 71 | Admin | 8.3 |
There are a few rules that must be followed when entering data in a spreadsheet. Following these rules will help make the data useable, which is the primary requirement. The organization of the spreadsheet should also be done for readability, but not at the expense of the useability. Once the analysis is complete, one can worry about making the data or the output of the analysis look nice for presentation, but that should be the last concern. The main considerations about spreadsheet organization are these: