SAS/INSIGHT is an environment for interactive analysis of data. Its focus is on interactive graphics: graphics which the user can modify at the screen. An example of this is the ability to click on a data point (an unusual observation, for example) on a plot and have it identified with its corresponding observation number. Or to reverse this process, a subset of the data points on an existing plot (say all males) could be easily highlighted. SAS/INSIGHT also has many data-handling and data-analytic capabilities to complement its graphical capabilities.
This very brief introduction covers only the barest essentials of SAS/INSIGHT. Its goal is to get beginners up and running in the SAS/INSIGHT environment, and to provide a guide to some basic tasks. Full documentation is found online at http://support.sas.com/onlinedoc/912/docMainpage.jsp. In addition, SAS/INSIGHT has a very good help system of its own, as will be explained below.
To access SAS/INSIGHT, select the "Solutions" entry from the menu bar of the main SAS window, then select the "Analysis" and "Interactive Data Analysis" entries in succession. Try this now. A small window entitled "SAS: SAS/INSIGHT: Open" will appear on your screen. We will call all activities you perform in SAS/INSIGHT from the time this window appears until you exit SAS/INSIGHT, a session. The box before you is the initial dialog box. By pressing the "Open" button at the bottom, you may read an existing SAS data set into SAS/INSIGHT. You will be asked to do this later in this tutorial. To begin, however, you will be asked to create your own SAS data set using SAS/INSIGHT. To begin this process, click on the "New" button.
A new data window, entitled something like "SAS: WORK.A" will appear. This means that the SAS data set you will be creating will be found in the SAS data library "WORK", which is a storage area for temporary SAS data sets (data sets that will be erased when you exit from the current SAS session). The data window is divided into a number of rows and columns of rectangles. Each rectangle, which we will call a cell, will hold one piece of data. The upper left cell should be highlighted, which indicates that it is selected and ready to accept data entry. You will begin entering data soon. First, however, a few details about getting around.
In SAS/INSIGHT, operations you can perform include creating graphs and analyses, transforming variables, fitting curves and saving results. These operations are chosen by pulling down a menu from a menu bar. The menu bar is located at the top of SAS window (the one on the data window has the items File Edit Analyze Window Help). To pull down a menu, click on the item of interest from the menu bar. A pop-up menu will appear. Continue holding the mouse button down while you drag it down the pop-up menu until you reach the desired item. If another pop-up menu appears, continue holding the mouse button down and drag to the desired item. Release the mouse button when you have arrived at the desired operation.
For example, select the "Help" item on the menu bar. A pop-up window will appear. Drag the mouse down the items to "Reference ". Another pop-up window will appear. Move the pointer to the first item, "Data", and release the mouse button. This activates a help window which explains about the data windows in SAS/INSIGHT. The sequence of steps by which you brought up this data window can be written in shorthand and italicized as Help:Reference:Data. This shorthand and italicized notation will be used in the rest of this tutorial to describe how to move through the menus.
If you find you have made a mistake and don't want the pop-up menu you've opened, click on some neutral area of the window, such as blank space on the menu bar.
There is also context-sensitive help available. For example, if you are displaying a bar chart (a subject considered later in this tutorial) and you want some question answered about bar charts, you can put the pointer on the bar chart and press the F1 key on the keyboard.
This tutorial will not attempt to duplicate the information found in the help windows. Instead it will focus on some of the features in SAS/INSIGHT which are unique or particularly easy to use.
For this section of the primer we will assume that a project team consisting of three professors has just run the helicopter experiment introduced in Lab 1.2 of the book. If you aren't yet familiar with it, the helicopter experiment consists of timing how long it takes a paper helicopter to drop a specified distance. The experiment requires someone to release the helicopter (the RELEASER) and someone to time the helicopter's stay in the air (the TIMER). The resulting data need to be entered into SAS:
If your team has already run the helicopter experiment, you should follow along in this section but enter your team's data instead of the above data.
Now begin entering the data. Click on the upper-leftmost cell in the data window to select it for the first data value. Type "Moe <enter>"; (note: (1) <keyname> means press the key named keyname on the computer keyboard. On some computers the enter key has the name return. (2) Type what is within the quotes, not the quotes themselves.) The name "Moe" should appear in the selected cell as you type, and <keyname> should select the next cell down. Now in succession type "Moe < enter>";, "Moe <enter>";, and "Curley < enter>";. You are on your way to entering the data!
You may have already noticed that the letters "Nom" appeared at the top of the first column, and below them the letter "A". A is the name SAS has given the first variable, and Nom indicates it is a nominal variable. A nominal variable is one which "names". Because the values you have input consist of letters, SAS has concluded (correctly) that the first variable is nominal. We want to name the first variable "RELEASER". To do this click on the triangle in the upper left corner of the data window with the left mouse button (always select with the left mouse button unless told otherwise). A popup menu will appear. Click on the menu entry "Define Variables...". A "SAS: Define Variables" dialog box will appear. Click on the "A" to the right of "Name:", enter the name "Releaser" (without the quotes), and click on the "OK" button. The name of the variable will now be "RELEASER".
Before we go on, two things. First, a word about notation. In what follows, we will denote the triangle you first clicked on with the symbol . As we go through this tutorial, this triangle button will appear in a variety of windows and locations, but no matter where it appears, it will be referred to as . Thus two mouse selections you used in changing the name of the variable would be described as "choose : Define Variables...".
Second, a few comments about the data window. The window should now have four names entered under the variable named RELEASER. Notice the number 1 is to the right of the triangle and the number 4 is below it. The first tells the number of variables (columns) in the data set (there is only RELEASER) and the second tells how many observations. The left column contains small squares. These are the symbols used in plotting. The column to the right of these contains the observation number of each observation.
Now enter the rest of the data. You may continue entering the rest of the releaser names as you have been doing, or you may click on any cell to enter the value of a single observation, or you may enter rows of data. Let's try the latter. Click on the cell at the upper left containing the first data value you entered. Now press <Tab>. The next cell to the right should be highlighted. Enter "Larry". Tab over once more, enter "2.15" and press <enter>. Now enter "1.34", and press <shift-tab> (i.e. hold down the "shift" and "tab" keys simultaneously). This will enter the "1.34" and move one column to the left. You may now enter "Larry", press <Enter> to move one row down, and continue. You may use this or the column entry you began with to complete entry of the data, or you may devise some other method of your own.
When you have finished data entry, name the second and third variables TIMER and TIME. Notice that TIMER is a nominal variable, but TIME is an interval variable, which is the default for numerical measurements.
So far, the data you have entered are accessible only to SAS/INSIGHT and only during this session. If you exit INSIGHT the data will be lost. However, you can save these data in a SAS data set.
SAS data sets contain data and information about data such as variable names. They are created by SAS and are readable only by SAS. There are both temporary and permanent SAS data sets. Temporary data sets disappear after you finish your SAS session. They are stored in a library called WORK. Permanent data sets are stored in SAS data libraries in your directory, and may be accessed later. The default data library is SASUSER. Many SAS data sets have been created and stored for your use in the data library SASDATA.
To save your data to a SAS data set, from the data window choose File: Save: Data. A dialog box will appear offering you your choice of libraries to save to and allowing you to choose a name for the data set. If you want to create a temporary data set, select the library WORK. If you want to create a permanent data set, select the library SASUSER. In either case, call the data set COPTER.
It may be that you want to use SAS/INSIGHT to analyze data in an existing SAS data set. Data from an existing SAS data set are entered into SAS/INSIGHT through the initial dialog box, which is automatically brought up when entering SAS/INSIGHT. The initial dialog box may also be accessed if you are already in SAS/INSIGHT, by choosing File: Open. Whichever method you use, bring up the initial dialog box now.
To enter a SAS data set into SAS/INSIGHT, click on the name of the library where the data set resides and then on the data set name. One or both these actions may involve scrolling the names in a window. To scroll, place the pointer on the slider bar, hold down the left mouse button, and move the mouse. You can scroll more slowly by clicking with the left mouse button on the arrows at the top or bottom of the scroll bar.
For this tutorial, select the library SASDATA and then the data set BASEBALL. A data window containing this data set will appear. Use your mouse to enlarge this window and view its contents.
This data set consists of performance measures and salary levels for regular hitters and leading substitute hitters in major league baseball for the year 1986 (a year that will live in infamy for all Red Sox fans). The variables are:
You may access more than one SAS data set from SAS/INSIGHT at the same time. However, as you may have noticed, when the data window appeared, the initial dialog box window disappeared. To enter other data sets, choose File: Open. The initial dialog box will reappear to allow you to access another data set.
To close any SAS/INSIGHT window, click on the x in the upper right of the window, or choose File:End. When a data window is closed, all windows generated from that window are also closed. When you have closed all data windows, you exit SAS/INSIGHT.
In SAS/INSIGHT, all operations you may want to perform are listed in menus. So to perform any task, you point with the mouse and click the buttons to select objects and choose operations from menus.
You select an object to indicate that it is an object you want to work with. Objects you can select in a data set in SAS/INSIGHT include variables (such as NAME or NO_ATBAT in the baseball data set), observations (such as all data for Wade Boggs), and individual values (such as Bill Buckner's number of errors). You can also select the results of analyses you conduct in SAS/INSIGHT, such as graphs, curves and tables. Selected objects become highlighted on the display.
To select an object move the pointer to it with the mouse and click (i.e. press and then release) the leftmost mouse button . To select multiple objects, click and drag by pressing and holding the left mouse button down while moving the pointer across the objects of interest, then releasing the mouse button. This selects all objects touched by the pointer while the mouse button was held down.
Try these techniques now on the baseball data. Select the variable NAME by clicking on it. Select observation 2 (Alan Ahsby) by clicking on the number 2 next to Alan's name. Select Andre Dawson's number of hits by clicking on the 141 in the appropriate box. Select the observations for the first 6 players by clicking and dragging in the leftmost column.
When objects are far apart, it is convenient to use modifier keys with the mouse button. The shift key can be used to make an extended selection. For example, to select the observations for the first 100 players, click on the number 1 next to Andy Allenson's name, scroll down to player 100 (Eddie Milner), and click on the number 100 while holding down the shift key.
To make a non-contiguous selection, use the Ctrl key in a similar way. For example, select the variables NAME, NO_HITS and CR_HOME by clicking on any one of them first, then on a second while holding down the Ctrl key, and again on the third while holding down the Ctrl key. Try it yourself.
As you've noticed, selecting another object de-selects previously selected objects.
One of SAS/INSIGHT's strengths is its ability to create sophisticated graphical displays. To introduce you to SAS/INSIGHT's graphical capabilities, we'll consider the simplest graphical display, the frequency histogram. A frequency histogram is a graphical summary of a data set which creates a number of subgroups of the data based on the value of the variable being plotted. One bar is drawn over the range of values in each subgroup. The height of the bar drawn over a subgroup is equal to the number of data points in that subgroup.
Draw a frequency histogram for each of the variables SALARY and NO_HOME. To do this, choose Analyze:Histogram/Bar Chart (Y) from the menu bar. In the resulting dialog box, choose SALARY and NO_HOME as the Y variables, and click "OK". A window containing two frequency histograms will appear. Enlarge this window now. The graphs will remain small.
To enlarge the graphs, choose Edit:Windows:Renew from the menu bar of the graph window. A dialog box will appear: click on "OK".
You can move and change the size and/or shape of the graphs using the mouse. To move a graph, click with the left mouse button anywhere (except at a corner) on the side of the frame enclosing the graph. Then, still holding mouse button down, move the frame to a new location. Release the mouse button when the frame is where you want it. To enlarge (or shrink) the graph, click on a corner of the frame. As you move the mouse, the frame will change shape. Release the mouse button when the graph is the right size. With a little practice, you'll get quite good at this.
Incidentally, now would be a good time to try out the context-sensitive help facility in SAS/INSIGHT. Put the pointer on one of the frequency histograms and press the F1 key. This will bring up a help window about frequency histograms.
SAS/INSIGHT will automatically choose the number of groups and the group boundaries on the frequency histogram. You can customize the frequency histogram by altering both the number of groups and/or the group boundaries, as follows:
A good way to see how the appearance of the frequency histogram can be changed is to hold down the left mouse button while moving the move tool all around the frequency histogram. Try this now. Does it help you to get a better picture of the data?
You can more precisely specify the number and positions of the bars in the bar chart by choosing :Ticks, where is found in the lower left corner of the frequency histogram window. The resulting dialog box allows you to specify the minimum and maximum of the axis as well as the starting and ending location of the bars (first and last ticks) and bar width (tick increment).
This feature demonstrates some of the power of SAS/INSIGHT. Suppose you want to look at the data in the leftmost bar of the frequency histogram for SALARY. To do this, click on that bar. You will notice that not only does that bar become highlighted, but parts of the frequency histogram for NO_HOME do as well. Now look at the data window. You'll notice that the observations of the players whose salaries are displayed in the leftmost bar of the frequency histogram are also highlighted. This illustrates two things. First, you can select observations by clicking on locations on graphs. Graphs with this feature are said to be dynamic. Second, when you select a subset of observations, the selection is displayed on all relevant windows in SAS/INSIGHT. Graphs with this feature are said to be linked. To de-select, just click on an empty region of the histogram window. Try this now.
You can do this in reverse as well. Go to the data window and select observations 1-10. These will become highlighted in the data window and on your graphs.
To delete a graph, first select it by putting the cursor outside the graph frame and clicking and dragging the cursor inside the frame. The graph will become highlighted. Then choose Edit:Delete. The graph will disappear.
Suppose you want to compare the frequency histograms of batting averages for American and National Leagues. This is easily done as follows. Choose Analyze:Histogram/Bar Chart (Y). From the resulting dialog box, select NO_HOME and click on the "Y" button. Next select the variable LEAGUE and click on the "Group" button. Click on "OK". Separate frequency histograms for each League should appear side by side in the resulting window.
Be careful in comparing them, though! The scale of their axes won't be the same. To get the same horizontal axes and the bars over the same intervals, adjust the ticks as described above. To get the same vertical axes, choose Edit:Windows:Align. Try this now. Do you detect any differences in batting averages between the two leagues?
A scatter plot or X-Y plot is a graph of bivariate data which plots the X variable on the horizontal axis and the Y variable on the vertical axis. As an example, suppose you are interested in whether there was a relation between a player's salary and his batting average. The best way to see any relationship is to plot SALARY (Y) versus NO_HOME (X). To do this, choose Analyze:Scatter Plot ( Y X ) from the menu bar. A dialog box will appear. Select NO_HOME as the X variable by clicking on NO_HOME in the variables box on the left and then clicking on the "X" button at the upper right. Select SALARY as the Y variable by clicking on SALARY in the variables box and then clicking on the "Y" button. Select NAME as the label variable by clicking on it in the variables box and then clicking on the "Label" box. Then click on "OK". The scatter plot will appear. Enlarge the window and renew the plot as desired.
Do you see a pattern to the data? Are there any unusual points? To find out who they are, click on any of those points on the plot. The player's name will appear because that is the label you gave the data. Who were the most underpaid players in terms of batting average? The most overpaid?
Perhaps you want to find which variables among NO_RBI, CR_RBI and SALARY were most related. You can use SAS/INSIGHT to produce a scatterplot array. In the data window select the variables NO_RBI, CR_RBI and SALARY. Then from the menu bar choose Analyze:Scatter Plot (Y X). Enlarge the window as desired and renew the plot. Check out the results. Smooth, huh? What do you conclude about the relationships between pairs of these variables?
NOTE: Printing is not available in the Statistics Multimedia Classroom in KH 207. It may, however, be available at other locations on campus. If you are working in the stat lab, you will want to save your graphs to files in order to print them at a later time or include them directly into documents such as lab reports. See the next section for details.
As all SAS/INSIGHT output seen at the screen is written to the SAS/INSIGHT windows, it is important to be able to print the contents of these windows.
To get a good printed version of the window, follow this five step procedure:
Sometimes, it is desirable to save SAS/INSIGHT output to a file for inclusion in a document or for printing at a later time. SAS/INSIGHT allows you to do so in a number of formats (e.g., gif, PostScript, and bitmap).
To save SAS/INSIGHT output to a file, from the window containing the output to be saved, choose File:Save:Graphics File. In the resulting dialog box, choose the type of file and give the file a name (including the path if it is to be written in another directory). Be sure to check "Titles and Footnotes" if you intend to submit it in a lab report or homework assignment.
As a default, only what is visible in the window will be saved to the file. If you want to include parts of the output that are not visible or only partly visible, you must select all objects you want to include in the file.
The SAS data sets you read into SAS/INSIGHT are not affected by any modifications you may have made during your SAS/INSIGHT session. You can, however, save the data modified in SAS/INSIGHT to a SAS data set. The resulting data set will contain:
To save the baseball data set as it currently exists in SAS/INSIGHT,
Data, and from the resulting dialog select the library where you want the data set stored (usually WORK if you want it to be temporary and SASUSER if permanent). You should also choose a data set name.
SAS/INSIGHT accesses the same SAS data sets common to all SAS modules. Therefore any output written to a SAS data set by SAS/INSIGHT can be accessed by other SAS modules and vice-versa. Also, SAS/INSIGHT can be run simultaneously with other SAS modules such as SAS/EIS.
There is one caution, however. If a data set is open in SAS/INSIGHT, other SAS programs may be unable to access or write to it. This is particularly true of the macros in SAS/EIS. In this case a good strategy is to save a copy of the data set to a temporary data set as outlined in "Saving Data", and use one for analysis in SAS/INSIGHT and another for all other SAS analyses.
Introduction to SAS/EIS, which you'll use to run SAS macros (programs) for labs and specialized applications.
Introduction to SAS/INSIGHT II: Advanced Concepts. This tutorial will show you some of the more advanced features of SAS/INSIGHT