The number of new features is rather breathtaking
![]() |
There may be more than a grain of truth to the marketing blurb I borrowed in the headline. As with many other successful packages, this one has come a long way from its command-line origins. I had found the earlier versions of this software to be rather plain and stiff. What you needed was there, it just took a bit of work to get at it and, during an analysis, you got exactly what you asked for (assuming correct structure of the query), nothing more.
Now “Mr. User-friendly,” SYSTAT 12 has a number of useful and sometimes unique features to assist new users. The number of new features is rather breathtaking
![]() click to enlarge Figure 1: The data editor |
and is indicated by the length of the ‘What’s New and Different in SYSTAT 12’ chapter in the Getting Started Manual (63 pages!). The trend for most statistical and mathematical software is toward user friendliness, and SYSTAT is no exception. The amount of work done in upgrading and simplifying menus, customization routines and tools is refreshing, and new modules, such as the increasingly useful Quality Analysis and Monte Carlo collections, are especially valuable.
As my interest is in the testing, diagnostics and output, I like to concentrate in those areas for most reviews. Let’s first get a few preliminaries out of the way to give the basic information, and then work through a few examples to get a feel for the depth and breadth of the analytics.
System requirements
• Windows 32 bit (Vista, XP, 2000)
• Pentium-level 32 bit processor
• 128 MB RAM (512 recommended)
• 220 MB free disk space (inclusive of manual PDFs, 85 MB)
• CD-ROM drive
• SVGA adapter and monitor
• Internet Explorer 6
![]() click to enlarge Figure 2: Scatterplot dialog box |
Documentation
For paper manual junkies like your editor, this is heaven! SYSTAT 12 comes with 10 manuals: Getting Started, Statistics I-IV, Graphics, Data, Language Reference, Monte Carlo and Quality Analysis.
I highly recommend Getting Started to orient new users in the tips, tricks, quirks and SOPs of the software. This manual has sections on first use, basics, how to use both the menu-driven and command line elements, customizing the work environment, and a nifty ‘Applications’ section which contains problem examples from manufacturing, engineering and several scientific disciplines. Although the inputs are all given (unfortunately) as command line format, the menu-driven steps are included in the help section with the software. The writing is clear and the indexing at least adequate.
The statistical manuals serve not only as guides to the analytic techniques, but also are wonderful introductions to the theory behind each. For example, most users may not know that regression and ANOVA are two sides of the same coin and are both examples of general linear models. Statistics I explains the similarities and why it is advantageous to have all three procedures default to the same algorithm. Many of the sections may serve as stand-alone introductory texts.
![]() click to enlarge Figure 3: Basic scatterplot |
The new features, as well as the standard testing routines, are so complete as to require six to eight pages just to list them. They may be found online at www.systat.com/products/Systat/productinfo/?sec=1006. It also should be mentioned that the index and search features found under the help menu are near exhaustive and are easily accessed. Unfortunately, what you get is sometimes dependent upon the module in which you are working, so the user needs to be aware of this.
Statistics and graphics
SYSTAT can import data in the common .xls, .csv, .dat and .txt formats, as well as the specialized formats of many other statistical programs including: SigmaPlot, SAS, STATISTICA, Minitab, SPSS, Stata, JMP and Statview.
![]() click to enlarge Figure 4: Labeled and smoothed data plot |
To provide the reader with a feel for the usage, we will step through an example of basic statistical analyses. In this example (fat and caloric content of foods), we do some simple data snooping with scatterplots, sorting lists, two-way frequency tables, descriptive statistics, scatterplot matrices, subsetting, t-tests and ANOVAs.
To access the precooked data set from the main menu bar, we select File/Open/Data, select ‘All Files’ from the drop-down box, select the file Food.dat and hit OK. The data is displayed in the data editor (Figure 1).
To take a look at the relation between two quantities, we can quickly produce a scatterplot by again going to the main menu and choosing Graph/Scatterplot. In the resulting dialog box, we select FAT as the X-variable and CALORIES as the Y (Figure 2). We can also click the ‘Fill’ tab and select a solid fill and color for the fill pattern. When we hit OK, the result is a simple scatterplot (Figure 3).
To further gain information from this data, we can return to the scatterplot dialog box (this time merely choosing the scatterplot icon from the main menu) and ask for LOWESS smoothing and labeling by brand (Figure 4).
![]() click to enlarge Figure 5: Grouping and overlay plots |
We now immediately see that fat content of these foods correlates heavily with calories. We also see that some brands are more fatty and high-calorie than others. It would have been just as easy to label by food type and see that dependency.
To ask more obvious questions, we may quickly see if the diet meals are truly lower in calories than the standard fare. By using DIET as the grouping variable in the scatterplot Dialog box and requesting an ‘overlay multiple graphs’ feature with a 90 percent confidence interval, plus deselecting the smoother and label features, we can see that there is a difference (Figure 5).
There appears to be good separation of the two classes as well as a possible outlier. The manual then stresses that graphics are enlightening, but that there are times when we really need to see numbers. By using the Data/Sort File and Data/List Cases menu selections, we generate a list logically sorted for the study (Figure 6).
![]() click to enlarge Figure 6: Tabular, sorted data |
We now see wide variation of fat content within food type. We could also take a quick look to verify that the data is not strictly balanced, and we can use Analyze/One-Way Frequency tables to get counts and percentages. However, more interesting is a two-way table from cross-classifying variables. This is quickly done by selecting Analyze/Tables/Two-Way from the main menu, and selecting Diet and Brands as the row and column variables, respectively. We also need to select List layout and deselect Counts in the same dialog box (Figure 7). Similarly, we can generate basis statistics for any subgrouping of the data, as well as a scatterplot matrix to visualize the correlations.
Perhaps of greatest interest are the group comparisons, done via the independent t-test and ANOVA. For the former, we select Analyze/Hypothesis Testing/Mean/Two-Sample t-Test and fill the dialog box with variable selections and alternative hypothesis (Figure 8). Here, we wish to compare protein content in the diet and regular brands.
![]() click to enlarge Figure 7: Two-way tables |
Note the very nice graphic that seems to imply a similarity. However, also notice the highly significant p-value that strongly suggests we should not be making decisions based upon graphics alone! Now, let’s try an ANOVA, remembering that SYSTAT has always been very strong in general linear models.
In this day and age, cost is an important consideration in any buying decision, so we could reasonably ask about price variability by brand. By again ordering the display and choosing extended results, we have formatted the data and can run an ANOVA with Analyze/Analysis of Variance/Estimate Model (Figure 9 displays selected results).
These simple, scratch-the-surface examples give some idea of the ease-of-use that is present in most analyses. This assumes, of course, that the analyst knows what he wants, knows how to structure the problem, and can correctly interpret the results. Always a huge assumption!
![]() click to enlarge Figure 8: Two-sample t-test |
Summary
It is truly amazing how many sophisticated analyses can be done with modern, off-the-shelf commercial software. With today’s concentration on the ease-of-use issue, as well as full documentation of methods with the novice in mind, we have entered a new era in scientific computing. As with all software, not everyone will be entirely pleased, and I had a number of
![]() click to enlarge Figure 9: One-way ANOVA |
minor quibbles (which seemed major to me as a new user).
For example, upon using Search in the Help menu to find out about the Undo function, although there was a good explanation of the workings, I would have appreciated a word on actually finding it! It appears under the Edit menu but should have its own menu button as default. Also, DOE should have its own major menu item, but it is buried under Utilities (and yes, it is a valuable ‘utility’).
Some actions are rather non-intuitive, such as how to undo a windows tile or cascade. Also, some explanations are slightly one-sided, as with resampling where we get a much more satisfying and lengthy treatment of the bootstrap than the jackknife, although that may be a function of usage.
Despite the above, the first use of this software is a rather enjoyable experience. I find much to recommend in the completeness of the testing repertoire and simplicity of use. There is a lot of reading to do in assessing the breadth and depth of the offerings, and interested parties are highly encouraged to visit the Web site, download a copy, and actually take the time to master a few simple routines.
Availability
• $1,299 single user, commercial
Systat Software 225 W Washington St., Suite 425
Chicago, IL 60606
1-312-220-0060; Fax: 1-312-220-0070
Toll-free: 877-797-8280
www.systat.com; [email protected]
John Wass is a statistician based in Chicago, IL. He may be reached at [email protected].