Home of the original IBM PC emulator for browsers.
[PCjs Machine "ibm5150"]
Waiting for machine "ibm5150" to load....
A statistical analysis package for handling numerical data, operated by
entering one-line commands and subcommands. Command ``batch'' files can
be created for automatic execution, along with explanatory screen
remarks.
STATMATE operates on information contained in a database, generated by
the program. A user ID is required before entering a database, and for
every new user ID, an empty database is created. This feature permits
multiple users to work with STATMATE while keeping the data files
separated.
Extract data from an ASCII text file and load it into the database for
operation. Data is stored in columns and rows, and you can extract
portions of the data according to your specifications. As you
manipulate the data, the results can be displayed on the screen,
printed, or saved on a disk file.
The main analytic features are elementary statistics, scatter plots,
cross tabulations, histograms, data comparison using the T-Test,
correlation, arithmetic operations, distribution functions, curvilinear
regression, multiple regression, nonlinear regression, data recoding,
and data transformation and manipulation. An on-line help facility is
included to give you a detailed description of all the STATMATE
commands.
Disk No 863
Program Title: STATMATE/PLUS version 1.3 (Disk 3 of 3)
PC-SIG version 1.1
This is the third disk of the STATMATE package, disks #861-63, and
contains the five-part documentation for the program. Please refer to
disk #861 for full information.
Usage: Statistics Analysis
System Requirements: 128K memory and two disk drives.
Suggested Registration: $50.00
File Descriptions:
SMPART1 DOC Documentation, part 1.
SMPART2 DOC Documentation, part 2.
SMPART3 DOC Documentation, part 3.
SMPART4 DOC Documentation, part 4.
SMPART5 DOC Documentation, part 5.
README How to get started.
PC-SIG
1030D E Duane Avenue
Sunnyvale Ca. 94086
(408) 730-9291
(c) Copyright 1987,88 PC-SIG Inc.
╔═════════════════════════════════════════════════════════════════════════╗
║ <<<< Disk #863 STATMATE/PLUS (Disk 3 of 3) >>>> ║
╠═════════════════════════════════════════════════════════════════════════╣
║ To copy the documentation to your printer, Type: ║
║ PRINTDOC (press enter) ║
╚═════════════════════════════════════════════════════════════════════════╝
STATMATE/PLUS
(A Statistical Package)
Version 1.3
Shareware User's Guide
August 1, 1988
The Software Hill
1857 Apple Tree Lane
Mountain View, Ca. 94040
Copyright (C), 1987
COPYRIGHT
The STATMATE/PLUS statistical application package is
copyrighted (C) 1987, by The Software Hill. All rights
reserved. Non-registered users are granted a limited license
to use this product on a trial basis, and to copy the program
for trial use by others subject to the following limitations:
1. STATMATE/PLUS is distributed in unmodified
form, complete with documentation.
2. No fee, charge or other consideration is
requested or accepted.
3. STATMATE/PLUS is not distributed in
conjunction with any other product.
If you intend to use STATMATE/PLUS on a regular basis, please
show your support by registering the program for a nominal
fee. Registration information is give below. Commercial,
business or governmental use by non-registered uses is
prohibited.
If you are interested in multiple copies for use at work,
site and corporate licenses are available. Please write for
information.
TRADEMARKS
STATMATE/PLUS is a trademark of The Software Hill.
TABLE OF CONTENTS
REGISTRATION....................................1
USER-SUPPORTED SOFTWARE.........................2
PRODUCT SUPPORT.................................3
INTRODUCTION TO STATMATE/PLUS...................4
FEATURES........................................5
OPERATION.......................................6
STATMATE Example................................7
Command Summary.................................9
INTERACTING WITH STATMATE.......................11
Commands and Subcommands........................11
Commands........................................11
Subcommands.....................................12
STATMATE DATABASE CONCEPTS......................14
STATMATE Database and Directory.................14
Organization and Manipulation of Data...........14
Variable Names..................................15
Ways of Referencing Data--Variables and Cases...16
ENTERING DATA INTO THE SYSTEM...................17
External Data Entry--Files......................17
Creating ASCII Files............................17
COMMAND DESCRIPTIONS............................21
CROSSTABS.......................................22
ERASE...........................................25
EXECUTE.........................................26
EXIT............................................27
GIVE............................................28
HELP............................................31
INPUT...........................................32
LET.............................................35
PLOT............................................43
PRINT...........................................48
QUERY...........................................49
REGRESSION......................................50
REMARK..........................................53
SET.............................................54
SHOW............................................56
STATISTICS......................................58
TTEST...........................................60
WHEN-ELSE-END...................................62
WRITE...........................................67
APPENDIX A: Computation Methods.................69
APPENDIX B: Sample Data.........................70
APPENDIX C: Installation and Miscellanea........73
APPENDIX D: STATMATE Size Limitations...........77
APPENDIX E: HELP................................79
APPENDIX F: Suggested Diskette Organization.....81
APPENDIX G: Invoice and Order Form..............82
References .....................................85
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
REGISTRATION
------------
Feedback on STATMATE/PLUS is an important part of developing
a useful and successful software package. Please share your
impressions, suggestions and comments by writing to us.
STATMATE/PLUS is distributed as User-Supported Software. You
are encouraged to try the program and share it with your
friends and colleagues as long as:
1. STATMATE/PLUS is distributed in unmodified
form, complete with documentation.
2. No fee, charge or other consideration is
requested except by The Software Hill.
3. STATMATE/PLUS is not distributed in con-
junction with any other product.
If you use STATMATE/PLUS on a regular basis, please show your
support by registering the program. You may register by
sending a check or money order for $45 to:
The Software Hill
1857 Apple Tree Lane
Mountain View, Ca. 94040
Registered users will receive (1) notification of major
releases of STATMATE/PLUS, newsletters and other information
supporting the package and (2) two sort utilities and a high
resolution scatter plot utility (for use with CGA, EGA or
Hercules graphics cards). Program disks are not included in
the registration fee. Note that when you register you receive
a $10 coupon applicable to additional purchases.
1
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
USER-SUPPORTED SOFTWARE
User-supported software is a means for users to receive
quality software while directly supporting software authors.
It is based on the ideas that:
a. Immediate assessment of the package through
hands-on use to detemine whether the package
satisifies the user's personal application
needs and operational tastes.
b. Creation and support of independent personal
computer software is important and desirable
by interested application users.
c. Copying of programs should be encouraged,
rather than restricted to promote the widest
possible development, interest and support
by the application's community.
Under the concept of user-supported software, anyone may
request a copy of STATMATE/PLUS by sending a blank, DOS
formatted, 5-1/4 inch diskette to The Software Hill along
with a self-addressed, postage-paid return mailer. You will
receive STATMATE and program documentation on the disk by
return mail.
The program carries a notice suggesting registration, but
registration is strictly voluntary.
You are encouraged to copy and distribute STATMATE,
regardless of whether or not you register, for private and
non-commercial use of others.
2
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PRODUCT SUPPORT
As of this date, we are able to provide support of registered
owners by mail or phone. A bulletin board system is not
presently available. In order to answer questions or
comments, please provide as complete a description of your
problem as possible. Include a description of your
configuration (hardware and operating system), steps taken
before a problem occurrence and any printed material
identifying the problem.
The latest version of STATMATE/PLUS may always be found on
the PC SIG bulletin board system. As the popularity of the
program grows, it will be found on your local bulletin
boards, shareware distribution disks and a number of other
computer environments.
MACHINE REQUIREMENTS
STATMATE/PLUS requrires 128K of memory (RAM). It is best
operated from a hard disk but may be operated from two 5
1/4-inch floppies. See the appendix for suggestions on
tailoring STATMATE to your system. It operates under DOS
version 2.0 or higher. STATMATE may be used on a PC, XT, AT
or compatible. There is no dependence no the type of
terminal used, whether monochrome, composite or color. Any
terminal type is satisfactory.
3
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
Introduction
This guide briefly describes some of the capabilities of
STATMATE/PLUS. Enough descriptive material is given in this
guide so that you should be able to understand the
capabilities of the package and put STATMATE to use in your
applications. For registered owners, a complete guide to
STATMATE/PLUS is available for $35.
Probably the best place to start is by reading over the
material in this guide. When you are ready to try the
program, read the section on operating STATMATE. Try the
example discussed there and read appendix F regarding the
suggested organization of STATMATE program files on disk.
Once you have completed the example, try the package with the
EXECUTE command on the DEMO file. After entering STATMATE,
give the three character ID required, just enter in response
to the command prompt:
EXECUTE DEMO
This will cause STATMATE to run through a sequence of
commands. A description of what is happening is given as the
program proceeds through the commands. The demonstration
will pause and give you the chance to read what is happening
at your own pace.
The command summary given later indicates commands which will
operate in the demonstration program.
4
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
FEATURES
A brief summary of some of the many STATMATE operational
features are given below.
-Operational Features-
* Help facility
* Data extraction from external files
* Placement of output on results files
* User named variables
* Default names for variables
* Data transformation and manipulation
* Display of selected data
* Missing or not applicable data values
* Multiple user operation
* Database maintainence operations
* Data selection by specified conditions
* ASCII output files to other applications
The analytic features available with the STATMATE package are
given below.
-Statistical Features-
* Elementary statistics * Scatter plots
* Cross tabulations * Histograms
* T-Test * Correlation
* Curvilinear regression * Random Number Generation
* Arithmetic operations * Distribution functions
* Multiple regression * Group statistics
* One-way ANOVA * Two-way ANOVA
* Control chart calculations * Nonparametric methods
* Nonlinear regression * Polynomial regression
* Data recoding
5
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
OPERATION
Operation of STATMATE is begun by entering:
SMATE
You will be prompted with the message:
ENTER ID-
The program expects a three alphabetic*, upper or lower case
letters, character identification as a response. The
characters are used as a database identification. Your
initials are usually the simplest ID to use. For example,
ENTER ID-XYZ
In this case XYZ is used as an ID and XYZ is the identified
database. Use of another ID would identify another database.
This mechanism allows you to create databases for different
purposes, for example, one might belong to your data, and
another to a colleague. (IMPORTANT: Each time you supply a
different ID, STATMATE creates an empty database file
corresponding to the ID you supplied. These files may be
large. It is best to use the same ID each time you use
STATMATE or you will quickly exceed your disk capacity. See
the use of databases in section STATMATE DATABASE CONCEPTS.)
After you provide the ID, the program will issue the prompt:
Command:
At this point, a STATMATE command must be entered in order to
continue the operation of STATMATE. A carriage return must
be entered at the completion of each line of input. After
each command is completed, another 'Command:' prompt will be
issued. Entering EXIT will terminate the program.
* By supplying a fourth character, q or Q, with the ID, the
shareware banner output after entering the ID is suppressed.
6
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
STATMATE EXAMPLE
The following sample illustrates the operation of STATMATE:
DATA U.S. POPULATION:YEAR,URBAN,RURAL
1860 , 6.217 , 25.227
1870 , 9.902 , 28.656
1880 , 14.130 , 36.026
1890 , 22.106 , 40.841
1900 , 30.160 , 45.835
1910 , 41.999 , 49.973
1920 , 54.158 , 51.553
1930 , 68.955 , 53.830
1940 , 74.924 , 57.246
1950 , 88.927 , 61.770
The data used in the example problem is the U. S. Population
data for rural and urban areas from 1860 through 1950. The
data is on a file called USPOPDEM.DAT (a portion of the
USPOP.DAT file supplied with STATMATE) and consists of three
fields: year, urban population and rural population;
population data is in millions.
The example of STATMATE operation is shown on the next page,
and the description of the operation is described in this
paragraph. From the next page, the ENTER ID- prompt is
answered with ABC. This establishes the user's ABC database
as the database which is to be used in the example. Next,
the ERASE command clears all data from the database. Data is
then extracted from the data file USPOPDEM.DAT by the INPUT
command. In the INPUT command, the clause OMIT 2 causes the
first two fields of the data file to be ignored. The clause
KEEP 1 causes the third field, rural population to be
extracted. Hence, only one variable, that is, one column or
field of data, containing rural population, is placed in the
database. STATMATE only operates on data placed in its
database. There are 10 data points or cases for the
extracted variable as is reported by STATMATE as 10 CASES at
completion of the INPUT command. Initially this variable has
the name #1 assigned to it. The GIVE NAME command is used to
give #1 an alternate name, RURALPOP. The STATISTICS command
is applied to RURALPOP to derive the simple statistics
produced by the command. The program is terminated by the
EXIT command.
7
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
SMATE
ENTER ID-ABC
Command: ERASE #1 THRU END
10 VARIABLES ERASED
Command: INPUT USPOPDEM.DAT OMIT 2,KEEP 1
1 VARIABLE INPUT AT #1
10 CASES
Command: GIVE NAME #1,RURALPOP
1 ATTRIBUTES MODIFIED
Command: STATISTICS RURALPOP
VARIABLE: RURALPOP
10 CASES
0 MISSING
CENTRAL TENDENCY SPREAD DISTRIBUTION
------------------- ------------------------ --------------------
MEAN 45.10 STD. DEV. 12.17 MINIMUM 25.23
VARIANCE 148.15 MAXIMUM 61.77
RANGE 36.54
COEFF. VAR. 0.27
SUMMATIONS HIGHER MOMENTS
----------------------- --------------------
TOTAL 450.96 SKEWNESS -0.37
SUM SQ 21669.59 KURTOSIS 1.93
SUM SQ(DEV) 1333.37
Command: EXIT
END STATMATE
8
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
Command Summary
Commands are the basic form of communicating with STATMATE.
A summary of the commands available for both STATMATE/PLUS
and STATMATE are shown below.
Summary of STATMATE/PLUS Commands
Command Name Description Modifiers/Descriptors
------------ -------------------- ----------------------
BREAKDOWN Statistics by groups none
CHART X- and R-Charts TYPE, CONTROL, KCENTER,
HMARK, VRANGE, TITLE,
HLABEL, KSIGMA, DISPLAY,
HFILLER, VPOSITION,
VLABEL
COMPUTE Fit and forecast TYPE
CORRELATE Pair-wise correlation none
CROSSTABS Two-way cross tabs none
CURVE Ten curve fits TABLE, BEST, EQUATION
CUSUM Cusum chart TARGET, DISPLAY, RESET,
HMARK, HFILLER, VRANGE,
VPOSITION, TITLE,
VLABEL,HLABEL
EDIT Database editing none
ELSE Reverse WHEN condition none
END Remove WHEN condition none
EXIT End STATMATE operation none
ERASE Remove database data none
EXECUTE Multiple command entry none
GIVE Give data attributes none
HELP Provide command help none
HISTOGRAM Histogram TITLE, RANGE, BARS,
VPOSITION
INPUT Data input KEEP, OMIT
KOLMOGOROV Kolmogorov tests DISTRIB, SPARAM, UPARAM
LET Arithmetic operations none
NONLINEAR Nonlinear regression MODEL, MAXITER, REPORT,
TYPE, CONVERGE
ONEWAY One-way ANOVA METHOD,ALPHA
ONPARAM Nonparametric methods none
PLOT Scatter plot TITLE, HRANGE, VRANGE,
HPOS, VPOS, HLAB, VLAB
POLYNOMIAL Polynomial regression TABLE,
PRINT Data display none
QUERY Database status none
RCORRELATION Rank correlation TEST
RECODE Recode crosstab data none
REGRESSION Multiple regression TABLE, INTERCEPT, DURBIN
REMARK Allows documentation none
9
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
Summary of STATMATE/PLUS Commands (Continued)
Command Name Description Modifiers/Descriptors
------------ -------------------- ----------------------
SET Set output type COPY
SHOW Show internal status none
STATISTICS Summary statistics TABLE
STEPWISE Stepwise regression TABLE, MAXSTEP, FORCE,
FENTER, FREMOVE, METHOD
TNPARAM 2-way nonparam ANOVA TEST
TTEST Student T-test none
TWOWAY Two-way ANOVA DESIGN
WHEN Select database view none
WRITE Put variables to file none
10
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
INTERACTING WITH STATMATE
Commands and Subcommands
The primary means by which you communicate with STATMATE is
through commands. For example, the PRINT command tells
STATMATE to list data which is specified in the command. In
some instances, a command requires the use of a subcommand to
enter additional information about the operation requested by
the command. In STATMATE, subcommands are available, for
example, with the CHART, CUSUM, STEPWISE and PLOT commands.
Information related to specific commands and subcommands is
found in the command description portion of this manual. If
you need help while actually entering a command or
subcommand, an on-line HELP command is available.
Commands
A command contains a command name and a reference to the
variables that it is to operate on. For example,
STATISTICS URBANPOP
calculates statistics for the variable URBANPOP, representing
urban population data. A command name may be abbreviated by
using the first three characters of its name. That is,
STA URBANPOP is an acceptable command.
Some commands operate on several variables. For example, the
scatter plot command, PLOT, requires variables for plotting
on the vertical and horizontial plot axis. In order to help
distiguish between the use of variables, some commands use a
keyword which essentially divides the list of variables into
easily identifiable pieces. For example, in
PLOT SALES,WAGES ON YEAR
ON is a keyword that separates the list of vertically plotted
variables, SALES and WAGES, from the horizontially plotted
variable, YEAR.
Another type of item used in a command is a modifier. A
modifier supplies some additional information about how the
command should operate. For example,
11
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PLOT WAGES ON YEAR,TITLE='PLOT OF WAGES BY YEAR'
TITLE is a modifier indicating that the title shown should
appear on the scatter plot. Modifiers are followed by an
equal sign (=) and are separated from the list of variables
and from one another by commas. For example,
CURVE WAGES ON YEAR,TABLE=ANOVA,EQUATION=LIN,QUA
Some modifiers contain option names following the equal sign
to designate which options are to be selected for the
modifier. For example, TABLE=FIT,PARAMETERS indicates the
FIT and PARAMETERS options are selected for the TABLE
modifer. As with command names, only the first three
characters of a modifier or option name need be used.
Although some commands contain modifiers, the modifiers do
not need to be entered. If a modifier is not entered it
assumes a default value. For example, if the TITLE modifier
is not given for PLOT, the title is assumed to be blank. In
most instances, you cannot specify your own default values;
however, with a few commands, STEPWISE and PLOT, for example,
you are allowed to change the default settings. This is a
very useful feature. For example, in the event that you use
the same scatter plot title for much of your work, the plot
title can be set and not changed until necessary.
Subcommands
Subcommands are used to specify additional operations for a
command. Subcommands are available only with a few of the
commands. Usually a subcommand permits you an alternate way
of entering information about modifiers. Other subcommands
permit the values of current modifier settings to be
displayed or saved.
Entering and Using Commands and Subcommands
STATMATE will prompt you with a message to enter a command or
subcommand. In the case of a command, the prompt is:
Command:
To cause STATMATE to execute a command, you only need enter
the command as in:
Command: PLOT URBANPOP ON YEAR
In some instances, the text of a command may be so long that
it will not fit on a single line. Commands may be continued
12
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
by breaking them after a comma. That is, anywhere that a
comma is allowed in a command is a point at which the command
may be broken. When a line is terminated by a comma,
STATMATE will ask you for an additional line of text with the
following prompt
Continue:
For example,
Command: PLOT URBANPOP ON YEAR,
Continue: TITLE='URBAN POPULATION FROM 1790 TO 1950'
As many as 250 characters of text may be included in a
command.
Command names, keywords and other elements of a command may
be entered in either upper or lower case letters.
In the case of the PLOT and STEPWISE commands, for example,
it is not necessary to enter the information following the
command name in response to the command prompt. An alternate
method is available that some users may find easier is
available. For example, if only the command name PLOT is
entered, STATMATE will then prompt you for the names of the
variables to be plotted. After the names have been entered,
you will be placed in the subcommand mode. With the
subcommands, you may enter any of the PLOT modifiers. When
you have entered the modifiers you need, entering the
CONTINUE subcommand causes the PLOT command to be executed
and a plot to be produced. The following is a sequence of
commands and subcommands used to enter the PLOT command
discussed above.
Command: PLOT
Enter Y-axis variables: URBANPOP
Enter X-axis variable: YEAR
Subcommand: TITLE='URBAN POPULATION FROM 1790 TO 1950'
Subcommand: CONTINUE
Note that TITLE and CONTINUE are subcommands. CONTINUE
causes STATMATE to leave the subcommand mode and execute the
PLOT command.
13
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
STATMATE DATABASE CONCEPTS
STATMATE Database and Directory
An important concept in the operation of STATMATE is the
STATMATE database. This is a file containing the data on
which the program operates, and is generated by STATMATE.
Data is placed on the file by the INPUT command and may be
displayed by the PRINT command.
A user identifies which of his databases he wants to operate
on when responding to the ID prompted by the program. The ID
is associated with a database. Usually one database is
sufficient for most applications. Since these database files
can be occupy a lot of disk space, some care should be taken
in utilizing a number of different IDs. An empty database is
created each time a new ID is specified. A database is
reused when the ID given corresponds to an existing database.
When a database is created, it is large enough to accommodate
10 variables with as many as 250 cases per variable. The
STATMATE install program, SMINSTLL may be used to change the
size of the databases created by STATMATE. With a database
of 10 variables, the user may store, modify, manipulate and
repeatedly use up to 10 variables for analysis. This ability
reduces the re-entry of data for each analysis.
In situations where several users work with STATMATE, they
may want to create their own databases by using a different
ID. This feature provides additional security when multiple
users operate the package.
Associated with the database is a program generated directory
which contains the names of the variables and other
attributes of the data contained in the database. The
directory may be examined and manipulated by such commands as
GIVE, QUERY and ERASE.
Organization and Manipulation of Data within the Database
Data is organized by variables within the database.
Variables represent a collection of data on which some
analytic or manipulative operation is to be performed, for
example, a variable could be the number of houses built each
year for 15 years. Each database has a maximum number of
variables that can be placed in it, and a maximum number of
data values that can be placed in any variable. Initially, a
database is empty, but contains space for the maximum number
of variables and data values.
Variables in a database are assigned to specific database
14
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
locations in a sequential fashion, starting the with variable
1 and proceeding to the highest numbered variable. For
example, the first variable is assigned to database location
1, the second to location 2, and so on. Variables are given
predefined names according to their ordered location in the
datatase. Each predefined name is prefixed with a # and is
followed by a number. For example, #7 is the name of the
seventh variable. The user may assign his own alternate
names to the predefined variable names, as well.
Data is usually brought into a database with either the INPUT
or EDIT commands. The INPUT command allows you to enter data
contained on external files. EDIT allows you to enter data
from the keyboard. When bringing variables into a database
with the INPUT command, variables are assigned to successive
variables in a database. For example, assume that there are
variables in the first five database variables. If two
variables are input, then the new variables will reside in #6
and #7. New variables are generally placed after existing
variables. One form of the INPUT command allows you to place
variables at specific locations in a database. See the INPUT
command for further details on the entry of data into a
database. Use of the LET command, permits data to be moved
from one variable to another. Variables may be removed from
a database by erasing them, using the ERASE command.
Variable Names
A particularly useful feature of STATMATE is that it allows
you to assign names to variables, Thus, it is possible to
assign more meaningful names to variables. For example,
SALES, AGE, URBANPOP, ACCIDENTS, QUARTER, INVTRY1982, etc.
are valid names. These names consist of from 1 to 10
characters. The first character must be alphabetic but the
remaining characters may be either numeric or alphabetic.
Alphabetic characters must be in upper case letters. All
variables have alternate predefined names of the form #n
where n is the location number of the variable. For example,
#4 and #9 are predefined variable names. The GIVE command
allows alternate names to be given to a variable. The
variable #4 might, for example, be given the alternate name
AGE. Either #4 or AGE could be used to reference the same
variable.
A special variable, #0, is available, which provides the data
values 1, 2, 3, .... This variable does not occupy any space
in the database. It may not be given an alternate name.
15
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
Ways of Referencing Data--Variables and Cases
Data is generally regarded as arranged in rows and columns by
STATMATE. Although this generalization is not entirely
adequate, it is at least a starting point for understanding
how STATMATE deals with sets of data. A simple example of
such an arrangement is the U.S. population data discussed
earlier and contained in data file USPOP.DAT (data in
millions):
<-----Columns----->
Year Urban Pop Rural Pop
1790 0.202 3.728
1800 0.322 4.986
^ 1810 0.525 6.714
| 1820 0.693 8.945
| 1830 1.127 11.739
Rows 1840 1.845 15.224
| 1850 3.544 19.648
| 1860 6.217 25.227
V 1870 9.902 28.656
1880 14.130 36.026
... ... ...
The data is arranged so that each column represents some item
(variable) that is to be examined in detail. For example, it
might be of interest to determine the average value of the
item. A row represents some common element that each of the
columnar items have in common, in this case, the
corresponding population for a given year.
In STATMATE the data in a column to be examined or studied is
called a variable. The data in a row is referred to as a
case or observation. In the above example we have:
<------- Variables------>
Year Urban Pop Rural Pop
^ 1790
| 1800
cases
|
|
v
Year, urban population and rural population are variables.
The data 0.202 and 3.728 represent the case data for 1790.
Note that there is no reason why the dates themselves could
not be considered as a variable and the individual years as
belonging to a case.
16
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
ENTERING DATA INTO THE SYSTEM
External Data Entry--Files
Files are the basic way of entering data into STATMATE.
(Another method for entering data, with the STATMATE/PLUS
EDIT command, is available). Data files have names within
the Operating System, CP/M, MS DOS or PC DOS. Any of these
Operating Systems allows a variety of characters for names;
however, STATMATE only recognizes file names and file name
extensions which are composed of alphabetic characters or
numbers. File names and extent names must begin with an
alphabetic character. ABCDE, uspop.DAT, HIST1980.DT and
MYDATA are examples of valid file references within STATMATE.
STOCK/82.DAT (/ is a special character and is not allowed),
SAL$DATA ($ is a special character) and YEARS.DATA (the file
extension DATA is too long) are examples of invalid
references. Your Operating System may allow these names, but
STATMATE will not accept them if they are used in commands
where a file name is needed. Use of your Operating Systems
renaming capabilities will solve any difficulties with file
names. A file name may be prefixed with a disk drive
identifier as in A:XYZDATA.DAT. Data files are entered into
the STATMATE database with the INPUT command.
There are two ways of creating files for input. Only one of
these will be described here, the use of ASCII files. ASCII
data files may be created using a text editor program, such
as WORDSTAR or EasyWriter. An ASCII file can be easily
printed or listed.
Creating ASCII Data Files
Often the simplest way of producing an ASCII file is to use a
text editor. All that is required for preparing a STATMATE
input file with a text editor is a basic understanding of how
to arrange the text representing the data. This is a simple
task and it is addressed below.
The following shows the contents of a file prepared using a
text editor.
DATA MY ENERGY USE - 1981,ELEC CONSUMPTION--JAN TO DEC
400,250,390,
280,250,305,
235,220,230,
330,450,525
If the file were called ENERGY.DAT, it could be read with the
STATMATE INPUT command by entering:
INPUT ENERGY.DAT KEEP 3
17
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
See the INPUT command description for additional information.
The following rules apply to preparing a data file:
1. The word DATA (uppercase or lowercase) must
appear as the first word of the first line.
Blanks or any other characters may not appear
before DATA. Any comments may appear after
DATA on the line. MY ENERGY ... is a comment
describing the data in the above example.
2. Data items follow on each subsequent line.
Each item is separated from another by a comma
or blank.
3. All lines must be followed by a carriage
return and line feed, including the last line.
4. Alphanumeric information must be enclosed in
single quotes (') or begin with an alphabetic
character. If alphanumeric data contains an
embedded blank, the data must be surrounded by
quotes. The MOTOR.DAT file in Appendix B is
an example of using alphanumeric data in a
DATA file.
5. Data must be arranged on a case by case basis.
It should be noted that although many word processors and
editors will produce files which STATMATE will read, there
are some word processors and editors which place extra
characters at the beginning of a file. Users of EasyWriter,
for example, must use the TRANSFER utility to produce a
proper ASCII file. EasyWriter users must also use the ENTER
key to generate carriage returns after each line. If extra
characters are placed before DATA, STATMATE will issue a
message that the file is invalid.
18
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
This page deliberately left blank
19
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
This page deliberately left blank
20
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
COMMAND DESCRIPTIONS
Each command is described in detail in this section. In
order to determine which commands are executable in each of
the three STATMATE packages, observe that at the top of each
page a subtitle lists the packages. When a package name
appears there, then the package includes the command
described.
An important part of the command description is the syntax or
format of the command. When describing the syntax of the
command the following general rules are followed:
1. Uppercase characters should be entered as
shown.
2. For brevity, only the first 3 characters
of commands, modifier names, etc. need
be entered. If additional characters are
entered, they should always match the
name in every position given from the
first to last character entered.For
example, STATISTICS is matched by STA or
STATI but STATS does not match it.
3. Lowercase is used to describe the type of
entry that you must provide. For
example, in a description var1 might
represent a variable name.
4. Punctuation is entered as shown.
5. An ellipsis (...) means repeat the
previous item as needed. For example,
num,..., where num represents a number,
indicates either that a single number or
a list of numbers with separating commas
is acceptable.
In addition to a command's syntax, each command is further
clarified with descriptive material and detailed examples
concerning its use. Input to STATMATE that is entered by the
user is shown in boldface in the detailed examples.
Remember to enter a carriage return after entering a command
line.
21
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
CROSSTAB
Command: CROSSTAB
Purpose: Used to cross tabulate data on two variables.
Syntax:
A. CROSSTAB classavar ON classbvar
where
classavar = classification variable A
classbvar = classification variable B
Defaults:
none
Syntax Examples:
CROSSTAB AGE ON WEIGHT
CROSSTAB SMOKING ON SEX
CROSSTAB COMPANY ON SALES
CROSS TAXES ON COUNTY
Description:
CROSSTAB peforms a two-way cross tabulation on two variables.
Each variable contains data which divides the data into
classes. For example, the data below collected on the season
and observed color of a botanical specimen. SEASON might be
thought of as defining classes for each of the four seasons
and COLOR as defining five classes: BLUE, GREN (green), RED,
BLCK (black) and YELL (yellow).
SEASON COLOR
------ -----
FALL BLUE
FALL GREN
WINT GREN
SUMR RED
SPNG BLCK
FALL GREN
SUMR BLUE
SPNG YELL
In a two-way cross tabulation, the data might appear as:
COLOR
BLUE BLCK GREN RED YELL
---- ---- ---- ---- ----
SPNG 1 1
SEASON SUMR 1
FALL 1 2
WINT 1
Classes may be defined for either numeric or alphanumeric
data. CROSSTAB tabulates data for the class A variable by
22
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
CROSSTAB
rows and the class B variable by columns. As many as 20
different classes may be contained in a variable. If a
variable contains more than 20 classes, the STATMATE/PLUS
RECODE command may be used to reduce the total number of
classes.
CROSSTAB produces a table of statistics. Each table entry,
or cell, contains a frequency, row percentage, column
percentage, and total percentage for the corresponding
classes. Percentages and totals for rows and columns are
reported. Fisher's exact probabilities for the special case
of 2x2 tables are calculated when there are 32 or fewer
entries in the table, These probabilities are used to produce
Tocher's correction. If a 2x2 table is produced with more
than 32 entries, Yate's correction to the Chi-square is
output. The Chi-square statistic is produced for other
tables. A Phi value, a measure independent of the number of
cross tabulation entries, is produced for all tables.
If the number of columns and rows in the output table exceed
a reasonable page width and length, the table is output in
conveniently divided sections. Basically, the entire table
is output in sections from left to right and from top to
bottom. Classes are sorted in ascending order before they
are output, so classes placed in the output table are easy to
find.
Detailed Example
The following example uses CROSSTAB to perform a two-way
cross tabulation on the class data contained in FACTORA and
FACTORB. Note that some of the class data in FACTORA is
defined by a -2 value, and note that any values may be used
to identify a class. Class identifiers are, numeric values
in this instance, are printed before each row and above each
column. FACTORB has four classes: -2, 4, 6 and 8. FACTORA
has three classes: 2, 3 and 4.
Command: PRINT FACTORA,FACTORB
FACTORA FACTORB
----------- -----------
3.00 4.00
2.00 -2.00
3.00 6.00
2.00 8.00
2.00 6.00
4.00 8.00
2.00 4.00
2.00 8.00
2.00 4.00
2.00 8.00
4.00 6.00
23
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
CROSSTAB
3.00 -2.00
Command: CROSSTAB FACTORA ON FACTORB
CLASS A VARIABLE: FACTORA
CLASS B VARIABLE: FACTORB
12 CASES
0 MISSING
0 NOT TABULATED
FACTORB
COUNT |
ROW PCT | | | | |
COL PCT | | | | | ROW
TOT PCT | -2.00| 4.00| 6.00| 8.00| TOTAL
FACTORA | | | | |
|--------|--------|--------|--------|
2.00 | 1 | 2 | 1 | 3 | 7
| 14.3% | 28.6% | 14.3% | 42.9% | 58.3%
| 50.0% | 66.7% | 33.3% | 75.0% |
| 8.3% | 16.7% | 8.3% | 25.0% |
|--------|--------|--------|--------|
3.00 | 1 | 1 | 1 | 0 | 3
| 33.3% | 33.3% | 33.3% | 0.0% | 25.0%
| 50.0% | 33.3% | 33.3% | 0.0% |
| 8.3% | 8.3% | 8.3% | 0.0% |
|--------|--------|--------|--------|
4.00 | 0 | 0 | 1 | 1 | 2
| 0.0% | 0.0% | 50.0% | 50.0% | 16.7%
| 0.0% | 0.0% | 33.3% | 25.0% |
| 0.0% | 0.0% | 8.3% | 8.3% |
|--------|--------|--------|--------|
COLUMN 2 3 3 4 12
TOTAL 16.7% 25.0% 25.0% 33.3% 100.0%
CHI SQ: 3.7381 DF: 6
PHI = 0.558
24
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
ERASE
Command: ERASE
Purpose: Erases data variables from the EZPACK database.
Syntax:
A. ERASE vname THRU END
B. ERASE vname
where
vname = name of first variable where data is
to be erased
Syntax Examples:
ERASE #6 THRU END
ERASE WAGES THRU END
ERASE SALES
Description:
In order to erase data from the database, the ERASE command
must be used. For the variables specified, the command
resets the number of cases to zero, removes any assigned
name, and resets the missing value to 1.0E30. Variables
where data is erased become numeric variables. Data may be
erased variable by variable, or from a given variable through
the end of the database. Erasing data does not affect the
amount of data stored in the database (since erased data is
simply ignored), but does affect the use of the INPUT command
(See the INPUT command for details).
Detailed Example:
Below, the ERASE command is used to erase data from the
database. Erasing begins with the variable TOTALPOP, #4, and
proceeds to the last variable, #10.
Command: ERASE TOTALPOP THRU END
7 VARIABLES ERASED
25
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
EXECUTE
Command: EXECUTE
Purpose: Allows execution of commands from an EXECUTE file.
Syntax:
EXECUTE fname
where fname is the name of a file
Syntax Examples:
EXECUTE MYCOMFIL
EXECUTE A:CMDFILE.EXE
Description:
The EXECUTE command allows commands placed on an EXECUTE file
to be executed by STATMATE. This feature simplifies the
entry of often used sequences of commands. A discussion of
this feature may be found in the introductory information on
the use of EXECUTE files.
An EXECUTE command may not appear in an EXECUTE file.
When an EXECUTE command is entered, the subsequent Command:
prompts that usually are issued by STATMATE are replaced by
!Command: until the commands on the file are completely
processed. The Command: prompt is then issued to indicate
that STATMATE wants you to enter a command when all commands
on the file have been processed.
Detailed Example:
In the following example, EXECUTE refers to the EXECUTE file
REGSTUDY.EXC, found on the C disk (directory).
Command: EXECUTE C:REGSTUDY.EXC
26
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
EXIT
Command: EXIT
Purpose: Returns the user to the Operating System
Syntax:
EXIT
Syntax Examples:
EXIT
Description:
The EXIT command returns the user to the Operating System.
Detailed Example:
EXIT is entered in response to an STATMATE command request to
return to the Operating System.
Command: EXIT
END STATMATE
27
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
GIVE
Command: GIVE
Purpose: Assigns user defined attributes (names, missing values,
and data types) to variables.
Syntax:
A. GIVE NAME aname,vname1,...
where
aname = user or predefined variable name at which
assignment begins
vname1 = user name for aname
vname2 = user name for next variable
after aname
B. GIVE MISSING aname,missv1,...
where
aname = user or predefined variable name at which
assignment begins
missv1 = missing value to be assigned to
variable aname
missv2 = missing value to be assigned to next
variable after aname
C. GIVE TYPE aname,atype1,...
where
aname = user or predefined variable name at which
assignment begins
atype1 = A (alphanumeric) or N (numeric) to be
assigned to variable aname
atype2 = type to be assigned to next variable
after aname
Syntax Examples:
GIVE NAME #1,SALES,STOCK,TIME
GIVE NAME #4,AGE,SCORE
GIVE NAME AGE,NEWAGE
GIVE MISSING #1,200.0,200.0,-1000.0,'NA'
GIVE MISSING QUARTER,'Q?'
GIVE MISSING AGE,0.0,0.0
GIVE TYPE,#15,A,N,A,N,N
Description:
The GIVE command allows a specific name, missing value or
data type to be assigned to a variable. Initially, each
variable has a default name, missing value and data type
until you change these attributes with GIVE. Recall that
each database contains a fixed number of variables, even
though you have not used INPUT to bring any data into the
database. Hence, GIVE can be used to modify variable
attributes before data is placed in a variable. GIVE is
often just for this purpose just before using INPUT.
A variable name (user assigned name) is used as an alternate
name to the predefined name given to every variable. A
28
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
GIVE
numeric value of 1.0E+30 is assigned to every numeric
variable, if it is not changed by GIVE MISSING. A missing
value of blanks is assigned to every alphanumeric variable,
if it is not changed by GIVE MISSING. Alphanumeric missing
values must be enclosed in single quotes ('). The MISSING
attribute may not be changed once data is placed in the
variable; the ERASE command must be used first before
assigning a new MISSING attribute to a variable. The data
type may be numeric (N) or alphanumeric (A). In order to
place alphanumeric data in the database with INPUT, you must
make the appropriate variable an alphanumeric type.
29
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
GIVE
Detailed Example:
GIVE NAME is used to assign the names YEAR and AGE to #3 and
#4, respectively, then GIVE MISSING is used to assign missing
values of 0 and -1 to YEAR and AGE.
Command: GIVE NAME #3,YEAR,AGE
2 ATTRIBUTES MODIFIED
Command: GIVE MISSING YEAR,0,-1
2 ATTRIBUTES MODIFIED
30
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
HELP
Command: HELP
Purpose: Displays descriptive information regarding the use of
all commands.
Syntax:
HELP cname
Syntax Examples:
HELP
HELP CURVE
HELP REGRESSION
Description:
The HELP command displays descriptive information regarding
the use of commands currently implemented. This command
provides a reminder of available commands and formats when
working interactively at the terminal. If HELP is followed
by the name of a command, specific help for the command is
displayed. If HELP is given without a command name, a list
of commands is given along with some general information
about the package.
Help information is contained on a file with the name
IFHELP.TXT, which can be modified by the user with a text
editor.
Detailed Example:
The following illustrates the output of the HELP command.
Only a part of the output is shown.
Command: HELP
---COMMANDS---
COMPUTE CORRELATE ELSE END ERASE
EXECUTE GIVE HELP INPUT LET
PLOT PRINT QUERY REMARK SET
STATISTICS WHEN WRITE
... (more follows but is not shown)
31
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
INPUT
Let's look at the record.
Al Smith (1938)
Command: INPUT
Purpose: Allows entry of data into the STATMATE database
from external files.
Syntax:
A. INPUT fname descriptor1, descriptor2, ...
B. INPUT fname descriptor1, descriptor2, ... AT vname
where
fname = a file name
descriptor = one of the following field extraction
descriptors:
KEEP n
OMIT n
vname = variable name
Syntax Examples:
INPUT MYDATA KEEP 1
INPUT STOCKFILE KEEP 6, OMIT 4 AT MYSTOCK
INPUT POPFILE.DAT KEEP 2, OMIT 1,KEEP 1
INPUT B:XYZ KEEP 5 AT #20
INPUT DATAFILE KEEP 2, OMIT, KEEP
Description:
INPUT provides one way of entering data into the STATMATE
database. Data is read in (kept) or not read in (omitted)
from the specified file according to field descriptors given
in the command. The descriptors permit any field of a file
record to be selectively extracted and inserted into the
database as a variable. Data read from the file is placed
either beginning at the variable specified after the AT
keyword in the command, or at the start of the rightmost
block of erased or empty variables at the end of the
database.
STATMATE assumes that a file is record oriented. A record
consists of one or more successive fields . Each record must
contain exactly the same number of fields. A file contains
one or more records. See the introductory section on
specific details of the two file types (DATA and PROG) files
that may be read by the INPUT command.
When INPUT is used to read such files, it must be told field
by field, for every field of a record, whether the field that
is to be transferred to a variable in the database. The OMIT
and KEEP descriptors are used for this purpose. If several
32
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
INPUT
successive fields are to be kept or omitted, a number may
follow OMIT and KEEP to indicate the number of such fields.
If a number does not follow a descriptor, a value of one is
assumed for the number of fields the descriptor applies. The
total number of fields kept and omitted must be exactly the
number of fields in a record.
In terms of fields and records, a case corresponds roughly to
a record entry and a variable to all the entries in a
specific field.
Fields extracted from a data file are placed in successive
variables in the database. The location of the first field
in which data is to be placed may be specified by using the
form containing the AT keyword. If this form is not used,
INPUT will placed the first variable extracted from the file
at the start of the rightmost block of erased or empty
variables at the end of the database. For example, if the
last variable in the database is #25, and all variables from
#12 through #25 are erased, then #12 will be the first
variable to receive data of the INPUT command, when the AT
form is not used.
Detailed Example:
INPUT is used to extract data from the States and population
density fields of the MOTOR.DAT file given in the Appendix B.
The file contains 8 fields, and the first and fourth fields
contain the data of interest. All variables from #4 through
the end of the database are assumed erased or empty. Since
the States field contains alphanumeric data, #4 is given the
alphanumeric attribute with the GIVE TYPE command prior to
using INPUT. When INPUT is executed, the two data fields are
placed in variables #4 and #5. Each field contains 50 values
or cases. WHEN is used to select the first five cases, and
PRINT is used to list the first five cases of the two
variables read into the database.
Command: GIVE TYPE #4,A
1 ATTRIBUTE MODIFIED
Command: INPUT MOTOR.DAT KEEP 1,OMIT 2, KEEP 1,OMIT 4
2 VARIABLES INPUT AT #4
50 CASES
33
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
INPUT
Command: WHEN #0=1 THRU 5
5 OF 250 CASES
Command: PRINT #4,#5
#4 #5
----------- -----------
AL 64.00
AK 0.40
AZ 12.00
AR 34.00
CA 100.00
34
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
Command: LET
Purpose: Allows simple arithmetic operations and other
manipulations on data.
Syntax:
A. LET vname = simpexp
B. LET vname = fcn(arg)
C. LET aname1 = aname2
where
vname = numeric variable name
simpexp = simple arithmetic expression
fcn = STATMATE function name
arg = list of arguments for the function
aname1 = alphanumeric variable name
aname2 = alphanumeric variable name
Syntax Examples:
LET NEWAGE = AGE-21
LET SALES = COST*AMTSOLD
LET WAGES = SALARY-TAXES
LET LAGSALES = LAG(SALES,1)
LET #4 = #0
LET LENGTH = SQRT(AREA)
LET NEWVAR = #2/3.0
LET RESPONSE = LOG10(EXPOSURE)
Description:
LET is a very useful command for performing arithmetic
manipulations on variables, and moving data from one variable
to another. In addition to the simple arithmetic operations,
several simple functions are available to deal with lagging
and other useful operations. Numeric and alphanumeric data
may be moved from one variable to another by using a simple
assignment of one variable to another.
The entry of LET NEWYEAR = YEAR + 1900 produces the
following:
NEWYEAR YEAR
------- ----
1979 79
1980 80
1981 81
1982 82
35
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
Other simple arithmetic operations that may be performed are:
A+B A-B A*B A/B
A+cnst A-cnst A*cnst A/cnst A**cnst
-A +A A -cnst +cnst cnst
where A and B represent any variable names, and cnst refers
to any numeric constant. The symbols +, -, * and / represent
their normal meanings, * indicating multiplication. The
symbol ** indicates the raising of a number to a power. Note
that A**-1 is the same as the reciprocal of A (1/A).
The LET also accomodates a number of arithmetic functions.
The following functions are avaiable:
Function Description
-------- -------------------------------------------
LAG Lag or shift time period data a specified
number of periods
LOG Logarithm (base e)
LOG10 Logarithm (base 10)
SQRT Square root of data
SAM Select a random sample
MOV Moving average
EXP Raise a number to a power of e
NUM Produce a sequence of numbers
PRD Produce a repeated sequence of numbers
STP Produce a sequence of numbers at a given
increment
NOR Normally distributed random numbers
UNI Uniformly distributed random numbers
DNOR Normal distribution points
DEXP Exponential distribution points
DWEI Weibull distribution points
DCAU Cauchy distribution points
ABS Absolute value
CUM Cumulative sum
INT Integer or whole number
Functions have the form:
LET vname = fname(a1,a2,a3,...)
where
vname is a variable
a1, a2 and a3 are arguments
For example,
LET GROWTH = LOG(DOSAGE)
LET RESPONSE = NOR(HEIGHT,0,2.0)
36
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
The first argument of any function is always the variable
name for which the function is to be applied. The first
example causes the logarithm, base e, to be taken on all of
the data in the variable DOSAGE, and placed into the variable
GROWTH. In the second example, note that the NOR function
has three arguments. The second and third arguments
represent the mean and standard deviation of the normally
distributed random numbers to be generated.
Although it is always necessary to provide a variable as the
first argument of a function, the values of the variable may
not be used by the function. For many functions, the
variable is used to determine how many values are to be
generated by the function. For example, in the use of NOR
cited above, exactly as many normally distributed values are
produced as there are cases in DOSAGE. However, if a case in
DOSAGE is missing, a missing value will be assigned to
corresponding case in the resulting variable. In fact, this
is true in general. That is, a missing value found in a
variable used on the right side of a LET expression produces
a missing value in the assignment variable.
Descriptions of the individual functions and their argument
lists are given below.
Function Description
----------- ----------------------------------------------
LAG(v,p) Lag v by p periods. See the NOTE below for
more on LAG.
Example:
LAG(YEAR,3) -- lag 3 periods
SAM(v,n) Select exactly n items randomly from v.
The values of the cases selected are placed
in the assignment variable. Items not
selected are marked with a missing value.
Example:
SAM(STATES,12) -- randomly select 12
items from STATES
MOV(v,n) Compute an n-term moving average. If v
contains the values a,b,c and d in the
first four cases, a two-term moving
average produces four cases in the
resulting variable: mv, (a+b)/2,
(b+c)/2, (c+d)/2, where mv is a missing
value.
Example:
MOV(SALES,4)
37
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
Function Description
----------- ----------------------------------------------
SQRT(v) Take the square root of v.
Example:
SQRT(X)
LOG(v) Compute the log of v using base e.
Example:
LOG(POPULATON)
LOG10(v) Compute the log of v using base 10.
Example:
LOG10(STEMSIZE)
EXP(v) Raise e to the power of the data in v.
Example:
EXP(WGT)
PRD(v,b,i,p) Produce a repeated sequence of numbers
beginning with b. Increment b by an
amount i exactly p times, and then repeat
the sequence beginning with b again.
Example:
PRD(SALES,1,1,12) -- 1,2,3,4,5,6,7,
8,9,10,11,12,1,2,3,...
STP(v,b,j,p) Produce the number b exactly p times, then
increment b by j and produce b+j exactly
p times. Continue adding j to the last
sequence of p numbers produced.
Example:
STP(RESP,4,1,2) -- 4,4,5,5,6,6,...
(4=start,1=jump,2=repeat)
NUM(v,b,s) Produce the sequence of numbers b, b+s,
b+2*s, b+3*s, ...
Example:
NUM(MONTH,1,2) -- 1,3,5,7,...
(1=start, 2=step)
NOR(v,m,s) Generate random numbers from a normal
distribution with a mean of m and standard
deviation of s.
Example:
NOR(DOSE,5.0,1.4) -- Normal random nos.
(mean=5, sd=1.4)
38
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
Function Description
----------- ----------------------------------------------
UNI(v,a,b) Generate random numbers from a uniform
distribution on the interval from a to b.
Example:
UNI(COST,20,44.5) -- Uniform random nos.
(left=20,rght=44.5)
DNOR(v,m,s) Compute the probability F(x), given that
F is a normal distribution with mean m
and standard deviation s. See the KOLMOGOROV
command.
Example:
DNOR(LOSS,-40.2,4)
DEXP(v,s) Compute the probability F(x), given that
F is an exponential distribution with mean s.
See the KOLMOGOROV command.
Example:
DEXP(DOSAGE,3.4)
DWEI(v,u,s) Compute the probability F(x), given that
F is a Weibull distribution with location
parameter u and scale parameter s. See the
KOLMOGOROV command.
Example:
WEI(MACHINE,2.2,4.1)
DCAU(v,u,s) Compute the probability F(x), given that
F is a Cauchy distribution with median given
by the parameter u and the 1st quartile
given by u-s. See the KOLMOGOROV command.
Example:
DCAU(CONT,12.4,8.2)
ABS(v) Take the absolute value of v.
Example:
ABS(X)
CUM(v) Produce the cumulative sum of v. For example,
if 10, 20 and 15 are the first three cases of
v, the cumulative sum is 10, 30, 45.
Example:
CUM(COST)
INT(v) Take the integer portion of v. For example,
if the first two cases of v are 22.4 and
-15.8, then the result is 22 and -15.
INT(WGT)
39
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
-------------------------------------------------------------
NOTE: The LAG function provides a very useful way of
examining relationships between one time period and another.
For example, assume that annual inventory and sales data are
available and that a comparison of a year's inventory to a
previous year's sales is to be made. A lag period of one
year is needed. A lag has the effect of advancing data from
earlier periods to recent periods by the specified number of
lag periods. Consider the following data, where INVLAG is a
variable created by using
LET INVLAG = LAG(INV,1)
Case (Year) SALES INV INVLAG
---------- ----- --- ------
1975 40 115 ?
1976 55 135 115
1977 65 185 130
1978 90 140 185
1979 70 200 140
... ... ... 200
A comparison of SALES and INVLAG shows a simple relationship
for any year; SALES is about one-half the inventory figure
for the previous year. Ah, to be so lucky! Since data for
1974 did not exist, a missing value (shown as a ?) is
indicated. The concept of lagged variables is a particularly
useful one in curve fitting, forecasting and regression.
-------------------------------------------------------------
A LET operation sometimes involves missing values. For those
variables on the right side of the equal sign whose cases
contain a missing value, a missing value is created for the
corresponding cases in the left variable.
When a LET operation is performed, new cases or values are
formed for the variable appearing on the left of the equal
sign. The length of this variable, or the number of cases it
contains, can be affected by the operation. Generally the
effect is of little concern. In some instances, some insight
into how the length is determined may be useful.
In most instances, the length is found by determining the
longest variable involved in the operation. If a variable
does not appear on the right, as in the case of simply
assigning a constant, the length is the same as the length of
left variable. If, in assigning a constant, the left
variable has a zero length or was erased or never used, its
length becomes the maximum length that can be used in the
database. If the variable to the left of the equal sign, the
assignment variable, has fewer cases than other variables,
40
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
its length is not increased. An already existing variable's
length can be increased only by erasing the variable and then
assigning data to it with the LET.
Detailed Example:
LET is used to adjust the YEAR variable from the USPOP.DAT
file given in Appendix B. Instead of beginning with 1790,
the data is transformed to begin with the year 0 and
continues in intervals of 10. The new variable is ADJYEAR.
The name ADJYEAR was assigned earlier with the GIVE NAME
command to some variable.
Command: LET ADJYEAR = YEAR-1790
ADJYEAR MODIFIED
17 CASES
0 MISSING
41
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
LET
This page deliberately blank
42
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PLOT
Command: PLOT
Purpose: Produces a plot at the terminal.
Syntax:
A. PLOT yvar1,yvar2,... ON xvar
B. PLOT yvar1,yvar2,... ON xvar,mod1,mod2,...
C. PLOT
D. PLOT ?
where
yvar1=name of variable to be plotted vertically
xvar=name of variable to be plotted horizontally
mod1=one of the following:
HRANGE=h1 THRU h2
VRANGE=v1 THRU v2
HRANGE=DATA
VRANGE=DATA
HPOSITIONS=number of spaces
VPOSITIONS=number of lines
TITLE='title'
VLABEL='vert-label'
HLABEL='horiz-label'
Defaults:
HRANGE = DATA (i.e., use minimum and maximum of xvar data)
VRANGE = DATA (i.e., use minimum and maximum of yvar data)
HPOS=50
VPOS=40
Syntax Examples:
PLOT SCORES ON AGE
PLOT SALES ON INVENTORY,HRANGE=0 THRU 100,
TITLE='SALES HISTORY'
Description:
PLOT produces a plot of one to five variables against another
variable in the form of a scatter plot. A number of
modifiers provide ways of titling, labeling, and sizing the
plot.
Multiple curves, variables, are plotted vertically. Vertical
and horizontal axes, scale, and legend information are added
to the plot.
If the ranges are not specified with HRANGE or VRANGE, the
range over which the data for either axis is to be displayed
is derived from the data.
The TITLE, VLABEL and HLABEL modifiers provide a way to add
title and axis descriptions to the plot. Note that the
descriptive information for these modifiers must be enclosed
by single quotes.
43
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PLOT
Physical sizing of a plot is aided by the HPOSITION and
VPOSITION modifiers. They permit the height and width of a
plot to be specified in terms of the number of lines and
characters per line. A maximum of 60 lines and 120
characters may be specified. These modifiers apply to the
portion of the plot exclusive of the axes and descriptive
information placed on it.
Plot points are shown as +, *, X, @and # for the first
through the fifth curve plotted. If points are coincident,
only the point for the rightmost curve in the list is
plotted.
Before actually producing the plot, a pause occurs at the
terminal to allow positioning and placement of paper in the
printer. Once the paper is adjusted, depressing return will
cause the plot to appear. Don't forget to enter ctrl-P if
you want the output on the printer! See the SET command for
an alternate way of placing output on the printer.
If PLOT is used without specifying any variable names or
modifiers, you will be prompted for the variable names.
After entering the variable names, STATMATE will then prompt
you for subcommands. The subcommands allow you to change
modifier values. The following subcommands are available:
Subcommands Purpose
-------------------------------- ----------------------------
CONTINUE Execute PLOT or exit command
SAVE Save modifier default values
SHOW Display modifier values
HPOSITION = x Set HPOS modifier
VPOSITION = x Set VPOS modifier
HRANGE = x.x THRU x.x or DATA Set HRAN modifier
VRANGE = x.x THRU x.x or DATA Set VRAN modifier
TITLE = 'xxxx' Set TITLE modifier
HLABEL = 'xxxx' Set HLABEL modifier
VLABEL = 'xxxx' Set VLABEL modifier
where x.x is a decimal number, x is whole,
xxxx is a string of characters
Some examples of subcommands are:
HRANGE=30 THRU 200.5
TITLE='POPULATION DENSITY FOR 1980'
VPOSITION=30
SAVE
VRANGE=DATA
CONTINUE
VLAB='RESPONSE VALUES'
Entering CONTINUE in response to a subcommand prompt, causes
STATMATE to produce the desired plot and return to command
44
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PLOT
prompting. If you used PLOT ? to enter the subcommand mode,
as explained below, CONTINUE will simply return STATMATE to
command prompting without executing the PLOT. SHOW displays
the current setting of the modifiers.
When a modifier is set to a particular value using a
subcommand, its value is only temporary. That is, when the
plot is produced, the value for the modifier will be the
temporary value that you assigned it. However, the next time
you use PLOT the modifier will revert to its original
permanent default value. You may change the default values
permanently by using the SAVE subcommand. The SHOW
subcommand displays the current values of the modifers.
Setting the default values as permanent is a very useful
feature of STATMATE. It is particularly useful when you have
titles and labels that you continually use from PLOT to PLOT.
Entering PLOT ?, allows you to examine and change default
settings of the modifiers without producing a plot. In this
case, you will not be prompted for variable names. The
default settings will be displayed and the subcommand prompt
will appear. Entering CONTINUE causes STATMATE to return to
command prompting without producing a PLOT.
Detailed Example:
The use of the PLOT command is illustrated by plotting
RURALPOP and YEAR data from the USPOP.DAT file found in the
appendix. Also plotted are the fitted, FITRURPOP, and
forecast, FORRURPOP, values obtained by fitting a linear
equation to RURALPOP with the STATMATE CURVE command and then
computing these two variables with the COMPUTE command. A
similar set of fitted and forecast data could be produced
using the STATMATE STEPWISE regression command. RURALPOP,
FITRURPOP and FORRURPOP are plotted against YEAR.
YEAR has been extended by modifying the USPOP.DAT file with
the years 1960 and 1970. The corresponding cases for
RURALPOP were assigned missing values. After the plot is
made, the PRINT command is used to display the values of
these values.
The forecast values are printed as X on the plot. The actual
and fitted points at 1810 coincide so only the FITRURPOP
point (*) is printed. The command is long enough that the
first portion is terminated with a comma to permit
continuation of the remainder of the command after the
CONTINUE: prompt.
Command: PLOT RURALPOP,FITRURPOP,FORRURPOP ON YEAR,
Continue: VLABEL='POPULATION(MILLIONS)',
Continue: HLABEL='YEARS',
Continue: TITLE='EXAMPLE OF FORECASTING RURAL POPULATION'
45
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PLOT
19 CASES
19 MISSING
ADJUST PAPER, HIT RETURN
46
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PLOT
EXAMPLE OF FORECASTING RURAL POPULATION
70.50+
: X
:
: X
:
61.55+ *
:
: *
: +
: *
P 52.59+ +
O : +
P : + *
U :
L : + *
A 43.64+
T : *
I : +
O : *
N :
( 34.68+ +
M : *
I :
L : *
L : +
I 25.73+ *
O : +
N : *
S :
) : * +
16.77+
: +
: *
: +
: *
7.82+ +
: *
: +
:+ *
:
-1.14+*
+---------+---------+---------+---------+---------+
1790.00 1826.00 1862.00 1898.00 1934.00 1970.00
YEARS
LEGEND:
+ RURALPOP
* FITRURPOP
X FORRURPOP
47
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
PRINT
Command: PRINT
Purpose: Prints (lists) data for variables.
Syntax:
A. PRINT vname1,vname2,...
where
vname1 = name of variable to be printed
Defaults:
none
Syntax Examples:
PRINT DOW,MYSTOCK,#8,INVENTORY
PRINT STOCK,PRICES
PRINT #0,WEIGHT
Description:
The PRINT command is useful for printing (listing) data from
the database. Data is not actually sent the printer, it is
displayed on the CRT. When variables with a dissimilar
number of cases are printed, an asterisk is printed in place
of cases which do not exist.
If the output of PRINT is to be placed on a printer, remember
to enter a ctrl-P before pressing the carriage return at the
end of the command. See the SET command for an alternate way
of placing output on the printer.
Detailed Example:
An example of the PRINT command use is shown below that lists
the first seven cases of the three variables YEAR, URBANPOP
and RURALPOP. The WHEN command is first used to select a
view of the first seven cases.
WHEN #0=1 THRU 7
7 OF 250 CASES
Command: PRINT YEAR,URBANPOP,RURALPOP
YEAR URBANPOP RURALPOP
----------- ----------- -----------
1790.00 0.20 3.73
1800.00 0.32 4.99
1810.00 0.52 6.71
1820.00 0.69 8.95
1830.00 1.13 11.74
1840.00 1.84 15.22
1850.00 3.54 19.65
48
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
QUERY
Command: QUERY
Purpose: Provides summary information about database variables.
Syntax:
A. QUERY vname1,vname2,...
B. QUERY vname1 THRU vname2
where
vnamei = name of variable
Syntax Examples:
QUERY STOCKF,WAGES,AGE
QUERY LOANS,#5,#3,REGION
QUERY #1 THRU #36
Description:
The QUERY command displays information about a variable which
includes its name, predefined name, data type, number of
cases, number of missing cases and missing value code.
Detailed Example:
In this example, QUERY is used to display database
information about the first four variables in the database.
Command: QUERY #1 THRU #4
NAME USER NAME TYPE # CASES # MISSING MISSING VALUE
---- ---------- ------- ------- --------- -------------
# 1 YEAR NUMERIC 17 0 1.0000E+030
# 2 URBANPOP NUMERIC 17 0 1.0000E+030
# 3 RURALPOP NUMERIC 17 0 1.0000E+030
# 4 STATES ALPHA 50 0
49
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
REGRESSION
Command: REGRESSION
Purpose: Performs a multiple regression analysis.
Syntax:
A. REGRESSION yvar ON xvar1,xvar2,...
B. REGRESSION yvar ON xvar1,xvar2,... ,mod1,mod2,...
where
yvar=dependent variable name
xvari=independent variable name
modi=one of the following modifiers
TABLE=tabnam,...
where tabname is PARAM,FIT, ANOVA,
SEPARAM
TABLE=ALL
INTERCEPT=opt
where opt is YES or NO
DURBIN=opt
Defaults:
TABLE=PARAM,FIT
INTER=YES
DURBIN=NO
Syntax Examples:
REGRESSION AUTOSALES ON TREND,GNP,CPIDELTA,SEASON
REGRESSION SLPLOSS ON DOSAGE,AGE,STRESS,TABLE=ANOVA,PARAM
REGRESS RESPONSE ON #0,WEIGHT,INTERCEPT=NO
REGRESS DEMAND ON BINDEX,GASPRICE,DURBIN=YES
Description:
The REGRESSION command computes the linear regression
equation relating the dependent variable, yvar, to one or
more independent variables, xvar1, xvar2, ..., xvarn. It
also provides analysis of variance (ANOVA), standard error of
the parameters (SEPARAM) and fit (FIT) statistics. The
Durbin-Watson (DURBIN) statistics for testing serial
correlation among residuals may be calculated. An equation,
with or without an intercept, may be fit to the data. In
combination with the COMPUTE command, REGRESSION allows
residual computations to be derived from the fitted data.
The equation derived by the regression method is:
Y = a0 + a1 * X1 + a2 * X2 + a3 * X3 + ...
Y is the variable that we wish to determine as a combination
of the variables X1, X2, etc. The parameters a0, a1, etc.
are determined by the program using the method of least
squares.
The above equation is sometimes referred to as a model. The
form shown is the intercept model. If the a0 parameter is
50
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
REGRESSION
removed from the model, it is referred to as the
no-intercept, or origin, model. The use of the word
intercept refers to the fact that the equation represents a
plane in space which intercepts the coordinate axes of the
space. The no-intercept form passes directly through the
origin of the coordinate system. REGRESSION computes either
form depending on the setting of the INTERCEPT modifier. The
above equation, with a0, is the one most often used in data
analysis and forecasting.
The TABLE modifier allows any or all of the four output
tables, PARAM, FIT, ANOVA, and SEPARAM, to be selected. The
ALL option selects all four tables. The INTERCEPT modifier
is used to select whether an intercept form of the regression
is to be used.
Detailed Example:
In the following example, the REGRESSION command is used to
find a regression equation in the intercept form for the
HALD.DAT data file shown in the appendix. The application
involves trying to relate the heat produced, HEATPROD, in
cement production to the amount of certain materials present
in the cement: ALUMINATE, SILICATE, FERRITE and DISILICATE.
Examination of the PARAMETER table output shows the relation
to be:
HEADPROD = 62.41 - 0.144*DISILICATE + 0.102*FERRITE
+ 0.510*SILICATE + 1.551*ALUMINATE
The TABLE=ALL option produces four output tables. The
command is long enough that the first portion is terminated
with a comma to permit continuation of the remainder of the
command after the CONTINUE: prompt.
Command: REGRESSION HEATPROD ON ALUMINATE,SILICATE,FERRITE,
Continue: DISILICATE,TABLE=ALL
13 CASES
0 MISSING
DEPENDENT VARIABLE
HEATPROD
INDEPENDENT VARIABLES
ALUMINATE SILICATE FERRITE DISILICATE
51
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
REGRESSION
PARAMETERS TABLE
----------------
PARAMETER ESTIMATE
---------- ----------------
ALUMINATE 1.551102
SILICATE 0.510167
FERRITE 0.101909
DISILICATE -0.144061
INTERCEPT 62.405403
FIT TABLE
---------
R-SQUARE 0.9736 ADJ. R-SQUARE 0.9366 STD. ER. RES 2.4460
ANOVA TABLE
-----------
SOURCE DF SUM OF SQUARES MEAN SQUARE F-VALUE
---------- ---- -------------- -------------- ----------
REGRESSION 4 2667.89900 666.97460 111.4792
ERROR 8 47.86361 5.98295 0.0264
TOTAL 12 2715.76200 226.31350
P > |F| = 0.0000
STD. ER. OF PARAMETERS TABLE
----------------------------
PARAMETER ESTIMATE T-VALUE STD. ER. P > |T-VAL|
----------- ------------ ---------- ----------- -----------
ALUMINATE 1.551102 2.082660 0.74477 0.0758
SILICATE 0.510167 0.704858 0.72379 0.5037
FERRITE 0.101909 0.135031 0.75471 0.8964
DISILICATE -0.144061 -0.203175 0.70905 0.8448
52
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
REMARK
Command: REMARK
Purpose: Allows information remarks to document a ezstep
run.
Syntax:
REMARK rmrktxt
where
rmrktxt is any text representing a remark
Syntax Examples:
REMARK THE FOLLOWING JOB IS FOR JOHN DOE
REMARK IF I WERE YOU, WHO WOULD BE READING THIS SENTENCE?
Description:
The REMARK command does not cause anything to happen other
than for ezstep to issue another Command: prompt. Its
purpose is to provide a way of documenting a ezstep terminal
session. The text following the command may contain any
characters.
Detailed Example:
In this example REMARK is used to comment on the next use of
the STATISTICS command.
Command: REMARK COMPUTE STATISTICS FOR JULY 15 DATA COLLECT
Command: STATISTICS NEWDATA
53
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
SET
Command: SET
Purpose: Sets destination of ezstep output.
Syntax:
A. SET COPY=opt
B. SET SEED=sno
where
opt = HARDCOPY, or OFF, or FILE fname
where fname = file name
sno = integer for random number seed
Defaults:
COPY=OFF
Syntax Examples:
SET COPY=HARDCOPY
SET COPY=FILE MYRESULTS.DAT
SET SEED=1832
Description:
When COPY=HARDCOPY, all output results from executed commands
are sent to the printer. The output does not appear on the
terminal. When COPY=OFF is used, output results appear on
the terminal. When COPY=FILE fname is used, the output of
commands is placed in the file name, fname, designated. This
file can be edited and displayed by other programs, including
text editors such as WORDSTAR. Unless changed by this
command, output results are sent to the terminal. Error
messages and command prompts are always directed to the
terminal. Whenever output is first directed to a results
file, it replaces already existing data on the file. Data
are not appended to the file.
With the SEED modifier, you may set the value of the seed
used for random number generation. The seed number
initializes the random number generators. Normally, the seed
is the same number each time you use ezstep. Hence, it is
possible to generate the same random numbers each time you
use ezstep. If this is undesirable, you may want to choose
different seeds with the SET command. Use any number from 1
to 30000.
54
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
SET
Detailed Example:
In this example, SET is used to direct results to the printer
instead of the terminal.
Command: SET COPY=HARDCOPY
TURN ON THE PRINTER
55
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
SHOW
Command: SHOW
Purpose: Displays internal STATMATE settings.
Syntax:
A. SHOW
Defaults:
none
Syntax Examples:
SHOW
Description:
SHOW displays internal information that is set by the user.
For example, items displayed are the value of the random
number seed used in LET, the current setting of the SET
command, limits of the database being used and default limits
of any new database created.
The default information displayed is taken from information
supplied through the STATMATE install program, EZINST (see
appendix C). This information includes default limits for
any database created by STATMATE and the location of internal
files. Since it is possible to have used the STATMATE
install program with STATMATE, default information relevant
to program groups used in STATMATE is displayed also.
56
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
SHOW
Detailed Example:
The following example illustrates the use of the SHOW
command.
Command: SHOW
ABC DATABASE SIZE
MAXIMUM VARIABLES: 50
MAXIMUM CASES: 250
HIGHEST USED VAR: 49
GENERAL
RANDOM NO. SEED: 8632
COPY ASSIGNMENT: TERMINAL
INSTALLED SETTINGS
DEFAULT DATABASE SIZE
MAXIMUM VARIABLES: 10
MAXIMUM CASES: 1000
DEFAULT GROUP ASSIGNMENTS
GROUP DISK DRIVE
-------------- ----------
NUCLEUS CURRENT
INTERNAL FILES CURRENT
STATISTICS CURRENT
REGRESSION CURRENT
MISCELLANEOUS CURRENT
57
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
STATISTICS
Command: STATISTICS
Purpose: Produces elementary statistics.
Syntax:
A. STATISTICS vname
B. STATISTICS vname
where
vname=name of variable
Defaults:
none
Syntax Examples:
STATISTICS SALES
STAT WEIGHT
Description:
A number of useful statistical quantities such as totals,
averages, and variances may be computed for a single variable
from the STATISTICS command.
The statistics may be broken into the following five
categories:
Summation Spread
Totals Standard Deviation
Sum of Squares Variance
Sum of Squares about Mean Range
Coeff. of Variation
Central Tendency Distribution
Average or Mean Minimum and Maximum
Higher Moments
Skewness
Kurtosis (thickness of tail)
58
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
STATISTICS
Detailed Example:
The STATISTICS command is used to compute a complete set of
statistics on 1964 motor death data from the MOTOR.DAT file
in the appendix. The variable MTRDEATHS represents the data.
Command: STATISTICS MTRDEATHS
VARIABLE : MTRDEATHS
50 CASES
0 MISSING
CENTRAL TENDENCY SPREAD DISTRIBUTION
------------------- ------------------------ --------------------
MEAN 926.76 STD. DEV. 889.32 MINIMUM 43.00
VARIANCE 790887.37 MAXIMUM 4743.00
RANGE 4700.00
COEFF. VAR. 0.96
MIDSPREAD 789.00
SUMMATIONS HIGHER MOMENTS
----------------------- --------------------
TOTAL 46338.00 SKEWNESS 2.07
SUM SQ 81697687.00 KURTOSIS 8.48
SUM SQ(DEV) 38753481.12
59
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
TTEST
Command: TTEST
Purpose: Perform a comparison of two data sets using the T-Test.
Syntax:
TTEST vname ON classvar
where
vname=variable name
classvar=classification variable name
Syntax Examples:
TTEST DOSAGE ON SEX
TTEST VOTE ON POLPARTY
Description:
A comparison of the means of two data sets (samples) may be
performed using TTEST. The means of two sets are computed
and compared by computed using the Student T-test. Values
are calculated using assumptions of both equal and unequal
variances. In the case of unequal variances, the degrees of
freedom is calucated using Satterthwaite's approximation. A
table of means and standard deviations for each of the two
sets is produced.
The analysis is performed on the data contained in vname.
The variable classvar determines which of the two sets or
classes the corresponding value in vname belongs. That is,
classvar contains codes that indicate which category or class
the data in vname belong. Although not strictly necessary,
class codes should be coded as integer (1, 2, ...) values.
Detailed Example:
In the example, the PRINT command is used to first display
the data to be used in TTEST. CHEMRESULT contains the weight
of a substance after a chemical reaction as produced by 20
individual experiments. Two laboratories were involved and 4
experiments were performed at one lab and the remaining 6 at
the other lab. The laboratory performing the experiment is
coded in LABCODE. TTEST is used to determine if a
significant difference exists in the procedures used by the
laboratories in performing the experiment.
60
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
TTEST
Command: PRINT CHEMRESULT,LABCODE
CHEMRESULT LABCODE
----------- -----------
40.50 1.00
42.30 1.00
50.80 2.00
46.00 1.00
47.80 1.00
42.20 1.00
49.00 2.00
36.90 1.00
44.40 2.00
39.00 2.00
Command: TTEST CHEMRESULT ON LABCODE
RESPONSE VARIABLE: CHEMRESULT
CLASS VARIABLE: LABCODE
10 CASES
0 MISSING
CLASS STATISTICS
LABCODE CASES MEAN STD. DEV. MINIMUM MAXIMUM
---------- ----- ----------- ------------ ---------- ----------
1 6 42.6167 3.8923 36.9000 47.8000
2 4 45.8000 5.2738 39.0000 50.8000
TWO SAMPLE T-TEST RESULTS
VARIANCE T-VALUE DF PROB > |T-VALUE|
-------- ----------- ------ ----------------
EQUAL -1.1055 8 0.3011
UNEQUAL -1.0340 6 0.3410
61
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
WHEN--ELSE--END
Command: WHEN--ELSE--END
Purpose: Permits data to be selected for analysis from the
database according to logical and relational conditions
Syntax:
A. WHEN relexp
B. WHEN relexp1 AND relexp2
C. WHEN relexp1 OR relexp2
where relexp if one of the following
vname relation c1
vname = c1, c2, ...
vname = c1 THRU c2
and
vname = variable name
relation = one of the relations: =, >, <, >=, or <=
c1 and c2 = numeric or alphanumeric constants
Defaults:
none
Syntax Examples:
WHEN AGE=13 THRU 30
WHEN COLOR='RED','BLUE','PINK'
WHEN WIDTH>25.55
WHEN STOCK<=4500.00
WHEN #0=40 THRU 80
WHEN AGE=21 THRU 29 AND WEIGHT=150 THRU 175
Description:
The WHEN command is a very useful command for analyzing data
according to specific selection criteria. For example,
consider a variable containing ages of individuals and
another containing their weights. It might be important to
find the average weight of this group for all members between
the ages of 50 and 65. The WHEN command can be used to
select data according to this criterion, and then the
STATISTICS command can be used to find the required average.
The ELSE command reverses, or negates, a condition
established by WHEN. The END command removes any conditions
established by WHEN or ELSE. Only one WHEN or ELSE may be in
effect.
Simple forms of WHEN relational expressions include the
following relational operators:
Operator Meaning
-------- -------------
= Equality
>= Greater than or equal
<= Less than or equal
> Greater than
< Less than
For example, YEAR>1950 means select all cases for which the
62
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
WHEN--ELSE--END
year is greater than 1950.
Another useful conditional expression is illustrated by
AGE=15,20,25,26,27. In this example, all cases which have an
age corresponding to 15, 20, 25, 26 and 27 are selected.
A frequently useful conditional expression is illustrated by
VOLUME=405.44 THRU 650.25. This form of a conditional
expression permits a range of values to be selected. In this
example, all cases which have a volume from 405.44 to 650.25
are selected for use.
There are intances when it is desirable to compound
conditions with a logical "and" or "or". STATMATE allows two
simple expressions to be logically combined in such a manner
by using an AND or OR to separate simple expressions. For
example, AGE=15 THRU 40 AND WEIGHT>190.55 selects all cases
for which the age is from 15 to 40 and whose weight is
greater than 190.55.
The WHEN--ELSE commands operate on a case by case basis.
When data is selected with a WHEN, STATMATE effectively is
restricted to a window or view of your total data. The view
extends across all variables on a case by case basis. That
is, any variable in the database may be used, but only those
cases selected by the WHEN condition are used.
To better understand how the data view established by a WHEN
operates, consider the following variables after the WHEN
command WHEN AGE=30 THRU 70 has been used. The cases
designated by the small x are the part of the data view
established by the command.
AGE WEIGHT
--- ------
22 140.5
x 33 177.2
15 105.4
10 88.0
x 54 188.3
x 38 224.5
Although the selection was made on AGE, only the cases in
WEIGHT corresponding to those selected in AGE are accessible
when WEIGHT is used by an analytic command such as
STATISTICS. That is, STATISTICS WEIGHT would compute the
average of 177.2, 188.3 and 224.5. Incidentally, applying
ELSE now would put the cases corresponding to ages 22, 15,
and 10 in the view.
Some important aspects of the WHEN command should be
understood when an attempt is made to write into the database
with a COMPUTE, LET or RECODE command while WHEN is in
effect. Perhaps the best way to explain the effect is to
63
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
WHEN--ELSE--END
consider an example. Suppose we consider the three variables
AGE, WEIGHT, DOSAGE and NEWDOSAGE shown below. Assume
further that WHEN AGE=30 THRU 70 has been applied to the
database, as described earlier. The x symbols show the
selected cases.
AGE WEIGHT DOSAGE NEWDOSAGE
--- ------ ------ ---------
22 140.5 15.2 empty
x 33 177.2 11.4
15 105.4 17.8
10 88.0 10.4
x 54 188.3 11.3
x 38 224.5 21.5
Note that WEIGHT and DOSAGE contain data, but NEWWEIGHT does
not. Assume the following two LET commands are executed:
LET NEWWEIGHT=WEIGHT+100 and LET DOSAGE=DOSAGE+2.0. The
result is:
AGE WEIGHT DOSAGE NEWWEIGHT
--- ------ ------ ---------
22 140.5 15.2 missing
x 33 177.2 13.4 277.2
15 105.4 17.8 missing
10 88.0 10.4 missing
x 54 188.3 13.3 288.3
x 38 224.5 23.5 324.5
Note that new values have been calculated for cases in the
view, but not for cases outside of the view. Further, note
that because NEWWEIGHT did not have data, missing values are
placed at cases not within the view.
While the WHEN is in effect, some care must be exercised in
writing data into variables that have been used to select the
view. For example, WHEN AGE=10 THRU 25 followed by LET
AGE=AGE+5 changes AGE but does not affect the view. That is,
the WHEN selects cases whose age is from 10 to 25, but the
subsequent LET recalculates the ages so that the selected
cases have ages between 15 and 30. In this situation,
STATMATE does not reselect cases to conform to the previous
WHEN.
Detailed Example:
In the example shown below, WHEN is used to select all cases
of YEAR which are greater than 1890. The STATISTICS command
is then executed on URBANPOP to find statistics of urban
population. The statistics produced are those for the urban
population from 1900 to 1950. ELSE is then used to reverse
the condition, and statistics are computed on the urban
population before 1900. PRINT is used to display YEAR and
64
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
WHEN--ELSE--END
URBANPOP before and after the use of the WHEN and ELSE. END
is used to restore the data view to the full view.
Command: PRINT YEAR,URBANPOP
YEAR URBANPOP
----------- -----------
1790.00 0.20
1800.00 0.32
1810.00 0.52
1820.00 0.69
1830.00 1.13
1840.00 1.84
1850.00 3.54
1860.00 6.22
1870.00 9.90
1880.00 14.13
1890.00 22.11
1900.00 30.16
1910.00 42.00
1920.00 54.16
1930.00 68.95
1940.00 74.92
1950.00 88.93
Command: WHEN YEAR>1890
6 OF 17 CASES
Command: PRINT YEAR,URBANPOP
YEAR URBANPOP
----------- -----------
1900.00 30.16
1910.00 42.00
1920.00 54.16
1930.00 68.95
1940.00 74.92
1950.00 88.93
Command: STATISTICS URBANPOP
VARIABLE: URBANPOP
6 CASES
0 MISSING
CENTRAL TENDENCY SPREAD DISTRIBUTION
---------------- ------------------------ --------------------
MEAN 59.85 STD. DEV. 21.85 MINIMUM 30.16
VARIANCE 477.63 MAXIMUM 88.93
RANGE 58.77
COEFF. VAR. 0.37
65
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
WHEN--ELSE--END
SUMMATIONS HIGHER MOMENTS
----------------------- --------------------
TOTAL 359.12 SKEWNESS -0.07
SUM SQ 23883.03 KURTOSIS 1.74
SUM SQ(DEV) 2388.15
Command: ELSE
244 OF 250 CASES
Command: PRINT YEAR,URBANPOP
YEAR URBANPOP
----------- -----------
1790.00 0.20
1800.00 0.32
1810.00 0.52
1820.00 0.69
1830.00 1.13
1840.00 1.84
1850.00 3.54
1860.00 6.22
1870.00 9.90
1880.00 14.13
1890.00 22.11
Command: STATISTICS URBANPOP
VARIABLE: URBANPOP
11 CASES
0 MISSING
CENTRAL TENDENCY SPREAD DISTRIBUTION
---------------- ------------------------ --------------------
MEAN 5.51 STD. DEV. 7.14 MINIMUM 0.20
VARIANCE 50.92 MAXIMUM 22.11
RANGE 21.90
COEFF. VAR. 1.29
SUMMATIONS HIGHER MOMENTS
----------------------- --------------------
TOTAL 60.61 SKEWNESS 1.34
SUM SQ 843.17 KURTOSIS 3.61
SUM SQ(DEV) 509.17
Command: END
FULL DATA VIEW RESTORED
66
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
WRITE
Command: WRITE
Purpose: Places data from the database onto an external file
Syntax:
A. WRITE vname1,... ON filenm
where
vname1 = variable name
filenm = a file name
Defaults:
none
Syntax Examples:
WRITE YEAR,URBANPOP ON POPFILE
WRITE STOCK ON STOCK.DAT
WRITE #0,DOSAGE,ANIMGROUP ON STUDY.DTA
Description:
The WRITE command places data from the database onto a
designated file. The list of variables specified in the
command are written case by case onto the file in ASCII form.
Files in the ASCII format may be printed on your system, or
modified by word processors, such as WORDSTAR. With minor
modifications, the file may be used as input to other
application programs, such as dBASE II (III), which accept
ASCII data.
When the output file is written, the very first line of data
contains information which enables the file to be read by
STATMATE, if it is desirable to re-enter the data into
STATMATE again. The format of the first line is the same as
discussed in the section ENTERING DATA INTO THE SYSTEM (See
the discussion of ASCII files). Also, the names of the
variables placed on the file are listed on the first line.
Each subsequent line of output contains one case for each
variable specified in the command.
Detailed Example:
In the example below, WRITE is used to place the data from
variables YEAR and URBANPOP on the file POPDATA.DAT.
Command: WRITE YEAR,URBANPOP ON POPDATA.DAT
17 CASES AND 2 VARIABLE(S) WRITTEN
67
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
This page deliberately blank
68
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
APPENDIX A: COMPUTATIONAL METHODS (Abbreviated)
CORRELATE
The computations of CORRELATE may be found in many
statistical texts; see in particular Ostle in the references.
STATISTICS
Most of the statistics computed by STATISTICS may be found in
any standard text on statistics. Sums and sums of powers are
calculated using provisional methods.
LET
The basis for the uniform random number generator used in the
LET functions is the Wichmann article cited in the
references. Normally distributed numbers are generated by
summing twelve numbers generated from a uniform distribution
and applying appropriate transformations to scale the result
to the desired mean and standard deviation.
69
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
APPENDIX B: SAMPLE DATA
This appendix describes three data files which are included
with the STATMATE package:
1. U. S. Population Data
2. Motor Vehicle Death Data
3. Hald Cement Data
1. U. S. Population Data
The data listed below represents U. S. Population data from
1790 through 1950.
Column Description
------ ----------------------------
1 Year
2 Urban population in millions
3 Rural population in millions
DATA U.S. POPULATION:YEAR,URBAN,RURAL--POPULATION IN MILLIONS
1790 , 0.202 , 3.728
1800 , 0.322 , 4.986
1810 , 0.525 , 6.714
1820 , 0.693 , 8.945
1830 , 1.127 , 11.739
1840 , 1.845 , 15.224
1850 , 3.544 , 19.648
1860 , 6.217 , 25.227
1870 , 9.902 , 28.656
1880 , 14.130 , 36.026
1890 , 22.106 , 40.841
1900 , 30.160 , 45.835
1910 , 41.999 , 49.973
1920 , 54.158 , 51.553
1930 , 68.955 , 53.830
1940 , 74.924 , 57.246
1950 , 88.927 , 61.770
70
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
2. Motor Vehicle Death Data
The data below may be found in Draper and Smith, page 191,
given in the references. Note the presence of a missing
value in the rural road mileage for Washington, D. C..
Column Description
------ ---------------------------------------------
1 States
2 (Y) Motor vehicle deaths in 1964
3 Number of drivers (in units of 10,000)
4 Persons per sq. mi. in 1960
5 Rural road mileage (1,000)
6 Percentage of males greater than females
(0=no, 1=yes)
7 January maximum temperature
8 Highway fuel consumption in 1964 (10 million)
DATA MOTOR VEHICLE DATA
AL, 968, 158, 64.0, 66.0, 0, 62, 119.0
AK, 43, 11, 0.4, 5.9, 1, 30, 6.2
AZ, 588, 91, 12.0, 33.0, 1, 64, 65.0
AR, 640, 92, 34.0, 73.0, 0, 51, 74.0
CA, 4743, 952, 100.0, 118.0, 0, 65, 105.0
CO, 566, 109, 17.0, 73.0, 0, 42, 78.0
CT, 325, 167, 518.0, 5.1, 0, 37, 95.0
DE, 118, 30, 226.0, 3.4, 0, 41, 20.0
DC, 115, 35, 12524.0, 1E+30, 0, 44, 23.0
FL, 1545, 298, 91.0, 57.0, 0, 67, 216.0
GA, 1302, 203, 68.0, 83.0, 0, 54, 162.0
ID, 262, 41, 8.1, 40.0, 1, 36, 29.0
IL, 2207, 544, 180.0, 102.0, 0, 33, 350.0
IN, 1410, 254, 129.0, 89.0, 0, 37, 196.0
IA, 833, 150, 49.0, 100.0, 0, 30, 109.0
KS, 669, 136, 27.0, 124.0, 0, 42, 94.0
KY, 911, 147, 76.0, 65.0, 0, 44, 104.0
LA, 1037, 146, 72.0, 40.0, 0, 65, 109.0
ME, 196, 46, 31.0, 19.0, 0, 30, 37.0
MD, 616, 157, 314.0, 29.0, 0, 44, 113.0
MA, 766, 255, 655.0, 17.0, 0, 37, 166.0
MI, 2120, 403, 137.0, 95.0, 0, 33, 306.0
MN, 841, 189, 43.0, 110.0, 0, 22, 132.0
MS, 648, 85, 46.0, 59.0, 0, 57, 77.0
MO, 1289, 234, 63.0, 100.0, 0, 40, 180.0
MT, 259, 38, 4.6, 72.0, 1, 29, 31.0
NB, 450, 89, 18.4, 97.0, 0, 32, 61.0
NE, 215, 23, 2.6, 44.0, 1, 40, 24.0
NH, 158, 37, 67.0, 13.0, 0, 32, 23.0
NJ, 1071, 329, 807.0, 21.0, 0, 43, 231.0
NM, 387, 54, 7.8, 62.0, 1, 46, 48.0
NY, 2745, 744, 350.0, 84.0, 0, 31, 439.0
NC, 1580, 226, 93.0, 71.0, 0, 51, 177.0
ND, 185, 38, 9.1, 102.0, 1, 20, 24.0
71
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
OH, 2096, 530, 237.0, 84.0, 0, 41, 358.0
OK, 785, 137, 34.0, 94.0, 0, 46, 107.0
OR, 575, 108, 18.0, 73.0, 0, 45, 81.0
PA, 1889, 570, 252.0, 89.0, 0, 39, 353.0
RI, 100, 46, 812.0, 1.3, 0, 38, 27.0
SC, 870, 122, 79.0, 52.0, 0, 61, 86.0
SD, 270, 40, 9.0, 87.0, 1, 23, 28.0
TN, 1059, 177, 85.0, 67.0, 0, 49, 135.0
TX, 3006, 515, 37.0, 196.0, 0, 50, 448.0
UT, 295, 57, 10.8, 32.0, 0, 37, 38.0
VT, 131, 20, 42.0, 13.0, 0, 25, 15.0
VA, 1050, 208, 100.0, 50.0, 0, 50, 150.0
WA, 730, 160, 43.0, 59.0, 1, 46, 109.0
WV, 467, 88, 77.0, 32.0, 0, 43, 54.0
WI, 1059, 207, 72.0, 87.0, 0, 26, 141.0
WY, 148, 22, 3.4, 67.0, 1, 37, 20.0
3. Hald Cement Data
The data shown below may be found in Draper and Smith, page
630, given in the references.
Column Description
------ --------------------------------------------
1 Amount of tricalcium aluminate (% clinker wgt)
2 Amount of tricalcium silicate
3 Amount of tetracalcium ferrite
4 Amount of dicalcium silicate
5 (Y) Heat produced in hardening cement
DATA HALD DATA FROM DRAPER AND SMITH
7,26, 6,60, 78.5
1,29,15,52, 74.3
11,56, 8,20,104.3
11,31, 8,47, 87.6
7,52, 6,33, 95.9
11,55, 9,22,109.2
3,71,17, 6,102.7
1,31,22,44, 72.5
2,54,18,22, 93.1
21,47, 4,26,115.9
1,40,23,34, 83.8
11,66, 9,12,113.3
10,68, 8,12,109.4
72
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
APPENDIX C: STATMATE INSTALLATION AND MISCELLANEA
STATMATE Components
The STATMATE package is supplied on several disks.
Generally, the disks contain a set of compiled programs with
file extensions of .OVL and .COM (.OVL and .EXE for the PC
DOS-MS DOS version). A set of sample data sets with file
extensions of .DAT are included. As well, the disks contain
an SMSA$ file, an SMHELP.TXT file and an SMINSTLL.COM (.EXE
for DOS) program file. It is advisable to make a copy of
these disks. Use the copies as your working disks.
Once you have made backup copies, you are ready to begin.
Make sure that the disk containing SMATE.COM is in your
A-drive when you execute STATMATE. See the section on
STATMATE Operation for instructions on how to operate the
package. If you have space problems fitting the package onto
your system, see the section in this appendix titled
Tailoring STATMATE to Your System.
STATMATE Internal Files
In order to operate, STATMATE creates several files for its
internal use. These include SMSA$, xxxSM$DI, xxxSM$DB,
xxxSM$S1, xxxSM$PL, xxxSM$WH, xxxSM$EQ and xxxSM$PL,
xxxSM$CH, SM$CU, where xxx represents the ID provided by the
user. Some versions of STATMATE may produce additional
files, but they are always prefaced by xxSM$. SMSA$ is a
control file containing installation information (problem
size, etc.).
Tailoring STATMATE to Your System
STATMATE contains an SMINSTLL program which allows you to
tailor STATMATE to meet various disk space needs and to
modify problem and data size parameters. Let us examine how
SMINSTLL can be used to help solve disk space needs.
In order to accomodate different disk space needs, the
STATMATE package provides a way of distributing the various
programs over several disks by dividing STATMATE program
files into five groups. These groups designate the disk
location of the programs for particular commands and files
for internal STATMATE use. The following table shows the
five groups and the commands and files controlled by the
groups:
73
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
Group Controls Disk for
-------------- ----------------------------------
Internal Files STATMATE files
Nucleus ERASE, EXECUTE, GIVE, HELP, WRITE
INPUT, PRINT, QUERY, REMARK
SET, EXIT, WHEN, ELSE, END
Statistics STATISTICS, BREAKDOWN
ONEWAY, TTEST, TWOWAY,
CROSSTAB, KOLMOGOROV,
TNPARAM, ONPARAM, RCORR
Miscellaneous PLOT, HISTOGRAM, EDIT, LET, RECODE
CHART, CUSUM
Regression REGRESSION, CURVE, POLYNOMIAL,
NONLINEAR, COMPUTE, CORRELATE,
STEPWISE
For example, the Regression group controls which disk
STATMATE expects to find the programs for the CURVE and
REGRESSION commands. Through the use of the SMINSTLL
program, discussed in the next paragraph, the user may alter
the disk locations where STATMATE expects to find the
programs for its commands.
In order to modify the delivered configuration, it is
necessary to use the SMINSTLL program provided with STATMATE.
SMINSTLL is an interactive program that asks for the desired
disk drive of the groups shown above. Once the drives for
these groups are specified, SMINSTLL will give instructions
on how STATMATE programs should be distributed on your disk
drives. Information about the new configuration is placed on
the SMSA$ file, which is used by STATMATE to determine what
configuration is to be used during execution. It is
necessary to operate STATMATE in the specified disk
configuration. STATMATE only knows of the current
configuration as specified by SMINSTLL.
A second need solved by SMINSTLL, is the ability to change
database and problem size parameters. As delivered, the
maximum number of variables allowed in the database is 10 and
the maximum number of cases that can be placed in the
database is 250. Use SMINSTLL to change these values.
SMINSTLL will query you for the values. For the best
results, change the maximum number of variables to some
multiple of 5, for example, 15.
Once you use SMINSTLL to change these values, all new
databases will be of the specified sized. STATMATE maintains
these size parameters with each database. Hence, previously
created databases can be used without any difficulty. The
size of an existing database cannot be changed.
Some care should be exercised in specifiying the size of the
database. A database with a maximum of 10 variables and 250
cases occupies about 5K bytes of file space. This space is
74
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
assigned immediately, and not as you add data to the
database. If you specify a database of 50 variables and 500
cases, then you will use about 50K bytes of file space. If
you only work with at most 200 cases and 5 variables, only
part of the database will be used. The remainder will
nevertheless occupy file space on your disk.
Note that there is no way to change individual command
limitations on problem sizes. For example, there is no way
to increase the size of problems that can be accommodated by
ONEWAY.
User's with a hard disk should use SMINSTLL, and specify that
all groups belong on the current directory. All STATMATE
files should then be placed in a single directory.
When using SMINSTLL, it is probably a good idea to print the
output so that you will have a record of how your
configuration should be installed. The names of program
files which are to be placed on specific disk drives may vary
with the version of STATMATE. The example below is
representative of the interaction with SMINSTLL. In any
case, the output instructions from SMINSTLL should be
followed when you actually perform the installation.
SMINSTLL Example
In the following example, SMINSTLL is used to create a
configuration that establishes databases with a maximum of 40
variables and 500 cases. The location of internal files is
specified as the B-disk. We begin by entering the word
SMINSTLL at the terminal.
SMINSTLL
STATMATE
INSTALL PROGRAM
Note: [ ] denotes default value
SPECIFY
MAXIMUM VARIABLES [10]: 40
MAXIMUM CASES [250]: 500
NOTE: Your STATMATE database file will occupy
about 80K bytes of disk storage.
SPECIFY DISK DRIVE (A,B,C,D,E,F OR RETURN=CURRENT) FOR:
STATMATE INTERNAL FILE GROUP [CURRENT]: B
--- File Distribution Settings ---
STATMATE Internal File Group Files Will Be Generated on B Drive
CP/M PC DOS/MS DOS
-------------- -------------
75
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
xxxSM$DB xxxSM$DB
xxxSM$DI xxxSM$DI
xxxSM$S1 xxxSM$S1
xxxSM$WH xxxSM$WH
xxxSM$EQ xxxSM$EQ
xxxSM$PL xxxSM$PL
xxxSM$ST xxxSM$ST
xxxSM$CU xxxSM$CU
xxxSM$CH xxxSM$CH
where xxx is the user ID
Press carriage return to continue
Make sure all of the following STATMATE files are on the same diskette:
CP/M PC DOS/MS DOS
--------------- -------------
1. STATMATE.COM STATMATE.EXE
2. All .OVL files .OVR files
3. SMSA$ file SMSA$ file
4. SMHELP.TXT SMHELP.TXT
End of installation
76
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
APPENDIX D: STATMATE SIZE LIMITATIONS
Command problem size limitations are shown in the table
below.
STATMATE/PLUS SIZE LIMITATIONS
Command problem size limitations are shown in the table
below.
Command Limitations
Command Cases Var Classes Other Limitations
------------ ----- ---- ------- -------------------
BREAKDOWN 32000 1 100
CHART 32000 1 100 100 subgroups & 20 items
per subgroup
COMPUTE 32000* 1 na**
CORRELATE 32000 20 na
CROSSTABS 32000 2 20
CURVE 32000 2 na
CUSUM 32000 1 100 100 subgroups & 20 items
per subgroup
EDIT 32000 20 na
ERASE na na na
EXECUTE na na na
EXIT na na na
GIVE na na na 20 attribute
modifications
HELP na na na
HISTOGRAM 32000 1 20 20 bars
INPUT 32000 50 na
KOLMOGOROV 500 2 na
LET 32000 2 na
NONLINEAR 250 5 na 5 parameters
ONPARAM 500 2 xx
ONEWAY 32000 2 20
PLOT 32000 6 na 5 on y-axis and 1
on x-axis
POLYNOMIAL 250 2 na Degree less than 11
PRINT 32000 6 na
QUERY na na na 20 variables per query
RCORRELATION 500 2 na
RECODE 32000 1 na 10 individual values
when recoding a set
of specific values
REGRESSION 32000 20 na
REMARK na na na
* 32,767 for those who can remember or need it
** not applicable
77
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
Command Limitations (Continued)
Command Cases Var Classes Other Limitations
------------ ----- ---- ------- -------------------
SET na na na
SHOW na na na
STATISTICS 32000 1 na 500 cases when
computing quartiles
STEPWISE 32000 20 na
TNPARAM na 2 na 20 sets by 20 groups
TTEST 32000 2 na
TWOWAY 32000 3 20
WHEN 32000 2 na
WRITE 32000 20 na
A STATMATE database may contain as many as 32,000 cases, and
several hundred variables. When a database size is large,
the limiting factor becomes the disk capacity of your system.
Generally, analyses can process as many as 32,000 cases and
20 variables. Specific limitations are shown in the above
table.
78
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
APPENDIX E: HELP
1. Question: How do I enter alpanumeric data into the
database.
Answer: Read the INPUT and GIVE TYPE command
descriptions.
2. Question: I keep running out of disk space because
the disk contains many STATMATE internal files.
What can I do?
Answer: Try to keep your use of identifiers in
response to the ID prompt to a minimum. Use the
install program to create larger databases and use
as few databases as possible.
3. Question: Why won't INPUT read my data.
Answer: Check your input file to see that data is
in the correct fields and that data items are
correct.
4. Question: What needs to be done to print my output
on my printer?
Answer: Use the SET COPY=HARDCOPY command.
5. Question: Why can't the INPUT command find my data
file?
Answer: Use a disk identifier, such as B:, before
the file name to specify the disk the file is on.
6. Question: Why aren't my PLOT modifiers retained
for repeated use.
Answer: Use SAVE in the subcommand mode.
7. Question: When I created one of my databases, I
forgot how many variables and cases I specified as
the maximum. How can I find out what the maximum
is for a database.
Answer: Use the SHOW command.
8. Question: Is it possible to get data from another
program, such as dBASE into STATMATE.
Answer: Create an ASCII file with the program and
turn it into a 'DATA' file with your word processor
or editor.
9. Question: Almost every time I use the package, I
use the same seven or eight commands for my task.
Is their any way to reduce the effort.
Answer: Use the EXECUTE command.
79
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
10. Question: PRINT produces too many cases to view
easily. Can I reduce the output some way?
Answer: Use the WHEN command to restrict output.
11. Question: How can I control my output to the
screen, it scrolls by so fast that it can't be read
easily.
Answer: Use your operating system's ability to
hold screen output with control keys. Use the SET
COPY=FILE command to output the results to a file
for examination later.
12. Question: Do the STATMATE database and internal
files and my data files need to be on the same
diskette?
Answer: No. Internal file usage is controlled by
the install program. Your data files may be
referenced with the INPUT command by preceding the
file name with a disk identifier, if necessary.
80
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
APPENDIX F: SUGGESTED DISKETTE ORGANIZATION
STATMATE/PLUS is distributed on diskettes with the contents
listed below. If you are executing the program from
diskettes, this is the arrangement that you should begin
with. Disk 1 should be placed in your a-drive and disk 2 in
your b-drive. Disk 3 contains the shareware version of the
STATMATE/PLUS user's guide as listable files. It is not
needed to execute STATMATE.
The suggested diskette arrangement should leave you with
about 30K of free space on disk 1, enough to use STATMATE as
configured by the install program, SMINSTLL.EXE. Some of
this space will disappear when you use STATMATE for the first
time. STATMATE creates a database file and other internal
files as it is used. If you need more space, consider moving
SMINSTLL.EXE and the data files (.DAT) to another diskette.
If you continue to have space problems, see appendix C for
information on installing STATMATE in other configurations.
Suggested Diskette Contents
Disk 1
SMATE .EXE --- main program
SMASKC.OVR SMATE .EXE SMBRKD.OVR --- overlay files
SMINLX.OVR SMINPT.OVR SMKOLM.OVR SMONET.OVR
SMONEW.OVR SMONPR.OVR SMRCOR.OVR SMSETI.OVR
SMSHOW.OVR SMSTAT.OVR SMSYAN.OVR SMTNPR.OVR
SMTTES.OVR SMTWOT.OVR SMTWOW.OVR
SMSA$ --- configuration file
SMHELP.TXT --- help file
SMINSTLL.EXE --- install program
DEMO --- example command file
USPOP.DAT USPOPDEM.DAT --- sample data files
HALD.DAT MOTOR.DAT QAMEAS.DAT SAMPLE.DAT
README
Disk 2
SMASUM.OVR SMCHLX.OVR SMCHPL.OVR SMCHRT.OVR --- overlay files
SMCOMP.OVR SMCORR.OVR SMCORT.OVR SMCSLX.OVR
SMCSUM.OVR SMCURT.OVR SMCURV.OVR SMEDIT.OVR
SMHIST.OVR SMLETA.OVR SMLETX.OVR SMNONL.OVR
SMNONT.OVR SMNONX.OVR SMPLLX.OVR SMPLOT.OVR
SMPOLY.OVR SMRECO.OVR SMREGR.OVR SMROUT.OVR
SMSTEP.OVR SMSTLX.OVR SMXTAB.OVR
Disk 3
SMPART1.DOC SMPART2.DOC SMPART3.DOC SMPART4.DOC --- user's guide
SMPART5.DOC
81
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
APPENDIX G: INVOICE AND ORDER FORM
A sample invoice and form is enclosed to simplify ordering.
Use the order form to place an order. The invoice form is to
be used within your organization to generate payment for
STATMATE/PLUS registrations.
STATMATE/PLUS is available on disk for your evaluation and
convenience for $35. This fee only covers diskette costs,
handling and postage (within the U.S.). It does not cover
registration. Please show your support by registering the
program, if you are using it on a regular basis and find it
of value. Note that a 190+ page user's guide is available
for $35 and you can save $10 by purchasing the diskettes and
registering at the same time. Remember that registered
owners receive several utilities when they register. Further,
when you purchase either the diskettes or registration, you
receive a coupon worth $10 on your next purchase. If you
purchase both, then you receive $10 off now and a $10 coupon.
The coupon offer applies only to purchases made directly from
the Software Hill.
Business, commercial, governmental or educational institution
use of non-registered copies of STATMATE/PLUS is strictly
forbidden. Write for details concerning site or corporate
licensing.
82
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
---INVOICE FORM---
for
STATMATE/PLUS
Remit to: The Software Hill
1857 Apple Tree Lane
Mt. View, CA 94040
(415) 969-4233
Sold to: ____________________ Ship to: ______________________
____________________ ______________________
____________________ ______________________
-------------------------------------------------------------------
Date: PO #:
-------------------------------------------------------------------
Items: Qty Price
[ ] STATMATE Evaluation disks $ 35 ______ $______.____
[ ] STATMATE/PLUS Registration $ 45 ______ $______.____
[ ] STATMATE disks & registration $ 70 (save) ______ $______.____
[ ] STATMATE/PLUS User's Guide only $ 35 ______ $______.____
[ ] California Residents add 7% sales tax $_____.____
Total (U.S.) $________.____
Shipping:
[ ] Ship COD via UPS or not U.S. mail; add $15 $______.___
[ ] Ship outside of North America:
Add $15 [ ] STATMATE/PLUS disks only $______.___
Add $25 [ ] STATMATE/PLUS manual & disks $______.___
Add $15 [ ] STATMATE/XG disks and guide $______.___
83
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
---ORDER FORM---
for
STATMATE/PLUS
The Software Hill
1857 Apple Tree Lane
Mt. View, CA 94040
(415) 969-4233
Operating System and Computer:
[ ] PC DOS [ ] MS DOS Version _____ Computer ______________
Items: Qty Price
[ ] STATMATE Evaluation disks $ 35 ______ $_____.___
[ ] STATMATE/PLUS Registration $ 45 ______ $_____.___
[ ] STATMATE disks & registration $ 70 (save) ______ $______.__
[ ] STATMATE/PLUS User's Guide only -- $ 35 ______ $_____.___
[ ] California Residents add 7% sales tax $____.___
Total (U.S.) $_______.___
Shipping:
[ ] Ship COD via UPS or not U.S. mail; add $20 $______.___
[ ] Ship outside of North America:
Add $15 [ ] STATMATE/PLUS disks only $______.___
Add $25 [ ] STATMATE/PLUS manual & disks $______.___
Add $15 [ ] STATMATE/XG disks and guide $______.___
Payment:
[ ] Check enclosed. Amount enclosed $________._____ (U.S.)
Make check The Software Hill
payable to: 1857 Apple Tree Lane
Mt. View, CA 94040
1. Allow three weeks for delivery
2. Orders outside U.S. send check drawn on U.S. bank
or international money order. Amount in U.S. dollars.
------------------------------------------------------------------
Date: PO #:
------------------------------------------------------------------
---Customer Information---
Name ______________________________________________
Address ___________________________________________
___________________________________________
___________________________________________
City __________________ State ______ Country __________
Phone (_____) ______ - _________
84
STATMATE/PLUS Shareware User's Guide--Copyright (C), 1987
REFERENCES (Partial)
Cooper, B. E., Statistics for Experimentalists, Pergamon
Press Ltd., London, Englad, First Edition, 1969
Draper, N. R. and Smith, H., Applied Regression Analysis,
John Wiley and Sons, New York, New York, Second Edition, 1981
Duncan, A. J., Quality Control and Industrial Statistics,
Richard D. Irwin Inc., Homewood, Illinois, Fourth Edition,
1974
Marquardt, Donald W., An Algorithm for Least-Squares
Estimation of Nonlinear Parameters, Journal of the SIAM, vol.
11, no. 2, pp 431-441, June 1963.
Ostle, Bernard, Statistics in Research, Iowa State
Univ. Press, Ames, Iowa, Second (now in Seventh Edition),
1963
Seigel, Sidney, Nonparametric Statistics for the Behavorial
Sciences, Chapters 6 and 8, New York, McGraw-Hill, 1956
Tukey, John W., Exploratory Data Analysis, Addison-Wesley,
Reading, Massachussetts, 1977
Wichmann, B. A. and Hill, I. D., A Psuedo-random Number
Generator, NPL Report, DITC, June, 1982
Winer, B. J., Statistical Principles in Experimental Designs,
Mc Graw-Hill, New York, 1970
85
Volume in drive A has no label
Directory of A:\
FILES863 TXT 751 9-07-88 2:18p
GO BAT 38 11-05-87 3:26p
GO TXT 463 11-05-87 3:56p
PRINTDOC BAT 789 11-05-87 3:59p
SMPART1 DOC 44886 8-31-88 6:45a
SMPART2 DOC 42240 8-07-87 5:42p
SMPART3 DOC 25216 8-07-87 5:43p
SMPART4 DOC 23808 8-07-87 5:44p
SMPART5 DOC 35845 8-31-88 6:47a
9 file(s) 174036 bytes
143360 bytes free