|
|

|
Software Guides
Guidelines to Searchable Software Options
SPSS, Version 9.0
Introduction.
In order to use the database you will need to know how to access it.
Many of you will know how to use EXCEL, or any of the several other formats
in which the Guadalajara Censuses Project database comes in this CD-ROM.
We are providing the guide to using the Statistical Package for the Social
Sciences (SPSS), which is one of the oldest and most user-friendly of
all the options. We will give instructions on how to find individuals
in the database.
Many SPSS users will tell you that the older, so-called "syntax" version
is more flexible, provides more options and ultimately more sophisticated
scholarship. They are probably right. But I remember spending two summers
in Anne Arbor, Michigan, at the ICPSR's quantitative workshops trying
not to make a mistake. One period out of place in your syntax command
and all you got were error messages-after sometimes hours of work. The
Windows version reduced hours to minutes, even seconds. So for 95 percent
of all the information most people will want to know, the "point
and click" works just fine. I am using version 9.0 because I know
it. Version 10.0 is o.k. and not much different from 9.0. If you have
10.0 or higher do not worry. It remembers what it's former self, 9.0,
can do and will obey you like a trooper.
Getting Started.
Assuming you have SPSS on your computer, getting started is easy. Just
double click on the SPSS icon and your computer will automatically load
SPSS. (Or alternatively, you can go to "File" on your word
processing window, clip on "open" and then on the SPSS File
drive.) You are now facing your SPSS Window data entry screen. At the
top of the screen are a series of menus beginning with "file" and
ending with "Help" on the far right side.
Go to upper left of screen to the menu labeled "File." First,
click on "file." [A menu box options unfolds.] In the menu
box click on the second option: "Open." The dialogue box politely
asks you "What would you like to do?" You click on Open an
existing file. Then you click on the appropriate drive which houses your
CD. Finally you click on either "Guadalajara Archive" or "Guadalajara
Consolidated," according to your choice. In the blink of an eye
(depending on your computer), you are staring at the first installation
of the census data for nearly sixty thousand individuals. Please remember
that you will need to save the CD in a format which you can utilize for
search and sorting, in this case the Statistical Package for the Social
Sciences.
SPSS Data Editor Window.
What You Are Looking At. On the screen are a series of numbers and words
running left to right and from top to bottom. Each is contained in
a separate "cell," as one "value" in the series
of "variable" columns. The variable columns begin with the
first variable, on the far left, "MASINDEX" (i.e, master
index), and end on the far right of the screen, at the last variable, "PATRIA" (Patria/place
of birth). Each line, or "row" as it is usually called, begins
with a separate number under Masindex standing for each individual
(or, in some cases, a vacant house) in the two censuses of 1821 and
1822. Each variable column contains some bit of information on that
individual. For example. Take the first case. Master index 1, index
1, cuartel 1, year 1821, household number 1. We have Don Manuel Machuca,
a male, age 36, married (to Feliciana Gómez), born in Guadalajara,
and a comerciante, a merchant, as befitting the first household in
the block, usually a well-off family. We do not know his calidad, not
provided for this cuartel but as only a handful of other-ethnicity
individuals were given a don, odds are he is an español. That
was line one of our database. There are 57054 separate lines in this
database. Just to get some perspective, if you were to print out this
database in a normal font of point 12, using "landscape" orientation
of 28 lines per page for your 20 variables, you would have just over
six thousand pages. Pretty long book.
How are the cases organized? When you received your CD-ROM, the Guadalajara
Censuses Project staff had predetermined the organization of your database.
Once you get the hand of it you can organize the data anyway you want,
within limits of course! For obvious reasons we have organized the
data by district (i.e.cuartel) and within each district, by household,
and within each household, by individual in the order determined by
the census takers nearly two hundred years ago.1 This is the order
in which we entered the data, but it does not have to stay that way.
Variables. There are twenty variables, all listed on page 27, above.
The reason why several of the variable names are rather strange is
that for some reason known only to SPSS technicians, the longest variable
name (called "labels") can only be eight characters. The
data or "value" put in the variable cell, however, can be
almost as long as you want. For example, SURNAME can contain "Delgado
y Ledesma". It is only the label which is limited, not the actual
column width.
Values. Each cell contains either a word, name or term, or it contains
a number. Numbers are codes for the information, and are entered that
way for two reasons. One, it may be faster to enter say 1 for male
and 2 for female. "0" means that the coder was unable to
determine the sex from the census material. This data is called "missing." Another
type of missing data is the "dot" you can see in some columns
(it is called "systems missing"). This means that the data
entry person simply "tabbed through" the cell, usually because
no information was given in that particular cuartel (e.g., Patria or
Calidad).
Exchanging Value Labels for Numbered Codes.
For your first exercise, go to the Menu labeled "View," click
on it and then click on "Value labels." Now you have in each
cell what the codes stand for. This makes no difference in any statisticalprocedure
you might do. It just makes it easier to read the data. You can change
it back again by clicking on the same option.
Menus. You have already used the "file" menu. The others are
strung out across the
top from left to right: Edit, View, Data, Transform, Analyze, Graphs,
Utilities, Window and Help. The latter, of course, is your refuge when
these directions do not suffice for what you want to do.
For our purposes, you will only need to use Edit, Data and Analyze, at
least for the time being.
Finding Individuals.
With a basic understanding of what you are looking at, you are ready
to look for individuals. The process is easy in SPSS as it is in any
spreadsheet format. First, you need to decide on how you are going to
search for the person you want. To answer that question, first ask yourself
whether or not the name you are looking for is relatively common or relatively
rare. If you answer the latter than I would just get a list of all the
surnames in the Consolidated file, and print them out if you want. It
only amounts to 46 pages. (At one point it was well over one hundred
pages so we did some consolidating, that's for sure.) Here is how you
do it.
List of All Surnames in the City. Go to the menu across the top, find "Analyze" and
click on it. Then click on the first choice "Descriptive Statistics" and
then on "Frequencies." Up pops the Dialogue Box.
The left hand window contains the list of all variables in the file.
Just click on Surname and then click on the black arrow to the right
to place it in the window to the right. Now click on OK and in a second
or so, you will be in the "Output" window looking at a list
of all the surnames in the database and how many of them there are.
You will notice the first two "names" are missing letters.
Each dot stands for a missing letter. The number under the "Frequency" column
are the number of persons with that surname. For Abad, for example,
there are only 10. That may or may not be good news if you are looking
for an Abad but it does simply the finding procedure. To get back to
the data entry window, either click on the former screen in the bottom
left, or click on "Window" in the menu and then on "Consolidated.
file" on the pull-down.
Using the "Find" function. First, since you are looking for
a surname, use an arrow key to tab over to the variable "SURNAME." (Of
course you could click on the variable surname and it will do the trick.
However, if you do, avoid clicking on the top of the variable because
if you do, you will highlight the entire variable, all 57075 cases,
which is not what you want to do.) The point is that your cursor has
to be "in" the variable to use the "find" function.
Now, click on the menu "Edit." And click on "Find".
Up pops a "dialogue box."
Type in Abad in the open box and click on "Search Forward." In
a second or so the search will stop. You need to click on "Close" in
the upper right-hand corner of the dialogue box, and it has deposited
you in the household of the very first Abad in the city. Chances are
if your Abad lived in the city in 1821 you will find him or her in short
order. Since you have found a three year old little girl, you continue
your search.
Repeat the procedure: put cursor in Surname, click on Edit, click on
find, and then write in Abad and click on "Search Forward." Hit "close" to
take a look at this Abad, Francisco, a 47 year old Zapatero, married
although his wife is not living there, at least at the time. Say this
is not your person so on you go. Next a 13 year old criada, maid. Keep
looking. A thirty year old weaver pops up next, then another two children,
then three in the same family, likely siblings, and finally a young married
Abad, with no children. It has not taken that long and the advantage
of searching the database without sorting is that you get to look at
the entire household, which you likely cannot do if you are forced to
sort.
But what if you are looking for Delgado (218) or, heaven forbid, any
of the several spellings of Gonzáles (832), Gonsáles (315),
or González (217), which even in the Archive file presents you
with a daunting process? (In the Consolidated file, we have some fifteen
hundred Gonzálezes.) So here you will need to use the "Sort" function.
Using the "Sort" function. Sorting your cases allows you to
rearrange the cases to facilitate your search. The sort actually physically
rearranges the cases in any order you want. (But not to worry; you can
get them back into the original order with no problem at all.) First,
go to the top menu, and click on Data (third from the left). On the pull
down menu, find "Sort Cases" about half way down and click
on it. In the dialogue box, you have a list of all the variables to choose
from. Now here you have a number of options. First, you can simply sort
by Surname. Just highlight surname in the list to the left and click
on the black arrow.
Hit OK and you are in business. In a matter of seconds all the cases
will be sorted by surname, beginning in ascending order (unless you
specified otherwise) from the very first "a" surname to the
last "z". Looking for Delgado, say, you would tab over to
the surname column, click on the Menu "Edit" and then on
Find, as before. Type in Delgado, click on "Search forward" and
it will take you to the first Delgado, and all the other Delgados in
a row, all 252.2 Hopefully you will have known the first or given name(s)
as well. In that case you would have placed the variable "FirstName" in
the sort box, after the variable Surname.
The data is now sorted by surname, and within each surname, by the
given name(s) as well. That way if you knew that the Delgado you were
looking for was a Juan Delgado, you would place your cursor in the
Surname column, and then go to "Edit" and click on "Find" and
typed in Delgado. After a few seconds you would have arrived at the
first Delgado listed. Then you would have clicked on "close" and
put your cursor in the "FirstName" column, gone to "Edit", "Find",
then typed in Juan.
This would have taken you to the first Juan Delgado on the list of
five Juan Delgados. Hopefully you will know more information which
will now help you identify the correct Juan Delgado. If you know that
he was known to be a weaver, for example, you will find that two are
listed as obrageros, the most common Spanish term for weaver in that
era. At this point, hopefully you will also have information such as
age or place of birth, which may help you decide which Juan Delgado
is the for whom you are searching. But if you do not, but you know
the name of his spouse, you now need to re-sort to be able to find
the entire household (for his wife would be listed by her name, and
be found elsewhere, in this sort.) Before you do, however, you will
need to make a note of the cuartel (ward) number where he was living,
the year the census was taken, and the household number, for each of
the two choices.
Finding the Family.
All you have to do is to go back to Data and then to Sort Cases. Before
you can sort, however, you will need to transfer the variables from the
previous sort (Surname and FirstName) back to the variable list on the
left. Just click on each variable (Surname, first name, oficio) and then
click on the little arrow aimed at the variable box for each. Now that
they are out of the way, just click on MASINDEX, put it in the sort box
and click OK. In a short time, the cases will all be returned to their
original order.
Using the "Find" Function the Easy Way. Instead of going to
the "Edit" menue, and then the "Find" dialogue box,
there is an easier way. Do you notice the large pair of binoculars in
the middle of the second row of icons, just below the top menu row? Click
on that and you go directly to the "Find" dialogue box. You
use the same procedure: place your cursor on the cuartel column, type
in cuartel 7 for the first Juan Delgado. Click on "Search Forward." Once
you arrive on the first person in cuartel 7, click on "close" and
place your cursor on the "year" column and repeat the procedure,
the year being 1822. Finally place your cursor on the hhnumber column,
type in 102, click on "Search Forward" and then on "close" when
you arrive.
You are now at the first person in household number 102. You will notice
that this is not Juan Delgado but Javiera Acosta, age 59, no occupation
given. Perhaps she is the owner of the house; in any case she is listed
first and assumed to be the head of household, as that was the standard
practice for the census takers. Juan Delgado was listed second, with
his wife, María Plácida, age 30, followed by their young
son, Anselmo Delgado, age 1. As you look around you, you can see that
it is a neighborhood of weavers, most likely setting up shop in their
home, as was the custom for most obrageros. Although not officially
noted, Juan's wife likely spun the yarn which he used to weave the
cloth.
Finding the City.
This ends the SPSS guidelines for this version of the GCP's CD-ROM. When
we issue version 2, which will incorporate data from the censuses of
1791, 1813-14, 1824, 1838-42, 1850 and 1930, we will include a far more
extensive guidelines, including specific instructions designed to encourage
genealogists and family historians to place their ancestors in the historical
context of the city of Guadalajara. For example, if you found a particular "Calidad" associated
with your ancestor(s), what did that term mean in Guadalajara in 1821?
It will also include instructions on the use of other SPSS functions;
for example, how to use the frequency function and the cross tabs (ie
tables) to further investigate the life and times in which your ancestors
lived? If you want to use the data now, I would suggest that you use
the SPSS "Help" menu. Despite what you might think, it really
is not difficult to create your own tables and charts. There are also
published guides to SPSS which may be of use for you. (3)
Ending the session:
If you have done all that you want, you are now ready to exit. Go back
to the "File" in top Menu and click on "exit" located
at the very bottom of the pull-down menu. You will automatically get
a query "Save data screen?" All this really means is that if
you have changed the variables in any way (and of course you can do so,
most particularly through "Re-coding" under "Transform"),
do you want to save your changes. If you have created a new variable,
for example, and want to save it, then you click "yes". If
you have only used the old variables, you click "no." The disk
has those and you do not need to save them again. It will remember. Then
the machine may ask you "Save output screen?" This means any
frequencies you have run, cross tabs, tables, etc. At this time you will
have printed out anything you want and you will say "no." (If
you said "yes" you would have to give the output a name and
you could call it up next time.) It sometimes will reverse the questions,
just to see if you are paying attention, asking about saving the output
first, and then the data screen. Be alert.
NOTE. If you do something awful, like inadvertently highlighting and
deleting the Surname variable, sending them to the Misty Mountains
from which few return, REMEMBER, all you have to do is, when you have
clicked on "Exit" to finish your session, click on "No" when
it asks you if you want to save your data. Your data will all return
to as it were, before you began the session. But if you Save before
you Think, you have lost it. However, before you panic, remember the
Original CD-ROM, which will have the original file, unchanged. Keep
care of that disk! If you do change the data on your C drive, save
it to a "floppy" (ie, Zip, CD, DVD) as well as to your C
drive. You do not know how much experience lies behind that gratuitous
advice. Good luck. If you have any comments, suggestions or corrections
please email me at the address below. In time I will get back to you.
Or you may call our Guadalajara Censuses Project offices, number below.
We welcome any and all queries, comments, criticisms and suggestions.
If you have additional, or different information from what we have
here on individuals, please let us know. We have spent considerable
time and effort to get the data right and appreciate your help in this
regard.
Rodney D. Anderson
Department of History
Florida State University
Tallahassee, Florida 32306-2200.
randerso@mailer.fsu.edu
850-645-4697
Notes:
1. Understand that the order makes no difference to the statistical operations
which SPSS can do. As long as each case has its variable data, you could
organize the cases by sex, by age, etc. and still be able to obtain your
analysis, as you will see. The organization is for your convenience only.
2. If you do not specify any other variable besides surname, the surnames
will be ordered by the first numeric variable, namely the Master Index.
That is ordered by the order of the cuartels, of course, so that you
will have surnames ordered by residence, in effect. This will place all
surnames of the same household together, but, of course, will miss others
of the same household like, say, the mother who, by tradition, will go
by her paternal name.
3. SPSS publishes its own guides to their software. See SPSS Base 9.0
Brief Guide(NJ: Prentice-Hall, Inc., 1999). However, I would recommend
any of several guides to SPSS written by individuals. For example, Thomas
W. Pavkov and Kent A. Pierce, Ready, Set, Go! A Student Guide to SPSS
9.0 for Windows (CA: Mayfield Publishing Company, 2000) or Brian C. Cronk,
How To Use SPSS. A Step-by-Step Guide to Analysis and Interpretation
(NP: Pyrczak Publishing, 1999). A more extensive guide to both how to
use SPSS and to the kinds of statistical analysis which can be done with
SPSS is Marija J. Norusis, SPSS 9.0 Guide to Data Analysis (NJ: Prentice-Hall,
1999). The latter is the traditional "user friendly" guide
to SPSS and is really quite good with both the "how to" and
the statistics. It is, however, not cheap.
Excel 97 Version
Although SPSS is a useful tool for historians and social scientists,
it is not necessary to purchase and use SPSS to access much of the data
from the Guadalajara Census Project. All of the information present in
the original census manuscripts has been captured in the Excel archive
files of the project. Because it is quick, easy to use, and easily accessible
(many of you probably already have this program on your computer), most
users will probably prefer to examine the data in the Excel format. The
GCP used Excel 97 during its archival phase; this or any version published
after it will be acceptable for examining this data.
Getting Started. Unlike SPSS, Excel was never meant to handle a database
as big as the Guadalajara Census Project’s, which contains over
57,000 cases (Excel is limited to 10,000). Therefore, those of you
using Excel will be at one disadvantage; the GCP data is necessarily
separated into many individual files. The actual manuscripts of the
censuses are divided by cuartel, or neighborhood; therefore, the GCP
adopted this as the logical division for our data. The city of Guadalajara
is divided into 24 cuartels; each of these neighborhoods may or may
not have data for both 1821 and 1822. The GCP filenames contain information
about both the cuartel and the year. A file named c1(21) is Cuartel
1 in 1821; c15(22) is Cuartel 15, 1822. So instead of being able to
search for an individual once in the complete database, Excel users
will have to search repeatedly in many different files to find an individual.
There are several differences between SPSS and Excel besides the size
of the files. Because Excel is a simple spreadsheet, it lacks many
of the bells and whistles described in the SPSS introduction. Excel
cannot, for example, list every last name in the city for you, or run
crosstabs to locate individuals within the city. However, Excel is
quick and easy to use.
To begin, open one of the Excel files. For the sake of continuity, we
will examine Cuartel 1, 1821, the same cuartel that was used in the
SPSS tutorial. Again, this file would be labeled c1(21). The file may
be opened either from the Excel start up screen or through the CD in
Windows Explorer. The file opens to display the first individual of
Cuartel 1, 1821: again, Manuel Machuca.
What you are looking at. The Excel version of the GCP data is much the
same as the SPSS data. Again, information is divided in to two different
formats: variables, which run vertically and record the same kind of
information about every individual (male or female? Name? Last name?)
and cells, which run horizontally and record the information about
different individuals. Here, the Excel information is much simpler
than the information contained in SPSS. Because all cuartels and both
years (1821 and 1822) were combined in the SPSS file, it was necessary
to create a master index to keep track of each individual. Also, when
browsing through the data, one has to keep an eye on when cuartels
and years begin and end in the database. Not so in the Excel file.
Since we already know that we’re in c1(21), anyone we find in
this file lives in Cuartel 1 in 1821.
How the cases are organized. For the most part, the variables present
in the Excel version of the GCP data are in the same order that they
appear in the actual census manuscript. Therefore, the order of the
variables varies from file to file. Also, the information or variables
present in the files may vary. Some of the census takers included information
about race (calidad); many did not. Some included information about
where individuals had migrated from, if they were not originally from
Guadalajara. Two of the census takers included information about residency,
or how long the individuals had lived in Guadalajara. This information
proved very valuable to the GCP. However, you may not be interested
in this information, and want to reorganize the screen so that you
can see all the information pertinent to your search at one time.
Just as in the SPSS version, it is very easy to reorganize the order
of the variables in any order you wish. Simply use the copy/paste function
to rearrange the variables. In order to do this, first pick the place
where you want to move your target variable. Highlight the column to
the right of this spot by clicking on the letter at the top of the
column. For example, say you wanted to relocate the social status variable
to a new position in between the first and last name of the individual.
In this case, we would first make room for the variable by highlighting
the letter N at the top of the column labeled surname. The entire column
is then highlighted and ready to be changed. Go to the top of the screen
to “insert,” and scroll down to “column.” A
new column will appear between “name” and “surname.” The
next step is to copy the social status variable. Go to the variable.
Again, highlight the entire column by clicking on the letter above
the variable (in this case, H). Go to the Edit function at the top
of the screen Scroll down, highlight “cut” or “copy” (depending
on if you’d like to move the variable altogether or just copy
it). Then, simply highlight the empty column you’ve created,
go to the Edit function again, and paste the information.
Finding Individuals. Unlike SPSS, there is no way to run frequencies
in Excel. Therefore, it is impossible to do such things as list all
last names in the city, as discussed in the introduction to SPSS.
As we have discussed above, searching for specific individuals in the
Excel files will be a little more time consuming than searching in
SPSS. However, it is still quick and easy. To find a specific individual,
we’ll use the “Find” function, located under “Edit.” Say,
for example, that we want to find a man named Gregorio Muro. The easiest
thing to do would be to search by last name. Bring up the find screen
by going to “Edit” and scrolling down to select “find.” A
screen will appear. In this case, we’ll only use the simplest
type of search. Type in the last name, and don’t worry about
all the other options- they’re automatically preset to search
everything rather than specific areas. It will take a couple of seconds
longer to complete the search, but is much easier to use than the specific
searches. Type in “Muro.” When searching in Excel, be sure
to always capitalize where necessary- the find feature in this program
is extremely picky and will not find anyone named “muro,” even
though there is an entire family of them. Also, be sure to check every
alternate spelling of a last (or first) name. Someone with the last
name “Gonzales” may be listed as “Gonzalez,” “Gonsales,” or “Gonsalez.” Now,
hit the find next button. A man named Nepomuceno Muro should be highlighted.
Gregorio Muro is a member of his family and happens, at this point,
to be visible. However, one could continue searching the rest of the
cuartel simply by repeatedly hitting the “find next” button.
Eventually, one could scroll through every person named “Muro” in
the cuartel. Again, one of the disadvantages of using Excel is that
every cuartel must be searched separately.
An easier way to search for an individual if it is a common last name
or one with several possible spellings (such as “Gonzales”)
is to use the Sort function. Go to the “Data” function
at the top of the screen, and choose “Sort.” In order to
search by last name, click on the first pull down menu and highlight “Surname,” then
hit enter. All the individuals in the cuartel will automatically be
sorted alphabetically. Now, simply scroll down the file until you reach
the Muros. In order to sort back to the original order, simply sort
the cases by index number.
What if there is more than one Gregorio Muro in the city? Again, hopefully
you have additional information about him. In this case, we are not
sure or not if Gregorio is married, but we can see that he is a comerciante.
Ending the session. Simply go to “file,” then save. If you
have made any changes to the file, you may want to save the file to your
hard drive. Choose “save as,” name your file, and note where
the file is saving to. All the same rules for saving in SPSS apply here
in Excel; if you’ve altered the file and don’t want to save
the changes, do not save, but choose “Exit” and say “no” when
the dialogue box asking if you want to save pops up.
Again, if you have any questions, comments, suggestions, or corrections,
please feel free to get in contact with us at the Guadalajara Census
Project. Also, if you have additional or different information about
any of the individuals in the census data, please share it with us.
We’ve spent a long time getting to know all of these individuals,
their families, and their neighborhoods, and would be delighted to
know more about them.
|
 |
|