|
|

|
About the GCP CD-ROM
The Guadalajara
Census Project CD contains two major files: the Documentation File and
the Database File.
Documentation
File
This file
contains essays on the nature and purpose of the Guadalajara Censuses Project,
and on the historical background of the city of Guadalajara and of the
two censuses of 1821 and 1822. The state of the project as of the date
of issuance is explained and each type of information in the database is
described in the Variable List and in greater detail in the Codebook. Finally,
instructions are provided on searching the database for individuals. The
instructions pertain to two of the six formats in which the database is
organized. One is the most common software used by historians--the Statistical
Package for the Social Sciences, version 9.0. Note:
Although version 11.0 is now available and is included as one of the two
SPSS formats provided in this CD-ROM, 9.0 will be the most recognized.
Most instructions applicable to 9.0 will also apply to 10.0. The
other is Excel, a well known spread sheet. Further information on the methods
and procedures used in creating the Guadalajara Censuses Database can be
accessed at
Database
File
This file
contains two versions of the database (Archive and Consolidated), each
one containing the data in six different formats (SPSS 11.0 and 9.0; Excel;
TAB-DEL.dat; F-ASCII.dat; DBASEIV.DBF).
Archive
File
The Archive
File contains data entered verbatim as it appears on the pages of the manuscript
census, preserving the original spelling, accent marks (usually the lack
of same) and syntax. The concept of the archive file is to
provide an accurate copy of the original document consistent with a statistical
database. Genealogists and family historians may want to use this file
in searching for more common Hispanic surnames when the archaic spelling
of family surnames is known. However, otherwise it is recommended that
if one is searching for a particular individual that you start first with
the Consolidated File. It should be understood, however, that the archive
file has not undergone all the error detection processes and may contain
data entry errors which have since been corrected. When in doubt, the
information may be crossed checked with the Consolidated file.
Each file is shadow indexed, meaning that except where error
has been detected in the presence or absence of a case (ie individual)
or a household, the order of the cases in both files are the same.
Errors caught in the process of data entry have been corrected in the archival
file. It is only errors which were revealed after the data entry
had been completed that remain in the archive file. They have been corrected
in the consolidated file. In brief, the Archive File contains nearly all
the literal information which one would find on the original padron or
census manuscript page. Such data would be the names of the residents,
their ages, ethnicity (calidad), place of origin (patria),
marital status and the like. The Archive File does not contain the constructed variables
described below in the Consolidated File.
Consolidated
File
This file
contains 87 variables. The variables are divided into two major categories.
They are alpha-numeric (string) and numeric (coded).
String variables record the data as written. The job carpintero for
example would be entered into the database as carpintero. An
example of a coded variable would be marital status, in which the many
types of marital status would be entered by codes (1 = parvulo, 2 = soltero,
3 = doncella, etc.). Those two categories are then sub-divided into two
typesliteral and constructed.
Literal
Variables
Literal
variables are those pieces of information taken verbatim from the census
manuscript page, as noted in the Archive File. However, in the Consolidated
File that data have been converted into the most common modern spellings
and accent marks for surnames, occupations, place of birth, etc. This includes
the substitution of v for the archaic b, j for x (except
where x is still used, such as in México.), etc. To
facilitate statistical analysis most string (alphabetic) variables have
been paired with a duplicate numeric variable. To further expedite analysis,
variables characterized by a large number of values (e.g., age, martial
status, ethnicity and birthplace) have also been paired with a consolidated
version of that variable. In both cases, the paired variable is usually
identified with a 2" as in Age2.
Constructed
Variables
Constructed
variables are those created by GCP staff from the literal data
contained in the census. The purpose of constructed variables is to provide
as much information as possible in an easy to use format. Examples of such
variables might be as simple as how many employed persons are present in
a household to complex categories of household and family structure. Because
constructed variables involve various levels of interpretation, the GCP
staff have also created variables which quantitatively measure the nature
and degree of interpretation and assumption. See Database Documentation for
details.
|
 |
|