About the Guadalajara Censuses Project
Guadalajara: Background & History
Guide to Database Designing
The Guadalajara Census Project CD contains two major files: the Documentation File and the Database File.

  Documentation File

This file contains essays on the nature and purpose of the Guadalajara Censuses Project, and on the historical background of the city of Guadalajara and of the two censuses of 1821 and 1822. The state of the project as of the date of issuance is explained and each type of information in the database is described in the Variable List and in greater detail in the Codebook. Finally, instructions are provided on searching the database for individuals. The instructions pertain to two of the six formats in which the database is organized. One is the most common software used by historians--the Statistical Package for the Social Sciences, version 9.0. Note: Although version 11.0 is now available and is included as one of the two SPSS formats provided in this CD-ROM, 9.0 will be the most recognized. Most instructions applicable to 9.0 will also apply to 10.0. The other is Excel, a well known spread sheet. Further information on the methods and procedures used in creating the Guadalajara Censuses Database can be accessed at

  Database File

This file contains two versions of the database (Archive and Consolidated), each one containing the data in six different formats (SPSS 11.0 and 9.0; Excel; TAB-DEL.dat; F-ASCII.dat; DBASEIV.DBF).

  Archive File

The Archive File contains data entered verbatim as it appears on the pages of the manuscript census, preserving the original spelling, accent marks (usually the lack of same) and syntax. The concept of the “archive file” is to provide an accurate copy of the original document consistent with a statistical database. Genealogists and family historians may want to use this file in searching for more common Hispanic surnames when the archaic spelling of family surnames is known. However, otherwise it is recommended that if one is searching for a particular individual that you start first with the Consolidated File. It should be understood, however, that the archive file has not undergone all the error detection processes and may contain data entry errors which have since been corrected. When in doubt, the information may be crossed checked with the “Consolidated” file. Each file is “shadow” indexed, meaning that except where error has been detected in the presence or absence of a case (ie individual) or a household, the order of the cases in both files are the same. Errors caught in the process of data entry have been corrected in the archival file. It is only errors which were revealed after the data entry had been completed that remain in the archive file. They have been corrected in the consolidated file. In brief, the Archive File contains nearly all the “literal” information which one would find on the original padron or census manuscript page. Such data would be the names of the residents, their ages, ethnicity (calidad), place of origin (patria), marital status and the like. The Archive File does not contain the “constructed” variables described below in the Consolidated File.

  Consolidated File

This file contains 87 variables. The variables are divided into two major categories. They are alpha-numeric (“string”) and numeric (“coded”). String variables record the data as written. The job “carpintero” for example would be entered into the database as “carpintero.” An example of a coded variable would be marital status, in which the many types of marital status would be entered by codes (1 = parvulo, 2 = soltero, 3 = doncella, etc.). Those two categories are then sub-divided into two types–literal and constructed.

  Literal Variables

Literal variables are those pieces of information taken verbatim from the census manuscript page, as noted in the Archive File. However, in the Consolidated File that data have been converted into the most common modern spellings and accent marks for surnames, occupations, place of birth, etc. This includes the substitution of “v” for the archaic “b,” “j” for “x” (except where “x” is still used, such as in México.), etc. To facilitate statistical analysis most string (alphabetic) variables have been paired with a duplicate numeric variable. To further expedite analysis, variables characterized by a large number of values (e.g., age, martial status, ethnicity and birthplace) have also been paired with a consolidated version of that variable. In both cases, the paired variable is usually identified with a “2" as in Age2.

  Constructed Variables

Constructed variables are those created by GCP staff from the “literal” data contained in the census. The purpose of constructed variables is to provide as much information as possible in an easy to use format. Examples of such variables might be as simple as how many employed persons are present in a household to complex categories of household and family structure. Because constructed variables involve various levels of interpretation, the GCP staff have also created variables which quantitatively measure the nature and degree of interpretation and assumption. See Database Documentation for details.





