Data Submission

 

    Technical Instructions for Cross Site Data Transmission – Nov. 16 1998

This document is intended to address some of the basic areas of and to provide guidelines for data exchange to Target Cities staff members at all Target Cities locations. It will be updated as the data collection process evolves.

First, your data file(s) must be sent as a fixed format ASCII file. The record length of the file must match the specified length in the most recent codebook. This means that each variable must reside in a column specified in the codebook. The fixed format is non-delimited; there are no commas and no spaces separating one variable from another. Both SAS and SPSS have the capability of generating fixed format ASCII files.
Another important concern is date format. We have experienced a number of problems with reading dates that were coded improperly. If the codebook says that the FIELD CONSTRAINT is MMDDYY then the interview date of, for example, February 14, 1996 should be coded as "021496", NOT "02141996" or, still worse, "021419". Where the FIELD CONSTRAINT is MMDDYYYY, the right coding is "02141996", NOT "021496  " or "0214  96". Please make sure you check what your date fields look like in the generated ASCII file before sending it.

Currently every time that you submit the data, you should send your entire data set. All data must be converted to the most recent version of the database. The cutoff date for the data should be end of the previous month. For example the data that should be received in Akron by August 31 should include all data through July 31.

The data may be sent via e-mail to target@uakron.edu (the preferred method) or on disks. If it is sent using e-mail then the data should be sent as an "attachment" to an email message. The name of the data file must follow the format described below. The body of the e-mail message must contain the following information:

National Data Ver. x.x (the value of "DBVER" for this data)
Database (1, 1e, 1c, 2a, 2ae, 2ac, 2be, 2bc, 3e, 3c)
City name
Date range of the data
Date of download
A list of the SAS and SPSS programs that were used to check the data
A short narrative detailing the status of the data set after these programs were run
Are you are submitting data in the most recent format of the codebook? - YES or NO
Is the data file named according to the convention described above? YES or NO
Does the record length of the file must match the specified length in the most recent codebook? - YES or NO

For example:

National Data Ver. 1.4
Database 1
Dallas
Data range: 1/24/95 – 10/31/97
Download date: 11/20/97
ASCII1.SPS, CONSIST1.SPS, MACRO1.SAS, and DUP1.SAS were used to check the data.
All problems reported by these programs have been corrected in the data.
YES - we are submitting data using the format in the latest codebook.
YES - the data file is named according to the convention.
YES - the record length of the file matches the specified length in the most recent codebook.

The e-mail address for the data and any other correspondence related to Target Cities is:

target@uakron.edu

If the data is sent by disk then the size of the data set may become an issue. Without compression, each disk can only accommodate 3,000 (database 1) cases. We request that you use the Microsoft’s BACKUP and the PKZIP program to compress and backup the data if it is larger than one disk. BACKUP is included with Microsoft DOS and Windows. You can run it in DOS mode or from inside Windows. A copy of PKZIP has been previously distributed to each site and is available for download from our Downloads page. If you decide not to compress your data you may divide your clients into units of 3,000 (database 1), placing the first 3,000 on one disk, and the remainder of your clients on subsequent disks, again in units of 3,000. All disks must be enclosed in disk mailers.

In order to maintain a high level of quality control, we ask that the data disk submitted be labeled in the following manner:

National Data Ver. x.x (the value of DBVER for this data)
Database (1, 1e, 1c, 2a, 2ae, 2ac, 2be, 2bc, 3e, 3c)
City name
Date range of the data
Date of download
Disk # of #

In order to easily distinguish one data set from another; please use the following naming conventions for the files:

If you are submitting nine or fewer disks (or sets of data):

The first two characters will be the Site ID Number.

The third character (and fourth) will be:

A "C" for database 1
A "CE" for database 1e
A "CC" for database 1c
A "D" for database 2a
A "DE" for database 2ae
A "DC" for database 2ac
A "OE" for database 2be
A "OC" for database 2bc
A "FE" for the database 3e
A "FC" for the database 3c

This is followed by the "Disk # of #" information found on the label formatted in the form "#of#".

For example, the file name on disk 2 of 3 of New Orleans’s Client Supplied data, will be 09C2of3.dat.

P L E A S E  don't substitute the letter "O" for zeros ("0") in the file names, and vice versa!

 

If you are sending data on diskettes then a file entitled READ.ME should be included on the first disk that contains the following information:

A list of the SAS and SPSS programs that were used to check the data
A short narrative detailing the status of the data set after these programs were run

 

A checklist has been enclosed (see bottom of the page) which should be filled before each data submission. If you are sending data via e-mail then the checklist is part of the e-mail header. If the data is sent using diskettes then a hard copy of this checklist should be enclosed.

All data disks should be sent to Shoba Nair in Akron at:

Target Cities Cross-Site Team
Department of Sociology
University of Akron
Olin Hall
Akron, OH 44825-1905
(330) 972-6712

Please contact Akron if you have any questions about this process or about e-mailing files, Douglas McGivern or Sevy Petras at (330) 972-5773 or Shoba Nair at (330) 972-6712.

 

Note that if these procedures are not followed the data will not be incorporated into the national dataset(s).

Data Submission Checklist

If the data is sent by e-mail:

Question

Yes/No

Has all the following information been included in the e-mail header?

 

National Data Ver. x.x (the value of "DBVER" for this data)

 

Database (1, 1e, 1c, 2a, 2ae, 2ac, 2be, 2bc, 3e, 3c)

 

City name

 

Date range of the data

 

Date of download

 

A list of the SAS and SPSS programs that were used to check the data

 

A short narrative detailing the status of the data set after these programs were run

 

Are you are submitting data in the most recent format of the codebook?

 

Is the data file named according to the convention described above?

 

Does the record length of the file must match the specified length in the most recent codebook?

 

 

Data Submission Checklist

If the data is sent on disk:

Question

Yes/No

Are the data files named according to the convention described above?  
Are the disks labeled according to the convention described above?  
Are all disks enclosed in disk mailers?  
Does the record length of the file must match the specified length in the most recent codebook?  
Are you are submitting data in the most recent format of the codebook?  
Is the READ.ME file on the first disk (including the following information)?
A list of the SAS and SPSS programs that were used to check the data
A short narrative detailing the status of the data set after these programs were run
 

 

This page was last modified on December 05, 1998. 

 

Target Cities Home General Info Current News Codebooks Deadlines Data Submission Donwnloads Contact Info Links

Copyright Target Cities Research Project Cross-Site Team, 1998.
The views and opinions expressed in this page are responsibility of the page author(s). The contents of this page have not been reviewed by The University of Akron.
The content of this site has not been reviewed, approved or endorsed by the US Department of Health and Human Services or any its division.