Dataset basics: Using listing survey details in a household survey

Previous Next

Click here to download the dataset basics sample form. This .zip file contains two .xlsx files with form definitions plus a .xml file with a dataset definition.

You will upload this particular sample to your server in three parts: as two forms and one dataset that links them. First, go to the Your forms and datasets section of the Design tab and upload each of the two .xlsx files, one at a time, using + followed by Upload form definition. Then, click + again and Add server dataset to add a new dataset, choose the New dataset from definition tab, and upload the .xml file.

This sample demonstrates how datasets can be used to link multiple survey forms into a single workflow. In this case, the workflow is this:

Household listing -> household sample -> household surveys

Accordingly, the sample includes two survey forms and a dataset to link them: a household listing survey, a household sample dataset, and a household survey. The survey forms are short; they do not include the litany of questions you would normally include in a household listing or in a full household survey. These forms are meant only to demonstrate the logic of connecting surveys via datasets.

Look first at the household listing form included with this sample. It includes fields for the household address and the head of household's name. It also includes several calculated fields:

type	name	calculation
calculate	hhid	once(int(random()*900000)+100000)
calculate	random_draw1	once(random())
calculate	in_sample	if(${random_draw1} < 0.60, 1, 0)
calculate	random_draw2	once(random())
calculate	survey_team	if(${random_draw2} < 0.50, 'A', 'B')

The hhid field automatically generates a random six-digit household ID for each listed household. The in_sample field uses a random draw to decide whether or not to include this household in the sample: 60% of the time in_sample will be given a 1, and 40% of the time it will be given a 0. Finally, survey_team uses a second random draw to assign 50% of households to survey team A and 50% to team B.

In this case, the listing form itself assigns household IDs, chooses a sample, and makes random assignments. You might prefer to reserve some or all of those tasks for your back-office team; while the goal of this sample is to demonstrate a fully-automated workflow, in practice you might well build a semi-automated workflow instead.

This household listing form publishes data directly into the dataset included with this sample. Specifically, the hhid field publishes into the dataset's hhid_key field, the address and head-of-household fields publish to fields by the same names, and the survey_team field publishes to the team field. Importantly, publishing is set so that only submissions for which in_sample is equal to 1 are published into the dataset; that's because we want this dataset to hold just our randomly chosen household sample, not all listed households.

This dataset is then attached – as pre-loaded data – to the second form included with this sample, the household survey form.

This second form begins with a simple select_multiple field that asks for which of two teams to list households.

The next field asks the enumerator to select one of the sample households assigned to the selected survey team.

This field dynamically loads the list of options from pre-loaded data – in this case, data pre-loaded from the attached dataset (see Loading multiple-choice options from pre-loaded data). Following are the relevant rows from the survey and choices worksheets of the form definition.

type	name	label	appearance
select_one household	hhid	Which household are you surveying?	search('hh_sample', 'matches', 'team', ${team})

list_name	value	label
household	hhid_key	address, hoh_name

The search() appearance indicates that the choice options should include unique values from the hh_sample dataset, from all rows for which the team column matches the selected survey team ("A" or "B"). The value and label columns of the choices sheet indicate the dataset columns to use for the choice values and labels (hhid_key for the values and both address and hoh_name for the labels).

Next, once the user has selected a specific household, the pulldata() function is used to read the address and hoh_name fields from the pre-loaded data (i.e., from the dataset).

type	name	calculation
calculate	address	pulldata('hh_sample', 'address', 'hhid_key', ${hhid})
calculate	hoh_name	pulldata('hh_sample', 'hoh_name', 'hhid_key', ${hhid})

Now part of the household survey form, these fields can be referenced to, for example, solicit confirmation that the enumerator is at the correct household.

As mentioned before, in this sample everything is fully automated: whenever new household listing forms are submitted to the server, the household sample dataset – and the household survey form to which it is attached – is automatically updated within 10 minutes. An alternative workflow might be:

The listing form publishes key household details into a household dataset – but the form does not include drawing a random household ID, drawing a random sample, or assigning surveys to survey teams.

A member of the field or back-office team downloads the household dataset and updates it to include household ID's, an indicator of which households to include in the sample, and survey team assignments, all in additional columns of the downloaded dataset.

The team member uploads the updated dataset to the server, allowing the server to merge the revised data with any additional data that might have been published in the mean time (if, i.e., new household listing forms were submitted since the team member downloaded the dataset).

The household survey form filters the list of possible households to include only those in the sample (since now the dataset includes both households in and out of the sample).

There are an infinite number of possible workflows. This sample and accompanying discussion are just meant to give you an idea of what is possible. The next sample continues by extending this one to include an additional back-check survey.

Previous Next

Dataset basics: Using listing survey details in a household survey

Don't have a SurveyCTO account yet?