Pre-loading: Referencing pre-loaded .csv data
To open or use this sample form, go to the Design tab, scroll down to the Your forms and datasets section, and click + then Start new form; then enable Use a sample form as your starting point and choose Pre-loading: Referencing pre-loaded .csv data from the list. You can also click here to download the spreadsheet form definition (this is a .zip file that contains a .xlsx sample form as well as a .csv file containing sample data for pre-loading).
If you upload the spreadsheet form definition to your SurveyCTO server manually, be sure to attach the .csv file – or upload the .csv file as a dataset named hhplotdetails and then attach that dataset – otherwise the form won't be able to find the data when users are filling it out.
When you open the .csv file, you will see that the first column is titled hhid_key. Here, "hhid" stands for "household identification number"; the suffix "_key" indicates that the values in this column will be used to uniquely identify each row, so SurveyCTO should index the column's contents for fast look-up. (See Pre-loading data into a form for a full discussion of the techniques employed here.)
hhid_key | nplots | plot1description | ... |
1001 | 1 | HHID 1001 Plot 1 Description | ... |
1002 | 2 | HHID 1002 Plot 1 Description | ... |
... |
This .csv file represents data that the survey team already has about the number of plots that each household owns (in the nplots column), as well as short descriptions and sizes for each of those plots (in the plot#description and plot#size columns). (Further to the right, there are other columns as well, but you can ignore those for now as they are only used in the next sample.)
Turning to the spreadsheet form definition for this form, you will see that the user is first asked to double-enter the household identification number of the respondent. He or she will be provided the hint that "Valid IDs are between 1001 and 1060" because those are the households covered in the accompanying .csv file.
type | name | label | constraint |
integer | hhid | Please enter the household's ID. | .>=1001 and .<=1061 |
integer | confirmentry | Please re-enter the household's ID. | .=${hhid} |
Note that the constraint for the hhid field intentionally allows 1061 to be entered, even though that is not a valid ID in the .csv file. This is allowed so that you can see how the form behaves when an entered ID cannot be found in the .csv data.
Next in the workbook, you will see a block of twenty calculate fields. These are internal, hidden fields, so nothing you see in this block will appear to the user filling out the form. Each of these pulls a different piece of data from the .csv file and assigns it a name that can be referenced later in the survey. The expression in the calculation column of each calls the pulldata() function, passing the following as parameters: which .csv file to pull from, which column to pull, and which column and value to use in identifying the correct row to load.
Finally, you will see a block of actual questions that will be visible to the user. The labels for most of them make reference to the data values that were earlier pulled from the .csv file.
type | name | label | relevance | choice_filter |
select_one plot | bestplot | Please select your best-performing plot in the most recent agricultural season. | filter<=${nplots} | |
note | plotsintro | I will now ask you a series of questions about each of your ${nplots} agricultural plots. | ${nplots}>=1 | |
select_one yesno | plot1cultivated | Did you cultivate your "${plot1description}" (size ${plot1size}) plot in the most recent season? | ${nplots}>=1 | |
select_one yesno | plot2cultivated | Did you cultivate your "${plot2description}" (size ${plot2size}) plot in the most recent season? | ${nplots}>=2 |
All of the values inside the ${}'s (for example, ${nplots} and ${plot1size}) refer back to the names of calculate fields from earlier, in order to pull in those field values – which, in turn, were pulled in from the pre-loaded .csv data.
You will also notice that respondents will only be asked questions about plots they own. In the first question in this block, which asks respondents to identify their best plot, the expression filter<=${nplots} in the choice_filter column – along with the accompanying answer options on the choices tab – ensure that respondents will be presented with a list of only those plots that they own. In later fields, the expressions ${nplots}>=1 or ${nplots}>=2 in the relevance column indicate that these questions should only be asked if the number of plots the respondent owns is high enough.