Previous Next

You can use statistical software like Stata, SPSS, or R to visualize and analyze your data, or other software like Microsoft Excel or Google Sheets (both of which have free statistical-analysis add-ons). But to quickly and easily monitor your incoming data, you can use SurveyCTO's built-in Data Explorer.

Using the Data Explorer, you can easily summarize data submitted for individual fields, summarize the empirical relationships between fields, and drill down to browse individual submissions. With it, you can start learning from your data right away.

There are two ways into the Data Explorer:

  1. Click the "Monitor form data" action for any form in the Form submissions and dataset data section of the Monitor tab.

  2. Click the "Explore" action for any form in the Your data section of the Export tab.

The Data Explorer is mostly the same whichever path you take, but you can save and maintain separate workbooks for monitoring vs. exploration/analysis. Also, when you enter via the Monitor tab, quality-check results are summarized with your data (see the Monitoring sub-section below).

However you enter the Data Explorer, you'll first need to choose which submissions – and maybe which fields – to load. Because all of the submissions you choose will be downloaded and loaded into your web browser's memory, you might need to choose only recent data or a random subset of data if your dataset is very large. Some computers with fast Internet connections, a lot of memory, and modern browsers will be able to load many thousands of submissions, but others could become too slow or even crash when loading too much data. If you're loading data for an encrypted form, you'll also need to choose whether to load all data (which requires the private key) or just fields that have been flagged as publishable.

Loading the Data Explorer might take a while, particularly the first time you open it; for large datasets, you might want to go grab a tea or coffee while it loads. But don't worry: downloaded data will be stored in your local browser's cache so that you won't need to download it again the next time you go in.

And once you're all loaded up, you can use the Data Explorer fully offline: you only need to connect again if you want to view or download an attached file (like a photo or audio recording), view or explore maps, or save your workbook. Saving your workbook is handy so that, when you come back later, all of your summaries will be there (ordered, organized, and configured just the way you left them – but also with any new data that's come in).

Exploring your data

When you first load the Data Explorer, you'll start with a mostly-blank canvas. At the top, you'll see some summary information about your form, including the submissions and fields you chose to load. Below that, you have the opportunity to add field summaries, relationship summaries, and groups:

  1. Field summaries. These are summaries of your data, one field at a time. To quickly add field summaries for every field in your form, just click to add field summaries, choose Select all, and save (but if you have hundreds or thousands of fields in your form, you might want to be more selective). Multiple-choice data will be summarized with horizontal bar or pie charts, numeric data with histograms (counting frequencies in an adjustable number of equal-sized "bins"), GPS data with maps, date/time data with vertical bar charts (with aggregation to days, weeks, or months), and text data with a simple list ordered by frequency. When multiple summary views are possible, you can click the "eye" icon to change a summary from one view to another (e.g., from categorical to numeric). All field summaries include a count ("N") of observations as well as a count of missing observations ("N missing"); min, max, median, mean, and standard deviation ("SD") are also included for all numeric data.
  2. Relationship summaries. These are summaries of the bivariate relationship between two fields. For convenience, you can add more than one relationship at a time: just select your primary field first, then all of the other fields to which you'd like to relate that primary field (e.g., choose enumerator ID as primary and then a series of other fields, in order to see the relationship between enumerator and those other fields). Relationships will be summarized as either a scatterplot (for two numeric fields), a trend view (date/time + numeric), a map view (geopoint + anything), or a table (anything + anything); when multiple views are possible, you can always click the "eye" icon to switch between views.

    For scatterplots, correlation coefficient, R-squared, and OLS beta estimates are given, and you can show or hide an OLS line of best fit. For map views, each location pin is shaded or colored based on the value of the other field, and that other field's value is shown on hover. For trend views, a line chart is shown, and you can aggregate data at the day, week, or month level. For tables, you can choose between crosstab or summary-statistic views, depending on whether one or both fields is numeric; in a crosstab view, numeric data is grouped into an adjustable number of equal-sized "bins" just like histograms, but empty rows or columns are automatically hidden so that the crosstab doesn't get too big.

  3. Groups. If you wish, you can create a group and then add summaries to that group (or drag and drop existing summaries into it). You can then collapse and expand groups, to show or hide the summaries within them. If you have a workbook with a lot of summaries, you might find groups to be helpful for organization and navigation.

Importantly, you can "drill down" to view individual submissions all through the Data Explorer: just click on any histogram bar, crosstab cell, scatterplot point, map marker, "N" count, etc. to view submission details. Your data will be nicely formatted (and even printable!) based on the latest version of your form. When viewing a submission, you can click to open or download any submission attachments, such as photos or audio recordings, or click to show question administration timing from an attached text audit – though this will require an active Internet connection. You can also click to copy a direct hyperlink to the submission details, in order to re-visit it later or share with a colleague (valid login and sufficient access rights will be required, of course). Finally, if you have the unique key (or UUID) for a submission, you can also jump straight to the submission-detail view from the Monitor tab's Look up by key action.

If your form has been translated into multiple languages, there will be a language selector in the top-right of the Data Explorer. Use that to choose the language in which to show field and option labels throughout the summaries and submissions.

You can feel free to add summaries and groups, drag and drop them into your preferred order, and then save your workbook so that it will be the same next time you come back – though every time you come back the data itself will update as appropriate. If you ever want to start completely over, just choose Reset in the upper-right.

Monitoring

A key to collecting high-quality data is keeping a close eye on data as it comes in. Use a combination of automated quality checks and manual inspection to catch potential problems quickly, so that they can be dealt with quickly (e.g., while teams are still available in the field). Configure your Data Explorer workbook and return to it frequently, as new data is collected.

If you enter the Data Explorer via the Monitor tab, then automated quality checks and quality-check results will be noted with your data. For example, say that you have a quality check configured to warn if there are submissions with duration less than 600 (i.e., less than 10 minutes). If you show a field summary for the duration field, then that quality check will be listed, along with any associated warnings from the current quality-check report. You'll even be able to easily click to view individual submissions that triggered warnings, and the details for those submissions will also note the related warnings. Finally, a summary of all configured checks and warnings will be available in the upper-right of the Data Explorer, in the CHECKS section of the page header.

Filtering and excluding data

Once you start using the Data Explorer, you're likely to want to filter or exclude some data. There are three options available to you:

  1. Exclude entire submissions. You might have some submissions that were test or invalid submissions, which you want to exclude from summaries globally, for your entire workbook. Just click to view a submission, then click View options and Exclude submission from summaries. Back in the main window with your summaries, that submission will then be excluded. A yellow bar at the top of the screen will remind you of any exclusions; you can click the count of excluded submissions to view and un-exclude submissions individually, or you can click the icon in the yellow bar to quickly un-exclude all submissions.
  2. Exclude certain values. Some outliers might make your field summaries look unpleasant, or they might throw off your summary statistics. For example, a -999 entered as "don't know" for income can really throw off the graph, the mean, and the standard deviation. Just click on a graph bar or data value, then select Exclude from this summary to drop the outlier(s) from the summary. Only the current summary view will be affected (not the workbook overall), and an orange bar at the top of the summary will remind you of the exclusion; click the icon next to the "Some values excluded" note in this bar to view or clear exclusions.
  3. Filter data. As you explore your data, you may want to filter your workbook to focus on particular subsets. For example, you might want to see how the data looks when you consider only young women. To do this, you could click the "Female" bar in your field summary for gender, choose Add to global data filter, then click one or more bars in the age field's histogram to also add those to the global filter. This would filter all other summaries in the workbook, to only include submissions with the gender and age(s) you specified. Like with submission exclusions, a yellow bar at the top of the screen will remind you of any global filters, and you can click the filter icon in that bar to view or clear them. Using global filters, you can zero in on particular subsets of data and get a quick view of how that data looks.

Reviewing and correcting data

If the review and correction workflow is enabled for a form, then the Data Explorer is where you review, comment on, correct, approve, and reject submissions. Whenever you view the details for an individual submission, you will see the data as it exists after any corrections, plus any comments that have been made; options at the top of the screen will allow you to comment, correct, approve, or reject. Note that you will be able to review submissions and make changes while offline, but to apply those changes you will need to press the option to save them – which will require a working connection to the server.

Calculating and aggregating data

Sometimes you'll be interested in looking at the sum of multiple income measures, the average age of household members, or some other aggregation or calculation. Since the Data Explorer can only look at individual fields or individual relationships between two fields, you'll need to include important aggregations or calculations as fields in your form so that the results are readily available in your data. To do this, include one or more calculate fields to perform the appropriate aggregations or calculations. Once you do, you'll be able to add automated quality checks on those new fields, and they'll also be conveniently included in your exported data.

Advanced mode

In both the Form submissions and dataset data section of the Monitor tab and the Your data section of the Export tab, there is an Advanced mode button that allows you to enable a set of more powerful Data Explorer tools. Meant for more expert users, advanced mode allows you to:

  • Configure multiple workbooks. By default, each form has one workbook on the Monitor tab and one on the Export tab. In advanced mode, you can configure as many additional workbooks as you like, tailoring each to a particular view, team, or workflow.
  • Attach datasets to workbooks. Advanced mode also allows you to attach server datasets to workbooks, so that you can supplement incoming form data with earlier listing data, QC results from outside systems, and more.
  • Download and upload workbook definitions. Finally, advanced mode includes Download and Upload buttons that allow you to export and import workbook definitions. These definitions are Excel spreadsheets, similar to form definitions and edited in a similar way; instead of defining fields in a form, however, these workbook definitions define summaries in a Data Explorer workbook.

See the help topic on advanced-mode usage for more details.

Data security

For the highest possible data security, we always recommend that you encrypt data using your own encryption keys. If you do, only you and your team will be able to read the data, because only you and your team will have access to the private encryption key necessary to decrypt it.

The Data Explorer can decrypt your encrypted data, but only if you allow your web browser to temporarily use your private encryption key. Here's how it works: your data is downloaded and stored in your web browser's local cache so that it doesn't have to be re-downloaded every time, but it's stored safely in its encrypted form; every time the Data Explorer loads, it will need you to select your private encryption key so that it can decrypt your data; your encryption key will not be saved in the browser cache and it will not be transmitted anywhere (it will only be used in memory); your decrypted data will likewise be decrypted in memory but it will not be saved anywhere, nor will it be transmitted anywhere; once you close the Data Explorer tab, the memory is freed and the decrypted data is no longer accessible.

This approach to data security has a few costs. For one, you need to select your private encryption key every time you load encrypted data (since we won't store it in your browser's local storage). For another, loading takes longer because we decrypt data every time you load (since we also won't store decrypted data in your browser's local storage). But the benefits are worth the costs: your data stays truly "for your eyes only," and your respondents' confidentiality is protected. We don't think that you'll find a more secure data-collection platform anywhere.

Hyperlinking into the Data Explorer

Your SurveyCTO server name
Your server name is the x in the x.surveycto.com you see in the browser address bar after you log in to your server console. Use that anytime you see "servername" here in these instructions.

You may want to hyperlink directly from outside systems into SurveyCTO's Data Explorer. For example, if you have a dashboard to monitor top-level indicators, you may want to allow users to drill down into individual data points in the Data Explorer. To hyperlink directly into the submission-details view for a particular submission, just use a hyperlink like the following:

https://servername.surveycto.com/view/submission.html?uuid=[KEY]

The one thing to be aware of is that the unique ID (the KEY) for the submission has a colon in it, and for a colon to appear in a URL it should be changed to %3A instead. If you were using Microsoft Excel and you had your SurveyCTO server name in cell A1 and your submission's KEY in cell B2, this formula would generate the proper hyperlink:

="https://" & A1 & ".surveycto.com/view/submission.html?uuid=" & SUBSTITUTE(B1,":","%3A")

But then, if you are using Excel – and you are on Windows – then you should know about another complication: Microsoft Office products on Windows are terrible about hyperlinking to secure pages like this one (pages that require a login); they'll freeze for a little while, and then they'll fail to open the page at all. Luckily, we've constructed a work-around. Say that you have your server name in cell A1, your submission KEY in A2, and the formula above (which generates the URL) in A3. In that case, this formula will generate a hyperlink that does work in the Windows version of Excel:

="https://" & A1 & ".surveycto.com/officelink.html?url=" & SUBSTITUTE(SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(C1, "%", "%25"), " ", "%20"), "?", "%3F"), "&", "%26"), "=", "%3D"), "{", "%7B"), "}", "%7D"), "[", "%5B"), "]", "%5D")

That "officelink" URL sends users through an unsecured landing page that has a Continue button to login and continue. It's a bit of a hassle, but often worth it to get working hyperlinks directly from back-office Excel sheets.

Previous Next