Looking at data in Microsoft Excel
The first thing many people do, as soon as they export their data, is open it in Microsoft Excel. You can certainly review, clean, or even analyze your data in Excel. However, please note the following:
- If you double-click on a .csv file, most systems will default to opening the file in Excel. In this case, Excel will import the .csv file based on a series of assumptions, most of which will not alter your data. But it will assume that the characters in the file are Windows or Mac characters as opposed to Unicode characters, so it will mess up accents or foreign scripts.
- The safer way to import a .csv file is to first open Excel and then choose either File...Import or Data...Get External Data...Import Text File. You will want to specify that it is a .csv file, that it is comma-delimited, and that its "file origin" or "encoding" is Unicode/UTF-8. You will then get the opportunity to preview the columns and override the cell formats used for each column. Finally, when it asks you where to put the data, there will be a Properties... button that you can click to see options for "refresh control": whether to re-import the data whenever you open the Excel workbook and whether to re-prompt for the source .csv file every time. Once you import this way, you can essentially re-import at any time by clicking Refresh on the Data tab.
- Sadly, when you import data using Excel's import wizard, it gets confused when it sees line-breaks inside cells – which will be a problem for you if anybody ever presses Enter when entering a text response into one of your survey forms. Excel will end the row at the line-break and essentially break one row of data into multiple rows. To avoid this, there is an option in Desktop's settings to replace line-breaks in exported .csv files with some other character (like a single space).
- Excel will sometimes convert things that aren't dates into dates. For example, say you have an ID number like "10-11-12": Excel will convert it into a date, encoding it internally (and saving it!) as the number of days between Jan. 1, 1980 and Oct. 11, 2012. Obviously, that is totally wrong. To prevent Excel from corrupting your data in this way, you need to take care when assigning the formatting of your columns on import: for ID numbers and other things that might be mistaken for dates, "General" is not a safe choice.
- If you want to edit your data or make notes in new columns, you are better off configuring Desktop to directly merge incoming data into an existing Excel workbook, rather than exporting to .csv format and then re-importing into Excel yourself. See this help topic for more on merging directly into Excel.