Beyond survey design: take survey data to the next level

by Carolyn Doi
Education and Music Library, University of Saskatchewan

You’ve designed a survey, found the right participants, and waited patiently while responses come streaming in. The initial look at responses can be thrilling, but what happens next? I’ve used questionnaires as a data collection technique, and made the mistake of thinking the work is over once the survey closes. Kelley, Clark, Brown and Sitzia warn us about treating survey research as a method requiring little planning or time:

“Above all, survey research should not be seen as an easy, ‘quick and dirty’ option; such work may adequately fulfil local needs… but will not stand up to academic scrutiny and will not be regarded as having much value as a contribution to knowledge.”1

Let’s consider some steps to explore once data collection has been completed.

1) Data cleaning and analysis
Raw survey data is usually anything but readable. It takes some work to transform results into meaningful and shareable research findings. First of all, familiarize yourself with some of the relevant terminology, before moving on to actually working with the data. Before touching the dataset, you’re going to need to create four worksheets, one for raw data, one for cleaning in progress, one for cleaned data, and one for data analysis. Each worksheet shows a stage in the process, which will allow you to backtrack, or find errors. If you haven’t taken a stats class recently, I like this introductory Evaluation Toolkit, which clearly describes the processes of cleaning, tabulation, and analysis for both quantitative and qualitative data.

2) Visualization and reporting
Consider data visualization to bring your survey data to life, but remember to choose a visualization tool that makes sense for the data you’re trying to represent. The data visualization catalogue is a handy tool to learn more about the purpose, function, anatomy, and limitations of a wide range of visualizations. It includes links to software and examples of each visualization. There are lots of free or inexpensive programs to help create visualization including Microsoft excel, Google sheets, or Tableau Public. If you’re looking for some inspiration, take a browse through the stunning work of Information is Beautiful for ideas.

Likely you will want to share the outcomes of your research, either at your institution or in a paper or presentation. Kelley, Clark, Brown, and Sitzia provide a great checklist of information to include when reporting on any survey results, including research purpose, context, how the research was done, methods, results, interpretation, and recommendations.2 Clarity and transparency in the research process will help your audience to better understand and evaluate the research and its applicability to their context.

3) Data preservation and access
Consider an open data repository such as the Dataverse Project to make your data discoverable and accessible. Sharing your data comes with benefits such as “web visibility, academic credit, and increased citation counts.” You may also be required to archive your data to satisfy a data management plan or grant funding requirements, such as those from the Tri-Council. When archiving in a repository, remember to share your data in an accessible file format, and include accompanying files such as a codebook, project description, survey instrument, and outputs such as the associated report or paper. As a rule of thumb, aim to provide enough documentation that another researcher would be able to replicate your study. A dataset is a publication that you can cite in your CV, ORCID profile, in a paper, or presentation. Doing so is a great way encourage others to learn about your research or to build on your research project.

Getting your hands dirty and working directly with survey data is where you’ll be able to explore and eventually tell a compelling story based on your research. Be curious, persistent, and enjoy the process of research discovery!

1KATE KELLEY, BELINDA CLARK, VIVIENNE BROWN, JOHN SITZIA; Good practice in the conduct and reporting of survey research, International Journal for Quality in Health Care, Volume 15, Issue 3, 1 May 2003, Pages 261–266,

2Ibid. p. 265.

This article gives the views of the author(s) and not necessarily the views of the Centre for Evidence Based Library and Information Practice or the University Library, University of Saskatchewan.

The Elephant Tale of Data

Kristin Lee, Tufts University
Liz Settoducato, Tufts University

Earlier last year when I asked my colleague Liz Settoducato, Engineering Librarian here at the Tisch Library, if she would be interested in looking into Jumbo the Elephant with me we didn’t realize it would become a bit of a weird obsession. A simple idea to use data about Tufts University’s beloved mascot for instruction sparked research into the circus, P. T. Barnum and his shenanigans, taxidermy, and scientific specimens. The more we read, the more we wanted to know.

This project has become a place where our personal journeys to librarianship collide. My background is in science and I have long been fascinated by the idea of physical objects as data. I also love maps, a perfect way to present the adventures of an elephant who had his own personal train car. Liz comes from the world of gender studies and archives. She understands how our fascination with different forms of entertainment impact scholarship and research, and why it is essential to study this as part of the experience of being human. Together we can look at our subject, the sadly doomed star of the London Zoo so fiercely pursued by circus showman P. T. Barnum, who met his end in a train accident in St. Thomas Ontario in 1885, as the pop culture icon and flesh and bone creature that he was.

Chasing down information hasn’t been easy. There are circus handbills, correspondence, newspaper articles, songs, and images of Jumbo in collections all across the country. But we wanted more. Jumbo became the mascot of Tufts University posthumously, when his stuffed hide was donated to the Barnum Museum of Natural History in 1889 by Mr. Barnum himself, after travelling with the circus (The Story of Jumbo). There are pieces of Jumbo, King of Elephants, in collections all over the country (I just found out about this piece of tusk at the Henry Ford Museum in Michigan while writing this ). His heart was purchased by Cornell University, but all they have left is the jar, a fact which caused Liz and me to have a conversation about just how you could lose a 40 lb elephant heart. All of these specimens were once a whole, living elephant, a collection that requires each piece for context, and bringing them back together (at least virtually) has become a bit of a mission.

What we call “Our Eccentric Jumbo Research Project” isn’t really that outlandish in the context of librarian research at all. We are using tools from the digital humanities to explore texts, like the biography of Jumbo by his keeper Matthew Scott, to try and figure out how the people around him understood him. We are thinking about how P. T. Barnum, purveyor of “humbug” and serial hyperbolist, spread misinformation about his prized attraction to get the attention of crowds and how that affected the public view of wildlife in places they could only imagine in the late 19th century. We are tracking down data from studies of Jumbo’s bones and his tail (the only piece that survived after the rest of his hide was destroyed when Barnum Hall at Tufts burned down in 1975) to better understand how he was treated during his short life. Librarianship is about not only providing our communities with what they need, but giving them access to worlds they didn’t even know were out there and allowing a sense of wonder and whimsy to infiltrate the research process.

This article gives the views of the author and not necessarily the views the Centre for Evidence Based Library and Information Practice or the University Library, University of Saskatchewan.

More Data Please! Research Methods, Libraries, and Geospatial Data Catalogs: C-EBLIP Journal Club, August 25, 2016

by Kristin Bogdan
Engineering and GIS Librarian
University Library, University of Saskatchewan

Article: Kollen, C., Dietz, C., Suh, J., & Lee, A. (2013). Geospatial Data Catalogs: Approaches by Academic Libraries. Journal of Map & Geography Libraries, 9(3), 276-295.

I was excited to have the opportunity to kick-off the C-EBLIP Journal Club after a brief summer hiatus with a topic that is close to my heart – geospatial data! This article was great in the context of C-EBLIP Journal Club because it introduced the basics of geospatial data catalogs and the services around them, and provided an opportunity to look at the methods used by the authors as part of an ALA Map and Geospatial Information Round Table (MAGIRT) subcommittee research project.

Most of the group was unfamiliar with geospatial data catalogs, so the introductory material provided a good base for further discussion. There was good material about the breadth of the different metadata standards involved and how they are applied at the different levels of data detail. There was also good discussion about the importance of collaboration and the OpenGeoportal consortium in developing geospatial data catalogs.

One of the key themes of our discussion was that we would have liked to see more information about the research design and more data. We would have liked to see mention of the ethics process that the authors went through before carrying out their study. Our group had questions about the process that the subcommittee used to choose their sample, as it seemed like it was fairly limited. The authors acknowledge that this was not meant “to create a complete inventory” (p.281), but it seemed like it could have been broader to be more representative. We would also have liked to see the questions that were asked during the interviews and more of the qualitative data from the interviews themselves. It was unclear how structured the conversations with the catalog managers were and how the data presented in the tables and the conclusions were derived. The information presented in the tables was not consistently organized and seemed like it would have been more useful in the context of the interview. The pie chart they used on page 283 to show the “Approaches to Developing Geospatial Data Catalogs” was not as useful as a table of the same information would have been, as there are 5 pie sections to represent 11 data points.

In light of the questions around the data collection, the leap from the tables of responses to the recommendations seemed fairly large. In general, the lists of questions to consider when determining how to implement a geospatial data catalog were helpful but they aren’t really recommendations. The cases that they present provide some ideas about the staffing and skills required to create a geospatial catalog, but they are vague. The first case seemed unnecessary, as it states “The library has determined that there is a clear need to provide access to the library’s spatial data and other spatial data needed by the library’s customers. However, the library does not have the technology, staffing, or funding needed to develop a spatial data catalog.” It would have been nice to see some alternative solutions for those without the ability to create a full-blown data catalog like suggestions about some practices that could be put in place to start building the foundation of a geospatial data catalog like specific cataloging practices or file-type considerations.

Our discussion concluded with reflection on how carefully and critically we read articles in our general research lives. One of the great things about Journal Club is that we have the opportunity to really interrogate and dissect what we are reading. The ensuing discussion is an opportunity to see the article from many different perspectives. This makes us better researchers in two ways: we are trained to more thoroughly evaluate the things we read and we take that into consideration in the research that we produce.

This article gives the views of the author(s) and not necessarily the views of the Centre for Evidence Based Library and Information Practice or the University Library, University of Saskatchewan.

Walking the (Research Data Management) Talk

by Marjorie Mitchell
Librarian, Learning and Research Services
UBC Okanagan Library

Librarians helping researchers to create data management plans, developing usable file management systems (including file naming conventions), preparing the data for submission into repositories and working through the mysteries of subject-specific metadata schemes are at the forefront of the data sharing movement. All this work leads to research that is more reproducible, more rigorous, has fewer errors, and more frequently cited (Wicherts, 2011) than research that isn’t shared. In addition to those benefits, shared data leads to increased opportunities for collaboration and, potentially, economic benefits (Johnson, 2016). However, are we doing what we are asking our researchers to do and ultimately making our research data available and open for reanalysis and reuse? Are we walking the talk? Or is this the case of the carpenter’s house (unfinished) and the mechanic’s car (needing repair)?

When I’m speaking of data I use Eisner and Vasgird’s description of data as “a collection of facts, measurements or observations used to make inferences about the world we live in” (n.d.) because the research done by librarians consists of wide varieties of data: numerical, textual, photographic images, hand drawn maps, or diagrams created by study participants. Almost all have the potential to be shared openly and to act as a springboard for further research, subject to appropriate ethical considerations.

I started searching to see what data I could find from Canadian librarian researchers in repositories. I have not finished my search, but my early results show some interesting things. To date, this has not been a rigorous study, but more of a curious, pre-research “let’s see what’s out there” browse, and therefore must not be misconstrued as the basis for conclusions. I briefly looked internationally for a few studies and found a wider variety of topics with available datasets than I had found in Canadian repositories, which was what I expected to find.

Two things jumped out at me right away. First, when data is available, it is either from large, national or multi-institutional studies, or it is from studies that have been repeated over time, such as LibQUAL+®. Far fewer institution-specific or single researcher/research team datasets are “available.” Some of those have “request access” restrictions, meaning it may be possible to access the data with permission from the creator, but that is not guaranteed. The second thing I noticed was how difficult it is locate these datasets. Although there is a movement to assign unique and persistent identifiers to datasets, this has not, as yet, translated into a search engine that can comprehensively search for datasets.

I am happy to see a steady increase in the amount of librarian-generated research data being made available. Librarian-generated research is not alone in this trend. It is happening across the disciplines. While little library research is externally funded, it is worth noting some funders are requiring data management plans with the goal of data sharing. Some scholarly journals, particularly in the sciences, have strong policies about data sharing. Each change, minor or major, moves us more toward data that is shared as a matter of course, rather than data shared only reluctantly.

If this all sounds like “just another thing to do” or maybe “I don’t have the skills or interest to do this,” consider research data sharing as an opportunity to partner with another librarian who has those skills but perhaps lacks the research skills you have. Research partners and teams can allow people to contribute their best skills rather than struggling to compensate for their weaknesses throughout the process.

Finally, have a look at the data that is out there just waiting to be reused. Cite it, add to it (if allowed), and share your new results. I am confident this will add greater context to your research and highlight subtleties and nuances that might have remained invisible otherwise.


Eisner, R., & Vasgird, D. (n.d.) Foundation Text. In RCR Data Acquisition and Management. Retrieved from

Johnson, B. (2016). Open Data: Delivering the Benefits. Presentation, London, UK.

Wicherts, J. M., Bakker, M., & Molenaar, D. (2011). Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLoS ONE, 6(11). doi:hOp://

This article gives the views of the author(s) and not necessarily the views of the Centre for Evidence Based Library and Information Practice or the University Library, University of Saskatchewan.