joao-silas-72563-unsplash.jpg

Research Aid

For my master thesis, I designed and developed a web-based software tool to aid non-expert researchers in the literature review process. I adopted a user-centred approach for the project. A full research paper on this work has been published by DeLFI (Die 16. e-Learning Fachtagung Informatik, der Gesellschaft für Informatik e.V. (GI)) in September, 2018.

Final Design.png

Motivation

Non-expert researchers, usually bachelor students, lack the expertise to perform a literature review independently. They usually do not know how to acquire the right information  and analyse it to perform the review effectively. Since performing a literature review is the first step in starting research, non-experts should be able to perform this step efficiently. Therefore, the aim was to provide a solution that would simplify this process so non-experts can form an overview of a research field and its dynamics.

Team and Duration

I undertook this project independently as this was a part of my 6 months long final research work. I was, however, very fortunate to receive valuable feedback throughout the project from my thesis supervisor.

Key Objectives

Non-experts must be supported in

  • Identification of information needs during information acquisition
  • Structuring of the collected information to form an overview of it
  • Making their process time-efficient by avoiding need for trial-and-error.

Tools

  • Literature review
  • Online research
  • Online survey
  • Wireframes and Prototypes
  • HTML/CSS + ReactJS + d3.js + MongoDB
  • Usability testing + A/B testing
  • Formal user study design and analysis

Process Outline

Process Outline.png

Preliminary Research

Literature Review

I started by performing a literature review to gauge the existent research in this domain. I hoped to find out which of non-expert's problems is the existing research targeted at and if any of this research could be useful to my work.

Findings

Research regarding support for non-experts focuses on four major areas:

  • Bibliography- and research paper-oriented tools that seek to automise the review process
  • Tools that visualise some relationships between researchers, like co-author networks
  • Tools that seek to establish critical thinking by making use of visualisations like concept maps
  • Work that focuses on outlining questions non-experts should be able to answer to fully gain an overview over a research field

While the existing literature focuses on many different aspects, no single tool focuses on helping non-expert researchers perform literature reviews independently. However, the use of visualisations to provide overviews and mind tools like concept maps to aid critical thinking could be useful for non-experts. Additionally, the questions non-experts should answer while performing literature reviews could aid their process, provided the helpfulness of such questions was first verified through user research.

Survey

Since I discovered certain "guidance questions" for non-expert researchers through the literature review, I wanted to verify their validity. Secondly, I wanted to identify additional questions not outlined through literature research that might be helpful for non-experts. Therefore, I conducted a survey in the form of an online questionnaire by means of Google Forms. The questionnaire received 60 participants who were divided further into 10 experts and 50 non-experts.

Findings

The survey verified the validity of the discovered "guidance questions" and only added one additional question to the mix. It also provided a first look into how non-expert researchers start their research. The questions themselves were divided into the following major categories:

  • Researchers
  • Institutions
  • Topics
  • Publishings

These categories are interesting because questions related to them are regarded as highly relevant by experts but are not given the same importance by non-experts. The graph shows the magnitude of difference in the opinions between the two groups.

Magnitude of difference in opinions between experts and non-experts. Ntotal = 60, Nexperts = 10, Nnon-experts = 50.

Magnitude of difference in opinions between experts and non-experts. Ntotal = 60, Nexperts = 10, Nnon-experts = 50.

This is an important discovery because the purpose was to help non-expert researchers think about important criteria that they may not think about otherwise. Therefore, it was critical to understand how experts perform these processes and replicate them for non-experts.

Proposed Solution

To provide a solution tailored to the  identified user needs and aforementioned objectives, I designed and developed a web tool. The tool needed to provide a starting point to non-experts' research. However, it was important that it allowed the users to make their own decisions. The idea was to provide them with the identified guidance questions, let them choose which questions they would like to answer and then visualise their answers. These identified questions are designed to provide non-experts a complete overview of the research field.

Wireframes

Before starting with the actual development of the tool, I first explored many ideas for the design and functionalities of the tool via sketches using pencil-and-paper and medium-fidelity wireframes using Balsamiq. This made testing out my ideas much easier and design loopholes were usually obvious to me quite immediately. When an idea absolutely would not work, I was easily able to discard it and start from scratch. Wireframes also made it easier to communicate my ideas to my supervisor because they acted like a blue-print of a possible design. Through wireframes, I conceptualised various maps flows and maps models for the visualising before finally deciding on one. Following images show some examples of my wireframes at various stages.

Wireframe_1.png
Wireframe_2.png
Wireframe_3.png
Wireframe_4.png
Wireframe_5.png
Wireframe_6.png

Design and Development

 The first iteration of the design was based on the decisions made in the wireframing phase.  I used ReactJS, CSS3 and Bootstrap for the client, Node.js and MongoDb for the server and HTML Canvas and D3.js for the visualisation. As mentioned before, the idea of the tool was to present non-experts with guidance questions and visualise their answers to these questions. These questions were divided into four categories, which were further divided into sub-categories containing very specific questions. The following screenshots show one example user flow: Researchers --> Names.

Flow_1.png

As soon as the user entered the answer to a question, it was immediately visualised in visual template on the right. The following screenshot shows that the name of an interesting research institute added by the user is immediately visualised on the map.

Flow_2.png

Usability Testing

The tool underwent two phases of usability testing. The aim of the testing was to investigate:

  • Are users able to follow the questions, their structure and flow?
  • Are users able to perform all tasks without help and without errors?
  • Do users find the tool helpful?

Procedure

The test session involved exploration of the software tool. The participants were given a brief introduction to the aim of the research work. However, no specific details were provided to avoid bias. They were asked to choose a research field to perform a literature review on and start using the tool. No list of literature was expected as an output.

Some key points of this procedure were:

  • 7 participants --> non-expert/entry-level researchers (bachelor and master students)
  • 2 rounds of testing
  • Testing method --> Concurrent Think Aloud + Concurrent Probing
  • Documentation method --> Notes + Voice and Screen Recording
  • No guidance provided during the tool's exploration
  • Follow-up semi-structured interview at the end

Before the exploration, I made sure the participants were not anxious. I invited them to ask questions about the process before starting to make sure they were no misunderstandings. Once they started the exploration, I encouraged them to think their thoughts aloud as they go. I sometimes also prompted them to perform a certain task from a prepared list to check whether it was straightforward for the users to do so. If the users were mistaken about a feature, I tried not to interfere and let them figure it out themselves so that I could observe what would be intuitive to them. In the semi-structured interview that followed, I asked them their views on the tool, ease of use, problem areas (though most were identified during the exploration already) and what impact they thought this tool has.

Takeaways

Encouraging users to think aloud during the process proved to be extremely helpful. I was instantly able to identify where the tool's functionality did not align with the user's expectations. I made many improvements to the tool after both rounds of testing. 

Mostly, I made the following discoveries:

  • My ideas and the user's expectations don't always match!
  • Non-expert researchers do appreciate the guidance provided by the tool
  • Problem areas, such as tedious typing of researchers' names
  • Non-obvious design, such as non-intuitive naming of buttons (main vs. home)
  • Flaws in the structure flow, such as a back button that essentially behaved as a home button
  • Features the tool was lacking, such as addition of custom categories
  • Validation for several design decisions, such as colour coding, usage of icons and visualisation (as seen in this table)
Average scores for each category, as provided by participants. 1 --> lowest & 5 --> highest.

Average scores for each category, as provided by participants. 1 --> lowest & 5 --> highest.

In short, the usability testing helped me realise what could stay, what needed to be fixed and what had to go. Also, since I starting testing when the tool was at a medium-fidelity prototype level, I was able to catch these issues quite early in the process.

User Evaluation

To gain an objective understanding of the tool, I decided to perform a formal user study with users once the tool reached the level of working prototype. The aim was to test specific hypotheses to be able to answer the research questions.

Research Questions

Based on the outlined objectives, I defined the followed research questions:

  • Can the proposed software tool help non-expert researchers gain an insightful overview of the community?
  • Can using the software tool make the literature review process time-efficient for non-expert researchers?

User Study Procedure

The study was divided into two groups. Each group was given instructions to perform a literature review of a pre-specified Computer Science research field. One group was provided the tool as and aid, the other group was provided no aid. The expected output would be a list of literature they found and a general understanding of the research field. A list of the literature they found and the time at which they found it was maintained to perform quantitative analysis. I observed the entire process without making any comments, and noted down anything that I thought was interesting.

Some key points of this procedure were:

  • Between-subjects design
  • 18 participants --> non-expert researchers
  • Treatment group --> 9 participants
  • Control group --> 9 participants
  • Tool familiarisation --> 10 minutes
  • Lit. review time-limit --> 30 minutes
  • Documentation method --> Notes + Voice and Screen Recording
  • Follow-up (mostly) structured interview at the end

Hypotheses

The study was designed to test the following hypotheses:

  • Hypothesis I: The mean of papers found by the treatment group is higher than the mean of papers found by the control group. 

  • Hypothesis II: The control group has a higher mean in the percentage of papers found by unique authors than the treatment group. 

  • Hypothesis III: The mean of papers by unique authors found by the treatment group does not equal the mean of papers by unique authors found by the control group. 

  • Hypothesis IV: The mean of add-on* papers found by the treatment group is higher than the mean of add-on papers found by the control group. 

  • Hypothesis V: The difference in means of the timings in which a matched participant found his n-th paper is bigger than 0 (with n being the minimum of papers found among the matched pairing).

*Add-on papers are those papers which belong to an author, institution or publishing, i.e. a conference or journal, by whom a paper was already found.

Hypothesis I was designed to test whether there was any significant difference in the number of papers found by both groups. Hypotheses II, III and IV were designed to identify the cause of this significant difference, in case it did exist. To clarify, hypothesis II compares the relative percentages between the two groups whereas hypotheses III and IV compare only the absolute numbers of a single category, i.e. either papers by unique authors or add-on papers, between the two groups. Hypothesis V was designed to test whether the one group found research literature faster than the other.

Testing Methods

I chose the following hypothesis testing methods for the quantitaive analysis based on the recommendations made in Quantifying the user experience: Practical statistics for user research by Jeff Sauro and James R. Lewis.

  • Hypothesis I to IV
    • Unpaired two-sample t-test, α = 95%
  • Hypothesis V
    • One-sided paired sample t-test, α = 95%

For the qualitative side of the evaluation, I chose a mostly structured interview because I wanted to ask all the participants a standard set of questions, e.g. questions to judge their overview. This way, I was able to obtain a representative sample that allowed me to make inferences and statements.

Findings

On performing statistical analysis, I was able to make the following discoveries and inferences:

  • The treatment group did find significantly more papers than the control group
  • The difference existed due to treatment group's foray into add-on papers
  • The treatment group found the same number of papers as the control group is significantly less time

Through the interview, I was able to judge:

  • Treatment group was better able to answer overview-related questions compared to the control group, as seen in the following graph
Comparison of percentages of participants from treatment and control groups who were able to answer a specific question. Ntreatment = 9, Ncontrol = 9.

Comparison of percentages of participants from treatment and control groups who were able to answer a specific question. Ntreatment = 9, Ncontrol = 9.

  • Control group would have appreciated the aid of a visualisation during their process

Therefore, both research questions were answered affirmatively.

Takeaways

Designing and implementing a formal user study effectively took away all the guess work from the equation. These are the reasons why this was awesome:

  • Appropriate statistical methods made analysis very straight-forward
  • I was able to objectively answer research questions
  • I had actual proof that this tool works!

Following up the process with an interview had the following advantages:

  • I could better understand the reasoning behind the participant's decisions
  • I could evaluate their understanding and overview of research field
  • I could collect their (subjective) views on various aspects

Overall Takeaways

  • Surveys are great for preliminary information collection because they are quick and reach a rather large audience
  • Usability testing using concurrent think aloud is key to understanding what works or doesn't work for the user
  • User interviews are the way to go to really be able to answers the whys
  • Objective user studies are the best way to have proof that the product really does what it's supposed to
  • Avoiding bias at every stage of research is a must!

What would I have done differently?

  • Although the online questionnaire reached a very large audience and was very informative, it also consisted a lot of open-ended questions to really draw information out from the participants. This did irritate some of them. In retrospect, perhaps I should have kept the ratio of open-ended to close-ended questions even lower.
  • While performing the semi-structured interview following the usability test, I did outline objectives and base the interview questions on those. However, since I asked users to rate various aspects of the tool, perhaps it would have been a better idea to follow a more standardised scale such as the System Usability Scale.
  • The user study consisted of only 18 participants. From a statistical point of view, I would have preferred to have a much higher number of participants to have results at an even higher level of confidence.
  • Since the literature review methodologies of experts were so important to this work, I should have focused further on conducting one-on-one expert interviews.