Components

All the things

Adrian Galvin
Thesis Modules

--

Case Study 0: Seeds of Visualization Design

Medical Concept Visualization

Context

Classical medical illustration, and much of modern medical visualization for that matter, has the intent of communicating the form of a complex human body structure. Throughout the lifetime of illustration titans such as Henry Gray or Frank Netter, the human body and its myriad components were extremely difficult to capture images of while still in their living state. Advanced imagery and preservation techniques did not allow doctors and scientists to see the human internal form as they needed to: in its vital arrangement. In this era, it was the work of illustrators to study carefully cadavers, images, and samples to imagine what the living form of a tissue might look like and express that form directly. As an illustrator, I have a fundamentally different mission. Stunning high-resolution CT, MRI, and other scanning technologies allow doctors to see even the smallest structures in the human body while they are yet alive. From my experience, I have found that what illustrators can offer is the ability to distill complex phenomena, especially physiologic phenomena, into readily comprehensible form. We must go beyond visual description, to access the world of dynamic phenomena which occur over time and space. This work was the beginning of my story as a designer, an experience which inevitably led me to Carnegie Mellon to perform this research that you read today.

Mission

My partner research team at the University of Maryland and Institute of Radiologic Pathology are specialists in the study of diffuse lung pathologies. Every year they hold a course for hundreds of radiology residents, which helps prepare these students for their final board examinations and the real world of medical practice beyond. I was originally commissioned as an illustrator to create images for their presentations on the disease progression of emphysema. However this work continued to the creation of a physiologic concept animation and the beginning, though I didn’t know it at the time, of my life as a visualization designer and maker of cognitive tools for researchers.

Approach

My design approach has always centered around close personal relationships, embedding in teams for long periods of time to see their work, work alongside, and observe the affect that my work has on my partners. I didn’t know these terms yet, but my process leaned heavily on contextual inquiry, observational studies, in-depth subject matter expert interviews, an alpha version of reflective sketching which you will hear more about later, and regular in-person user studies. You will see refined versions of these throughout my thesis, but I started with this core combination of techniques in response to the particularities of working with small numbers of world class experts on research problems and knowledge creation.

Research Through Design

The result of the first phase of this project was a sequence of 5 detailed illustrations which demonstrate how emphysema spreads from the smallest conducting airways of the lung through the lung parenchyma. The images each demonstrated a diagnostic view of distinct phases in the disease progression. They were shown as a sequence of cross fading images, conveying change over time. However, in seeing these illustrations, we realized that there was a key component missing. In order to truly see the detailed transformation, we needed to show students why it was happening, what drove this process. There is a particular spatial pattern of particle deposition and clearance during respiration, and this dynamic movement accounts for the visual pattern of disease progression. We only discovered the necessity to illustrate this concept when we saw the completed phase one illustrations. The distilled visual form indicated a necessary new direction for research.

To express the underlying physics which drives the visual pattern of this process, I had to show how the location of inhaled particles changed over time, in other words how they moved. The mechanisms of deposition and clearance generate a consistent particulate movement, which in turn exposes segments of the lung tissue to specifically varying concentrations of particles. The areas which experience exposure to greater particulate concentration break down first, and are therefore the places where the disease becomes first visible. Only a carefully crafted motion graphic can express the complexity of this movement, and thus make it comprehensible for students.

My Role

I will highlight four distinctive components of this visualization design process: reflective sketching, anomaly detection, harmonizing with researcher’s way of understanding the world, and understanding through motion. I applied each of these principles as a designer intuitively and without self-awareness. Looking back I see these seedlings of the work that you will see in this document.

Territory, not Problem Space

Where I am

My Not Problem Statement

For the work of this thesis, I firmly reject the framing of a problem space. A problem space a priori implies brokenness. If something is broken, it causes problems. Those problems need to be fixed. Once fixed, the problem space will no longer be broken, and success will have been achieved. This is how much of design is viewed: problem identification, problem framing, designing, solutioning, and in the end the problem will be deemed solved. In this context, it is vitally important that a designer be able to understand and express the problem space, because their endpoint of design depends on the clarity of their understanding. I do not denegrate this approach.

But, this is not what I am doing.

Further Differentiation

I am exploring a new territory at the intersection of visualization design and scientific knowledge creation. This is not a pristine, un-tread territory for designers, we have working with scientists for years. However, the character of this historical or status quo relationship is fundamentally different from what I am trying to do. Most designers who work in the field of science do so primarily as translators and external communicators of fully resolved scientific knowledge. Recall the announcement of the Higgs-Boson particle discovery. For decades, physicists pried into the inner workings of particle interaction. Finally, researchers at CERN, ATLAS, and CMS felt confident that they had irrefutable evidence of the long sought after particle, and it is at this moment that the work of design was initiated. Communication designers made graphics, charts, maybe even a few sexy 3D renderings of the moment itself. Designers were brought on specifically to translate and visualize this novel discovery, package it into a highly readable, polished form for the public and transmit it to them. This relationship, much like the relationship I describe in case study 0, my time as a medical illustrator, is linear. A direct transmission of fully formed scientific knowledge, secondarily polished and packaged by designers, consumed by the reading public. I aim to be at the lab helping scientists to make discoveries, and I aim to use the skills of design to accomplish this task.

My Practice

Recursive Visualization Design

Evolution

In case study 0, I talked about my past practice of making medical visualizations for doctors and researchers. I view the approach to designing that I have now to be a direct and logical outgrowth of that work, but there are some key evolutions that bear examination. There are four high level or meta skills that I regularly employ as a designer: translation, distillation, interpretation, and visualization. Comparing myself at the end of this period of study in design for interactions to myself during my medical illustration years, my practice still rests on the same four meta skills. What has changed is the proportion of each in relation to each other.

When I was purely an illustrator, my job was to grasp the concepts that the doctors wished to express and figure out how to materialize them in a comprehensible visual form. The relationship was uni-directional, linear, and involved two distinct phases: the doctor would take their understanding and give it to me, then I would translate that knowledge into a visual form. But a funny thing happened when I did this: the doctors would say to me, often with a slightly puzzled expression, ‘you know I really understand this concept in a new way after working with you to express it’. I’ve learned in life to pay close attention to anomalies, to the unexpected. A new feedback loop, or potential for a feedback loop, was just beginning to reveal itself to me. I saw for the first time that my work as a designer could feed back into the process of scientific research and spark new understanding in the researchers minds.

I came to CMU to see if I could begin to understand that feedback loop. After three years of study, my design process has gone from linear and mostly translation plus visualization, to a recursive process characterized more heavily by distillation and interpretation which feeds periodically back into the minds of researchers. If I were to characterize these two processes in questions they would be:

primordial process ‘What does this data look like visualized?’

evolved process ‘What visual forms can I feed back into the research process to catalyze new moments of insight?’

Recursive Visualization Design

When we think of a repeated action in design, we most commonly think of iteration. A designer taking an idea or direction, and repeatedly making new, improved, more sophisticated, more sensitive, and more fully realized versions of a design. This is not what I mean when I say that I recursively design visualization inputs into a research process. To be sure, I iterate on visualizations to drive them forward. But that repeated designing is a forward drive that is bent back into the stream of research to come in contact with the scientists minds and thoughts. Iteration is about repeating toward a design objective. Recursion is about sparking new ways of knowing in the minds of my colleagues through sensitive design choices.

Impostering, Scientifically

A New Way for Designers to Be

Literature

In my review, I discuss findings from four separate but overlapping fields: human-computer interaction, cognitive psychology, design studies, and visualization studies. All of these fields have much, of much value, to say. But they fail to address how a designer should do the work of using visualization skills to spark new scientific insight. HCI demonstrates the fact that visualization can have an impact on researchers, and carefully catalogs what this impact is. Cognitive psychology characterizes how visualization is able to enter a person’s mind and create complex meaning within. Visualization studies catalogs novel and historical methods and techniques for representing information in visual form. Design studies, among many other things, elucidates the ways that scientific knowledge is constructed.

Examining the Doing

Unfortunately, none of these extremely valuable things tells me how I should proceed as a recursive visualization designer if I aim to spark novel insights. The inner mechanics of the design choices and actions themselves are not to be found in this literature. In order to discover the techniques and methods, I have chosen the practical approach of simply getting to work and doing it. This thesis is built around three supporting case studies, and one primary case study, all of which are different explorations into the territory of designing within the context of active scientific research. In each I will explain what I did, how I did it, what choices I made, why I made those choices, and what affect my work had. I believe that design is not purely an intellectual pursuit, it can never be fully understood through reading, thinking, and writing alone. I have certainly done all three of those things in this process and I believe in their value, but the core of research work involved designing as an action.

Living in the Prototype

It is fair to ask: ‘Why should a designer be on a research team?’ I will answer this question by showing in my case studies that bringing the discipline of design into the context of science derived novel and unexpected results. There are days when I have felt like an imposter, working alongside scientists who have knowledge and a breadth of understanding of the universe that I will never have. For this reason I have worked throughout this year to hone and examine every part of my process as a designer, so that I can bring something of real value to the table. We could consider each of the case studies that I will present as prototypes. But I consider myself and my practice to be the prototype. Each case study is a context of exploration. This document is a synthesis of what I have found along the way during this transformation of self.

Literature Review

Adjacent Territories

This thesis should be viewed as an organic outgrowth of previous research and thinking. There exists an extensive body of literature in human computer interaction, design studies, cognitive psychology, and visualization studies which articulates the benefits, dynamics, interactions, and results of visualization for scientific research. However, little has yet been written to describe how a designer may participate in and contribute to the process of knowledge formation through visualization in a scientific research context. My work grounds itself in these four fields, but extends knowledge by growing into a new territory. I argue that there is a place for designers on scientific research teams, and I aim to catalog and demonstrate how I have done so in my own practice.

As a visualization designer, I will start by defining why I do what I do, what drives me as the deepest level. Fundamentally I am driven by curiosity, by a desire to know. Not a superficial interest in the world, but an abiding need to see deeply into phenomena, to know their true nature and dynamics. Those that do this most comprehensively, we call scientists, and I strive to enable these people to reach uncharted areas of knowledge. Most frequently, designers act as external communicators of knowledge creation. A research team executes and concludes their research, and only then brings in the designer to help them communicate their learning to the world. I wish to do something different: to create visual cognitive tools which enable these knowledge explorers to reach places that they might otherwise not have. Put succinctly, I aim to design tools which enable scientific discovery. I turn to Saraiya, North, and Duca to expand on how this is possible:

“A primary purpose of visualization is to generate insight. The main consideration for any […] researcher is discovery. Arriving at an insight often sparks the critical breakthrough that leads to discovery: suddenly seeing something that previously passed unnoticed or seeing something familiar in a new light. The primary function of any visualization and analysis tool is to make it easier for an investigator to glean insight … ” (443)

For these researchers, visualization allows insight into data sets which are impenetrable, or at least exceedingly impractical to comprehend analytically or numerically. A canonic description of this enabling process comes from Zhicheng Liu and Jeffrey Heer, who study the effects of interactive latency on exploratory visual analysis tasks. The strongest conclusion of their study is based on this observation: significant interface latency decreases user activity and data set coverage by depressing rates of observation, generalization, and hypothesis during exploratory research tasks. This statement can seem obvious on its face, it is unsurprising to conclude that slower visualization systems reduce exploration, however the authors argue for a deeper, more impactful, and potentially surprising conclusion.

Inverting the observation clarifies the significance of this finding. Visualization systems which respond at the speed of researcher’s thoughts enable greater data set coverage by increasing rates of observation, generalization, and hypothesis. In other word, improving the visualization system does not take researchers to the same conclusions faster, it enables them to reach more, and more novel conclusions.

A summary of this constellation of authors’ key findings plus an addition of my own renders the following logic: designers, as specialists in the visual representation of data, deliver value to scientific research teams by creating visualizations which harmonize more closely with the way that researchers think and work, which enables said researchers to ask new and better questions of their data. This idea seems logically sound and makes intuitive sense, however making sense is not enough for scientific endeavors, the hypothesis must be tested.

This presents several major classes of difficulty. First, the inherent ambiguity of exploratory research questions. Second, the problem of ecological validity of any tests performed. Third, the variation and specificity of each researchers approach and work.

Evaluating the utility of a visualization in the context of exploratory research presents challenges for the standard array of usability tests which designers, human computer interaction researchers, and human factors researchers traditionally use. In these tests, users are asked to accomplish known tasks, and their ability to succeed in the interface can be measured in a variety of ways. This gives the test proctor a measure of how competent the interface is for a given task. However, when the task is to discover new anomalies, or to form novel hypotheses, it is very difficult to empirically measure the competence of the interface because the number of variables is much greater. When so many variables are present, it is difficult to say with confidence whether success or failure are driven by the visualization or by another factor. The final problem is that measuring the amount of novel insights is inherently more difficult than measure the rate of success of execution of a known task. Because of the difficulty of defining and measuring ‘novel insight’, a domain specific test protocol must be designed. Saraiya, North, and Duca present a protocol for genetic micro-array testing, and note which elements are generalizable to other branches of scientific inquiry. The protocol which I will present in this document is a direct outgrowth of their framework.

Barkley defines ecological validity as:

“… the degree to which the results of laboratory methods represent the actual behaviors of interest as they occur in naturalistic settings” (150)

In the context of visualization research, this concept directs us to ask whether the behaviors elicited under testing conditions, even if a domain specific method of assessing insight is generated, actually reproduce the behaviors of researchers in their daily context of research. In fact, circumstances would suggest that this is not the case: scientifically valid research frequently occurs over years, and without a test proctor pushing the researcher. It is likely that the problem of ecological validity in relation to assessing the efficacy of exploratory research visualizations cannot be eliminated. However, steps can be taken to minimize this problem. A canonical example of this is presented by John Rieman, in which he deploys user completed ‘eureka reports’ which are to be filled out in the user’s natural environment, precisely at the moment of insight. I have applied a similar approach to my research by supplementing modified classical usability studies, with longitudinal in situ ‘insight reports’. The aim of this study is to characterize the researcher’s use of the visualization tool in their natural context while working in their habitual manner.

Finally, atmospheric and climate scientists pursue radically different work, and carry out their work in a surprising variety of ways. One researcher that I have worked with statistically assesses the impact of wildfire smoke plumes on climate science models. Another performs comparative studies of the properties of wildfire smoke plumes in similar biomes on different continents. Although there is subject commonality, the visualizations which are most useful to each of these researchers would be quite different. It is thus a significant challenge to build visualization tools of general utility, even within a single field of scientific research.

Reflective Sketching

A Design Approach to Helping Expert Users Distill Information

Landscape

Expert users such as scientists and medical doctors must know an immense breadth of information in order to perform their work. Years of training coupled with a high stakes and high complexity work environment shape the minds of experts with deep knowledge. As a designer this can present two challenges: first an expert user, in spite of their expertise, is likely to know little or nothing about the design process, and second the amount of knowledge necessary to collaborate usefully with experts is greater than with a non-expert audience. There is both a mass and a filtration challenge; a large volume of information must be taken in, and the key pieces distilled out by a designer who is necessarily working outside her own expertise.

Strategy

The approach that I have found most successful is a specialized type of sketching applied as a followup to any contextual inquiry, interview, or co-design activities. The goal is to sort through the information that I receive and distill it down to a few sketchbook pages which capture what I believe, as a designer, to be the key information that I learned and need to take forward in the design process. The sketchbook pages should then be shown to the expert in an informal conversation, giving the designer the opportunity to reflect her current understanding of the information.

This approach has several advantages. First, the process of considering the information and the act of visually articulating it on a page helps to refine and cement a designer’s understanding of the concepts, easing the recall and application of the new knowledge at a later time in the design process. Second, the sketchbook as an external artifact allows the expert a context for giving feedback which does not have to be directly targeted toward the designer. I have found that users are sometimes much more willing to critique a sketch than to critique in direct conversation. The third advantage is that any conversation will naturally be bounded by the limits of the sketches themselves so that it does not overflow or go off track into something less useful. An easy transition is open to follow up interviews or activities after this reflective conversation. Fourth, users themselves can see clear evidence of how they are coming across to the designer, providing a feedback loop which can be useful, especially if the designer will have a close, long term relationship with the expert, as has been common in my work. Lastly, if the designer is working on a team, these sketches can act as a kind of consensus building, setting a common standard for what the team considers to be the most crucial insights and information going forward.

Differentiation from Other Forms of Sketching

In “What Designers Know” Bryan Lawson divides sketching for design into several primary classes, I will discuss the differences and similarities of the three which share the closest kinship to reflective sketching. First, presentation drawings, in which a designer “… communicate[s] their work to clients and others from whom they may need some agreement, consent or permission to continue” (34). This is quite similar to the reflective sketching that I propose in that it involves a conversation with or a presentation to a client or user. However the similarity breaks down there: reflective sketches are not meant as a persuasion, they have a more humble attitude of inquiry. Additionally, the reflective sketch is not of the artifact to be designed it is prior to it, expressing a critical idea which the designer will take forward into the ideation phase. Second, consultation drawings “… are primarily intended to convey information from designer to client or user or other participant in the design process. However, these drawings are done not so much to convince as to elicit a response in order to assist in the designing process itself” (36). This is one step closer to reflective sketching because the intent to elicit and response and cultivate a useful conversation are the same. The difference here again is the topic: a reflective sketch is a representation of a key concept, not a design idea or iteration. Third, diagrams “…drawings that we might normally describe as charts or graphs … These are obviously thinking drawings” (39). There is a commonality here in the fact that both reflective sketches and diagrams are generally abstractions or have an aspect of abstraction. Additionally, both can be easily classified as thinking drawings. However diagrams for Lawson are internally facing for the designer, they are a type of visual reasoning or calculation which the designer uses to further or deepen her own understanding by herself. Reflective sketches perform this function, but they go beyond this as they are fundamentally intended to be used in conversation with the expert user.

Another category of drawing that I wish to address is what Lawson refers to as proposition drawings, which are identical to Donald Schon’s classical description of drawings as a space of conversation for the designer. I will refer to Schon’s classical formulation of designerly reflection-in-action. There is a syntactic similarity here, both my formulation of reflective sketching and Schon’s reflection-in-action hinge on the concept of reflection. In Schon’s formulation, in the process of designing, the designer makes moves by adding to their sketch, and then considers the gestalt of the image, considering whether this new move is in harmony or conflict with the essence or intent of the design as a whole. If the move is harmonious, the designer continues to draw, if not they will remove the addition, constantly referencing internal logic of the form. In this way, there is conversation occurring between designer and form, a sort of experimental dance which over time builds a coherent image. Reflective sketching as I practice it, maintains this idea of conversation through form giving, a back and forth in which designer and expert can co-create an image of the most critical knowledge needed for research. There are two main differences here. First, the conversation occurs between the designer and the researcher. Second, the intent is to distill information into a coherent, agreed-upon aggregation of knowledge which will affect the design downstream.

Location, Color, and Form

Perception, Cognition, and Intuition

Oil field fires in Western Iran

Location

The most elemental component of a geospatial story is location. Without knowing where something is, no visualization can be built. The accompanying image shows a data story using only the location of fire events. Naturally occurring fires tend to have an element of randomness to their geographic distribution, whereas these fires display a regular north-south alignment, and persistent clustering around particular locations. It is possible to infer simply from this spatial arrangement that these fires are likely to be anthropogenic in origin. A looping animation of these fires throughout the year of 2018 reveals that their location is remarkably stable, further evidence of a human created phenomenon. This particular distribution of fires comes from oil fields in Western Iran, and although satellite imagery could be brought to bear to explain this, it is a useful exercise as a designer to visually explain a concept with as little information as possible. In the case of a visualization designer working in the geospatial field, this minimum necessary information must be the location of a given phenomena.

Different color scales highlight different parts of a data story, making ‘correctness’ difficult to assess

Color

Color is an illustrative microcosm of the balance between art and science which a visualization designer must strike. The realities of human anatomy and perception bear heavily on selection of chromatic scales, and sufficient attention must be paid to the critical work of cartographic color researchers such as Cynthia Brewer. The science of determining what colors and differences are maximally perceptible to the human visual system is a complex branch of research which a visualization designer must study carefully and apply to any geospatial visualization they create in order to maximize legibility and clarity of storytelling. Not only must a chromatic scale be carefully chosen, but the function by which the scale is mapped to a given data set is of critical importance for communication. The accompanying image shows only one data set and one color scale, but the transfer function is different. In the case of quantized color application, the steps in color are mapped simply to the axis values of the data set. This leads to many color categories remaining empty and a visualization in which the single outlier hexbin in Iran is dramatically highlighted while the rest of the data is flattened. The quantile application of color applies each color category to a subset of the data itself, which guarantees that all color steps will have equal representation in the data. This produces an image in which there is greater visual difference between the subtle variations of the data set, however the extreme outlier blends into the rest of the image belying how truly different it is from the rest of the data. This example illustrates how critical the application of color is to geospatial data visualization, and how different a story can be told with the same data and color scale.

There is however, another side to this story. A scientifically based, appropriately mapped color scale can nonetheless be ineffectual in conveying the essence of a data set. A successful visualization designer must simultaneously choose colors which are dynamic, evocative, and maintain a close emotional connection to the meaning behind the data. In these visualizations, a dark basemap with colors that suggest fire helps to preserve the impact of the visualization and helps to communicate the power and danger of wildfire. The successful visualization designer must be able to apply both the art and the science of color to produce a truly effective image.

Height and color are mapped to different variables and maintain close cognitive connection to the underlying data

Form

Extending geospatial visualization into the third dimension adds significant complexity to the problems a designer must address. However it also offers opportunities to convey complex phenomena in powerful and effective ways. The accompanying image shows hexbinned wildfire data with hexpillar height representing the average plume height in the subregion, and hexpillar color representing the averaged radiative power in the subregion. Maintaining a close cognitive connection between visual variable expression and data property helps to clarify the image to the viewer by reducing cognitive load. Additionally, viewers can read multiple variables simultaneously in a manner which would be difficult to recapitulate in a two dimensional visualization.

Meta Skills

Visualization, Translation, Distillation, and Interpretation

In other chapters I will refer to these meta-skills, which I consider to be the highest level skills which a visualization designer possesses and deploys in their work. I consider all four skills to lie along a ‘continuum of agency’ for the designer: visualization being the least full of agency, translation, distillation, and interpretation falling further toward the pole of agency from there. What I mean by agency is how much freedom the designer has to build visualizations the way she believes best. Below are the specific definitions and overtones with which I will use these words in this document.

Visualization

The act of materializing something in visual form. The concrete decision making of what color, shape, location, and arrangement an image should be given a known objective. In my usage, visualization affords the designer the least agency because it is simply the decision of how best to represent an established and unchanging concept. The designer is not free to choose what should be visualized, she is simply making the objective visible. This is not to say that visualization is inferior or requires less skill than the other meta skills, it does not, it is simply that the designer has the least opportunity to influence the objective.

Translation

The act of recapitulating something in another language or modality. A designer might translate a data structure into a color scale, thereby choosing how that data is to be represented. This affords the designer more agency because the choice of modality or visual channel significantly impacts how the original item to be translated will be understood. Although the designer does not have the power here to choose what to translate, nonetheless her ability to impact the final outcome is increased vis-a-vis visualization.

Distillation

The act of selecting a subsegment of critical content to represent, thus reducing the amount of information expressed, but focusing it into a more comprehensible form. An example might be a designer choosing which data variables to represent from a large and complex data set, before translating and visualizing those variables. Here the designer is afforded significantly more agency because she is able to begin to choose what represent, not just how it should be represented. This commonly occurs when an expert collaborator has an excess of unsorted information or knowledge, which the designer must comprehend and recognize the key aspects of.

Interpretation

The act of casting or reflecting distilled information into a novel or unexpected form in such a way that a previously un-seen aspect of the information is revealed. This act affords the designer the most agency in that it allows her full control of what is to be represented as well as how it is to be represented. This degree of latitude is usually only afforded a designer in the context of an intimately trusting collaboration with experts.

Case Study 01: Making Visualizations Resonate with How Researchers Think

Planetary Micro X-Ray Fluorescence Spectrometry

Context

The mars 2020 rover PIXL instrument is designed to detect signs of ancient life in the rock formations of the red planet. Organic processes leave markings, patterns, chemical evidence which cannot occur without biological mediation. Geologists on earth use these same techniques in their search for the oldest direct evidence of life on our planet. Similar microXRF instruments are used in both contexts to investigate the spatial relationships between elements and minerals (solid, pure, chemical compounds) which to the expert eye can tell the story of how a rock was formed, and thus determine if biologic mediation must have occured.

Carrying this type of investigation out through the instruments of a planetary rover confers additional challenges. On Earth, additional tests may be carried out to dis-confirm hypotheses as needed. Working on a different planet, the researchers ability to interact with the rock formation is restricted, putting additional emphasis on the importance of the visual analysis of spectroscopic data.

Mission

Scientists at the NASA Jet Propulsion Lab need to be able to visually assess microXRF data sets in order to determine if biologic mediation has occurred in a given soil or rock sample. Currently the team extracts the concentration values for each element present in the sample, and then renders each element map in a distinctive color. The expert then reads the geologic history of the rock across all of these maps.

Limitations and Challenges

Simple visual solutions such as this can be helpful in scientific decision making, but are of limited utility in addressing more complex cases. The prototypical simple case is illustrated below. One known form of geologic evidence of biologic mediation is called Oolite. This is a sedimentary rock formed of small concentrically layered grains called Ooids. This distinct radial crystal can be formed through accretion of microbial biofilms, which can be observed with microXRF instruments.

The concentric spatial distribution of elements within the Oolite make this a simple problem to address with visual analysis. Separate maps representing the concentration of each element are generated and given a distinct color. These can be used separately in a small multiples style comparison, or they can be overlaid by mapping the concentration from color to alpha as opposed to color to black. Because this particular sample has little in the way of color overlap, or color mixing due to complex mineralogy, the resulting image is clear and the story simple to comprehend. Unfortunately most samples are not like this, and more complex visual strategies must be applied to comprehend them.

A different and more challenging sample is depicted below. This is an Earth rock which was once thought to be evidence of ancient life. Scientists on the PIXL team now think differently. Their visual analysis methodology is based entirely on small multiples cross comparison because the amount of co-located elements and minerals makes overlay visualizations ineffective.

Problem

In this sample, all seven of the depicted elements overlap in the upper region, an attempt to understand this region through overlaying of color element maps would be ineffective. The illustration below demonstrates the difficulties inherent in attempting to disambiguate color mixing of several distinct colors. Although each color is legible on its own, the various mixtures are impossible to characterize with the level of confidence needed to support a scientific conclusion. However, working from an abstract analysis of the pure values is equally ineffective because high-level patterns are not highlighted in any way.

Approach

Part of a scientific visualization designer’s role is to provide novel solutions for circumstances such as these. An effective method of moving past this type of challenge, is to understand the researchers deeper need: what are they trying to know or understand? Although scientists have an established way of working, that way may not be the most effective, especially if it is reflective of previous software limitations and the force of habit.

The JPL team’s approach of small multiples and opacity overlays came about because it was the only approach afforded by the analysis software that they use. Continual use of these limited products generated an industry-wide habit of thinking, which the researchers expressed to us as their need. However, there was a deeper layer to discover below this.

In this case, we were able to run co-design sessions in which the researchers revealed that they wanted to understand where minerals were located spatially, to see the mineral’s spatial distribution clearly. As such, color was an un-necessary element of the visualization. The visual solution which we offered allows researchers to input the combination of elements which make up a given mineral, and receive a black and white visualization which gives at each pixel a binary answer to the question, are all requested elements present at this location? What we discovered was that the researchers thought that they needed to see multiple element maps overlaid, but what they actually needed was to see where minerals were present.

Visual Solution

This view is simpler, it culls unnecessary detail from the representation in order to represent the answer to precisely what the researcher wants to understand. This has a downstream affect on the researcher’s workflow: the visualization more clearly tracks or resonates with how the researchers think about the sample and its formation history. In the previous visualization, researchers spent much of their cognitive effort on decoding the color blocks that they were examining. In this visualization, on each card the researcher can see precisely the location and distribution of each element or mineral.

Design Decisions

The interface that this visualization lives in is black with light type and bright visualizations. I took this from radiology interfaces that I have worked on in the past. Radiologists are visual pattern analysts who see subtle and intricate differences in shades of gray. These shades represent the evidence of disease in the human body and a radiologist is significantly responsible for how a disease will be treated, they must see carefully all of the tiny subtle differences in color from an x-ray or MRI, the patient’s health depends on this. In order to support this careful seeing, radiologic interfaces are dark, which removes excess light that would otherwise constrict the pupil. With their pupils open wide to catch all of the detail, the bright visualizations become the focus of their analytic attention. The UI for this visual tool takes a similar approach because the PIXL scientists are doing something very similar, grayscale investigation of subtle patterns which represent a complex story to be discovered.

Another critical decision was to remove the color from each element map. This color originally played the role of helping the scientists to distinguish each element in opacity overlays. Additionally the team developed a common chromatic language around each element: for example all scientists on the team agreed to represent iron with red, this helps in team cohesion and consensus building. Since we removed the opacity overlay technique, each element map could be reduced to grayscale, which is a more perceptually consistent representation. Because of the contingencies of human eyes and visual cortex perception, a dark blue pixel will appear darker than a yellow pixel even if they are at the same light value. By removing color from the map, we eliminated this challenge from the scientists daily workflow. In order to preserve the team chromatic language, each map card has a single color dot next to the element letters, allowing the team to maintain their same cognitive connection of element to color.

One further challenge is the difficulty recognizing features across element maps. To overcome this, we only allow horizontal spatial comparison of maps and intersections. This way all distinct features will be at the same Y height in space, supporting side-to-side scanning tasks. To further clarify features, we created a feature trace method which allows the scientist to trace a line across the feature in the context image. Once created, this line is propagated automatically across all element maps and intersections, solidifying the exact location of the same feature across all maps.

My Role

The primary transformation in this case study occurred when my team was able to understand the underlying scientific need or inquiry. The science team, partially because of their expert status, was unable to see that their thinking had been sculpted by the contingencies of their ecosystem of non-specialized tools. In identifying and helping the team to articulate precisely the line of inquiry that they wished to open, we were able to reveal a deeper need to fulfill through our visualization. In this case, the final form of the visualization is less important in and of itself, but reveals a novel form of inquiry, one which allows the scientists to focus on comprehending the sample as opposed to decoding imagery.

Case Study 02: Anomaly Detection

United States Fisheries Data Exploration

Context

Researchers at the Brown University Public Policy Institute and the Carnegie Mellon School of design, are interested in understanding the fish production system from harvest to table and home service in the United States. The system is complex, preliminary investigation indicates that understanding the black box of fish processing practices, the middle-men of the system, will reveal fraudulent and unsustainable practices that need to be improved if we are to make our fishing sustainable overall.

Mission

I was asked to make a simple interactive application to visualize the top ten wild and farm fished species in the United States. The concept was to create simple bar charts which show the total number of tons caught, and what proportion of that domestic catch was exported in the same year. However, in the process of manifesting this simple visual, I discovered several anomalies which helped clarify the research direction of the team. My assertion is that the necessities of visual form giving helped reveal anomalies and ask questions of this data set.

the four visual form concepts

Sketching

The concept began with four answers to two questions: Should the tonnages be ordered by total domestic catch or export amount? Should the visualizations be baseline or midline organized? Sketching is an effective way to imagine the potential answers to questions like these. It was clear that for the researchers use case, midline organized export ordered visualizations most clearly told the story that they wanted to understand, and I didn’t have to start coding at all to answer this question. Visual reasoning in this manner is a practical way for a visualization designer to move research questions forward, particularly when they relate to visual form giving.

Coding

The work of anomaly detection began from this point forward, starting with the first computer drawings of the real data. I wrote a parsing and sorting algorithm in python and then visualized the ordered data in processing. Below is the first graph that I produced. Two things immediately stood out: first the scale difference between the largest catch and the smallest made it nearly impossible to see both at the same time, second we appear to export more than we catch of several fish species. On the face of it this seems impossible, or at least requires further research and explanation.

white bars are export tonnage, black bars are domestic catch

Anomaly 1

We export more than we catch per year of some fish species.

In this graphic, the massive white bar represents the amount of salmon that we export every year. However, we only catch a tiny fraction of that amount. This indicates a further problem: the bars are order by tons of domestic catch, of which export should be a fraction. In the case that export tons are greater than domestic catch, how should the bars be ordered? What is the data really saying? Does the United States export 100% of salmon caught every year? Does the United States export a number of tons far greater than it catches, but nonetheless keeps some of the catch for domestic consumption? The necessity of making ordering and rendering choices for visual form giving, revealed the fact that further research is needed to understand exactly what is occurring in our salmon harvesting practices. As a visualization designer, I can contribute to the research team progress by highlighting new questions and directions to pursue, which might have been previously hidden in the data.

Visual Solution 1

I placed the full bar for salmon on the export side, indicating that all caught tons are being exported, and applied a saturation change to indicate that the United States exports more than this, but that amount requires further investigation.

Decrease in opacity communicates anomalous data

In this graphic, the anomaly is clarified in such a way that readability is maintained. However, there is also the possibility that this visual arrangement is misleading. The form indicates that 100% of domestic catch is exported, however this could be false. Perhaps a smaller percentage of domestic catch is exported, but frozen or fraudulent tons of fish are being exported as salmon caught this year. This is a strong indicator that this is a question worth investigating for the researchers that I am working with. This is a phenomenon which occurs more frequently in aquacultured species, indicating a place for researchers to start their investigation of fraudulent production practices.

Anomaly 2

The second anomaly is the huge difference in scale between the maximum and minimum catch tonnage. This presents a problem for viewers because they cannot access the data visually. One typical solution to a data range problem such as this would be to use a logarithmic scale. However, these scales are less intuitive for public consumption. Since the target audience for this publication includes both researchers and non-expert members of the public, this is not a viable solution from a design standpoint.

Visual Solution 2

I built two interactions into the application which solve for this problem dynamically. First, the user can filter out the two largest catches from each graph, and then they can change the scale factor in order to see the data rendered at an appropriate scale. This allows both an expert and a non-expert user to see the data in exactly the scale required to answer their inquiry.

2008: full data set, zoomed out
2008: truncated data set, zoomed out
2008: truncated data set, zoomed appropriately for wild catch
2008: truncated data set, zoomed appropriately for aquaculture catch

The dynamic and exploratory nature of being able to zoom and truncate at will allows the user to answer their questions visually. Because the zoom interaction requires several steps, the user’s mind retains cognitive connection between each state, which helps to maintain orientation within the data set while examining it at different levels of scale.

My Role

In this mission, I had to move fluidly between sketching, data analysis, data visualization, and code. Although none of the tasks were particularly difficult, usually these roles would be filled by multiple people with different expertise. However, my expertise involves a fusion of analytic thought and creative expression. My ability to work across these disparate tasks allows me to question and investigate the data in an unbroken way, potentially leading to novel conclusions and research directions which would have remained unexplored.

Case Study 03: Time, Space, Motion

Wildfire Smoke Plume Visualizations

Context

This project centers around a data set from the NASA VIIRS instrument aboard the Suomi National Polar-orbiting Partnership satellite. This instrument returns data on the location and intensity of wild fire events on Earth. This data set is simpler to engage with than the MISR wild fire data base for several reasons. First, it can be downloaded as CSV files easily from the NASA database. Second, it contains many fewer variables per fire event than the MISR product, most importantly it does not measure smoke plume height above ground. I chose to use this data set because of its accessibility and simplicity so that I could build a functioning motion prototype without having to overcome the additional challenges that working with MISR data would necessarily entail.

Problem

This data set contains countless fascinating stories about our planet, but viewed from high up, with every one of the 187,000 ellipses glittering across the surface, it is difficult for the mind to focus down to comprehensible phenomena. There is so much data that the visualization can become a wash of noise. After constructing the functioning prototype, I had to find a way to highlight understandable stories within the cacophony.

Motion

I took a research through design approach to this problem, building interactive animations which told data stories by manipulating time, space, and motion. I began with a looping visualization of the geolocations of fire plumes world wide for the year of 2018. Each fire was represented as an ellipse which appeared on the date that it happened, and then faded out over a quarter second. This overlapping fade created a wonderful effect, as though the plumes were being painted across the globe. But this was not just an attractive visual effect, it drew attention to certain stories which were previously obscured in the data set.

Data Stories

Visualization with motion displays the entire 2018 VIIRS data in a dynamic and engaging way, but the individual stories are hidden in the cacophony of 187,000 dots. In order to distill the visual form down to comprehensible moments, I chose specific combinations of latitude, longitude, and zoom level which leave only the fires that I want the viewer to focus on in view. This curated viewpoint is accompanied with a simple explanatory guide text which cues the viewer on what to look for, helping them to see the evolving phenomenon.

data story implementation: a location, zoom level, and explanation

Five phenomena presented themselves in this storytelling experiment. First: African crop burning practices. As the growing season comes to a halt, African farmers burn their fields to help maintain their fertility. But the end of the growing season occurs at different times across the huge continent. The animation shows a sweeping wave washing from north to south through the year, following the change of seasons. This particular motion contains the essence of the story and is only made visible through the animation, it cannot be spotted in static imagery or abstract numerical analysis. Second: the enormous North American wildfires of 2018 including the deadly Camp Fire. Close analysis of the western coast of North America shows sudden, enormously patches of fire which spread from a single origin and then suddenly disappear. This abrupt motion is characteristic of a violent, fast moving wildfire, a phenomenon which is made imminently clear by the animation. Third: the oil fields of Western Iran and Eastern Iraq. The regular north-south distribution of these fires and the fact that there is no cyclic variation in the location throughout the year marks them as a consistent, athropogenic phenomenon. All of these phenomena were present in the original data set, but obscured by the fact that the interface could only render an aggregated slice of time. Fourth, Brazilian rainforest encroachment can be seen as capillaries of fire invade northward over the year. These fires follow major roadways and supply lines that the loggers use as they progress into the forest. This story could more clearly be told with accompanying satellite imagery, however this was outside of the scope of this simple prototype. Lastly, Australian brush fires which are similar in motion behavior to the North American wildfires can be seen spreading quickly from points of origin.

Design Decisions

I chose a dark base map for this visualization because I hoped to evoke some fo the emotional appeal of nigh time satellite imagery in which the tiny flickering lights of human civilization give a poignant sense of perspective on our place in the universe. In later work, I created a close cognitive connection between the fire event data and my visualization by choosing a dramatic red, yellow, and orange palette. However for this animation I wanted to experiment with a cognitive contrast. The base map and UI grays are shaded in the cool spectrum and the fire event ellipses are and icy light blue. This has an odd effect on the viewer, since the visual form deflects their mind away from ideas of fire or burning. This leaves the mind able to focus simply on the patterns of movement and distribution, which is the primary focus of this experiment. In other words the color choices de-emphasize the meaning behind the content in the visualization, which counterintuitively helps to foreground the primary purpose: an exploration of movement.

The transition animations between stories rely on a modified cosine curve which is used to generate acceleration and deceleration values for the camera position as it transitions between locations. Most map location transitions rely on simple arcs as opposed to cosine waves, this can make the beginning and end of transitions sudden or jarring. By using a cosine wave, all velocity changes are eased which more closely evokes the movement of a natural object in real space. This is reassuring for the viewer and allows them to concentrate on the visualization without jarring distraction.

My Role

It is worth noting that the fundamental elements of storytelling through dance are time, space, and motion. I consider these visualizations to be a form of data choreography, revealing their story as they evolve. In this case I was prototyping a process by which I could scrape large amounts of data from online NASA databases, transform that data into something usable through text parsing, and then rendering that data as a legible and attractive visualization. But deeper than that, my job was to be a storyteller, to take a massive hyper-object and reveal scoped views of this object in ways that are understandable to the human mind. I both give the user the entire data set to explore, and provide entry points for understanding.

Case Study Prime: MERLIN

Data Exploration

Context

The MISR instrument team at the NASA Jet Propulsion Lab is a group of scientists, engineers, and software specialists who collect, format, host, and distribute a key data product from the TERRA satellite. This data product contains the location and properties of 56,000 wildfire smoke plumes occuring around the globe starting in 2008. The team believes that its data is critical for a variety of scientific endeavors such as: comprehending climate change, decoding atmospheric dynamics, and disaster mitigation.

Problem

This team faces a challenge: their data set is mostly unused by the community of climate and atmosphere scientists world wide, who would benefit from it enormously, because the data is stored on a difficult to access archive as individual text files. The interface for exploring this data is old and does not provide sufficient visual feedback, filter control, abstract high-level visualization, or download functionality. I worked with a team of designers and computer scientists over the summer to build an appropriate exploratory research interface for this data product, which improves on the previous interface in all of the previously mentioned dimensions. The MISR Exploratory Research and Lookup Interface (MERLIN) joins a long and illustrious line of software named with nested acronyms, a NASA specialty. I had the privilege to continue working with one of the JPL science leads, Dr. Mika Tosca, for the entire thesis research process.

Research Approach

Since this case study was longer than the micro-studies, and is intended to extend the capacities of an existing system in the service of researchers, I applied a multi-modal course of discovery tactics in order to orient myself and hear from scientists about where they thought the opportunities were. This also gave me a chance to reveal information about some of the investigative superstructure of this thesis: coming to understand how scientists construct knowledge about the phenomena which they study. My investigative research had three components: a series of insight-focused usability studies, an in-situ longitudinal workflow analysis, and a descriptive insight process questionnaire.

Research Precedent and Conceptual Framing

The three studies were chosen specifically to address the aims and concerns which I discovered in my literature review. First Saraiya, North, Lam, and Duca’s classic insight based methodologies for assessing the capability of visualization interfaces for exploratory research tasks, although specifically focused on the field of bioinformatics, are nonetheless highly useful models or prototypes for the studies which I built and conducted. Indeed, in the conclusion and discussions they synthesize and describe abstracted concepts, definitions, and assessment criteria which are specifically made to be generalized to other fields of science. They also directly call for other researchers to take up, modify, and adapt their insight focused methodologies for use in other fields, which I have done. {CITATION}

“This longitudinal study is just the beginning of this line of work, and there is much more research to be done. More studies need to be conducted with different subjects and tools in diverse domains, in order to extract broader abstractions and patterns of the visual analytics process.”{CITATION}

My first study, an observational investigation of what insights could be revealed through the MERLIN interface and how scientists were able to connect those insights into a coherent story or hypothesis, was deliberately designed to assess the construct validity of both the interface and my model of insight within climate science. The second study was a follow on, a modified diary study which the scientists wrote in over a two week time period, which aimed to establish the environmental validity of my insight model. The final study asked the researchers to describe in writing how they understood their own process of insight and discovery in the context of their self-selected ‘most impactful paper’. I posit that the combination of historical, longitudinal, and observational studies provides a multi-faceted look at how climate and atmospheric scientists construct knowledge through the complex non-linear process of insight and discovery.

Participants

I worked with 7 researchers working at 4 different institutions, all climate or atmospheric scientists who are actively publishing as part of their regular employment, and who have a PhD in climate or atmospheric science or are currently earning one. These scientists were recruited informally through the personal network of my primary research partner Dr. Mika Tosca. One of the main challenges of my research is revealed here: although all of these researchers are working in a related field, the actual topics and modes of study employed by each is significantly different and varies even from paper to paper. This variation adds to the already significant challenge of working with experts described earlier.

Study 01: Insight-Based Methodology for Assessing Wildfire Visualizations

There are several instruments on Earth-orbiting satellites which catalog and measure the properties of wildfires and their attendant smoke plumes. The Multi-angle Imaging SpectroRadiometer (MISR) aboard NASA’s EOS flagship TERRA, is unique because of the length of its mission, but more importantly because of its ability to geometrically capture the height and spatial arrangement of smoke plumes in addition to their radiative power. It is generally believed that there is a correlation between fire radiative power and smoke plume height, and the MISR data product provides a unique context for investigating this hypothesis. The instrument itself returns image strips, which are searched for smoke plumes, and the characteristics of each smoke plume are measured using the MISR INteractive eXplorer (MINX) software. Each plume and its attendant values are then stored in text format. Atmosphere and climate scientists use this large data set by statistically analyzing or modeling the whole set, or using subsets to perform further comparison or analysis. Much like bioinformatics studies which this research is based on, the magnitude and complexity of the MISR dataset makes it prohibitively difficult to extract insight from without computational methods.

Most usability studies assess the ability of an interface to enable a user to accomplish a know task or outcome. Because the end state is known ahead of time, each user’s workflow can be cataloged, timed, and described; revealing a direct rating or measure of how usable the software is. However, we face an additional problem in the characterization of visualization interfaces for exploratory research because their purpose is specifically to reveal previously unknown or unstudied insights. It is therefore necessary to design and facilitate a study which addresses the open-ended and non-linear nature of exploratory research in wildfire smoke plumes. To this end, rather than measuring user performance and accuracy on known tasks, we aim to recognize, quantify, and describe moments of novel insight.

Previous studies of exploratory research interfaces focused on a cross-comparison of the efficacy of several competing interfaces designed for the same task. Because MERLIN is the first software designed for this task it was not possible to compare its performance to a control or another piece of software. Therefore this study focuses on characterizing the the thinking of the scientists themselves: what do they see, how do they identify relevant anomalies, how do they make connections to previously seen images or their own knowledge. In other words, this study aims to describe the way that scientists encounter data, extract insight, and connect those moments of insight into a hypothesis for further study. From my perspective, this is a window onto how scientists comprehend phenomena.

Participant describing what they see in the data distribution
Participant describing what they see in the geographic distribution

Scientists were asked to explore for 1–2 hours, no specific task was given for them to accomplish, simply to see what in the data set they wanted to study. We followed a standard think aloud protocol, audio recording the scientists monolog, and transcribing it for analysis. We use the Saraiya, North, Duca, and Lam definition of insight “An individual observation about the data by the participant, a unit of discovery.” Transcriptions were broken down into individual grains of meaning, and categorized according to a modification of the system developed by Liu and Heer.

Observation is a piece of information about the data which can be obtained from a single state of the visualization system. An observation can be made at the visual level, or at the data level.

Recall is prior knowledge or personal experience brought into working memory to help reason about the visualization.

Question is an indication of a desire to examine an aspect of the data. Questions do not necessarily have to be phrased to end with a question mark, it is a desire to know more.

Hypothesis is a well structured novel direction for future research, or a desire to know something that is outside of the bounds of this data set.

Paper Topic is a complete research direction which the scientist affirms could lead to a novel contribution to their field.

These coded segments were reconstructed into insight maps which visualize and elucidate the scientists thinking. Although there is significant variation in how the scientists proceeded, higher level patterns were extracted which will be discussed in the implications section.

Study 02: Insight-Based Longitudinal Study

In the third study we present, we found that atmosphere and climate scientists can take up to 7 years to publish a paper from conception to publication, the average time to publication being 3.35 years. It is important to note that not all of this time is spent on exploration and insight, but the process can extend up to multiple years in some cases, with a minimum time on the order of months. If we are to validly study how scientists understand phenomena which they successfully take to publication, we must address the issue of context and time frame. The previous study placed researchers in a room with the test proctor exploring the data continuously and describing their actions at the same time. This is quite different from the way that those scientists would work in their lab over months and years. To explore this more complex and sprawling process, we developed a log booklet with ‘insight reports’ which were filled out by the scientists by hand over a period of two weeks and returned. The advantage of this study over the previous controlled environment study is that it allows the subjects to communicate to us from inside, or just after, the moment of insight. This study has the corresponding disadvantage of being significantly lower resolution than the controlled environment study, because the subjects will be limited to the medium of writing, and will write only small amounts. This study aims to disturb the scientist’s normal method of analysis as little as possible, aside from the time taken to fill out the probe and return it.

Booklet
Example Insight Report for 11/26/2018

Study 03: Insight Process Questionnaire

The previous two studies discussed here focused on the process of exploratory research through visualization. However, the final goal of exploration is not simply to gain insight, but to connect a series of insights into a coherent story which constitutes new knowledge. The previous studies necessarily are not connected to published works, because the insights addressed were novel to the scientists at the time of testing. The questionnaire collected written accounts from scientists of how they gained and formulated insights into a publication quality narrative on their self-selected ‘most impactful’ paper. 7 questions were sent digitally, scientists took two weeks to craft written responses and returned them for analysis.

For this study I employed a self-authored coding system, because this study is meant to address some of the differences between the scientific method and the design process in order to look for effective places of intervention. The coding is therefore more of a high level categorization or classification.

Object a single thing which a scientist will engage with in some manner, most frequently by manipulating, refining, observing, manufacturing or fitting. Common objects in this study were: data, numerical models, scientific laws, simulations, and maps.

Action what the scientist does to the object in question. Most frequently this would observing, correlating, refining, generating, weighting, or aggregating. Actions are noted as uni-directional or bi-directional.

Indicators are world-states or principles which the scientist is looking for to know when a process is complete. For example researchers frequently looked for repeatability, novelty, literature gaps, narrative completeness, or causality.

Conclusion is the purpose of the paper, the singular thing which the scientist views as their contribution to knowledge expressed in each paper.

Each researcher’s process was mapped out using these four elements. Elements were not prescribed beforehand, but were synthesized by aggregating similar types of objects and actions. The maps provide an exploratory look into how different scientific research processes are, yet they can be seen to be made of similar components.

Results

The insight map opposite comes from study 01, the controlled environment study. It is a demonstration of the Liu and Heer principle that “visualization resonant with the pace of human thought … leads to greater data set coverage … and better questions.” {CITATION} The map is a visualization of each of the interface actions, visualizations, observations, questions, insights, and the final hypothesis which the researcher explores in this 51 minute session. Although it is difficult to characterize the complexity of the researcher’s visual thinking, certain critical high level patterns begin to emerge. There is a cycle, which starts with an interface adjustment, leading to a period of sequential observations about the new visualization. After a variable number of these stated observations, the researcher arrives at a new question that she wishes to examine. Since the interface allows her to explore the data quickly and fluidly enough, she is able to find the answer to her question in enough time to keep a coherent exploration moving. After 1–3 of these cycles of adjustment, visualization, observation, and question she is able to extract a higher order hypothesis, which is indicated by an ellipse above the main action line on the map. After five of these higher order cycles, she is able to assemble a paper topic which is built up from the five previous hypotheses. This study participant expressed that she was “entirely confident” that this hypothesis was be a publication caliber research question. The next step would be to correlate this subset of MISR data with external data sets such as temperature, humidity, and cloud fraction. For this diagram, the repeated observations would be the potential moments of insight, seeing new things or seeing old things in a new way. After a variable number of these insights, a question emerges which guides the next cycle of observation. Along the way, intermittently but generally toward the end of the cycle, a hypothesis may occur to the researcher.

The three following insight process maps were synthesized from study 02, the insight-based longitudinal booklet study. They represent shorter, cohesive explorations which the researchers performed at different times during the 2 week booklet study. Where the previous study showed details on how insights were assembled into a single research hypothesis, these smaller maps show how a single moment of insight is gleaned from a sequence of observations and questions about several individual visualizations. Note that there is variation in the length and number of observations that a researcher makes before making an interface adjustment. There is also variation between the number of sequential visualizations viewed before a moment of insight strikes. This captures the unpredictable, non-linear thinking pathway described in the literature. {CITATION} But the cyclical pattern of adjustment, visualization, observation, question, and insight is quite stable across all data returned from this study.

This set of figures examines the relationship between what visualizations the scientist views, the questions they ask, and the hypothesis which they eventually come to.

The following maps come from study 03, the insight process questionnaire. The five flow diagrams represent the objects, actions, indicators, and conclusions of five scientists on their self-identified “most impactful” publication. The indicators are further subdivided into initial drivers, and completion indicators. On a high level, all scientists descriptions of their process followed a similar format: initial drivers which launch a research process which is continued until the scientist sees strong enough completion indicators, at which point the deliverable is ready.

Indicators of both kinds display coherence with pragmatic outliers. All initial indicators involve anomaly detection or knowledge gaps in previous work, except for one participant who reported that their research was ‘always meant to be part of my dissertation’. I hypothesize that there is likely an earlier driver relating to anomaly detection or knowledge gaps earlier in this researcher’s career which motivated the direction of inquiry which led them to begin their PhD. Completion indicators are less cohesive, but all researchers describe a clear novelty to their work, except for one pragmatist who put an end to their work due to funding. The novelty in researchers work might have been a method, a correlation, a causation, or a description, but always the researcher emphasizes the novelty and repeatability of their contribution.

The research process contains two subgroups: objects and actions. Objects of engagement include data sets, models, maps, and literature. Actions were less coherent, including: matching, correlating, refining, comparing, aggregating, and generating.

Implications and Discussion

In this set of studies, we extend previous insight methodologies to the realm of climate and atmosphere science. Data suggests that visualizations of appropriate speed and flexibility enable researchers to explore in an iterative and connected flow, assembling scientifically useful hypotheses on timescales which are greatly reduced from previous methods. This suggests that there is great opportunity for improved visualization systems to greatly increase the pace and quality of scientific research in the field of atmosphere and climate science. Our research, although limited in scope, also supports the notion that the principle of supporting insight and discovery through reactive visualization is generalizable, with appropriate adaptation, across the branches of science. With further research in this field and others, universal principles and descriptions of the process of insight generation and discovery in science as a whole may one day be understood and described.

Our study reveals and describes a stable cyclical process in which researchers adjust a visualization state, make sequential observations, which build to investigative questions that inspire a new adjustment to the visualization state. As this process of insight sparking proceeds, higher order hypotheses are assembled within the researchers mind. We suggest that further research should be conducted to move beyond our pilot study and describe this cycle fully.

The insight process questionnaire data suggests that although individual scientist’s research processes are highly variable, there are higher order similarities to be observed. The heavy reliance on numerical data and models is at once the strength of science, and also a limitation under particular circumstances. Especially in the exploratory phase, which we focus on here, intuitive and careful visual representations may be able to scientists define and refine their pathways of inquiry with increased speed and success.

There is a potential homology between the scientific process as described by researchers in the insight questionnaire and the design process as described by Donald Schoen. It may be a canard, but it is interesting to note that many scientists described an iterative process of correlation, matching, and harmonization as being central to their research process. In Schoen’s description, the designer converses with an externalized form, harmonizing each additional line of a sketch to the essential quality of the emerging design. It is entirely possible that scientists also progressively harmonize the fit of their models to the structure of observed data in a way that is similar on a high level to the process of design. This may indicate a further potential area of collaboration and investigation.

Opportunity: Public Communication of Complex Ideas

In previous micro case studies, I suggest ways that a designer can function as a useful component of a research team, recursively feeding visual forms into the process which potentially spark novel or unexpected moments of insight. In this case study, I would like to explore the possibility that effective visualizations can also be of use in the context of education, media, and public understanding of science. In the context of research, visualizations of appropriate clarity and flexibility allow scientists to offload some of the cognitive work of understanding complex data sets which helps them to examine more complicated phenomena than pure abstract or numeric data analysis. In other words, the clarity of the visualization allows researchers to access understanding which they might not otherwise have been able to. It is worth asking, does this clarity also have utility in furthering public or student understanding of science? Although this is not the main thrust of my research, it is nonetheless a potentially valuable second order benefit.

North American wildfires in the year 2018
Seasonal crop burning on the continent of Africa 2018
South Asian tropical forest fires

These images, and accompanying animations, convey their stories with dynamic and evocative color, and clear cognitive connection to fire. In the case of North American wildfires, compelling imagery can help to convey the magnitude and anomalous power of the phenomenon invovled. In this case, a dark basemap without country and state labels was chosen in order to clear the scene and foreground the fire data. The data is hexbinned in 25 kilometer sections, with radiative power and smoke plume height averaged for each hexbin. A dark red to bright yellow color scheme was chosen in order to maintain a close cognitive connection to fire, but is reversed from the expected coloration of fire. Fire generally displays a gradient coloration from bright yellow at the base to dark red at the tip. This visualization inverts this color scheme in order to provide maximum contrast between the hottest fires and the basemap. In this way, the most devastating and powerful fires are visually highlighted. Motion adds an additional level of clarity to the digital version of these visualizations. Hexpillar motion closely mimics the jumping and quavering of live flames, which evokes the sense that an entire continent is burning.

Role of Design

In this case, the careful choices of research quality visualization design help translate the complex data of research into a readily comprehensible form for education or public communication. The massive, slow moving, and abstract nature of planet level phenomena makes them difficult to grasp for many people. Most scientific visualization, although effective for research purposes, is not affective within the larger community of humanity. Visualizations which convey phenomena with more dynamism and visual sophistication might prove to be useful in helping to further the understanding of global climate phenomena.

--

--

Adrian Galvin
Thesis Modules

design • science • visualization • illustration • jiu jitsu