Attack of the 50 Foot Zoom Presenter: Visual access and the new etiquette of video-mediated events

Professor Elizabeth Stokoe
10 min readFeb 24, 2022

by Jessica P. B. Hansen and Elizabeth Stokoe

Allison Hayes in Attack of the 50 Foot Woman (1958) in a scene in which, wearing a bustier in the dark, she appears taller than an electricity pylon.
Allison Hayes in Attack of the 50 Foot Woman (1958)

“You’re on mute!” has become one of the most regularly uttered phrases in video-mediated communication, as much of our everyday and working life moved online during the coronavirus pandemic. And remember the point at which Zoom, Teams, and other platforms started informing us — via captions and audio describing — that “recording” was “in progress”? What if the next generation of platforms told us “The participants can see your presenter notes”? or “Your image is ten times larger than life-size in this meeting”?

In a recent UK television interview on the BBC’s The Graham Norton Show, actor Cate Blanchett described attending a movie premiere by Zoom and the moment she realized that her face was giant on a cinema screen behind a row of physically present cast and crew members:

BBC’s The Graham Norton Show: Cate Blanchett describes the moment she realizes that she will be on a giant cinema screen behind a row of physically present cast and crew members while attending a film premiere on Zoom.
Cate Blanchett interviewed on BBC The Graham Norton Show (14.1.22)

Along with discussions about participants turning their camera on or off, or the strange distraction of virtual backgrounds and disappearing limbs, one of the most common complaints about remote encounters is the apparent lack of ‘body language’ they afford, compared to being co-present and in person. In this article, however, we flip the issue around to consider how video-meetings sometimes give us far too much access to people’s faces and bodies.

The issue of ‘who can see or access what’ runs in multiple directions. For example, while online participants may be unaware of how they are represented on someone else’s screen, in-person participants may not know what aspects of their physical environment can be accessed by those attending remotely.

Much is already written about the pivot to remote interaction and what counts as best practice for hybrid meetings. Many articles describe the best lighting, the right technology, how to organize an agenda, and how to run inclusive sessions or enable audience participation. However, far less attention has been paid about how to handle the way we get represented on each other’s computer or projector screens — and how much control we have over these representations.

For instance, how do we know whether someone is using a ‘speaker’ or ‘gallery’ view’? Is everyone participating using the same view? Is our image the same size or much larger than other participants? Are we giant or tiny? Do multiple screens mean that we think we’re looking at the camera while, to recipients, appear to be looking at nobody? Can people see that you are on Facebook because your screen is reflected in your glasses? Can people see your eyes glaze over because you are not at the back of a room but on a screen and the size of the 50 Foot Woman or the Wizard of Oz?

Two images — a stock image of a hybrid meeting and a screenshot of the wizard from the Wizard of Oz film appearing as a giant head
A stock image of a hybrid meeting and the Wizard of Oz

A related issue is whether participants have ‘self-view’ on or off. While many have commented on the strangeness and ‘fatigue’ of seeing one’s own face while interacting with others, when this is not the case in person, there are important reasons for seeing your face when online. ‘Self-view’ is about far more than our faces — it is about knowing and seeing what we can convey within the boundaries of our digital windows. This includes artefacts and aspects of our physical environment that we may show to camera or what we choose to hide by using a background setting or screen. Or we can turn the camera off altogether to reduce access to ourselves in ways we cannot do when in person.

Given that our faces and bodies may be “subjected to a different scrutiny while online” — and given that we may be relatively unaware or not in control of how we are represented on someone else’s screen — in this article we consider what might help to manage these new interactional matters. Before recommending five etiquette tips, we take a quick look at the science behind video-mediated meetings and other encounters.

Here comes the science

We tend to compare all forms of interaction and evaluate in-person as the ‘gold standard’ and remote communication as impoverished. However, it is important to keep in mind that humans have always used whatever resources are available to interact with subtlety and speed. On the telephone, for instance, we use voice and intonation but not gestures or aspects of our physical environment. This is why we say things like, “I’m just walking upstairs” (if we sound ‘out of breath’) or “my partner is just gesturing something at me” (if we are distracted by them, mid-conversation). Similarly, in written communication, a complex lexicon of emoticons, emojis, gifs, and other non-lexical resources has evolved to stand in for the way intonation modifies and conveys what our words are doing when we speak.

The challenge of creating effective interactional environments for remote collaboration is not new. The issue of how “to ensure that participants have compatible views of each other’s domains” and “common frames of reference” has been researched for at least twenty years preceding the pandemic across what some have termed the ‘fractured ecologies’ that remote communication creates.

Imagine you are presenting at an online event. You choose a specific portion of your screen that you want the audience to see. But, while presenting, you may have access only to your notes or the presentation slides, and not be able to see the audience’s video displays, chat or emojis. Does the audience know what you have access to, in terms of their participation? While someone might ask a question in the chat, you may still pursue responses from the audience, leaving their question unanswered — and, for that person, possibly feeling ignored. Meanwhile, you do not continue with your presentation until someone unmutes and responds verbally.

In scenarios like these, neither the presenter nor the audience knows what each other can see, hear, or has access to — and it is easy to make the wrong assumption. And this can lead to further assumptions, like the audience is not engaged, or the presenter is being rude.

The ‘fracturing’ of video-mediated environments gets even more interesting when different languages and interpreting are involved. For example, in video-mediated interpreting, an interpreter enables communication between people who do not share a language. Video technologies are increasingly used to provide interpreting services across a wide range of settings, such as in asylum court hearings and in medical consultations, and can be done in various ways depending on institutional practices and the distribution of participants.

In our research, we collected and analysed recordings of doctor-patient consultations that were enabled by an interpreter. The medical professionals and their patients were situated together in a hospital, while interpreters participated remotely. One of the striking things about these interactions was that although technology enabled a ‘face-to-face’ experience, the participants did not always use the technology in ways that ensured they each knew what the other parties could access and ‘see’:

Two line drawings: On the left, the interpreter has visual access to parts of the doctor and her self-view monitor. On the right, at the hospital ward the doctor is gesturing to invite the interpreter to begin interpreting. The gesture is not visible to the interpreter. At the hospital ward, they are not using a self-view monitor.
Line drawings of video-mediated interpreting

In one case, the doctor was partially visible to the interpreter while the patient and their next-of-kin, also present at the consultation, were not visually available to the interpreter. When the doctor asked the patient, “how you doing”, the Norwegian second person singular pronoun “du” (you) caused an obvious problem — who was the “you” referring to? The interpreter could see that the doctor was gazing at someone, but not who. This lack of equal visual access for all participants and awareness of who can see what can create friction in even the most mundane moments of interaction — the ‘how are yous’ at the start. But the medical professionals and interpreters rarely discussed the practical matter of visual access, even after problems occurred.

As with all technology, however, human beings are pretty good at navigating complexity and developing and adapting practices to do things in the environments they create — from workplace and institutional settings to our personal and family lives. For instance, a teacher in an online Norwegian Sign Language class developed practices for addressing and referring to individual pupils by pointing in directions related to where on their screen the students were positioned. At first, the teacher’s pointing gestures were combined with other resources to identify pupils, such as using their names. However, across the course of the class — and even though the students could not see how their images were distributed across the teacher’s screen — the direction of points became mutually understood.

Other research has shown how people position their bodies and gestures with nuance and skill across many settings, from telemedicine and video-mediated physiotherapy, to the staying-connected conversations that migrant workers have with their young children. Even very young children can use the affordances of technology to interact with others:

Two photographs of video-mediated interaction: On the left, Jessica’s niece is pointing to something outside Jessica’s frame of reference. On the right, Jessica is showing a blueberry to her niece, who responds by saying “blueberry”.
Photographs of video-mediated interaction

Here’s looking at you, kid!

In a recent hybrid workplace meeting, with most people physically co-present in a meeting room and a smaller number online, one of the physically co-present participants sent a private message to a remote one:

“FYI when your camera is on you are giant on the screen.”

The remote participant replied,

“Thanks for the heads up re being massive.”

In another meeting, a remote participant sent a screenshot of the physical room, in which some in-person participants sat outside the field of vision of the camera, with the accompanying message:

“I wondered if someone could paint red lines on the tables in the places indicated on the image below so those viewing remotely can see everyone?”

Screenshot of a workplace meeting taken by a remote participant indicating the boundaries of their field of vision.
Screenshot of a hybrid workplace meeting taken by a remote participant

The need for communicating basic information about these kinds of environments and representations was suggested almost twenty years ago — a long time before video-mediated communication became mainstream. Paul Luff and colleagues listed several practical suggestions for enabling distributed participation, including a recommendation to “[p]rovide participants with the ability to determine the location, orientation, and frame of reference of others.” Twenty years of technological development later, the need is even greater to think about how we create effective online and hybrid communication spaces.

‘Go meta’ on communication

The following five tips are underpinned by one general principle for video-mediated participation — ‘go meta’. By ‘going meta’, we mean make communication about communication itself a new norm. The relevance of each tip will vary according to whether the remote interaction is a workplace meeting or something social; whether it is multiparty or one-to-one, and whether all participants are online or whether the event is hybrid.

1. Representation

  • Tell people how they are represented on your screen — do you have them pinned? Are you using ‘gallery’ or ‘speaker’ view? Are people giant or tiny, relative to other participants? Are people using a virtual background that obscures parts of their body or objects they are holding in ways they are probably unaware of?
  • Do this reciprocally so that everyone has basic information for interaction and participation.

2. Gaze

  • One of the things that we cannot do easily in multi-party online interaction is use our gaze to select speakers — or “look at” one person in a way they will know you are looking at them. So — tell people: “By the way, I’m looking at you, Jessica 😉”
  • Tell people where your camera is — are you trying to replicate a ‘face-to-face’ moment (e.g., by looking very deliberately straight into the camera?) or is your camera on a different screen to where you see others’ faces?

3. Resources

  • Let co-participants know about the use of and access to resources needed to interact. Digital resources include telling people that your camera is broken (to account for not being face-to-face — whether true or not!) or that someone is showing their email inbox or presenter notes while sharing a screen. Material resources include checking that anything you mobilize for interaction (e.g., showing your coffee mug or a book to camera) is visible enough for the purpose of showing it (is a fleeting glimpse enough, or do your recipients need to see it and read properly?)

4. Inclusion and access

  • Check everyone’s communication preferences, especially when you do not already know the people you’re interacting with.
  • Much is written about how to ensure online meetings are inclusive and accessible.

5. Organization

  • Decide who should manage going meta (e.g., in a meeting, the Chair or someone else?)

There is more to read about planning for various online activities. Make time for meta matters, especially for more complex multimodal interactions which may require extra technical preparations.

Raised hand emoji

Video-mediated communication technologies have been with us for many years, and well before the pandemic. Such technologies are here to stay, along with their new lexicon — “you’re muted” or “my camera isn’t working”. We are constantly evolving ways of communicating to replicate and augment different actions in different modalities. For example, just like meetings in person, we raise our hand to participate online. But we sometimes forget to turn off the ‘raise hand’ symbol once we have spoken — unlike when in-person — mostly…! And so, we say in English, “Is that an old hand?” and in Norwegian, “Er det en gammel hånd?”. Even local varieties have emerged (e.g., at Liz’s workplace, “is that a legacy hand?”).

The next generation of catchphrases for interaction and participation are incoming — and might include, “you’re giant!” Good communication is good for everything, so now is the time to communicate more about online communication.

Jessica P. B. Hansen is Associate Professor at Østfold University College, Norway; Elizabeth Stokoe is Professor of Social Interaction at Loughborough University, UK.

Allison Hayes in Attack of the 50 Foot Woman (1958) in a scene in which she appears taller than a house
Allison Hayes in Attack of the 50 Foot Woman (1958)

--

--

Professor Elizabeth Stokoe

The London School of Economics and Political Science, specializes in conversation analysis, communication training, & science communication.