Problems and Solutions in Researching Computer Game Assisted Dialogues for Persons with Aphasia

In this paper, we describe technological advances for supporting persons with aphasia in philosophical dialogues about personally relevant and contestable questions. A computer game-based application for iPads is developed and researched through Living Lab inspired workshops in order to promote the target group’s communicative participation during group argumentation. We outline some central parts of the background theory of the application and some of its main features, which are related to needs of the target group. Methodological issues connected to the design and use of Living Labs with persons with aphasia are discussed. We describe a few problems with researching development of communicative participation during group argumentation using an app assisted intervention for the target group and suggest some possible solutions.

In this paper, we describe technological advances for supporting persons with aphasia in philosophical dialogues about personally relevant and contestable questions. A computer game-based application for iPads is developed and researched through Living Lab inspired workshops in order to promote the target group's communicative participation during group argumentation. We outline some central parts of the background theory of the application and some of its main features, which are related to needs of the target group. Methodological issues connected to the design and use of Living Labs with persons with aphasia are discussed. We describe a few problems with researching development of communicative participation during group argumentation using an app assisted intervention for the target group and suggest some possible solutions. BACKGROUND Acquired brain injuries (ABIs) happen after birth, caused by external trauma (e.g., accidents), or by internal causes (e.g., strokes). A common consequence is aphasia, a communication disorder sometimes defined as "an acquired language impairment following brain damage that affects some or all language modalities: expression and understanding of speech, reading, and writing." (Brady et al., 2016, p. 1). Many ABI survivors experience drastic decreases of social and communicative exchange following the onset of aphasia. It is likely that opportunities for engaging in communication about more personally meaningful and advanced topics suffer the most. This paper presents a methodology for designing and researching a digital pedagogical application aimed at supporting persons with aphasia to participate in, and learn through, deep and complex dialogues about big issues, called philosophical dialogues.
We have used a methodology for philosophical dialogues, based on Philosophy for/with Children (P4wC), for use with persons with ABI. The P4C program was originally designed for use in schools (Lipman et al., 1980), but has been adapted for other contexts (UNESCO, 2007). The participants, together with a facilitator, build a "community of inquiry" (Lipman, 2003, p. 20), where active sharing, exploration, and examination of ideas occur. This collaborative element encourages participants to ask questions, formulate novel hypotheses, provide reasons, and argue for and against different views, and bring forward examples and counterexamples to different ideas, thereby collectively taking the inquiry forward (Gardelli, 2016).
Although prior studies on philosophical dialogues involving persons with ABI are very few, strong results have been reported (Backman et al., 2020). For example, large qualitative and quantitative gains in group argumentation among persons with ABI and aphasia were detected during prior small-scale interventions (Backman et al., 2020). These results align with research showing positive effects of socially oriented and community-based conversation groups for persons with aphasia (Lee & Azios, 2020) and a plethora of results of P4wC interventions found in other target groups where positive results have been found on emotional and social (including communication) abilities (for an overview, see Gardelli, 2016), such as development of listening skills (Gorard et al., 2015;Trickey & Topping, 2004), interactive behaviour (Topping & Trickey, 2007a, 2007b, and confidence and willingness to speak (Gorard et al., 2015;Trickey & Topping, 2004).
While participating in philosophical dialogues would reasonably be of utility for many persons with aphasia, there is a need for specific support for persons with aphasia since participation demands several of the things that they often find particularly challenging.

DIALOGICA
One attempt at developing a tool to provide this support is an app prototype called Dialogica, a networked multiuser application, designed for large screen mobile devices, that is "intended to provide opportunities for [persons with aphasia] to participate actively in conversations about contestable questions and assist [them] in expressing themselves […] through personal avatars, animations and chats" (Backman et al., 2021, p. 195-6). It features several tools and functions specifically designed to support persons with communication disorders in participating actively and qualitatively in philosophical dialogues, such as text-to-speech technology and informative symbols to accompany text, but also advanced tools built upon theory and methodology from P4wC and argumentation analysis, such as a palette of dialogic moves and a "conversation tree" (Backman et al., 2021). Dialogica is intended to aid interactional symmetry, acknowledgement of participants' contributions in the dialogues, and multi-modal communication through both verbal and visual expressions (Backman et al., 2021). Such characteristics have been found to positively influence active participation of persons with aphasia in prior research (Lee & Azios, 2020).

LIVING LABS AND THEIR DESIGN IN DIALOGICAL RESEARCH WITH PERSONS WITH APHASIA
The development of and research about Dialogica depend on close collaboration with end users, using a methodology inspired by so-called "Living Lab" workshops (Bergvall-Kåreborn et al., 2009), where actual real-world users are involved in the research and design processes. These collaborative workshops generate feedback used for further prototype development, hence reducing the risk of digital innovations becoming inapplicable or not sustainably successful, a fate that most technological innovations meet (Feurstein et al., 2008).
Initially, the Living Lab methodology was developed to provide "… an 'ecologically valid' experimental platform for experimenting with emerging and future technologies" (Markopoulos & Rauterberg, 2000, p. 54), but was restricted to on-campus, home-like lab environments where people could live and use new technologies as in ordinary life (Markopoulos & Rauterberg, 2000). Later, the concept was extended to "… create a shared arena in which digital services, processes, and new ways of working can be developed and tested with user representatives and researchers" (Bergvall-Kåreborn et al., 2009, p. 1). Living Lab activities were gradually understood as more broadly situated in a real-world context, where ICT innovations are "cocreated, tested, and evaluated in open, collaborative, multi-contextual real-world settings" (p. 2) where "people's ideas, experiences, and knowledge, as well as their daily needs of support from products, services, or applications, should be the starting point in innovation." (p. 1) The close collaboration with end users in our project is dependent on iterated workshops with prototypes and facilitated dialogues conducted together with the actual users of the application. Due to their communicative needs and abilities, these workshops need to be carefully designed and conducted to provide opportunities for the participants to be heard and communicate preferences and needs. We include care professionals working closely with our participants and ensure that the dialogues are led by facilitators with vast experience in facilitating philosophical dialogues in general and with persons with ABI.
We largely followed a standard P4wC nine step routine (as detailed by Backman et al., 2020, p. 5) which provided a pedagogical setting suitable for both testing the prototypes -which supports ecological validity since it resembles the contexts that the application is designed and intended to be used in -and that holds promise to be enjoyable and beneficial (cf. Backman et al., 2020) to the users and therefore sparks engagement with the application, leading to valuable feedback from the users. After the dialogues we asked questions about usability, overall impression and specific features.
The workshops were filmed. 2-8 end users participated and the multidisciplinary team of 4-5 researchers with experience in computer science, special needs education research, and philosophical dialogues monitored the workshops, in order to: 1. make observations in two technical areas: the user interface usage and errors in the application. These data were used directly after the workshops for interface development and minor bug fixes. 2. make observations in three pedagogical areas: proportion of usage between different functionalities, how different functionalities were used, and how usage of different functionalities affects quality and quantity of communicative participation (e.g., to what extent the palette increased key contributions characteristic of a high-quality philosophical dialogue or the conversation tree supported memory and meta-cognitive awareness). These data were processed by the research team within two weeks from each workshop to establish directions for further development of key functionalities. 3. ask questions about the participants' experiences of different functionalities and their preferences regarding its functionalities -both those already incorporated and those that the participants felt were missing from the app. Some of these data were triangulated against the observational data described in 2, to understand how well the key functionalities worked and why. The data about the participants' preferences regarding possible new functionalities were used to establish directions for development of possible new features. 4. answer questions arising from the participants regarding the digital solution, its functionalities, and possible future development directions.
The advantage of using the extended Living Lab method is that we get feedback from the users continuously during the development process. One alternative would be to use the more classical waterfall method where user requirements are gathered initially followed by several design and development steps without the users' involvement until the very end where a prototype or almost finished product is shown to the users to get feedback on. That process might lead to a product design not corresponding to the users' requirements. The extended Living Labs method not only dictates that the end-users should be involved in the design process but also that they get to try early prototypes to give feedback. By involving the users early and continuously during the development process, the user requirements might also change during the development process. The extended Living Labs method combined with the idea of a "release early, release often" development paradigm, lead to a development process where users are more involved and hopefully to a final product that is much closer to the end-users' preferences and needs.

RESEARCH ETHICAL CONSIDERATIONS
Conducting research involving people with disabilities requires careful research ethical consideration. Persons with aphasia should have the opportunity to be acknowledged and listened to. While this is important for everyone, not everyone has equal opportunities.
Recurring difficulties with expressing themselves oftentimes result in exclusion from deeper conversations for persons with aphasia. It is thus important to elucidate factors that may have positive effects on their ability to participate more equally in dialogues, as well as to eliminate prejudice and raise people's awareness of how they can utilise their own resources more fully. People also have a legitimate claim of privacy. In this project, we have applied the principle of informed consent for all research participants. They have been informed that they are at all times free to terminate participation without giving any reason and that they are guaranteed confidentiality. We have also received permission from the staff, which are involved during the Living Lab workshops and the philosophical dialogues.

RESEARCHING DEVELOPMENT WITH PERSONS WITH APHASIA
In later stages of the development and research process, relevant research endeavours would include studying the effects of digitally supported interventions on communicative participation during group argumentation for persons with aphasia. We will now sketch two main problems occurring in such later research phases and suggest some solutions.
A first main issue is that there is a lack of research tools developed and validated for this purpose for persons with aphasia specifically. However, there are tools available for studying group argumentation development in philosophical dialogues for other target groups, for instance the Argumentation Rating Tool (ART, Reznitskaya & Wilkinson, 2017); a validated observational scale "… designed to help practitioners and researchers to assess the quality of teacher facilitation and student argumentation during group discussions of texts in elementary language arts classrooms" (Reznitskaya et al., 2016, p. 2). P4wC was one of the "established pedagogical models that use classroom dialogue for promoting argumentation" on which ART was based (Reznitskaya et al., 2016, p. 11). The creators of the ART state that their "understanding of facilitation and argumentation was most informed by the scholarship on Philosophy for Children" (Reznitskaya & Wilkinson, 2021, p. 3). The ART was also developed based on a review of empirical studies examining indicators of productive talk, existing tools for making observational analyses of classroom interaction, and a professional development program involving school staff (Reznitskaya et al., 2016).
The tool uses four key standards: Shared, Clear, Acceptable, and Logical. For each of these there are a few dialogical practices (eleven in total), such as "Sharing responsibilities" and "Clarifying meaning". The group of participants is rated through filmed dialogues on each such practice using a 1-6 scale, where the top scores show high argumentative quality and communicative participation, including high degrees of influence in the dialogue.
In previous research with persons with ABIs, ART has been used to study verbal group argumentation development in P4wC-based interventions through structured observations of filmed dialogues (Backman et al., 2020). High levels of pre-to-post-differences were detected and large effect sizes were obtained, while the levels of inter-rater reliability were lower for this target group than for the intended, indicating that further adaptions might be needed to provide more stable results. Nonetheless, we hold that the ART functions as a sufficiently good current alternative for studying communicative participation during group argumentation for persons with aphasia. It could be used to study intervention effects through pre-and post-measurement by "blind" raters or to compare dialogues with and without Dialogica, in order to measure the extent to which the application facilitates communicative participation during group argumentation for the participants.
A second main problem is that there are possible threats to internal validity when using final app versions in experimental studies to study development of communicative participation during group argumentation. Since internal validity concerns the correctness of inferring a causal relationship between two or more variables, e.g., x and y, we must ask ourselves whether we can be certain that it is in fact changes in x that causes changes in y, and not any other factor that gives rise to an illusory causal relationship (Bryman, 2016). The strict RCT requirements of manipulation of the independent variable, randomised sampling between control and experiment groups, pre-and postmeasurement, and comparison of development between control and experiment group are often considered to "create a strong confidence in … the credibility in causal conclusions" (Bryman, 2016, p. 77). In this particular case, it could be relevant to determine if a dialogic intervention using Dialogica causes improvement in communicative participation during group argumentation. To determine this, one would, among other things, need to determine whether detected improvements from pre to post in the digitally supported intervention are due to development of communicative participation skills and not merely acquaintance with the digital solution. It could be that an introduction of a digital solution in a dialogic intervention entails a digital threshold and initially decrease ART scores because of lacking technological skills and unacquaintance with the particular solution. After some time, the ART scores would then increase because of mere acquaintance with the technological support, while this could falsely be interpreted as development in communicative participation skills other than technological skill development.
One way to exclude this alternative explanation would be to use another data source such as interviews about the observed difference from early to late in the intervention. However, the limited verbal ability of the target group could constitute a problem. But with carefully designed interview protocols about the potential causes of the detected improvements, and with contextual knowledge about both the research participants and their communication habits and preferences, semi-structured interviews could be a way to proceed to gather relevant information. If blind data processing of early and late dialogues with high inter-rater reliability through the ART would show large positive gains, and the participant interviews would support that this finding was dependent on development of communicative participation skills mostly regardless of technological skill development, it would be easier to draw more certain conclusions.
A second way to exclude this alternative explanation could be to not only analyse early and late filmed dialogues with app support in the researched dialogic settings through the ART but also early and late filmed traditional dialogues in the same setting, to note potential differences between communicative participation in group argumentation in technological and traditional dialogues. If the noted communicative participation development would be more or less solely dependent on the participants' becoming better at using the app, then they would probably be scoring worse during the early digitally supported dialogues than during the early traditional ones. If there is no difference between the scores from the early traditional dialogues and the late digitally supported dialogues, this would indicate that the detected improvements between early and late digitally supported dialogues are only due to developed technological skill.