[ Article ]

Archives of Design Research - Vol. 39, No. 2, pp.145-158

ISSN: 1226-8046 (Print) 2288-2987 (Online)

Print publication date 31 May 2026

Received 29 Dec 2025 Revised 04 Apr 2026 Accepted 16 Apr 2026

DOI: https://doi.org/10.15187/adr.2026.05.39.2.145

Abstract Movement Design for Non-Humanoid Robots: Opening, Closing, and Turn-Taking Gestures

Honguk Lee , Eui-Chul Jung

Department of Design, Doctoral candidate, Seoul National University, Seoul, Korea Department of Design, Professor, Seoul National University, Seoul, Korea

Correspondence to: Eui-Chul Jung jech@snu.ac.kr

Abstract

Background Non-humanoid robots in service contexts face fundamental limitations in expressing social intentions due to physical constraints, including limited anthropomorphic features and low degrees of freedom. Although previous research has explored how minimal abstract movements can support social communication, a significant gap remains in understanding how such movements can function as conversational grounding, particularly in signaling interactional phases such as opening, closing, and turn-taking.

Methods Using an abstract robotic prototype equipped with two degrees of freedom that performed simple expressive movements, we conducted a user study with 30 participants. One participant was excluded from the final analysis due to a technical issue with the prototype, resulting in data from 29 participants. Three experimental conditions were evaluated: pointing-only (A), pointing + opening and closing (B), and pointing + turn-taking (C). We measured participants’ perceptions of engagement, likeability, intelligence, and fluency, alongside task completion time and conversational overlaps.

Results All abstract movements were perceived as appropriate for the interaction context. Condition B significantly increased engagement and likeability compared to Condition A. Condition C significantly improved perceived intelligence and fluency compared to other conditions. Additionally, Condition C reduced task completion time compared to Condition A and conversational overlaps compared to both conditions.

Conclusions These findings demonstrate that minimal abstract movements can effectively support conversational grounding in non-humanoid robots. This approach can help extend social capabilities to a broader range of intelligent products with similar physical constraints.

Keywords:

Non-Humanoid Robot, Abstract Movement, Conversational Grounding, Turn-taking, Human-Robot Interaction

1. Introduction

As robots are increasingly deployed in service domains, their social ability to engage in natural conversation has become essential. Recent advances in large language models have significantly enhanced robots’ verbal communication capabilities, enabling more coherent and contextually appropriate dialogue (Kim et al., 2024; Mahadevan et al., 2024). These improvements in verbal fluency have consequently raised user expectations for complementary nonverbal expressions in robotic interfaces. However, non-humanoid robots face inherent limitations in nonverbal expression due to physical constraints, including a lack of anthropomorphic features and low degrees of freedom (DoF) (Rosenthal-von der Pütten et al., 2018). Importantly, these constraints extend beyond robots to include future intelligent products, as smart appliances and assistant devices.

Current research in robotic nonverbal expression has predominantly focused on mimicking human gestures, such as head gestures (Lee & Hahn, 2025), eye gazing (Kompatsiari et al., 2021), body posture (Tuyen et al., 2020), and facial expression (Hu et al., 2024). While these studies have demonstrated the positive impact of anthropomorphic gestures on social interaction, directly translating such expressions to non-humanoid platforms presents significant implementation challenges.

Previous studies have explored how minimal, robot-specific movements can convey social meaning in non-humanoid platforms. Anderson-Bashan et al. (2018) showed that even an abstract robotic object can express positive or negative opening encounters through minimal ball rolling movement. Another study on low-anthropomorphic robots has demonstrated that subtle gaze shifts can significantly influence social engagement during guidance tasks (Zaga et al., 2017). However, despite previous research, a significant gap remains in understanding how such non-humanoid movements can function as conversational grounding cues, which are fundamental mechanisms for establishing and maintaining mutual understanding in interaction.

This study addresses this gap by investigating how minimal abstract movements can effectively function as conversational grounding cues, including opening, closing, and turn-taking cues. By focusing on low-anthropomorphic expressiveness, this study aims to explore non-humanoid gestures that serve as appropriate social grounding cues in conversational contexts. This approach broadens the scope of nonverbal design for constrained robotic embodiments by focusing on simple motion-based expressiveness rather than relying exclusively on humanoid-centric approaches.

This study contributes not only by demonstrating that minimal abstract movement can support conversational grounding in HRI, but also by clarifying how different movement types can play different roles in interaction. In this regard, the contribution extends beyond reporting statistically significant condition effects to offering implications for designing movement based interaction in constrained robotic embodiments.

2. Background

2. 1. Social Embodiment and Expressive Movement

Robots’ physical embodiment plays a central role in shaping how people interpret and engage with them (Clavel et al., 2016). Through movement, robots can communicate, engage users, and offer dynamic possibilities that extend beyond their mechanical appearance (Hoffman & Ju, 2014). However, overly rich human-like embodiment in robots can hinder effective interaction (Waytz et al., 2010; Mori et al., 2012). More importantly, robot behaviors must be predictable and understandable to human users (Onyeulo & Gandhi, 2020).

Abstract and robot-specific movements can provide an alternative design space for expressiveness without relying on human-likeness (Hoffman & Ju, 2014). Due to metaphorical reasoning, weak human-likeness can have as strong an effect on user perception of robots as strong human-likeness (Epley et al., 2007). Previous studies have shown that simple robotic movements in non-humanoid robots, such as moving the focal area for greeting expression (Anderson-Bashan et al., 2018) and tilting the entire body for social gestures (Zaga et al., 2017), can function as interpretable and recognizable social cues. Additional studies have shown that robot-specific movements can provide social expressions influencing emotional support (Erel et al., 2022) and quality of interaction, including engagement and perceived intelligence (Hu et al., 2025).

2. 2. Conversational Grounding Cues

Conversational grounding relies on shared signals, such as opening, closing, and turn-taking cues, that help coordinate dialogue. In human communication, head movement has semantic meanings as grounding cues (McClave, 2000). Moving the head can help speakers capture the listener’s attention (Duncan & Niederehe, 1974), and head turning movement associated with gaze occurs when speaking or listening in response to conversational partners (Kendon, 1967).

Humanoid robots have been investigated for conveying turn-taking cues through eye gaze, head movements, arm gestures, and body posture signaling (Skantze et al., 2014; Thomaz & Chao, 2011; Liu et al., 2013; van Schendel & Cuijpers, 2015). However, these modalities are difficult to implement in non-humanoid robots with limited physical constraints. Turn-taking serves as a fundamental speech-exchange system (Sacks et al., 1974) that can mitigate social errors in HRI (Tian & Oviatt, 2021). In human conversational behavior, listeners request a conversational turn by gazing at the speaker or shifting their posture to an upright or forward-leaning position (Wiemann & Knapp, 2017; Kendrick et al., 2023). Individuals typically seek to establish attention before the main topic of conversation begins (Goffman, 1955; Schegloff, 2007). To maintain speaking turns and discourage interruptions, speakers use hand gestures as a turn-holding cue (Duncan, 1972; Kendrick et al., 2023). Turn-yielding cues typically involve speakers terminating or relaxing hand gestures, adopting a reclined posture, or averting their gaze (Duncan, 1972; Wiemann & Knapp, 2017).

3. Design and Hypotheses

To explore how minimal non-humanoid movements can function as conversational grounding cues, we designed abstract gestures by extracting key communicative patterns from human head and gaze behavior and translating them into simple vertical or lateral motions feasible for low-DoF robots. The robot’s spherical component served as the primary expressive element, with movements across three vertical reference zones: action, standby, and sleeping zones (Figure 1). Opening was expressed through vertical upward motion followed by a rapid return to the standby zone. Closing was expressed as a downward shift toward the sleeping zone. For turn-requesting movement, the spherical component moves vertically upward from the standby zone to the action zone. For turn-holding movement, it maintains position in the action zone. For turn-yielding movement, it moves downward to the standby zone. Pointing was expressed as lateral rotation toward the target object.

Figure 1

Abstract movements designed in this study

We evaluated the appropriateness of these abstract movements through the experiment according to the following hypotheses:

H1: The abstract movements can appropriately represent expressive gestures: opening (H1a), closing (H1b), turn-requesting and turn-holding (H1c), turn-yielding (H1d), and pointing (H1e).

Pointing movements were included in all conditions as the control movement, as their effects on joint attention have been established in previous studies (Sauppé & Mutlu, 2014; Zaga et al., 2017). Three experimental movement conditions were designed: pointing-only (A), pointing + opening and closing (B), and pointing + turn-taking (C). We hypothesized that opening and closing movements would enhance social engagement and positive affect.

H2: Condition B will increase participants’ perceptions of engagement (H2a) and likeability (H2b) compared to Condition A.

Because turn-taking cues can reduce social errors in human-robot conversation (Tian & Oviatt, 2021) and enable the robot to respond more frequently to participants’ speech, we hypothesized that Condition C would significantly improve participants’ perceptions and reduce task time and social errors.

H3: Condition C will increase participants’ perceptions of engagement (H3a), likeability (H3b), intelligence (H3c), and fluency (H3d) compared to other conditions.

H3-1: Condition C will reduce participants’ task completion time compared to other conditions.

H3-2: Condition C will reduce conversational overlaps compared to other conditions.

3. 1. Implementation

A robotic prototype was designed with a minimal geometric configuration consisting of cylindrical and spherical components. A microcontroller unit and two servo motors were integrated into the cylindrical body, implementing vertical movement and lateral rotation. The robot dimensions are: height of 350 mm, width of 100 mm, depth of 155 mm, and spherical component diameter of 50 mm. Figure 2 shows the mechanical configuration and components of the prototype. The robot was operated using the Wizard of Oz technique.

Figure 2

Prototype mechanism used in the experiment

4. Experiment

4. 1. Participants

Thirty participants (14 male, 16 female) with a mean age of 30.20 years (SD = 2.37, range = 26-34) were initially recruited for this experiment. However, due to a technical issue with the robot during one session, the data from one participant were excluded from the analysis. Consequently, the final sample included 29 participants (13 male, 16 female) with a mean age of 30.10 years (SD = 2.35, range = 26-34). This study was approved by the Institutional Review Board (IRB) of Seoul National University.

4. 2. Design and Procedure

Participants were randomly assigned to one of three movement conditions, with 10 participants per condition. Due to a technical issue with the prototype, one participant’s data were excluded, resulting in a final distribution of 9 participants in Condition A, 10 in Condition B, and 10 in Condition C. The distance was set at 0.5-0.7 meters between a participant and the robot. To simulate a service context, participants were asked to talk with the robot following a provided dialog script and were given a small task to pick up a specific tea bag from the table. Figure 3 illustrates the experimental setup.

Figure 3

Experimental setting

The interaction proceeded as follows: The robot says, “Hi, do you need any help?” The participant responds, “Which tea bag tastes good?” The robot responds, “I would recommend the tea flavor. How about chamomile tea?” (with a two-second pause). The participant says, “I like chamomile.” The robot performs the pointing movement and asks, “Please pick up the yellow tea bag.” After the participant picks up the tea bag, the robot says, “That is chamomile tea. Enjoy your tea time.”

In Condition A, the robot performed pointing only. In Condition B, the robot performed opening, closing, and pointing movements. In Condition C, the robot performed turn-requesting, turn-holding, turn-yielding, and pointing movements.

To standardize the interaction across participants, the verbal content was kept constant across conditions and participants. The robot was operated using a Wizard of Oz procedure, in which the experimenter used a remote controller to trigger a predefined set of movement and utterance outputs. Each condition was programmed in advance so that the corresponding movement and utterance were linked as a single response unit. Following each participant response, the experimenter triggered the robot’s next response according to the predefined sequence. This procedure maintained a consistent interaction structure across participants while allowing the robot’s responses to follow participant utterances at the appropriate moment. The robot’s utterance was standardized using the same TTS settings across all participants and conditions. To examine differences in conversational overlap across conditions, the two-second pause in the turn-taking condition was embedded in the corresponding audio file together with the associated movement.

4. 3. Measures

4. 3. 1. Subjective Data

User Engagement: User engagement was measured using four items from previous HRI studies (Kim et al., 2013; Kompatsiari et al., 2021) on a 7-point Likert scale (7 = “strong” to 1 = “weak”):

- How much did you feel engaged with the robot?
- How familiar are you with the robot?
- I paid a great deal of attention while interacting with the robot.
- While listening to the robot’s utterances, I felt a strong interaction with the robot.

Likeability: Robot likeability was measured using five items from the Godspeed questionnaire (Bartneck et al., 2009) on a 7-point bipolar semantic differential scale:

- Degree of Like (or dislike), Friendly (or unfriendly), Kind (or unkind), Pleasant (or unpleasant), and Nice (or awful).

Perceived Intelligence: Participants’ perceptions of the robot’s intelligence were measured using five items from the Godspeed questionnaire (Bartneck et al., 2009) on a 7-point bipolar semantic differential scale:

- Degree of Competent (or incompetent), Knowledgeable (or ignorant), Responsible (or irresponsible), Intelligent (or unintelligent), Sensible (or foolish).

Fluency: Perceived fluency of the interaction was measured using three items from previous studies (Ajibo et al., 2020; Hoffman, 2019; Liu et al., 2013) on a 7-point Likert scale (7 = “strong” to 1 = “weak”):

- The robot contributed to the fluency of the interaction.
- Is the robot’s movement natural during conversation?
- What is the overall degree of naturalness?

Open-ended Interview: To gain deeper insights into participants’ perceptions (Veling & McGinn, 2021), qualitative data were gathered through open-ended questions:

- What did you think the robotic object was doing?
- What was your general impression of the robotic object?

Manipulation Check: Manipulation checks were conducted at the final stage of the questionnaire as a post-manipulation check to minimize any potential influence on the experimental procedure. Participants’ perception ratings of each movement (opening, closing, turn-requesting and holding, turn-yielding, and pointing) were measured using a 7-point Likert scale (7 = “strong” to 1 = “weak”).

4. 3. 2. Objective Data

Task Completion Time: To examine whether the robot’s movements affected participants’ task performance, we recorded and analyzed the time it took participants to complete the task following the robot’s request.

Conversational Overlaps: In HRI, conversational overlaps are considered social errors that can disrupt natural interaction flow (Tian & Oviatt, 2021). We recorded and counted participants’ overlaps during the two-second pauses in the robot’s utterances.

5. Results

5. 1. Manipulation Check

One-sample t-tests indicated that each movement was rated in the intended direction on a 7-point Likert scale. The results were as follows: opening (t(28) = 13.31, p < .001, M = 6.03, SD = 0.82), closing (t(28) = 24.01, p < .001, M = 6.55, SD = 0.57), turn-requesting and holding (t(28) = 15.56, p < .001, M = 5.97, SD = 0.68), turn-yielding (t(28) = 15.98, p < .001, M = 5.93, SD = 0.65), and pointing (t(28) = 24.52, p < .001, M = 6.59, SD = 0.57).

5. 2. Assumption Checks

Prior to the main analyses, assumption checks were conducted for the parametric tests used in this study. Normality was examined using Shapiro–Wilk tests and Q–Q plots, and homogeneity of variance was assessed using Levene’s test. For the MANOVA, Box’s M test was also conducted. The group sizes were similar across conditions (A = 9, B = 10, C = 10), and the multivariate homogeneity assumption was supported, Box’s M = 21.615, F(20, 2367.773) = 0.832, p = .675. Levene’s tests indicated that homogeneity of variance was satisfied for engagement, F(2, 26) = 1.206, p = .316, likeability, F(2, 26) = 0.272, p = .764, perceived intelligence, F(2, 26) = 1.434, p = .257, and task completion time, F(2, 26) = 1.696, p = .203, but not for fluency, F(2, 26) = 5.897, p = .008. Supplementary robustness checks were therefore conducted where appropriate. These checks showed the same overall pattern of results, and the planned parametric analyses were retained.

5. 3. Reliability

Cronbach’s alpha was calculated to assess the internal consistency of participants’ responses to engagement, likeability, perceived intelligence, and fluency. The results indicated that all measures were acceptable. Cronbach’s alpha values for each measure are presented in Table 1.

Table 1

Internal consistency of measures in the experiment

5. 4. Participants’ Perceptions

A one-way MANOVA revealed a significant multivariate effect, F(8, 46) = 32.75, p < .001. Figure 4 illustrates the differences of effects of robot movements on participants’ perceptions. Participants reported higher engagement in Condition B (M = 6.30, SD = 0.37) than in Condition A (M = 4.50, SD = 0.52) and Condition C (M = 5.60, SD = 0.61). For likeability, Condition B (M = 6.50, SD = 0.37) was rated higher than Condition A (M = 4.62, SD = 0.44) and Condition C (M = 5.46, SD = 0.31). However, participants reported higher perceived intelligence in Condition C (M = 6.02, SD = 0.52) than Condition A (M = 4.40, SD = 0.33) and Condition B (M = 5.02, SD = 0.33). Participants also reported higher fluency in Condition C (M = 6.23, SD = 0.35) than in Condition A (M = 4.56, SD = 0.37) and Condition B (M = 5.37, SD = 0.87).

Figure 4

Comparison of participants’ perception mean scores for robot movements (A: pointing-only, B: pointing + opening and closing, C: pointing + turn-taking)

Tukey HSD post hoc tests revealed significant differences. For engagement, Condition B showed higher ratings than Condition A (p < .001) and Condition C (p = .013). Condition C was significantly higher than A (p < .001). For likeability, Condition B received greater ratings than Condition A (p < .001) and Condition C (p < .001). Condition C was significantly higher than Condition A (p < .001). For perceived intelligence, Condition C was rated above both Condition A (p < .001) and Condition B (p < .001). Condition B also exceeded Condition A (p = .007). For fluency, Condition C scored higher than Condition A (p < .001) and Condition B (p = .008). Condition B was significantly higher than Condition A (p = .016).

Because Levene’s test indicated unequal variances for fluency, a Welch ANOVA was additionally conducted as a robustness check. The result remained significant, Welch’s F(2, 16.272) = 48.539, p < .001. Games–Howell post hoc comparisons also showed significant differences among all three conditions (A vs. B, p = .047; A vs. C, p < .001; B vs. C, p = .032), supporting the same overall conclusion.

5. 5. Task Completion Time

A one-way ANOVA revealed a significant effect of condition on task completion time, F(2, 26) = 6.58, p = .005. Participants in Condition C (M = 1.30, SD = 0.41) had shorter task completion times compared to Condition A (M = 2.50, SD = 1.10) and Condition B (M = 1.84, SD = 0.51). Post hoc comparisons using the Tukey HSD test showed that Condition C was significantly different from Condition A (p = .003). However, there was no significant difference between Condition C and Condition B (p = .233), nor between Condition B and Condition A (p = .134). Figure 5 illustrates a comparison of mean task completion time across movement conditions. As a robustness check, a Welch ANOVA was also conducted and remained significant, Welch’s F(2, 15.313) = 6.555, p = .009.

Figure 5

Comparison of mean task completion time across robot movements (A: pointing-only, B: pointing + opening and closing, C: pointing + turn-taking)

5. 6. Conversational Overlaps

A chi-square test revealed significant differences in overlap occurrence across conditions, χ²(2, N = 29) = 10.83, p = .004. Overlaps were observed at a lower rate in Condition C (30.00%) compared to Condition A (88.89%) and Condition B (90.00%). Although the chi-square test was significant, this result should be interpreted with caution because several expected frequencies were small (50.0% of the cells had expected counts below 5, with a minimum expected count of 2.79).

5. 7. Qualitative Data

In Condition A, participants described the robot’s movement as “simple guidance” or “recommending.” Some noted that “more natural interaction was desired.” In Condition B, participants perceived the robot’s movements as expressive and engaging, describing it as “an entity that welcomed me” and noting it “created a pleasant atmosphere.” In Condition C, participants emphasized that the robot’s movements appeared “interactive” and “conversational,” describing the interaction as similar to “playing a board game face-to-face.”

6. Discussion

This study explored abstract movements in non-humanoid robotic objects and their impact on conversational HRI. The findings revealed that: (i) all abstract movement types were significantly perceived as appropriate expressions, (ii) opening and closing movements significantly enhanced engagement and likeability, and (iii) turn-taking movements significantly improved perceived intelligence and fluency while reducing task completion time and conversational overlaps.

First, all gesture types were perceived as appropriate expressions (H1), confirming that abstract movements were interpreted as intended. This demonstrated that simple motion-based expressiveness can effectively communicate social intentions without requiring anthropomorphic features.

Second, Condition B significantly enhanced engagement (H2a) and likeability (H2b) compared to Condition A. Participants reported experiencing positive greetings, enjoyable interactions, and curiosity regarding the robot’s intentions, likely because opening and closing movements conveyed the robot’s communicative intent through clearly noticeable movements.

Third, hypotheses H3a and H3b were only partially supported. While Condition C positively affected engagement and likeability relative to Condition A, it was less effective than Condition B, which may be because the opening and closing movements had a more dynamic motion range than subtle turn-taking movements.

Conversely, hypotheses H3c and H3d were fully supported. Condition C received significantly higher ratings for perceived intelligence and fluency than other conditions. Participants reported that the robot’s feedback was clear and its movements were interactive and conversational, suggesting that Condition C provided easily understandable communicative cues. Furthermore, Condition C significantly reduced task completion time relative to Condition A, although this effect did not reach significance when compared with Condition B (H3-1). Condition C also significantly reduced conversational overlaps (H3-2). These results demonstrated that even minimal abstract movements in low-anthropomorphic robots could enhance conversational HRI. Turn-taking movements positively influenced task efficiency and reduced conversational overlaps, indicating that effective HRI does not necessarily require sophisticated humanoid features.

Beyond demonstrating statistically significant differences between conditions, the findings suggest several preliminary movement-design guidelines for low-DoF conversational robots. When the design goal is to make the beginning and end of an interaction more legible and to reinforce welcoming or closing impressions, relatively larger and clearer movements may be useful, as they can serve as cues that mark interaction boundaries and make the robot’s social availability more apparent. In contrast, when the goal is to clarify turn transition and support smoother speaking-listening exchange, smaller, shorter, and more clearly timed movements placed after a user utterance or before the robot’s next utterance may be better suited to the role of turn-taking cues. Overall, these interpretations extend the contribution of the study from reporting condition effects to offering design guidance on how minimal abstract movement may be used to organize interaction.

6. 1. Limitations and Future Research

This study presented several limitations. First, the relatively small sample size (9-10 participants per condition) limits generalizability and requires validation with larger participant samples. Second, the controlled laboratory setting did not fully reflect real-world environments. Third, participants’ interactions were short conversations that lasted less than a minute. Incorporating longer interaction periods and repeated encounters could reveal habituation and longitudinal effects. Fourth, this study examined only one-to-one interactions. Future research should investigate how non-humanoid abstract gestures can be applied to group conversation contexts.

7. Conclusion

This study investigated how abstract movements in non-humanoid robotic objects affect user perceptions in conversational contexts. All movement types were perceived as appropriate gestures. Opening and closing movements significantly enhanced engagement and likeability, while turn-taking movements improved perceived intelligence and fluency. Additionally, turn-taking movements reduced conversational overlaps compared to both conditions and shortened task completion time compared to the pointing-only condition.

These findings demonstrated that minimal abstract movements could effectively support conversational grounding in non-humanoid robots. The exploratory movement patterns identified in this study implemented conversational cues through simple motion-based expressiveness, offering an alternative to human-mimicking approaches. They may also help develop social features in other physically constrained intelligent products, ultimately contributing to more engaging and fluent interactions in everyday contexts.

The contribution of this study lies not only in showing that minimal abstract movement can support conversational grounding, but also in clarifying how different movement types may function as distinct interaction cues in low-DoF service robots. This provides a basis for approaching robot movement design not simply in terms of increasing expressiveness, but in terms of shaping interaction in socially and interactionally meaningful ways.

Acknowledgments

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2025S1A5C2A02022408).

Notes

Citation: Lee, H., & Jung, E-C. (2026). Abstract Movement Design for Non-Humanoid Robots: Opening, Closing, and Turn-Taking Gestures. Archives of Design Research, 39(2), 145-158.

Copyright : This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted educational and non-commercial use, provided the original work is properly cited.

References

Ajibo, C. A., Ishi, C. T., Mikata, R., Liu, C., & Ishiguro, H. (2020). Analysis of body gestures in anger expression and evaluation in android robot. Advanced Robotics, 34(24), 1581-1590. [https://doi.org/10.1080/01691864.2020.1855244]
Anderson-Bashan, L., Megidish, B., Erel, H., Wald, I., Hoffman, G., Zuckerman, O., & Grishko, A. (2018, August). The greeting machine: an abstract robotic object for opening encounters. In 2018 27th IEEE international symposium on robot and human interactive communication (RO-MAN) (pp. 595-602). IEEE. [https://doi.org/10.1109/ROMAN.2018.8525516]
Bartneck, C., Kulić, D., Croft, E., & Zoghbi, S. (2009). Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International journal of social robotics, 1(1), 71-81. [https://doi.org/10.1007/s12369-008-0001-3]
Clavel, C., Cafaro, A., Campano, S., & Pelachaud, C. (2016). Fostering user engagement in face-to-face human-agent interactions: a survey. In Toward Robotic Socially Believable Behaving Systems-Volume II: Modeling Social Signals (pp. 93-120). Cham: Springer International Publishing. [https://doi.org/10.1007/978-3-319-31053-4_7]
Duncan, S. (1972). Some signals and rules for taking speaking turns in conversations. Journal of Personality and Social Psychology, 23(2), 283-292. [https://doi.org/10.1037/h0033031]
Duncan Jr, S., & Niederehe, G. (1974). On signalling that it's your turn to speak. Journal of experimental social psychology, 10(3), 234-247. [https://doi.org/10.1016/0022-1031(74)90070-5]
Epley, N., Waytz, A., & Cacioppo, J. T. (2007). On seeing human: A three-factor theory of anthropomorphism. Psychological Review, 114(4), 864-886. [https://doi.org/10.1037/0033-295X.114.4.864]
Erel, H., Trayman, D., Levy, C., Manor, A., Mikulincer, M., & Zuckerman, O. (2022). Enhancing emotional support: The effect of a robotic object on human-human support quality. International Journal of Social Robotics, 14(1), 257-276. [https://doi.org/10.1007/s12369-021-00779-5]
Goffman, E. (1955). On face-work: An analysis of ritual elements in social interaction. Psychiatry, 18(3), 213-231. [https://doi.org/10.1080/00332747.1955.11023008]
Hoffman, G. (2019). Evaluating fluency in human-robot collaboration. IEEE Transactions on Human-Machine Systems, 49(3), 209-218. [https://doi.org/10.1109/THMS.2019.2904558]
Hoffman, G., & Ju, W. (2014). Designing robots with movement in mind. Journal of Human-Robot Interaction, 3(1), 91-122. [https://doi.org/10.5898/JHRI.3.1.Hoffman]
Hu, Y., Chen, B., Lin, J., Wang, Y., Wang, Y., Mehlman, C., & Lipson, H. (2024). Human-robot facial coexpression. Science Robotics, 9(88), eadi4724. [https://doi.org/10.1126/scirobotics.adi4724]
Hu, Y., Huang, P., Sivapurapu, M., & Zhang, J. (2025). Elegnt: Expressive and functional movement design for non-anthropomorphic robot. arXiv preprint arXiv:2501.12493. [https://doi.org/10.48550/arXiv.2501.12493]
Kendon, A. (1967). Some functions of gaze-direction in social interaction. Acta psychologica, 26, 22-63. [https://doi.org/10.1016/0001-6918(67)90005-4]
Kendrick, K. H., Holler, J., & Levinson, S. C. (2023). Turn-taking in human face-to-face interaction is multimodal: gaze direction and manual gestures aid the coordination of turn transitions. Philosophical transactions of the royal society B, 378(1875), 20210473. [https://doi.org/10.1098/rstb.2021.0473]
Kim, A., Han, J., Jung, Y., & Lee, K. (2013, March). The effects of familiarity and robot gesture on user acceptance of information. In 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 159-160). IEEE. [https://doi.org/10.1109/HRI.2013.6483550]
Kim, C. Y., Lee, C. P., & Mutlu, B. (2024, March). Understanding large-language model (llm)-powered human-robot interaction. In Proceedings of the 2024 ACM/IEEE international conference on human-robot interaction (pp. 371-380). [https://doi.org/10.1145/3610977.3634966]
Kompatsiari, K., Ciardo, F., Tikhanoff, V., Metta, G., & Wykowska, A. (2021). It's in the eyes: The engaging role of eye contact in HRI. International Journal of Social Robotics, 13(3), 525-535. [https://doi.org/10.1007/s12369-019-00565-4]
Lee, H., & Hahn, S. (2025). Effect of Robot Head Movement and its Timing on Human-Robot Interaction. International Journal of Social Robotics, 17(1), 3-14. [https://doi.org/10.1007/s12369-024-01196-0]
Liu, C., Ishi, C. T., Ishiguro, H., & Hagita, N. (2013). Generation of nodding, head tilting and gazing for human-robot speech interaction. International Journal of Humanoid Robotics, 10(01), 1350009. [https://doi.org/10.1142/S0219843613500096]
Mahadevan, K., Chien, J., Brown, N., Xu, Z., Parada, C., Xia, F., Zeng, A., Takayama, L., & Sadigh, D. (2024, March). Generative expressive robot behaviors using large language models. In Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (pp. 482-491). [https://doi.org/10.1145/3610977.3634999]
McClave, E. Z. (2000). Linguistic functions of head movements in the context of speech. Journal of pragmatics, 32(7), 855-878. [https://doi.org/10.1016/S0378-2166(99)00079-X]
Mori, M., MacDorman, K. F., & Kageki, N. (2012). The uncanny valley [from the field]. IEEE Robotics & automation magazine, 19(2), 98-100. [https://doi.org/10.1109/MRA.2012.2192811]
Onyeulo, E. B., & Gandhi, V. (2020). What makes a social robot good at interacting with humans?. Information, 11(1), 43. [https://doi.org/10.3390/info11010043]
Rosenthal-von der Pütten, A. M., Krämer, N. C., & Herrmann, J. (2018). The effects of humanlike and robot-specific affective nonverbal behavior on perception, emotion, and behavior. International Journal of Social Robotics, 10(5), 569-582. [https://doi.org/10.1007/s12369-018-0466-7]
Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. language, 50(4), 696-735. [https://doi.org/10.1353/lan.1974.0010]
Sauppé, A., & Mutlu, B. (2014, March). Robot deictics: How gesture and context shape referential communication. In Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction (pp. 342-349). [https://doi.org/10.1145/2559636.2559657]
Schegloff, E. A. (2007). Sequence organization in interaction: A primer in conversation analysis I (Vol. 1). Cambridge university press. [https://doi.org/10.1017/CBO9780511791208]
Skantze, G., Hjalmarsson, A., & Oertel, C. (2014). Turn-taking, feedback and joint attention in situated human-robot interaction. Speech Communication, 65, 50-66. [https://doi.org/10.1016/j.specom.2014.05.005]
Thomaz, A. L., & Chao, C. (2011). Turn-taking based on information flow for fluent human-robot interaction. AI Magazine, 32(4), 53-63. [https://doi.org/10.1609/aimag.v32i4.2379]
Tian, L., & Oviatt, S. (2021). A taxonomy of social errors in human-robot interaction. ACM Transactions on Human-Robot Interaction (THRI), 10(2), 1-32. [https://doi.org/10.1145/3439720]
Tuyen, N. T. V., Elibol, A., & Chong, N. Y. (2020). Learning bodily expression of emotion for social robots through human interaction. IEEE Transactions on Cognitive and Developmental Systems, 13(1), 16-30. [https://doi.org/10.1109/TCDS.2020.3005907]
van Schendel, J. A., & Cuijpers, R. H. (2015). Turn-yielding cues in robot-human conversation. In AISB Convention 2015, Society for the Study of Artificial Intelligence and Simulation of Behaviour, 20-22 April 2015, Canterbury, 2015.
Veling, L., & McGinn, C. (2021). Qualitative research in HRI: A review and taxonomy. International Journal of Social Robotics, 13(7), 1689-1709. [https://doi.org/10.1007/s12369-020-00723-z]
Waytz, A., Cacioppo, J., & Epley, N. (2010). Who sees human? The stability and importance of individual differences in anthropomorphism. Perspectives on psychological science, 5(3), 219-232. [https://doi.org/10.1177/1745691610369336]
Wiemann, J. M., & Knapp, M. L. (2017). Turn-taking in conversations. Communication theory, 226-245. [https://doi.org/10.4324/9781315080918-19]
Zaga, C., De Vries, R. A., Li, J., Truong, K. P., & Evers, V. (2017, May). A simple nod of the head: The effect of minimal robot movements on children's perception of a low-anthropomorphic robot. In Proceedings of the 2017 CHI conference on human factors in computing systems (pp. 336-341). [https://doi.org/10.1145/3025453.3025995]

Measure	Number of items	Cronbach’s α
Engagement	4	.86
Likeability	5	.89
Perceived intelligence	5	.91
Fluency	3	.83