Quantitative Distribution of English and Indonesian Motion Verbs and Its Typological Implications: A case study with the English and Indonesian versions of the Twilight novel

This paper investigates the quantitative distribution (type and token frequencies, and type-per-token ratio [TTR]) of motion verbs found in English and Indonesian versions of the novel Twilight (Meyer, 2005; Sari, 2008). The study is contextualized within two divergent views on the typological characteristics of Indonesian lexicalization patterns of motion events. One study (Son, 2009) suggests that Indonesian behaves like English, representing a satellite-framed pattern (i.e., lexicalizing Manner of motion in the main verb) while another study (Wienold, 1995) argues for the verb-framed nature of Indonesian (i.e., lexicalizing Path of motion in the main verb). We seek to offer a quantitative perspective to these two proposals. Our study shows that, compared to English, Indonesian has significantly higher number (i.e., types) and occurrences (i.e., tokens) of Path verbs (reflecting the verb-framed pattern). Moreover, the higher TTR value of Path verbs for Indonesian shows a greater lexical diversity in the inventory of Indonesian Path verbs compared to English. In contrast, the English Manner verbs are significantly higher in number and in token frequency than Indonesian (suggesting the satellite-framed pattern), and show greater lexical diversity given the higher TTR value. While these findings lean toward supporting the verb-framed pattern of Indonesian (Wienold, 1995), we caution with the limitation of our conclusion and offer suggestions for future study.


Introduction
Ever since Talmy's (1972) hallmark study of the semantic structure of motion events in English and Atsugewi, the expression of motion across languages has become one of the central topics in linguistics (Filipović, 2007). It has been subject of inquiry within the context of crosslinguistic studies in language acquisition, development, and change, linguistic typology, narrative discourse, bilingualism, and translation, among many others (for a recent overview, see Filipović & Ibarretxe-Antuñano, 2015). "Motion events" (or "translatory situations" in Talmy's (1972, p. 10) original study) refers to a change of location of an object from one location to another via certain path (Filipović, 2007, p. 8;Talmy, 2000, p. 25) (see example (1)): (1). (from Talmy, 2000, p. 227) The bottle floated into the cave Motion events consist of four internal, semantic components: Figure, Ground, Path and Motion (Talmy, 2000, p. 25). Figure is an object that moves with reference to another object, namely the Ground. Path is the trajectory along which the Figure moves with respect to the Ground. The Motion captures "the presence per se of motion (…) in the event" (Talmy, 2000, p. 25). Motion event expression in (1) can be analyzed with respect to these components. The Figure is lexicalized by the bottle while the Ground is lexicalized by the cave. The preposition into lexicalizes the Path of the Figure's movement, which in turn is expressed by the main verb floated. In addition to these four internal components, a motion event can often be associated with an external, Co-event, namely the Manner in which the movement is carried out. English is a celebrated example of language whereby the Manner is conflated in the verb (Talmy, 2000, p. 152). Example (1) shows the conflation of Manner and Motion in its main verb float, showing that the movement of the bottle into the cave happens by way of floating.
Talmy's central contribution is that languages can be classified according to how the core schema of a motion event, namely the Path, is lexicalized in the surface expressions. That is, whether the Path is lexicalized in the main verb or in other constituents, which are labelled the satellites (Talmy, 2000, p. 101ff).
In English, verb particles are examples of satellites in Talmy's system and they can overlap with other categories, such as English prepositions (Talmy, 2000, p. 102). Languages that characteristically lexicalizes the Path in the satellites, and conflate the Manner in the main verbs, are called the satellite-framed (or S-)languages, while those lexicalizing or conflating the Path in the main verb, and express the Manner in other constituents (e.g., adverbials), are called the verb-framed (or V-)languages (Talmy, 2000, p. 222). Romance languages (e.g., French and Spanish), Semitic, Japanese, Polynesian, Tamil, and Bantu are languages characterized as V-languages. The Slanguages are represented by English, German, Chinese (Talmy, 2000, p. 222).
While studies on motion events abound in many languages, a search of the literature revealed few studies which discuss this topic for Indonesian (Wienold, 1995;Son & Svenonius, 2008;Son, 2009;Pamphila, 2011), though a closely related language, namely Malay spoken in Malaysia, has been analyzed (Huang & Tanangkingsing, 2005). There is still a divergent view especially on the typological characteristics of Indonesian in lexicalizing the semantic components of motion events. Wienold (1995, pp. 311-312) argues that Indonesian behaves more like Spanish (i.e., a V-language) in which the Path is lexicalized in the main verbs due to the richness of verbs encoding the path of movement (e.g., melintasi 'to move across'; naik 'to move up/ascend'; turun 'to move down/descend'). However, Son (2009; see also Son & Svenonius, 2008) proposes that Indonesian (and related languages such as Javanese and Balinese) behave like English and other Germanic languages (i.e., S-language) in which the main verb lexicalizes the Manner and can evoke "directed motion interpretations" when cooccurring with the goal-expressing preposition ke 'to' (Son, 2009, p. 217) (see (2)): with directional preposition ke 'to', a pattern that Son argues as similar to English. In contrast, Wienold only presented path verbs and explicitly acknowledged that the study has not "been able to assess the extent of manner lexicalization in Indonesian" (1995, p. 334, endnote 13). Huang and Tanangkingsing's (2005) quantitative study on Malay, based on elicited narrative, offers a more nuanced perspective as opposed to the superficially strict two-way typology of motion events proposed by Talmy (2000, p. 221 Wienold (1995). Moreover, Huang and Tanangkingsing (2005, p. 334) argue that languages cannot be classified into "either-or" typology, since a given language can show (a combination of) features of satellite-framed and verb-framed languages to a varying degree. This suggests that what is thought to be a typical pattern in a language "is usually a statistical usage preference rather than a hard syntactic constraint" (Goschler & Stefanowitsch, 2013, p. 4; see also Beavers, Levin, & Wei Tham, 2010).
We have seen diverging proposals from Son (2009) and Wienold (1995) concerning the typological characteristic of Indonesian lexicalization patterns of motion events: the former argues that Indonesian behaves like English (i.e., satellite-framed pattern) while the latter proposes the verb-framed pattern for Indonesian. Their proposals, moreover, are based on qualitative approach without further quantitative investigation. While Wienold (1995, p. 313, Table 4) enlisted a number of Path verbs to propose the verb-framed nature of Indonesian, there was no analysis for the distribution of Indonesian Manner verbs (Wienold, 1995, p. 334 endnote 13). This paper departs from these gaps and the two competing views for the typological characteristics of Indonesian lexicalization of motion events. Our study is also built on Pamphila's (2011) work on motion verbs in English and their Indonesian translations, and uses the same data source, namely the English and Indonesian version of the Twilight novel (Meyer, 2005;Sari, 2008). Pamphila (2011) applies Talmy's typology of motion events in the context of translation strategies of motion events from English into Indonesian.
In this paper, we offer a quantitative basis for the characterization of Indonesian lexicalization patterns with respect to the twosystem typology of motion events proposed by Talmy (2000). Specifically, we compare the quantitative distribution (i.e., the number of types and token-frequencies) of motion verbs in English and Indonesian versions of Twilight. This quantitative approach is motivated theoretically and empirically by Slobin's (1996) seminal work on English and Spanish, demonstrating that the typological split should be relativised to the quantitative distribution of the motion verbs in language use (cf. Goschler & Stefanowitsch, 2013).
One of the proposals put forward by Slobin (1996) is that typological differences in the lexicalization of motion events between Slanguages and V-languages are connected to the number (i.e., type-frequency) and frequency of occurrences (i.e., token frequency) of lexical items that encode the Manner of motion. The quantitative prediction is that V-languages would have fewer and less expressive Manner-of-motion verbs than Slanguages (cf. Slobin, 1996, p. 208). This proposal is developed based on prototypical examples of S-languages and V-languages respectively, namely English and Spanish.
We extend and test Slobin's proposal to Indonesian motion verbs, with reference to English as the prototypical S-language, in the context of the typological implication of Indonesian lexicalization of motion events, given the two competing proposals from Son (2009) and Wienold (1995). To preview our results, we found statistically significant asymmetries in the distribution (both type and token-frequency) of Path and Manner verbs between English and Indonesian. We support Wienold's (1995) proposal that Indonesian is richer in its Path verbs compared to English. Towards the end of the conclusion, we close the paper with the limitation of our finding and further issue that we are exploring in relation to this project. Despite all the remaining works to be done, it is hoped that this paper generates fresh, quantitative, usage-based insights into the typological characteristics of Indonesian lexicalization of motion events.

Methodology
As mentioned in the previous section, the data was taken from the novel Twilight in English and its Indonesian translation (cf. Slobin, 1996 who also used novels as one of the data source). Twilight has received wide popularity, having been sold 17 million copies worldwide and translated into 37 different languages (Pamphila, 2011, p. 37). Consequently, the English and Indonesian versions of Twilight are more accessible compared to other titles.
Novel is chosen since, as a long narrative fiction, it could provide a range of human experiences (e.g., moving around places and how that movement is captured following the story lines); Twilight consists of narratives full of actions than merely a monologue diary type of literature, and offers rich human experiences. Given such inherent feature of a novel and Twilight in particular, we expect that it would provide rich expressions of motion events. Future works are welcome to further test the potential distributional differences of motion events across texts of different genres.
The rationale of using translated novel is to have the same basis of comparison of the motion events with the same story lines within the two version of the novels. In addition, translated novel has been used in previous studies on semantic typology and translation strategies of motion events (Slobin, 1996(Slobin, , 2005Ibarretxe-Antuñano, 2003). The use of texts with different story lines would presumably bias the number of motion events, though this is in itself an empirical question. In our case, we have at least the same story lines and the differences that we might observe (e.g., in the inventory of motion verbs and in the description of motion-related scenes of the same story) could be due to the different semantic-typological patterns in lexicalising motion, which is what this paper attempts to investigate (i.e., comparing the richness of manner and path verbs between English and Indonesian).
The database of the motion verbs, and their classification into Manner and Path verbs, was built manually through close reading of the entire novels, initially for the purpose of the master's thesis of the second author. We refer to previous works (e.g., Slobin, 1996Slobin, , 2000Wienold, 1995, among others) to classify the Manner and Path verbs in English. For the Indonesian data, we relied on our intuition as native speakers and consulted the online Big Indonesian Dictionary (KBBI) (https://kbbi.kemdikbud.go.id) to check the meaning of the verbs. The type of motion events that become the focus of this paper includes the self-propelled/directed motion, whereby the Figure performs the motion itself (3), and the caused motion (4) (cf. Slobin, 1996, p. 200 for similar approach).
The first author then performed automatic lemmatization, dependency parsing, and partof-speech (POS) tagging on the entire versions of the novels, using R (R Core Team, 2020) and the udpipe R package (Wijffels, 2019), in addition to conducting the quantitative analyses and visualization. Lemmatization helps in extracting the lemmas of the motion verbs that becomes the unit for the quantitative analyses (see below). Lemma refers to the abstract, uninflected form of a word (e.g., walk as a lemma can realize into the base form or infinitive walk, third person walks, past tense and participial walked and the -ing form walking). The POS tagging will ensure that the lemma to be filtered is verb but not noun (e.g., drive is an ambicategorical word that can be a noun [e.g., a one-hour drive] or verb [e.g., he drove to downtown]).
For the quantitative analyses, we compare the type and token frequencies of the Manner and Path motion verbs in the two novels. Type frequency measures the number of different verb types (i.e., lemmas) per category (i.e., how many types there are for Path and Manner verb-lemmas in English and Indonesian versions of the novel). The token frequency measures the number of occurrences of a given lemma in the whole novel (e.g., how many times does the verbal-lemma walk in its various inflections occur in the entire novel). The tally of the token frequencies per verb category (i.e., Path and Manner verbs) for each Statistical significance test for each comparison was computed using the Binomial Test (two-tailed) (see Gries, 2009, pp. 37-44) implemented in the R function binom.test(). In addition to these four comparisons, we also compare the type-per-token ratio (TTR) values for the Manner and Path verbs between the two novels. TTR can indicate relative lexicaldiversity of certain type of verb in each language (Slobin, 1996, p. 208). Data preprocessing and visualization were performed using the tidyverse suits of R packages (Wickham et al., 2019). We publish the data and R codes for the statistical analyses at https://doi.org/10.6084/m9.figshare.147534 45 (Rajeg & Pamphila, 2021).

Manner Verbs in English and Indonesian
Earlier studies on motion events have reported that English as an S-language shows greater variety of motion verbs that conflate the Manner component as compared to Vlanguages, such as Spanish (Egli, Pause, Schwarze, von Stechow, & Wienold, 1995, p. xiii;Wienold, 1995, p. 303;Slobin, 1996, p. 198, among others). Wienold's (1995) account on Indonesian, however, only includes Path verbs (excluding Manner verbs) that are not directly compared quantitatively with data from Slanguages as to providing relative position of Indonesian in the two-way typology proposed by Talmy (2000). Let us look at the number of Manner verb-lemmas found in the English and Indonesian versions of Twilight as visualized in Figure 1.
The English data holds highly significantly greater number of Manner motion verbs (65 types) than Indonesian (26 types) (pbinomial twotailed < 0.001), which is less than half of the English inventory. In other words, 79.3% of the total 82 motion verbs in the English data consists of Manner verbs, while the proportion for this type is lower for Indonesian (i.e., only 43.3% of the total 60 motion verbs in Indonesian database are Manner verbs). The prominence of the Manner verbs in English as compared to Indonesian is also clear from the token frequencies (i.e., frequency of occurrences) (Figure 2). English Manner verbs are, overall, highly significantly more frequent (Ntoken = 729) than Indonesian (Ntoken = 449) (pbinomial two-tailed < 0.001). Moreover, English also has higher type-per-token ratio (TTR) value for the Manner verbs compared to Indonesian (TTREnglish = 0.0892 vs. TTRIndonesian = 0.0579), meaning that English has greater diversity of Manner verbs than Indonesian does relative to their tokens.
Another interesting feature is the Indonesian Manner verbs prefixed with the static passive prefix ter-. All ter-prefixed Manner verbs in Table 2 expresses movement of tripping over or sliding/slipping away. The ter-prefix further accentuates the accidental nature of such manner of movements since one of the functions of ter-is to express involuntary/accidental action (Sneddon, Adelaar, Djenar, & Ewing, 2010, p. 117).
The remaining Manner verbs are formed via reduplication with added semantic aspect. For instance, the base melompat 'jump' evokes different meaning than the reduplicated form Journal of Language and Literature ISSN: 1410-5691 (print); 2580-5878 (online) Gede Primahadi Wijaya Rajeg & Utei Charaleghy Pamphila melompat-lompat suggesting an iterative action of hopping up and down. The crucial role of reduplication for creating Manner verbs can also be found in other languages, such as Japanese and Korean (Wienold, 1995). Wienold (1995, pp. 311-313) proposes that Indonesian is characteristically a verbframed or V-language (or Path language in Wienold's terminology), similar to French, Spanish, Italian, Thai, and Malay (see Huang & Tanangkingsing, 2005, for the study on Malay). One way of testing Wienold's (1995, p. 312) hypothesis is by comparing the inventory of Path verbs in Indonesian with English that is characteristically an S-language (cf. Slobin, 1996), as we have further confirmed in the previous sub-section.

Figure 3 visualizes the number of Path verbs in English and Indonesian versions of
Twilight. One can see that Indonesian indeed has a higher number of Path verbs than English (34 types for Indonesian vs. 17 types for English), and this distributional asymmetry is statistically significant (pbinomial two-tailed < 0.05). The proportion of English Path verbs is only 20.7% out of the total 82 motion verbs in the English database, while Indonesian Path verbs represent 56.7% of the total 60 Indonesian motion verbs. Wienold (1995, p. 323, Table 11) has shown that the predominant number of Path verbs in English is influenced by French and other Romance languages via borrowing, such as enter, exit, pass, return. Only two types are given as monomorphemic Path verbs of Germanic origin, namely rise and leave. Table 2 lists all the English and Indonesian Path verbs found in Twilight.

Figure 4 Summed token frequency of Path verb-lemmas in English and Indonesian versions of Twilight
The predominant type of the Indonesian Path verbs in Table 2 is derived transitive verbs, either from the intransitive verbal bases or from other word classes (e.g., adjective). For instance, naik 'go up' is an intransitive-base Path verb (see (9)) that has its derived transitive form in menaiki 'go up (onto sth.)' (10). Other examples of the same type include masuk 'go/come in' → memasuki 'enter', turun 'go down' → menuruni 'go down sth.' The intransitive forms can be used in the satelliteframed construction whereby the Path is also Journal of Language and Literature ISSN: 1410-5691 (print);2580-5878 (online) Gede Primahadi Wijaya Rajeg & Utei Charaleghy Pamphila expressed via satellites (i.e., directional preposition ke 'to') in addition to being lexicalized in the main verb (cf. Son, 2009) (see examples (9)-(12) below; original English source texts are also presented).

) -Transitive
Kakiku kram ketika menaiki tangga. foot.1POSS cramps when ascend stair 'My foot got cramps when (I) went up the stair' English source text: My feet dragged as I climbed the stairs. (Meyer, 2005, p. 78) (11). (Sari, 2008, p (Meyer, 2005, p. 21) The intransitive and transitive syntax of the Indonesian Path verbs in (9) and (10) respectively correspond syntactically to the English source texts, even though semantically the Indonesian example in (10) only maintains the Path and lost the Manner from the original English texts (cf. Pamphila, 2011); that is, menaiki 'ascend' does not capture the Manner expressed by climb. Examples (11) and (12) show that the Path verbs are used to render prepositions from the English source texts. It remains to be seen what factors may systematically influence such usage variation of the Indonesian Path verbs in questions.
Finally, it is important to point out that the Indonesian Path and Manner verbs can cooccur in a serial verb construction (SVC) (Aikhenvald, 2007). SVC captures a conceptually single event and is expressed as "a sequence of verbs which act together as a single predicate" (Aikhenvald, 2007, p. 1). Examples (13) and (14) illustrate the SVCs from our translation database (the original English source texts are provided as well). (13). (Sari, 2008, p. 137) Aku berlarimanner masukpath untuk 1SG run enter in.order.to memanaskan minyak 'I run inside to heat up the oil' English source text: I ran inside to get some oil heating on the stove (Meyer, 2005, p. 86) (14). (Sari, 2008, p. 42) Mr. Banner sedang berjalanmanner NAME PROG walk mengelilingipath kelas go/revolve.around class English source text: Mr. Banner was walking around the room (Meyer, 2005, p. 26) These two examples illustrate how SVCs in Indonesian can be used to translate satelliteframed patterns from English. The semantic components of Path in (13) and (14), expressed by the particles inside and around respectively, are kept in the Indonesian translation via the Path verbs in the SVCs. Our upcoming paper will present a quantitative study on the typological patterns of the motion Journal of Language andLiterature Vol. 21 No. 2 -October 2021 ISSN: 1410-5691 (print);2580-5878 (online) events and their translation strategies from English into Indonesian.

Conclusion
This paper is couched within Talmy's (2000) two-way typological systems of motion events and set to address the two diverging proposals for the typology of lexicalization patterns of motion events in Indonesian (Wienold, 1995;Son & Svenonius, 2008;Son, 2009). We extend one of Slobin's proposals in locating typological differences in lexicalizing motion events between satellite-framed or Slanguages and verb-framed or V-languages, namely by quantifying the number and frequency of occurrences of Manner and Path verbs, as well as their type-per-token ratios; our database is built on the English and Indonesian versions of Twilight.
Our findings show that Indonesian exhibits the characteristics of V-language in which the number and token-frequency of its Path verbs are significantly higher than English (Figure 4), but significantly lower for the distribution of the Indonesian Manner verbs compared to English (cf. Figure 2). These results provide further support to Wienold's (1995) proposal for the verb-framed nature of Indonesian, and are in line with Huang and Tanangkingsing's (2005) findings on the closely related language Malay spoken in Malaysia (Indonesian is a variety of Malay spoken in the Indonesia archipelago).
The quantitative study in this paper is only one analytical aspect suggested by Slobin (1996) in revealing typological differences in the lexicalization of motion events. We only investigated the inventory and usage frequency of the verbs in the two novels. Another proposal that we seek to investigate in our future work is the salience of the Ground expression in describing motion events (Slobin, 1996). Our conclusion is also limited to the data source that we used and, for Indonesian in particular, to the linguistic knowledge of the Indonesian translator. Therefore, our findings should now be seen as working hypotheses to be further tested with different data type and analytical aspects, and compared across other languages (e.g., regional languages in the Indonesia archipelago). As a follow-up study, and building on Pamphila's (2011) master's thesis, we are currently investigating (i) how the inventory of the Indonesian Manner and Path verbs joint-forces in the translation of motion events from English into Indonesian (see the discussion on examples (9) -(14) for some pointers); and (ii) whether such investigation can offer a different perspective on the typological patterns of Indonesian lexicalization of motion events.