The Flow Of Japanese Part 2 - Dependency Diagrams

This article is currently a draft

Note: you need some basic understanding of Japanese to read this part

Note 2: I am not an expert; I just found this too interesting to not share

Basic Rules
Unique Feature #1: Particles
Basic Japanese Example
Unique Feature #2: Sentence Qualifying Nouns
Nested Sentence Japanese Example
Handling Ambiguity
Conclusion
Conjectures and Caveats

In the first part, we learned about dependency grammar and discussed how to break down and analyze sentences using it. If you haven’t already, I recommend reading it first. In particular, we learned these things:

Each part of a sentence is dependent on another part, except for the head of the sentence.
Sentence structure can be described by how these dependencies are linked and where the head is placed.
Each dependency link represents an answer to a question word.
English is considered a predominantly right-branching, head-initial language, while Japanese is considered an almost exclusively left-branching, head-final language.

In this part, we will take a deeper look into the dependency structure of Japanese. Unfortunately, you will need to know some basic Japanese for this part. I saw that the textbook Quartet actually briefly glosses over some of what I am going to say, but that book was made for intermediate learners and I believe that knowing about this earlier on can be beneficial, so I will keep the examples rather easy.

Basic Rules

First, let’s establish some overaching rules of dependency structure in Japanese.

All dependencies are directed from left to right
Dependencies do not cross each other
Each bunsetsu segment, except for the last one, depends on only one bunsetsu segment

In point 3, we reference a word called bunsetsu. Bunsetsu are fundamental parts of a Japanese sentence and consists of one content word and zero or more grammar points. They are the base of sentence diagrams in Japanese. We will work with bunsetsu at the beginning, but as sentences get increasingly complex, it makes more sense in our use case to group sections of a sentence into larger, more manageable chunks. Because of this, I will now refer to the diagrams as dependency diagrams instead of sentence diagrams so we don’t confuse our informal shortcut with the things that real computational linguists do. Also, I like the alliteration.

Anyways, we now know what dependency diagrams are, what their rules are in Japanese, and how to connect the separated chunks together. Now we need to figure out how to get the chunks themselves. Luckily, Japanese has some unique traits that makes it easier for us to identify them.

Unique Feature #1: Particles

Japanese has this quirk of grammar that’s present in every complete sentence: particles. Unlike English where we have to deconstruct sentences using questions words to help find the parts of speech, the answer to each one of those questions can be explicitly defined using particles in Japanese. This makes it incredibly easy to identify what role each word in the sentence plays.

Basic Japanese Example

I’m going to assume that you know some basic Japanese for this section.

Take the sentence:

「彼は8時に起きます」

He wakes up at 8 o’clock.

Let’s split it up into its individual bunsetsu.

「彼は」- Topic particle. The “who” is「彼」. The sentence pertains to「彼」.
「8時に」- Time particle. The “when” is「8時」. Whatever is happening is happening at that time.
「起きます」- We reached the end of the sentence. We see that the head is the verb “to wake up”. You can now offload the information that you collected.「彼」wakes up. Waking up is at「8時」.

We can see that each bunsetsu answers a question word, all dependencies go from left to right, and none of the dependencies cross.

Unique Feature #2: Sentence Qualifying Nouns

Japanese allows for sentences to qualify nouns. Not only that, but sentences in Japanese must end in a verb or a copula (whether or not it’s actually there or just implied is another issue). Since we know that Japanese is a head-final, right-branching language, we can deduce that every dependency diagram ends in a verb/copula, and also that dependency diagrams are made out of smaller dependency diagrams.

Nested Sentence Japanese Example

Let’s try a more complicated example.

「彼は私が作ったケーキを食べた」

He ate the cake I made.

When encountering sentences nested within sentences, you treat each sentence individually before appending their individual dependency diagrams onto the overarching sentence. The only caveat is that the は particle attaches to the topic of the overall sentence and not a sub-sentence. A lot of beginners ask what is the difference between the particles は and が, since it’s usually explained that は is the topic marker and が is the subject marker. Through dependency diagrams you can visualizing this. は is globally referenced, while が is local. (There are other differences as well, but this is the biggest one)

「彼は」- Topic particle. The “who” is「彼」. The main idea of the sentence involves「彼」.
「私が」- Subject particle. The “what” is「私」. Something in the immediate vicinity is going to relate to「私」somehow.
「作った」- This is the verb “to make”. Who is doing the making? The closest valid dependency is「私が」so we now know that「私」is doing the making. We also know that as a verb, we can create the sub-sentence「私が作った」.
「ケーキを」- The word “cake” with a direct object particle. We can attach the sub-sentence「私が作った」to describe it. “The cake that I made”. We offload our sentence qualifier onto「ケーキ」and can now treat this as an atomic node「私が作ったケーキ」.
「食べた」- The end of the sentence. The head is the verb “to eat”. Who is doing the eating? 「彼」is. What is he eating? The cake that I made.

If we bundle the sentence qualifier into its own isolated package, we get a dependency diagram that goes from this:

To this:

You can see how this now has the same structure as the easier example from before. As you get better at reading, this sentence would present the same amount of difficulty as the previous one.

Handling Ambiguity

「僕はあの赤い鞄を持っている人が知っています」

I know the person who is holding that red bag.

「僕は」- Topic particle. The “who” is「僕」. The main idea of the sentence involves「僕」.
「あの」- A pre-noun. Can be translated as the word “that”. Think about question words. Which thing? That thing. Whatever the sentence is talking about is far away from the speaker.
「赤い」- An adjective. Something is red. Which thing? The red thing.
「鞄を」- A noun with the direct object particle. Something is taking an action on the bag. We can tie up some loose ends here. It is that bag. The bag is red.
「持っている」- The verb “to hold”. What is being held? That red bag is being held. We know that this isn’t the end of the overall sentence, so we store this sub-sentence for later.
「人が」- Subject particle. The “what” is a person. We are now outside of the sub-sentence, so we know that it attaches to the overarching sentence. Something is happening to a person. Which person? The person holding that red bag「あの赤い鞄を持っている人」.
「知っています」- The end of the sentence. The head is the verb “to know”. Who is doing the knowing?「僕」am. Who do I know? The person holding that red bag.

We can see how as we read left to right, we collect information and offload them bit by bit later down the line. But wait, couldn’t the「あの」be attached to the「人が」instead of the「鞄を」? The answer is yes. This is a form of structural ambiguity that Japanese has. Fortunately, although we said in the last article that Japanese requires more long-range detail attribution, we are still lazy humans in the end and like to attribute details to the nearest logical parent. In this case, we attach「あの」to the nearest possible parent, which is「鞄を」.

How about dealing with early attribution? Someone could read this sentence and group「僕はあの赤い鞄を持っている」together before continuing to read the sentence, and then get confused when they see「人が知っています」. I would conjecture that dependency links are more ambiguous in Japanese due to their forward-facing nature, and a mistake like this is a side-effect of being used to the English way of immediately linking dependencies to its nearest neighbor. It feels more natural to attribute the next word read backwards instead of taking its dependency forwards. There’s no shortcut except practicing reading sentences like this and learning to be more open to re-linking dependecies on the fly.

Conclusion

These diagrams are great when starting out, but it should only be used as a helper when you’re unfamiliar with how the sentence is structured. This is a nice under-the-hood look at how sentences are parsed, but once you get better at reading, you shouldn’t be thinking about the depenendency structure at all. Most people who learn Japanese to a high level learn these concepts innately, but I believe that putting it into words can potentially help beginners who are struggling with reading long sentences.

For those curious, here’s an example of the dependency diagram of a longer sentence. The sentence is taken from the book 「コンビニ人間」. These diagrams were generated from a program I made. More on that will be discussed in another blog post.

「店内に散らばっている無数の音たちから情報を拾いながら、私の身体は納品されたばかりのおにぎりを並べている」

Conjectures and Caveats

These are some observations that I have made, but am not sure if it’s 100% true.

All grammar does is change the nuances of the descriptor and/or changes the dependency path
It doesn’t help in producing a natural-sounding sentence, just a structurally correct one
Since dependency only goes from left to right, every branching path should technically be a valid sentence
To do this successfully, you need to know all the grammar in the sentence. It doesn’t let you understand grammar points that you don’t know. If you don’t know the grammar point, then you don’t know how to draw its dependency
On the flipside, just because you know the dependency doesn’t necessarily mean you know how to understand the sentence
This only works on fully-formed, standard sentences. Things like「体言止め」, Manual Keigo, and just people taking poetic license with their phrasing might break the dependency structure in odd ways. However, the most common way that it is broken is usually just an inversion of dependency and can be easily identified and corrected

Resources Used

Published Dec 30, 2020

Just another language blogBarely Lingual on GitHub