"

4. Linguistic data in morphosyntax

4.6. Becoming a linguist: Discussing data

In this section, we are going to discuss how to write about linguistic data. In morphology, our data is often words or lists of words. In syntax, our data is often phrases and sentences.

Two technical terms will be useful in this section. The object language is the language under discussion, or the language being analyzed. The metalanguage is the language that we are using in order to discuss the object language. In this section, I am using examples from Gonzalez (2023), which is a paper written in English about questions in Finnish and Turkish. The object languages of this paper are Finnish and Turkish, while the metalanguage is English.

Numbering data

When we include linguistic data in our papers, we usually separate it out from the main text or prose of our argument. Examples that are separated out this way are each numbered sequentially throughout the text. An example is shown below from the opening paragraph of Gonzalez (2023).

The strategy used for forming polar questions varies across languages. For instance, in languages like English, polar questions are formed using a raising intonation (marked as ↑) and subject-auxiliary inversion, as illustrated in (1). In contrast, in other languages, polar questions are formed using an interrogative particle. Japanese is an example of such a language. As shown in example (2), polar questions can be formed using the interrogative particle ka.

(1) Are you leaving ↑ ?
(2) Japanese (Uegaki 2018)
Hanako-ga hashitta-ka?
Hanako-NOM ran-KA
‘Did Hanako ran?’

(Gonzalaz 2023: 2)

Related examples may share the same number, with sub-examples labelled with letters.

(4) a. Finnish
Lähti- Mari?
left-PolQP Mari
‘Did Mary leave?’
(4) b. Turkish
Oya Dilara’yı öp-tü?
Oya Dilara-ACC PolQP kiss-PST
‘Did Oya kiss Dilara?’

(Gonzalez 2023: 3)

Definitions, formulas, and syntax trees are also sometimes numbered alongside the examples.

If you repeat an example several pages after you first discuss it, it is good to repeat the example so your reader does not need to flip back and forth. Typically, authors will give the repeated example a new number, but also tell you the original number, as shown below.

Consider again example (5), repeated below as (23).

Embedded examples

We also sometimes include examples embedded directly into a sentence, instead of setting them apart and numbering them. This is generally only done with really short examples, such as one-word examples. This format can also be used when discussing portions of a longer example. This should really only be done when the example does not need to be glossed, either because the example is simple enough to not need it, or because the example is glossed elsewhere close by in the text.

When we include examples embedded into a sentence, the word(s) in the object language are typically in italics. In fact, any time you mention a word (rather than use the word), it should be in italics. If the object language is different than the metalanguage, you should also include a translation of the word(s) in single quotes.

In the following example, pu:ch ‘ask’ and ja:n ‘know’, both Hindi-Urdu words, are formatted in this way. On the other hand, kya: is not translated, because this sentence is part of a larger passage about the details of the meaning and use of kya:.

For instance, example (15) shows that kya: can be embedded under the rogative predicate pu:ch ‘ask’, but cannot be embedded under the responsive predicate ja:n ‘know’.

(15) a. Hindi-Urdu (Bhatt & Dayal 2020)
Ṭi:car=ne Anu=se pu:ch-a: [ki kya: vo ca:i piyegi:].
teacher=ERG Anu=from ask-PFV that KYA s/he tea drink.FUT.3FSG
‘The teacher asked Anu whether she would drink tea.’
(5) b.
*Anu ja:n-ti: hai [ki kya: tum ca:i piyoge].
Anu.F know.HAB.F be.PRS.SG that KYA you tea drink.FUT.2MPL
(Int.) ‘Any knows whether you will drink tea.’

(Gonzalez 2023: 8)

How to discuss data

Every time you include a linguistic example, you should introduce your example, present your example, and then discuss your example.

Not all examples necessarily need discussion before and after. If it’s a very simple example, it may be enough simply to explain it all before you present it. Use your judgment and think about what your reader needs to know to understand your example and how it fits into your argument, as well as when they need to know which information.

Introduce your example

Before you give an example, you should at least tell your reader that you’re about to show them an example, referring to the example by number. In can be very disorienting if you’re reading along and all of a sudden there’s an example with no context. Even better, though, is if you tell your reader why you’re about to show them an example and tell them what to look for. This is illustrated above with Gonzalez’s introduction to her examples (1) and (2).

Present your example

After you’ve introduced the example, present the example. The example should be numbered, as discussed above, and glossed, as we will discuss in Section 4.7.

If the example is at all complex, guide your reader to which parts of the example they should be looking at. You can bold or underline the relevant parts. For example, Gonzalez bolded the question markers in her example (4), shown above. Ideally, you should explicitly state what you’ve bolded and underlined when you introduce the example. Instead of bolding or underlining, you can also describe in the text where your readers should be looking, before or after you present the example.

Always clearly indicate the language that the data is from. If the entire paper is about the same language, you can do this just once, in the introduction. But if you’re discussing multiple languages, every single example should be labelled with the language it comes from. Some authors do this in the prose before the example, but it’s even clearer if it’s incorporated into the offset example itself. Two common ways of doing so are shown below in examples (1)-(2). In (1), the name of the language is right-aligned next to the example. In (2), the name of the language appears above the example. Whichever format you choose, you should be consistent throughout your paper.

(1) This is an example. [English]
(2) English
This is another example.

Discuss your example

After you have presented your example, you should discuss it. Do not leave it to your reader to analyze the data — that’s your job as the author. Describe the pattern you see in the data and explain why the pattern matters for your argument. And be specific! Mention specific words or morphemes from your example and why they matter.

Attributing your data

Whenever you use data, you must attribute your source.

  • If you collected the data via introspection, this is the only time that you don’t need to attribute your sources.
  • If you collected the data via elicitation, you should thank your speakers in your acknowledgments section. It is also a good idea to discuss your methodology at some point within your paper.
  • If your data comes from a corpus, the data collection methods should be explained at some point in your paper. If you use multiple corpora, each example should be marked with which corpus it came from.
  • If your data comes from another published source, you should provide an in-text citation next to or under the example that indicates its source. Since an example is a kind of direct quote, you should include the page number of the original source. If the source from which you got the example is not itself the original source, you should cite both the original source and the source you got it from, in the format “[original source], as cited in [source you got it from]”.

Check yourself!

References and further resources

Sources for examples

Gonzalez, Aurore. 2023. Interrogative particles in polar questions: The view from Finnish and Turkish. Glossa 8(1): 1–47.

definition