Writing Style: Good or Bad?

So, how is a paper supposed to be written? Is there a standard for a good writing style? If yes, where does this come from? And if No, what is then the basis for the writing style of papers? More questions like these could be asked, leading the author to a never-ending loop. However, the most obvious, and probably the first, question should be "Who is the audience?". 

Furthermore, the author may still fall in the trap of bad writing style by poor choice of words, truncated or long sentences, to mention a few, all of which undermine the author's message. As a result, the attention of the reader varies throughout the paper depending on the comprehension of the sentences. As the audience sets the path for the writing style, the author must assure to follow it correctly to keep the attention high throughout the paper. As an example, the background and generality of a paper is more important for a less technical audience, and vice versa.  

Considering the introduction above, this blog post aims to analyze the writing style of a specific paper, namely, "Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representation", written by A. Baevski et al. from Facebook AI [1].

Title describes very well the approach the authors want to later introduce and utilize for solving the task of interest, which is later described in the paper. Previously gained knowledge in the area of speech recognition, specially former publications from the same group, is a prerequisite to understand the method described in the title.  

Reading further, Abstract describes shortly the problem, "learning ... from speech audio alone", which is based on a self-supervised learning approach. Furthermore, the wav2vec 2.0 method and the main results obtained by implementing the method are introduced and presented, respectively. The importance of this method is summarized in the final sentence of the abstract, "speech recognition with limited amount of data". 

The problem in question is concisely presented in Introduction, starting with a background about the importance of the utilized method in the area. Subsequently, the authors smoothly touch on the self-supervised concept before initiating the discussion of their latest approach. Reading further, the technical details of the concept increase and thus, there is a prerequisite on comprehending previous approaches [2, 3, 4], as well as basic neural network concepts. These include, but not limited to, multi-layer convolutional neural network, feature extraction, self-supervised learning, self-attention, Transformers, and contrastive loss.

Model and Training give an in-depth description of the utilized network, composed of short paragraphs describing the different layers of the network. Once again, one realizes here the importance of knowledge in neural networks in general. In contrast, subsequent sections, Experimental Setup and Results, are packed with long paragraphs, each reporting very technical details, which make the comprehension very difficult. Consequently, forcing the reader browsing through the text multiple times in order to extract useful information. 

Furthermore, the authors tabulate the results very clearly with sufficient details, without which the results would be hard to grasp. A disadvantage of the tables is their dependency on the text in the article, and reproducing the results are near to impossible without knowing the details depicted in the text.

Finally, Conclusion summarizes the model used and the main findings, as well as comparing to previously achieved results. The authors are also optimistic about future performance gains "by switching to a seq2seq architecture and a word piece vocabulary".

Considering References, as a personal preference and a minor comment, numerically ordered in-text citations are in favor of the readability, as well as efficiently accessing the reference list. Therefore, the paper has a drawback of having alphabetically ordered reference list, which makes the numeration of the in-text citations numerically unordered.

For the interested, the paper contains an appendix to get more acquainted with the setup of the approach. The results are once again clearly tabulated. 

So, does the paper present a good writing style? All in all, the paper is well written, informative with well-tabulated results. Evidently, the audience of the paper should possess the sufficient knowledge in the speech recognition concept in order to understand the latest progress in the series. This includes, but not limited to, Masked Transformer; Tokenizer; Beam search; CTC, and n-gram language modelling. Although being very concise, the paper consists of several prolonged paragraphs, which make the flow of the reading less dynamic. Ultimately, the paper is recommended to getting familiar with the wav2vec 2.0 concept, which is now state of the art model for speech recognition. 

References

[1] A. Baevski, H. Zhou, A. Mohamed, and M.Auli. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020.

[2] A. Baevski, M.Auli, and A. Mohamed. Effectiveness of self-supervised pre-training for speech recognition. arXiv, abs/1911.03912, 2019.

[3] S. Schneider, A. Baevski, R. Collobert, and M.Auli. wav2vec: Unsupervised pre-training for speech recognition. In Proc. of Interspeech, 2019.

[4] A. Baevski, S. Schneider, and M.Auli. vq-wav2vec: Self-supervised learning of discrete speech representations. In Proc. of ICLR, 2020.


Kommentarer

  1. Dear Mohsen,
    Thank you very much for your detailed review of the paper. How you started your blog post was really interesting and inspired me for the future assignments.
    I really liked the smooth flow of information in your blog. This really helped me easily read your text and get a good overview of how the paper is written.
    Lastly, your conclusion was great which was no repetition and really was informative. You really know how to make and keep the reader interested.
    Great job.
    All the best,
    Kiavash

    SvaraRadera
  2. Dear Mohsen,
    You wrote a very exciting introduction to your blogpost. These questions about what makes a good paper got me thinking straight away. I will try to answer these questions for the papers I read as well.
    The paper wav2vec sounds very interesting and I look forward to read more about it in your next posts. I really like that you not only summarise the paper but also wrtte and justify your own opinion. Keep it up!
    Best, Pascal

    SvaraRadera

Skicka en kommentar