Questions for Self-monitoring: Encoder-Decoder Architecture and Attention

  • Why can't machine translation be treated as sequence or segment labeling task?
  • What sets a seq2seq model apart from the other architectures discussed so far?
  • How is a seq2seq model trained?
  • How do seq2seq models perform inference?

  • What is the most serious drawback of the encoder-decoder architecture?
  • How can this drawback be overcome?
  • Why cannot all the information from the encoder be passed on to the decoder?
  • What's the underlying idea of attention?
  • How does dot-product attention work?

-- WolfgangMenzel - 06 Mar 2023
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback