QSM9b < Addis2023

Questions for Self-monitoring: Encoder-Decoder Architecture and Attention

Why can't machine translation be treated as sequence or segment labeling task?
What sets a seq2seq model apart from the other architectures discussed so far?
How is a seq2seq model trained?
How do seq2seq models perform inference?

What is the most serious drawback of the encoder-decoder architecture?
How can this drawback be overcome?
Why cannot all the information from the encoder be passed on to the decoder?
What's the underlying idea of attention?
How does dot-product attention work?

-- WolfgangMenzel - 06 Mar 2023

This topic: Addis2023 > WebHome > CourseStructure > ScheDule > QSM9b
Topic revision: 09 Mar 2023, WolfgangMenzel

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback