Questions for self-monitoring

  • What are the similarities and differences between a binary logistic classifier and a neural network composed of a single unit with a sigmoid activation function?
  • What are the differences between the different kinds of activation functions? What are their advantages and drawbacks?
  • What's the difference between a multinomial regression classifier and a single layer neural net?
  • Which additional benefit is created by adding a hidden layer to a neural net?
  • What's the purpose of 'pooling' and how it can be performed?
  • How should the parameters of a neural network be initialized? Is there difference to logistic regression?
  • How can the gradient of the loss function be computed for the last layer of the network? How can this be done on the other layers?
  • What's the mathematical foundation of backpropagation?
  • How does a computation graph work?
  • What is overfitting and how can it be avoided?
  • What are the possibilities to deal with pretrained embeddings in a neural language model?

-- WolfgangMenzel - 20 Feb 2023
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback