Type or paste a DOI name into the text box. Feature-wise transformations A simple and surprisingly effective family of conditioning mechanisms. Many real-world problems require integrating multiple advances in Multi-Grid Methods PDF of information.
Författare: Dietrich Braess.
During the week of December 8-13, 1984, a conference on Multi-Grid Methods was held at the Mathematisches Forschungs institut (Mathematical Research Institute) in Oberwolfach. The conference was suggested by the GAMM-Committee "Effiziente numerische Verfahren fUr partielle Differentialgleichungen". We were pleased to have 42 participants from 12 countries. These proceedings contain some contributions to the confe rence. The centre of interest in the more theoretical contri butions were exact convergence proofs for multi-grid method. Here, the theoretical foundation for the application of the method to the Stokes equations, the biharmonic equation in its formulation as a mixed finite element problem and other more involved problems were investigated. Moreover, improvements and new attacks for getting quantitative results on convergence rates were reported. Another series of contributions was concerned with the de velopment of highly efficient and fast algorithms for various partial differential equations. Also in this framework, the Stokes and the biharmonic equations were investigated. Other lectures treated problems from fluid mechanics (as Navier Stokes and Euler equations), the dam-problem and eigenvalue problems. The editors would like to thank Professor M. Barner, the director of Mathematisches Forschungsinstitut Oberwolfach for making this conference possible. D. Braess, Bochum W. Hackbusch, Kiel U. Trottenberg, St. Augustin v CONTENTS O. AXELSSON: A mixed variable finite element method for the efficient solution of nonlinear diffusion and pot- tial flow equations …………………………….. .
Other times, these problems involve multiple sources of the same kind of input, i. Finding an effective way to condition on or fuse sources of information is an open research problem, and in this article, we concentrate on a specific family of approaches we call feature-wise transformations. Now let’s imagine that, in addition to the various classes, we also need to model attributes like size or color. In this case, we can’t reasonably expect to train a separate network for each possible conditioning combination! Let’s examine a few simple options. A quick fix would be to concatenate a representation of the conditioning information to the noise vector and treat the result as the model’s input.
This solution is quite parameter-efficient, as we only need to increase the size of the first layer’s weight matrix. However, this approach makes the implicit assumption that the input is where the model needs to use the conditioning information. Because this operation is cheap, we might as well avoid making any such assumptions and concatenate the conditioning representation to the input of all layers in the network. Let’s call this approach concatenation-based conditioning. The same argument applies to convolutional networks, provided we ignore the border effects due to zero-padding. M 0 0 C 1 2.
Intuitively, this gating allows the conditioning information to select which features are passed forward and which are zeroed out. Given that both additive and multiplicative interactions seem natural and intuitive, which approach should we pick? This property is why dot products are often used to determine how similar two vectors are. In the spirit of making as few assumptions about the problem as possible, we may as well combine both into a conditional affine transformation. Lastly, these transformations only enforce a limited inductive bias and remain domain-agnostic. This quality can be a downside, as some problems may be easier to solve with a stronger inductive bias.
However, it is this characteristic which also enables these transformations to be so widely effective across problem domains, as we will later review. Without losing generality, let’s focus on feature-wise affine transformations, and let’s adopt the nomenclature of Perez et al. We say that a neural network is modulated using FiLM, or FiLM-ed, after inserting FiLM layers into its architecture. In other words, the FiLM generator predicts the parameters of the FiLM layers based on some auxiliary input. Below are a few notable examples of feature-wise transformations in the literature, grouped by application domain. FiLM to condition a pre-trained network. Their model’s linguistic pipeline modulates the visual pipeline via conditional batch normalization, which can be viewed as a special case of FiLM.
So far, the models we covered have two sub-networks: a primary network in which feature-wise transformations are applied and a secondary network which outputs parameters for these transformations. However, this distinction between FiLM-ed network and FiLM generator is not strictly necessary. As discussed in previous subsections, there is nothing preventing us from considering a neural network’s activations themselves as conditioning information. This idea gives rise to self-conditioned models. The conditional variant of DCGAN , a well-recognized network architecture for generative adversarial networks , uses concatenation-based conditioning. The simplest form of conditioning in PixelCNN adds feature-wise biases to all convolutional layer outputs. Global conditioning applies the same conditional bias to the whole generated sequence and is used e.