Weather models (or, as they are more commonly called in the field, atmospheric models) are computer programs that read in input data (initial conditions) and solve partial differential equations to produce a future state of the atmosphere. Although @JonEricson provides an overall good but anecdotal summary of what models do, here I describe the exact steps of what it takes for an atmospheric model to produce a forecast. This answer generally applies to ocean circulation and climate models as well. Many people believe that weather forecasters sit in front of a map and brainstorm where the cloud will go. This answer aims to provide an easy to understand but thorough explanation about how atmosphere and ocean prediction models work.

The evolution of the atmosphere may be described by a system of partial differential equations (PDEs). Most commonly, these are primitive equations, which consist of momentum equation (solving for velocity $\mathbf{v}$ or momentum $\mathbf{\rho v}$), continuity (or mass conservation equation) and thermal energy equation (solving for temperature $T$ and specific humidity $q$). Continuity equation is necessary for closure with the momentum equations. These equations may be approximated in many ways, yielding a reduced and/or simplified set of equations. Some of these approximations are hydrostatic, Boussinessq, anelastic etc. In the most complete form of the primitive equations for the atmosphere the prognostic state variables are $u$, $v$, $w$, $p$, $T$, $q$. An idealized atmosphere could also be simulated with just momentum and continuity equations (no thermodynamics), shallow water equations or just absolute vorticity equation. For an example of the latter, see the pioneering paper by Charney, Fjortoft and von Neumann (1950) who numerically predicted 500 mb vorticity by integrating the absolute vorticity equation in time. Because their model was barotropic it could not produce cyclogenesis. However, they achieved the first successful numerical weather prediction in history and their model ran on the first general-purpose computer, ENIAC.

Now take the momentum equation for example:

$$\dfrac{\partial \rho \mathbf{v}}{\partial t} + \nabla (\rho \mathbf{v}^{2}) +2\Omega \times \rho \mathbf{v} = -\nabla p + \nu \nabla^{2}(\rho \mathbf{v}) + \Phi $$

From left to right, we have the time tendency of momentum, advection, Coriolis force, pressure gradient, viscous dissipation, and finally, any external forcing or subgrid tendency. Unfortunately, the advection term $\nabla(\rho\mathbf{v}^{2})$ is non-linear, and it is because of this term that the analytical solution to this equation is not known. This term is also the reason atmosphere and other fluids are chaotic in nature and small errors in $\mathbf{v}$ grow quickly because they multiply in this term. If the equation is linearized, $\nabla(\rho\mathbf{v}^{2})=0$, analytical solutions can be found. For example, Rossby, Kelvin, Poincare waves are all analytical solutions for a certain reduced set of linearized momentum or vorticity equations. It is important to identify that we need to have the non-linear advection term if we hope to produce accurate forecasts. Thus we solve the equations numerically.

How to solve these PDEs? Processors are not capable of doing derivatives - they know how to *add* and *multiply* numbers. All other operations are derived from these two. We need to somehow approximate partial derivatives using basic arithmetic operations. The domain of interest (say, the globe) is discretized on a grid. Each grid cell will have a value for each of the state variables. For example, take the pressure gradient term in x-direction:

$$\nabla_{x}p = \dfrac{\partial p}{\partial x} \approx \dfrac{\Delta p}{\Delta x} = \dfrac{p_{i+1,j}-p_{i-1,j}}{2\Delta x_{i,j}}
$$

where $i,j$ are the grid indices in $x,y$. This example used finite differences, centered in space. There are many other methods to discretize partial derivatives, and the ones that are in use in modern models are typically much more sophisticated than this example. If the grid spacing is not uniform, finite volume methods must be used if the predicted quantitity is to be conserved. Finite element methods are more common for computational fluid dynamics problems defined on unstructured meshes in engineering, but can be used for atmosphere and ocean solvers as well. Spectral methods are used in some global models like GFS and ECMWF.

Processes unresolved on the grid scale (term $\Phi$) are implemented in the form of parameterization schemes. Parameterized processes may include turbulence and boundary layer mixing, cumulus convection, cloud microphysics, radiation, soil physics, chemical composition etc. Parameterization schemes are still a hot research topic and they keep improving. There are many different schemes for all of the physical processes listed above. Some work better than the others in different meteorological scenarios.

Once all the terms in all the equations have been discretized on paper, the discrete equations are written in the form of a computer code. Most atmosphere, ocean circulation, and ocean wave models are written in Fortran. This is mostly for historical reasons - having long history, Fortran had the luxury of having very mature compilers and very optimized linear algebra libraries. Nowadays, with very efficient C, C++ and Fortran compilers available, it is more a matter of preference. However, Fortran code is still most prevalent in atmosphere and ocean modeling, even in recently started projects. Finally, an example code line for the above pressure gradient term would look like this:

```
dpdx(i,j,k) = 0.5*(p(i+1,j,k)-p(i-1,j,k))/dx(i,j)
```

The whole code is compiled into machine language that is then loaded on the processor(s). The model program is typically not user friendly with a fancy graphical interface - it is most commonly run from dumb terminals on high-performance multiprocessor clusters.

Once started, the program discretely steps through time into the future. The calculated values for state variables at every grid point are stored in an output file, typically every hour (simulation time). The output files can then be read by visualization and graphics software to produce pretty images of model forecast. These are then used as guidance to forecasters to provide a meaningful and reasonable forecast.

You might also mention in your list of improvement reasons: better parameterization schemes for microphysics, radiation, land surface interaction, etc; tighter grids allowing explicit convection rather than parameterized convection; data assimilation. – casey – 2014-04-16T00:54:44.930

1@casey: I should perhaps mention that my firsthand knowledge of forecast models is over 15 (and closer to 20) years old. If you'd be willing to suggest an edit, I'd be happy to approve it. :-) – Jon Ericson – 2014-04-16T01:09:22.657