Use the date slider to select viruses sampled within the time interval indicated. The size of the interval can be changed by grabing the left end of the bar with the mouse, to move the interval, use the right end of the slider.
Use the drop down menu to color viruses by number of epitope mutations, non-epitope mutations or receptor binding mutations relative to root, or to color viruses by local branching index or geographic region.
Use the input box to specify positions to color viruses by genotype. Amino acid positions must be separated by a comma (e.g. 159,225). The default is HA1, to color by amino acid sequence in other regions use HA2:18 or SigPep:6. To color by nucleotide sequence, use nuc:527.
Mouse over a tip to show virus name, location and features.
Mouse over a branch to graph the frequency of the correponding clade trajectory below or click on a branch to zoom into its descendent clade. The tool tip will show amino acid mutations on this branch.
To restrict the displayed viruses to certain geographic regions, select the region in the drop down menu labeled region.
Epitope mutations are based on HA structure and exposed residues. Multiple recent mutations at epitope sites have been suggested to be predictive for strains dominating future seasons. Similarly, mutations outside of these epitopes -- termed non-epitope sites --- tend to be damaging and are suggested to be predictive of clade contraction.
Antigenic evolution has been shown to depend primarily on substitutions surrounding the receptor binding site of HA1. These seven positions (145, 155, 156, 158, 159, 189, 193 in HA1 numbering) are referred to here as receptor binding positions and changes at these positions could correspond to large changes in antigenic properties.
The local branching index is the exponentially weighted tree length surrounding a node, which is associated with rapid branching and expansion of clades. A more detailed explanation is available here. Retrospective analysis has shown that LBI correlates with clade growth.
Frequencies are estimated as maximum likelihood trajectories that penalize rapid changes in frequency and slope. The frequencies of large clades or abundant genotypes have sufficiently many observations to by robust, while frequencies of rare mutations can't be reliably estimated.
Built with love by Trevor Bedford and Richard Neher. This work is made possible by the GISAID Initiative and the open sharing of genetic data by influenza research groups from all over the world. We gratefully acknowledge their contributions. Give us a shout at @trvrb or @richardneher with questions or comments. All source code is freely available under the terms of the GNU Affero General Public License. A detailed description of methods is also available. Data updated and processed with commit .