nextflu
Real-time tracking of seasonal influenza virus evolution in humans
HA phylogeny
Legend
Frequencies
Input mutations as position+amino acid, i.e. 159Y, clades as clade name, i.e. 3c2.a and add locations as /AS, /NA, /EU, /OC, i.e. 159Y/AS. By default, positions are interpreted as residues in HA1. You can specify the subunit as HA2:18V. Alternatively, simply click on variable positions in the graph below.
Feature explanation

Phylogenetic tree

Use the date slider to select viruses sampled within the time interval indicated. The size of the interval can be changed by grabing the left end of the bar with the mouse, to move the interval, use the right end of the slider.

Use the drop down menu to color viruses by number of epitope mutations, non-epitope mutations or receptor binding mutations relative to root, or to color viruses by local branching index or geographic region.

Use the input box to specify positions to color viruses by genotype. Amino acid positions must be separated by a comma (e.g. 159,225). The default is HA1, to color by amino acid sequence in other regions use HA2:18 or SigPep:6. To color by nucleotide sequence, use nuc:527.

Mouse over a tip to show virus name, location and features.

Mouse over a branch to graph the frequency of the correponding clade trajectory below or click on a branch to zoom into its descendent clade. The tool tip will show amino acid mutations on this branch.

To restrict the displayed viruses to certain geographic regions, select the region in the drop down menu labeled region.

Frequencies

Enter a mutation or genotype above (e.g. 225D) and click plot frequencies to show estimated frequency of this mutation through time. In addition, geographic regions can be specified by adding AS (Asia), NA (North America), EU (Europe), or OC (Oceania) as 159S/225D/AS. Several genotypes can be entered simultaneously when separated by commas (e.g. 225D, 159S/225D/AS will graph the global frequency of 225D and the frequency of strains containing both 159S and 225D in Asia). Instead of a genotype, the common clades 3c3, 3c3.a, 3c2, 3c2.a can be used. Positions with very little variation are omitted. Beware that region specific frequencies are noisy. Click here for help with the interface.

Variability

The second plot shows the variation in the multiple sequence alignment used to construct the tree. High bars indicated variable positions. Clicking on those bars will color the tree by amino acid at this position and plot the frequencies of the corresponding amino acids.
Rationale and details

Epitope mutations are based on HA structure and exposed residues. Multiple recent mutations at epitope sites have been suggested to be predictive for strains dominating future seasons. Similarly, mutations outside of these epitopes -- termed non-epitope sites --- tend to be damaging and are suggested to be predictive of clade contraction.

Antigenic evolution has been shown to depend primarily on substitutions surrounding the receptor binding site of HA1. These seven positions (145, 155, 156, 158, 159, 189, 193 in HA1 numbering) are referred to here as receptor binding positions and changes at these positions could correspond to large changes in antigenic properties.

The local branching index is the exponentially weighted tree length surrounding a node, which is associated with rapid branching and expansion of clades. A more detailed explanation is available here. Retrospective analysis has shown that LBI correlates with clade growth.

Frequencies are estimated as maximum likelihood trajectories that penalize rapid changes in frequency and slope. The frequencies of large clades or abundant genotypes have sufficiently many observations to by robust, while frequencies of rare mutations can't be reliably estimated.


Built with love by Trevor Bedford and Richard Neher. This work is made possible by the GISAID Initiative and the open sharing of genetic data by influenza research groups from all over the world. We gratefully acknowledge their contributions. Give us a shout at @trvrb or @richardneher with questions or comments. All source code is freely available under the terms of the GNU Affero General Public License. A detailed description of methods is also available. Data updated and processed with commit .

Please cite: Neher RA, Bedford T. 2015. nextflu: real-time tracking of seasonal influenza virus evolution in humans. Bioinformatics 10.1093/bioinformatics/btv381.


© 2015 Trevor Bedford and Richard Neher