Target Explorer (Prototype)


Try Target Explorer

Overview

Target Explorer is intended to be an interactive web site that allows researchers in the Tuberculosis research community to experiment with different target selection criteria and explore alternative ways of prioritizing gene targets for experimentation (crystallization structure solution, high-throughput screening for inhibitor discovery, etc.)

The original source of the data comes from a PLoS-Computational Biology paper by Hasan and colleagues (Hasan et al., 2006), who collected various attributes on the 3927 genes in the H37Rv genome of Mycobacterium tuberculosis, including data on druggability, enzyme function, essentiality, and DNA micro-array results (gene expression profiles from various models simulating latency conditions, e.g. hypoxia, starvation, high pH...).

The Hasan paper combines the various values for each gene together using a weighted-sum (linear combination) scoring function. While the Hasan paper proposed a particular set of weights, we recognize that other researchers have alternative goals in mind and want to try different weighting schemes.

Individual Categories Used to Characterize Drug Targets

  • Druggability: These experiments are used to choose domains that bind small molecules following Lipinski's Rule of 5. (Proteins that have previously been targeted by experimental drugs are chosen.) The protein domains chosen need not be specific to M.tb, since if a protein domain has been successfully inhibited previously to treat any other disease, it might provide some further information regarding a new class of drugs. Additionally, domains in EC families with homologues that are targeted by commercial drugs are also chosen.
  • Essentiality: TRaSH (Transposon site hybridization) experiments were conducted by two different groups based on the same in vitro conditions. These experiments were conducted to identify growth essential genes in M.tb under nutrient-rich conditions. 78% of these predicted essential genes share a close homolog in the M.Leprae genome. Other experiments based on clinical isolates, identified some genes to be frequently deleted, rendering them undesirable as drug targets.
  • Metabolic Chokepoints: Enzymes involved in unique essential chokepoint reactions make good metabolic drug targets, since their function cannot be compensated for by another enzyme. The chokepoint reactions could either be the consumption of a unique substrate or production of a unique product. Targets with unique chokepoint recations and those with unique EC numbers are all identified.
  • Structural Clues: It is very useful to consider targets with known crystal structures, since this aids in docking and lead-optimization studies. The SMID genome comparison tool is used to compare genomes to find small molecule-protein domain interactions that are common across multiple genomes. Homology with teh host and host-flora is also computed based on this structural data. Other physical properties such as length and molecular mass of the target are also used for characterization.
  • Microarray Data: Various microarray models of the latent state of M.tb. in latent in vivo infection are available. If a target is expressed in most of these models, it increases confidence that it is expressed in the latent in vivo infection, which could mean that it is required for survival during dormancy.

    How to Use the Interface

    Access and Security There are two versions, a public and a private version of this interface. A secure (login based) interface allows access to private data. Each login name is associated with a list fo data files that a particular user has access to. This access can be easily changed over time to allow greater interaction between researchers. The public version provides equal access to all users to all publicly available data.

    Column Selection Different researchers use different criterion to evaluate possible drug targets. In order to provide them with this flexibility, the first page of the tool allows the user to view all the different categories of information available regarding each of the targets. The user can then select the criterion that they want to explore further. If the user wishes to reselect criteria at any stage they can use the Reselect Columns button on the second page.

    Weights and Score Each of the selected criteria can be assigned a user defined weight. (Default weights based on the Hasan paper are shown and can be automatically chosen.) Users can use this feature to examine the effects of varying the influence of criteria over the target prioritization. Each target is assigned a final score which is computed as the weighted sum over all the selected criteria. The drug targets are sorted based on the score. (Targets with the highest score are at the top of the list.)

    Sorting As mentioned earlier, the targets are sorted in descending order based on the overall score. Additionally, it is possible to sort the targets based on any of the selection criteria either in ascending or descending order. The data can be sorted based on any one column at a time. In order to resort based on a selection either click on the Rescore button on the upper left hand corner of the page or click 'Enter' anywhere inside the table.

    Normalization The user has two options to normalize the data in all the columns: unit normalize and standard normalization. Unit normalize option normalizes the data to the range [0, 1]. Standard normalize normalize option normalizes the data to a distribution with mean of 0 and variance of 1.

    Selection Criterion The user can choose to only view those targets that have values greater than a threshold. They can specify the selection criteria for each column (for eg.) as > 0 . This will result in a display of only those targets with a value greater than zero for that column.

    Correlate The user can examine the relationships between various criteria by calculating the correlation between these columns. At this point we cannot correlate discrete data with continuous data. If the two data columns being correlated contain discrete data, then a table containing the counts for each set fo discrete values is displayed. If the data columns contain continuous data then a graph showing the distribution is shown.

    Transform Threshold The user can set a threshold for the values in each column. If the data value in a column is greater than or equal to the threshold, a value of 1 is added to count, otherwise zero is added. A final count for each target is computed. The assumption here is that if a target resonds to a treatment, its level of expression is not as important as there being an expression. This count is also accompanied by a color coding (green if data ≥ threshold). This allows greater visualization of the results. Additionally, the targets are sorted based on the count. (The top 200 targets are shown.)

    Statistics For each column, the min and max values are computed. The mean, standard deviation and the range of data values are also computed.

    Implementors

    Target Explorer is implemented in the Sacchettini lab at Texas A&M University by Reetal Pai and Tom Ioerger. If you have any comments or suggestions regarding the data or implementation features please contact Reetal Pai at reetalp[at]cs[dot]tamu[dot]edu