Mitochondrion is an important organelle in eukaryotic cell. It is the center of cell energy metabolism and also plays a key role in cell apoptosis process. The disorder of mitochondrion is always associated with different diseases like diabetes or cancer. Complicated gene-gene interactions are the main mechanism for the achievement of the mitochondrial functions. Therefore, a system level understanding of the mitochondrial interaction network is needed.

We have established a human mitochondrial gene interaction database InterMitoBase by integrating interactions from literature mining and KEGG database where records in database are carefully manual curated to ensure the accuracy. Moreover, InterMitoBase is designed not only a data storage, but also a biological network analysis platform, including the construction of the gene interaction network in a list of genes, finding the largest sub-component, fundamental network analysis and Gene Ontology enrichment analysis. InterMitoBase is the first database focused on human mitochondrial gene interaction, and it can be helpful for understanding mitochondrial functions.

Brief structure

The platform can be simply divided into two parts: data storage part and data analysis part, as illustrated in figure 1. Interactions can be queried from the database where data has been prepared with manual curation and interaction networks can be analysised with queried interaction data by bioinfomatics tools provided.

Figure 1. Brief structure of InterMitoBase


The detailed framework of InterMitoBase is illustrated in figure 2. Explaination of each module is as follows:

  • Grey block Start point of the database querying and network analysis
  • Orange block Three search systems which are simple gene search, gene list search and advanced gene search. The search systems are flexible with amount of gene identify ID types integrated. Results by gene search can be re-filtered with specifying a certain ID field.
  • Green block Interaction list, the main part of the database and object for further network analysis.
  • Pink block Plain text file format of the records for downloading.
  • Lilght brown block Finding sub components of complex networks with discrete connected parts.
  • Red block Graph view of the interaction network.
  • Blue block Bioinformatics analysis of biological network, containing Gene Ontology enrichment analysis and node degree distribution analysis.

Figure 2. Framework of IntoMitoBase

General usage

The structure of the web page of the platform is basically organised in figure 3a. Left region is the main working region, in top of which contains a frame describing what the page is about and information of the current operation. Right region contains links from the page and tools to analysis current data. An example can be found with figure 3b.

Figure 3a. General layout of InterMitoBase

Figure 3b. An example of the layout of InterMitoBase

Simple search

Simple search can be fullfilled by querying a single word like gene symbol, different gene id, GO id and keywords of gene names. A full list of keywords supported can be found as follows.

  • Symbol, together with offiical symbol and aliase symbol
  • Gene names, together with official names and aliase names
  • Chromosome locus
  • Ensembl ID
  • Enzyme ID
  • Gene Ontology ID
  • Refseq
  • UniGene
  • UniProt
  • SwissProt

Figure 4. Simple search

List search

List of genes can be searched by switching "Simple Search" to "List Search". If field is choosen as a specific ID type (like Refseq), then genes in the list must be all from this ID type.

Figure 5. List search

Advanced search

If more accurate querying results users want, an advanced search system can be used. Users can select the ID field, match or not match and how the keywords are matched with their querying keywords, up to eight terms are allowed.

Figure 6. Advanced search

All three search systems return a uniform gene list result page as in figure 7. For each record, a table describing how records are matched is displayed with keywords highlighted in red. Results can be filtered again by choosing the ID type from "Links and Tools" frame. Gene list can be saved as text format. Detailed information of genes can be fetched by clicking the "Detailed gene info" botton, then a window would be opened, displaying the full information of the specified gene (figure 8). Users can click "Genes interacting with *" botton when they get the correct gene of interest in the list and find how many other genes interact with this gene. In other case, users may be interested in a list of genes. Interaction among the gene list can be obtained by clicking "Find interactions among this gene list" in the "Links and Tools" region to find interactions among them.

Figure 7. Gene list

Figure 8. Detailed gene info

Interaction list

Interaction results are displayed as a table where each record is represented as a gene pair. A graphic representation is displayed where red arrow is positive regulation as up-regulation or activation, green arrow is negative regulation as down-regulation or deactivation, grey arrow is action with no information of regulation, and grey line is action with no regulaton direction like subunit or binding relations. Multi kinds of interaction categories are supported and list below.

  • Interactions with one gene
  • Interactions within a pair of genes
  • Interactions in a list of genes
  • Interactions within a literature / KEGG pathway
  • Interactions back from a generated network

Figure 9. Interactions

Multi interactions for a single gene pair always occurs, so users can change the display way by switching the options above the records table to see distinct interactions in gene pairs(figure 10). Querying result is illustrated in figure 11.

Figure 10. Interaction display tool bar

Figure 11. Distinct interactions

Interaction records are searched both from literature mining results and kegg pathways. User can change the data source by switching the options "source" (figure 12).

Figure 12. Interaction source tool bar

Detained interaction

Interactions are generated by literature mining from pubmed and KEGG database. Users may click on "detailed interaction" in the interaction list page to find out the detailed interaction either from pubmed or KEGG.

In this page, the first part is the description of the source containing basic information like title, journal, abstract, mesh term for literature (figure 13), pathway name and pathway image for KEGG pathway (figure 14). The two genes in the current interaction record are highlighted in abstract of pathway image. The second part is the detailed information of the two gene in the current interaction record.

Genes interact with either member of the gene pair can be visited from the "Links and Tools" frame. All genes in the abstract or KEGG pathway where interactions have been found can be highlighted. Interaction in the abstract or KEGG pathway can be visited and graph visulize.

Figure 13. Detailed interaction from literature

Figure 14. Detailed interaction from KEGG pathway

Sub component

Networks sometimes present as discrete parts of connected graphs called sub components. Users can find the sub components from the interaction lists as illustared in figure 15. Number of nodes and edges is displayed, detailed interaction, graph, cand text-format file for each single component can be visited.

Figure 15. Sub components of network

Graph view

Graph visualization of networks is very important for network analysis. InterMitoBase generated graph by Graphviz. Four kinds of graph layout are provided and users can change the layout of graph by switch the options above the graph region. Nine file format generated by Graphviz is provided to download. Because graph illusion of large network expends a lot of time, network with nodes number less than 300 will be displayed and those larger than 300 will only reture a gv format file which is a Graphviz input file and users can download it and visulize locally. Four description of layout method can be found from the main page of Graphviz or as follows:

  • dot: The layout algorithm aims edges in the same direction (top to bottom, or left to right) and then attempts to avoid edge crossings and reduce edge length.
  • fdp: Fdp implements the Fruchterman-Reingold heuristic including a multigrid solver that handles larger graphs and clustered undirected graphs.
  • twopi: The nodes are placed on concentric circles depending their distance from a given root node.
  • circo: This is suitable for certain diagrams of multiple cyclic structures such as certain telecommunications networks.

Figure 16. Graph view

Gene Ontology Enrichment

GO (Gene Ontology) enrichment analysis is applied to found the over-represented genes sharing a common biological meaning by Fisher's Exact Test. Gene Ontology are organized into three spaces which are biological process, moleular functions and cellular component to describe the attribute of gene or protein from different aspects. The analysis can be applied by selecting a certain GO space (figure 17).

Figure 17. Gene Ontology enrichment tool

The enrichment result will be displayed under the toolbar with detailed description of Gene Ontology terms, how many genes in the network are mapped in the term, p-value and false discovery rate (FDR) from the test. If clicking at the link "Highlight on graph", genes on the graph within the selected GO term will colored as purple making it clear to find the biological meaning of the interaction network (figure 18).

Figure 18. Gene Ontology enrichment

Degree distribution analysis

Large biological networks are always presenting as scale free networks where very little genes interact with huge amount of other genes while amount genes only interact with a little. The node degree distribution of networks can sometime reveal the attributes of the underlying biological circunstances. Two transformations of node degree distribution are provided which are the exponential transformation and power-law transformation. Two tables describing the detailed degree for genes is below the distribution graph region.

Figure 19. Degree distribution