AID : PubChem's bioassay (protocol) identifier, a non-zero integer.
CID : PubChem's compound identifier, a non-zero integer for a unique chemical structure.
Complexity : The complexity rating of the compounds is a rough estimate of
how complicated a structure is, seen from both the point of view of the elements contained and the displayed
structural features including symmetry. However, neither stereochemistry nor isotope labeling are used as
auxiliary criteria. The value is computed using the Bertz/Hendrickson/Ihlenfeldt formula. A scaling factor
for aromaticity is used so that the complexity of benzene is the same as of cyclohexane.
It is a floating point value, ranging from 0 (simple ions) to several thousand (complex natural products).
Generally larger compounds are more complex than smaller ones, but highly symmetrical compounds,
or compounds with few distinct atom types or elements are downgraded. Complexity is only loosely
correlated with synthetic accessibility.
The most complex compound in PubChem is CID
6338588
(C124H185N9O207S36) with a complexity rating of about 18425. The average complexity of the
structures in PubChem compound database is about 551.
Comments : List all depositor's comments and additional information
for this substance.
Component : For mixture substance/compound, component is one of the single molecule.
Compound : Chemical representatives in substances. Chemical structure presented in a compound is standardized
through PubChem's data pipeline. A mixture substance may have several standardized compounds. A compound record is structurally
unique in the PubChem compound database.
Computed Descriptors : Information to describe the compound in different formats, including SMILES,
InChI, IUPAC names.
Computed Properties : These data are calculated from the compound, including molecular weight, formula, XLogP, etc.
Depositors Category : Depositors category tells users that there is an additional
category-specific information either on depositors substance summary page or on the depositor's web-site.
Deprecated Compound : A Compound CID which
has no links to any substance. This may occur as PubChem modifies processing. A
deprecated compound will not be available within Entrez.
HBA : Number of hydrogen acceptors in the structure. Classification of hydrogens follows [J. Chem. Inf. Comput. Sci. 1997,37, 615-621].
HBD : Number of hydrogen donors in the structure. Classification of hydrogens
follows [J. Chem. Inf. Comput. Sci. 1997,37, 615-621].
Heavy Atom : All atoms except hydrogen.
InChI : IUPAC International Chemical Identifier.
Learn more... InChI string
can be searched through the Entrez PubChem databases. Click
here to see the example.
Old Version Substance --
Substance versions are considered to be "old" when a more recent update is
provided by the depositor.
Molecular Weight : The
molecular weight is the sum of all atomic weights of the constituent atoms in a compound, measured in gr/mol. In the absence of explicit isotope labeling, averaged natural abundance (which may, for example in case of Li and U compounds, not be identical to purchasable material) is assumed. If an atom bears an explicit isotope label, 100% isotopic purity is assumed at this location, even for short-lived radioactive isotopes where this is often physically unrealistic. At this moment, it is not possible to deposit more detailed isotope composition information into the PubChem database. Pseudo-atoms which are not an element have an atomic weight of 0 g/mol.
Revoked BioAssay : When a depositor removes an
assay that the depositor previously deposited into PubChem, the assay is considered revoked.
A revoked assay will not be available within Entrez..
Revoked Substance : When a depositor removes a
substance from their substance collection, the substance is considered revoked.
A revoked substance will not be available within Entrez.
SID : PubChem's substance identifier, a non-zero integer for a deposited substance.
SMILES : Simplified Molecular Input Line Entry System,
a line notation (a typographical method using printable characters) for
entering and representing molecules.
Learn
more..
You can also find more related information form PubChem's document section in
PDF or
Text.
SMARTS : A language that allows you to specify substructures using rules
that are straightforward extensions of SMILES.
Learn more..
Substance : Individual record object collected from depositors, representing a sample used at bioassay.
Substance Category : Substance categories (one or more) are
assigned to each depositor, based on nature of that depositor's institution and the type of data they supply.
Suppressed Compound : A Compound CID that
links only to an old version substance. A suppressed compound will not be
available within Entrez.
Synonyms : All names, trivial names, synonyms, frequently used IDs, and other names collected from depositors. In
the compound summary page, synonyms are
distinct synonyms from all corresponding substances.
TPSA --
Topological Polar Surface Area. This is an estimate of the area (in Å squared) which is polar. The implementation follows [J. Med. Chem.
2000, 43, 3714-3717.]. It is a simple method - only N and O are considered, 3D coordinates are not used, and there are various precomputed factors for different hybridizations, charges and participation in aromatic systems.
Version : PubChem substance version number is
incremented when an update is provided by the depositor.
Xref : The external references/links to PubChem database records.
XLogP : A partition coefficient or distribution coefficient
that is
a measure of differential solubility of a compound in two solvents.
Learn more..
From November 2006, the PubChem uses version 2 of the algorithm of the reference [
Perspectives in Drug Discovery and Design. 2000,
19, 47-66.] to generate the XlogP value.