Because the brains of individuals are not the same
with regard to structure, function or organization, no single, unique physical
representation of the brain that depicts the human species is possible.
For the last six years, we have worked to develop a probabilistic atlas
and reference system for the human brain that serves both as informatics
and neuroscience tools, because it captures, in digital form, the variance
of a large population of subjects and includes information about their
racial and ethnic backgrounds, education and handedness, personal traits
and habits, medical, neurological and psychiatric profiles, structural
and functional imaging and DNA for genotyping. The current data structure
includes 438 normal subjects between the ages of 20 and 40 and will soon
be expanded to 1,000 subjects and possibly beyond. The addition of functional
information from fMRI and PET as well as microscopic data on cyto- and
chemo-architecture provides new and unique tools and strategies. An important
neuroscientific outcome for this program is the ability to examine, for
the first time, the stability, relationships and distribution of the micro-
and macroscopic structure and function of the human brain. This issue,
although a major area of interest, has remained a vexing problem because
of the difficulty in obtaining data sets of sufficient magnitude, diversity,
number and organization to answer such questions. The resultant data set
is organized in four dimensions (three in space and one in time) with an
infinite number of potential attributes. Through the consortium structure
that we have developed there is a distribution of labor that has been separated
into parallel, complementary tasks, executed in such a way as to create
a "real world" environment among participants. With this program, differences
in equipment, software and protocols actually reflect a microcosm of the
larger neuroscience and neuroinformatics communities. We believe that there
is value added in such an approach as it allows both individual and summary
data and the entry of raw or interpretive data where the user of the resultant
database can choose their confidence level by setting a threshold for the
type of data to be obtained from a query (e.g., all data for a given location
versus only peer-reviewed and independently reproduced data). Such a database,
with such a large number of subjects, provides the opportunity for electronic
hypothesis generation and comparisons between individuals, experiments
and laboratories.
Reconstruction of the cerebral cortex and subsequent flattening procedures routinely generate large collections of data in an increasing variety of formats. Viewing a specific dataset from the many possible configurations, requires selection of an appropriate combination of compatible files out of the dozens that may exist for each experimental hemisphere. This complexity reflects the diversity of information needed to specify cortical shape, topology and experimental findings. To address the logistical problems that this imposes, we have developed SuMS, Surface Management System.
SuMS plays four important roles in the surface reconstruction
and analysis process.
First, it provides a systematic framework for the
classification and storage of all surface, volume and experimental datasets.
Second, within this classification, it serves as a version control system
for the rapidly evolving surface and volume datasets. Third, with its built-in
Database Management System Support, SuMS provides rapid search and retrieval
capabilities across all the datasets. Finally, with both client and server
side Java implementations, SuMS is in a good position to act as a multi-platform,
multi-user "Surface Request Broker" for the community of neuroscientists
studying the structure and function of the cerebral cortex.
The National fMRI Data Center (http://www.fmridc.org)
was established in the Autumn of 1999 with the objective of creating a
mechanism by which members
of the neuroscience community may more easily share
functional neuroimaging data. Examples in other sciences offer proof
of the utility and benefit
that data sharing provides through encouraging growth
and development in those fields. By building a publicly-accessible repository
of raw
neuroimaging data from peer-reviewed studies, the
Data Center expects to create a similarly successful environment for the
neurosciences.
In this article, we discuss the continuum of database
efforts and provide an overview of the scientific and practical difficulties
inherent in managing
various database models. Next, we detail the organization,
design, and foundation of the fMRI Data Center, ranging from its current
capabilities to
the issues involved in the submitting and requesting
of data. We discuss how a publicly-accessible database enables other
fields to develop relevant
tools that can aid in the growth of understanding
of cognitive processes. Information retrieval and meta-analytic techniques
can be utilized to
search, sort, and categorize study information with
a view towards subjecting study data to secondary “meta- and mega-analyses”.
In addition,
we discuss the technical and policy choices needed
to be addressed in the formation of the Data Center. Among others, these
include: human subject
confidentiality issues; the ensuring of investigator's
rights; heterogeneous data description and organization; the development
of search tools; and data
transfer issues. We conclude with comments concerning
the future of the fMRI Data Center effort, its role in promoting the sharing
of neuroscientific
data, and how this may alter the manner in which
studies are published.
We have developed a graphical anatomical database
program, X-anat, that allows the results of numerous studies on neuroanatomical
connections to be stored, compared, and analyzed in a standardized format.Data
are entered into the database by drawing injection and label sites from
a particular tracer study directly onto canonical representations of the
neuroanatomical structures of interest, along with providing descriptive
text information.Searches may then be performed on the data by querying
the database graphically, for example by specifying a region of interest
within the brain for which connectivity information is desired, or via
text information such as keywords describing a particular brain region
or an author name or reference.Analyses may also be performed by accumulating
data across multiple studies and displaying a color coded map that graphically
represents the total evidence for connectivity between regions.Thus, data
may be studied and compared free of areal boundaries (which often vary
from one lab to the next), and instead with respect to standard landmarks,
such as the position relative to well known neuroanatomical substrates,
or stereotaxic coordinates.If desired, areal boundaries may also be defined
by the user to facilitate the interpretation of results.We demonstrate
the application of the database to the analysis of pulvinar-cortical connections
in the macaque monkey, for which the results of over 120 neuroanatomical
experiments were entered into the database.We show how these techniques
can be used to elucidate connectivity trends and patterns that may otherwise
go un-noticed.The database software may be obtained from http://redwood.ucdavis.edu/bruno/xanat/xanat.html.
Driven by the necessity of integrating the ever increasing
amount of data on the mammalian brain, several ambitious neuroscientific
database projects have been started during the last decade. Databases on
anatomical connectivity as delivered by tracing studies play a particularly
important role as these data characterize the structural constraints of
the complex and poorly understood functional interactions in real neural
systems. Available connectivity databases have already made possible important
analyses of anatomical brain circuitry in various species and opened exciting
new ways to interpret functional data, both from electrophysiological and
functional imaging studies. The eventual impact and success of connectivity
databases, however, will be determined by the resolution of methodological
problems that currently still limit their use. These problems comprise
four main points: (i) objective representation of coordinate-free, parcellation-based
data, (ii) assessment of the reliability and precision of individual data,
especially in case of contradictory reports, (iii) data-mining in large
sets of partially redundant and contradictory data, (iv) automatized and
reproducible transformation of data between incongruent brain maps (the
"parcellation problem"). In this article, we analyze potential solutions
to these problems, and present the specific implementation of a database
on the cortical connectivity of the Macaque (CoCoMac; http://www.cocomac.org).
The design of this database focuses especially on the needs of both experimental
and computational neuroscientists to perform flexible data-mining of the
great amount of experimental data published by tracing studies. The efficiency
and flexibility of our approach is demonstrated by analyses of the cortico-cortical
and thalamo-cortical network in the Macaque monkey.
We have implemented a pair of database projects,
one serving cortical electrophysiology and the other invertebrate neurons
and recordings. The design for each combines aspects of two proven schemes
for information interchange. The journal article metaphor determined the
type, scope, organization, and quantity of data to comprise each submission.
Sequence databases encouraged intuitive tools for data viewing, capture,
and direct submission by authors. Transcending these models, neurophysiology
additionally requires new datatypes and benefits from dynamic data viewers
that function like a virtual oscilloscope. Datatypes, chiefly timeseries,
histogram, and bivariate, and illustration-like wrappers, were selected
by utility to the community of investigators. Functional and anatomical
characteristics specify neurons. Searches are via visual interfaces to
sets of controlled-vocabulary trees of values to neurophysiological metadata
attributes; in neuroscience, where interpretation of recordings is heavily
context-dependent, such metadata also supplement datasets. Permanence is
advanced by data model and data formats largely independent of contemporary
technology; the projects rely only on Java and the new XML standard, itself
implementation-dependent. All user tools are Java-based, free, multiplatform,
and distributed by our application server to any contemporary networked
computer. Copyright is retained by submitters; viewer displays are dynamic
and do not violate copyright of related journal figures. Panels of neurophysiologists
view and test schemas and tools, enhancing community support.
It is generally assumed that the variability of neuronal morphology has an important effect on the connectivitity and response within the nervous system, but this effect has not been thoroughly investigated. Neuroanatomical archives represent a crucial tool to explore structure-function relationships in the brain. We are developing computational tools to describe, generate, store, and render large sets of three-dimensional neuronal structures in a format that is both compact, quantitative, accurate, and readily accessible to the neuroscientist.
Single-cell neuroanatomy can be characterized quantitatively at several levels. In computer-aided neuronal tracing files, a dendritic tree is described as a series of cylinders ("branches"), each represented by diameter, spatial coordinates (x, y, and z), and the connectivity to other branches in the tree. This "Cartesian" description constitutes a completely accurate mapping of dendritic morphology, but it bears little "intuitive" information for the neuroscientist (e.g. it is difficult to establish the morphological class of a neuron by simply looking at its Cartesian file). In a classical neuroanatomical analysis, in contrast, neuronal dendrites are characterized on the basis of the statistical distributions of morphological parameters, e.g. maximum branching order or bifurcation asymmetry. This description is intuitively more accessible, but it only yields information on the collective anatomy of a group of dendrites, i.e. it is not complete enough to provide a precise "blueprint" of the original data. We are adopting a third, intermediate level of description, which consists of the algorithmic "generation" of neuronal structures within a certain morphological class based on a set of measured parameters. Given the right algorithm, these "fundamental" parameters describe that morphological class as intuitively as in classical neuroanatomical analysis (because their statistical distributions have an intuitive geometrical meaning), and as completely as in the Cartesian format (because they are sufficient to generate and display complete neurons). Since fundamental parameters measured from experimental data result in statistical distributions, the algorithms that generate "virtual neurons" sample values from these distributions stochastically. As a result, like in nature, no two virtual neurons are identical, even if they belong to a recognizable anatomical class.
This "computational" approach to neuroanatomy, originally proposed in the 70's, has only recently become a viable strategy thanks to the exceptional improvement of computer hardware, software, and graphics. The advantages of the "algorithmic" description of neuronal structure are immense. If an algorithm can measure the values of a handful of parameters from an experimental database and generate virtual neurons that are anatomically indistinguishable from their real counterparts, a great deal of data compression and amplification can be achieved. Data compression results from the ability to describe quantitatively and completely thousands of neurons from a morphological class with just a few statistical distributions of fundamental parameters. Data amplification is possible because, from a set of experimental neurons, many more virtual analogs can be generated. This approach could allow in principle to create and store a neuroanatomical database containing data for an entire human brain in a personal computer.
Two major types of algorithms have been proposed
for the generation and description of dendritic trees. Local algorithms
rely entirely on a set of local rules correlating morphological parameters
(such as branch diameter and length) to let each branch grow independent
of the other dendrites in the tree and independent of its absolute position
within the tree. In global algorithms, new dendritic branches are dealt
"from outside" to competing groups of growing tips, also depending on their
position in the tree (e.g. on their distance from the soma). Local and
global algorithms offer complementary advantages. Local algorithms are
simpler and more intuitive, and their fundamental parameters can be measured
directly from experimental data. Because of their small number of parameters,
they are perfectly suited to study structure/function relationship and
the origin of emergent properties (i.e. anatomical parameters that are
not explicitly imposed in the algorithm). Global algorithms are usually
more flexible and overall accurate, but many of their fundamental parameters
must be obtained through extensive and elaborate parameter searches. Global
algorithms can be also extended to generate populations of interconnected
neurons (networks), instead of single neurons. We are developing two programs,
L-Neuron and ArborVitae, which implement several global and local algorithms,
to investigate systematically the potential of the "computational neuroanatomy"
approach for neuroscience databases. We virtually generated anatomically
plausible neurons for several morphological classes, including cerebellar
Purkinje cells, hippocampal pyramidal and granule cells, and spinal cord
motoneurons.
The extracellularly activated ligand gated ion channels
(LGIC) are polymeric ionotropic receptors to neurotransmitters. These LGIC
constitute superfamilies of receptors formed by homologous subunits. The
last two decades revealed an unexpected wealth of genes coding for these
subunits. Multiple comparisons of sequences proved to be an invaluable
tool in modern pharmacological investigations. From the study of regulation
of gene expression to the understanding of protein structure-function relationships,
almost each design of experiment involves a step of sequence comparisons.
In addition, the careful analysis of known sequences may lead to the cloning
of new genes. Unfortunately, although of outstanding importance, the general
sequence databases suffer from several imperfections due to their size
and their widespread purpose. Each gene is often represented by multiple
entries, multiplicity generated from intrinsic causes, such as alternative
splicing or editing, from methodology, cDNA versus genomic cloning, but
also from competition between laboratories, each submitting its own clone.
In addition, unwanted errors are sometimes made during the submission process.
There is therefore room for expert-maintained databases, of restricted
focus but higher quality, where the knowledge of the research field help
to filter the huge amount of data generated. The Ligand Gated Ion Channel
database (LGICdb) has been developed to handle the growing wealth of cloned
LGIC subunits. The database aims to provide only one entry for each gene,
containing annotated nucleic acid and protein sequences. The release 3
of the LGICdb contained 266 subunits entries belonging to 28 different
species and covering three groups of receptors: the superfamily of pentameric
LGIC (nicotinic, 5-HT3, GABA A and C, glycine, and anionic glutamate receptors),
the cationic tetrameric glutamate receptors (AMPA, kainate and NMDA receptors)
and the trimeric ATP P2X receptors. In addition to the gene entries, the
database provides multiple sequence alignments, phylogenetic investigations,
and atomic coordinates when available. The LGICdb is accessible via the
worldwide web (http://www.pasteur.fr/recherche/banques/LGIC/LGIC.html),
where it is continuously updated.
In any scientific discipline, the role of the published
literature is to provide an 'intellectual environment' for a domain of
knowledge, representing the totality of both experimental evidence and
theoretical explanation in articles, books and reviews. Neuroscientists
suffer greatly from information overload, due to the extent, the complexity,
as well as the complicated taxonomy of their subject. We describe a paradigm
for knowledge management systems called 'Knowledge Mechanics' that implements
a versatile, generally applicable framework for the management and manipulation
of information that is represented in a distributed literature. We have
built a system that implements this paradigm for the systems-level neuroscientific
literature called 'NeuroScholar'. This system allows neuroscientists to
interact with the information in the literature in roughly the same way
that an application programming interface ('API') allows software engineers
to manipulate data constructs and subroutines. The system is design to
provide the following functionality: (1) to represent the contents of the
system's target literature accurately; (2) to permit users to interpret
the literature according to their own judgement, producing a personalized
representation; (3) to provide mechanisms to allow users to merge, share
and compare their individualized representations; (4) to cross-reference
the data in order to identify contradictions and discrepancies between
different personalized representations and (5) to provide data-analysis
tools that help us to form more powerful interpretations of the literature.
Within this paper, we describe Knowledge Mechanics and NeuroScholar in
detail, both conceptually and practically. We describe the history of the
project and present worked examples illustrating how the system may be
used.
Many problems in analytical biology, like the classification
of species, the analysis of metabolic or neural networks, or the modelling
of macromolecules, involve complex relational data. Here we describe a
novel software system, CANTOR, which has been developed to deal effectively
with such data, and which can also be used as a general development tool
for intelligent database applications. Although the system grew out of
a specific project in the analysis of neuroanatomical connectivity, it
can be applied to a wide range of relational data. Principal elements of
the CANTOR system are a database of dynamic objects, as well as a set of
library functions which can perform various operations on these objects.
The objects possess attributes that define the objects' characteristics
as well as their relationships to other objects within the database. Most
of the object relationships are dynamically maintained and updated by the
objects themselves, thus providing a flexible, efficient and constantly
updated data representation. The CANTOR library routines allow modifications
of object attributes as well as the rearrangement of objects in the database.
This restructuring can be evaluated by a large variety of user-defined
cost functions and can be guided by optimisation algorithms, providing
a flexible and powerful tool for the structural analysis of the database
content. The application of optimisation approaches also makes it possible
for the CANTOR system to deal effectively with incomplete and inconsistent
data. A prototypical form of CANTOR has been coded and has subsequently
been used successfully in the analysis of anatomical and functional mammalian
brain connectivity, involving complex and inconsistent experimental data.
In addition, it has been used for solving multivariate engineering optimisation
problems. CANTOR has been programmed using the ANSI-C language and is thus
architecture-independent. The software is supported by systems libraries
which allow multi-threading (the concurrent processing of several database
operations), as well as the distribution of the dynamic data objects and
library operations over several computers at once. These attributes make
the system easily scalable and in principle allow the representation and
analysis of arbitrarily large sets of relational data.
Neuromorphic hardware is the term used to describe
full custom designed integrated circuits, ICs or silicon ``chips'', that
are the product of neuromorphic engineering a methodology for the
synthesis of biologicallyinspired systems such as retinae, cochleas,
oculomotor responses and central pattern generators, but also for the replication
of neurons and functional circuits of neurons to provide tools for the
analysis of the workings of the nervous system, including structurefunction
relationships. Neuromorphic hardware can be constructed with either digital
or analogue circuitry or with a hybrid of the two. Currently, most examples
of this type of hardware are constructed using analogue circuits. The correspondence
between these circuits and neurons, or functional circuits of neurons,
can exist at a number of levels. At the smallest scale, the correspondence
is between populations of ion channels, either synaptic or nonsynaptic,
and types of fieldeffect transistors, whilst the resistive and capacitive
properties of the neuronal membrane can be represented with extrinsic devices,
or with the intrinsic properties of the materials from which transistors
are com posed: doped silicon and polysilicon. This allows silicon
``neurons'' to be built, with dendritic, somatic and axonal structures
and endowed with ionic and synaptic properties. Examples of structurefunction
relationships already explored using neuromorphic hardware include directional
selectivity, sublinear summation and temporal coding. Establishing databases
for this hardware is valuable for two reasons: firstly, independently of
neuroscientific motivations, the field of neuromorphic engineering would
benefit greatly from a resource in which circuit designs could be stored
in a form appropriate for reuse and refabrication. Analogue designers would
benefit particularly from such a database, as there are no equivalents
to the algorithmic design methods available to designers of digital circuits.
Secondly, and more importantly for the purpose of this theme issue, is
the possibility of databases of silicon neuron designs replicating specific
neuronal types and morphologies. Especially if an automated process for
translating morphometric data directly into layout compatible format were
to be developed. The question that needs to be addressed is: what could
a neuromorphic hardware database contribute to the wider neuroscientific
community that a conventional database could not? The answer is that neuromorphic
hardware is expected to provide analogue sensorymotor systems for
interfacing the computational power of symbolic, digital systems with the
external, analogue environment. It is also expected to contribute to ongoing
work in ``living silicon'' and neural prosthetics. This, combined with
the possibility of evolving the hardware in the form of analogue field
programmable gate arrays, creates the need for a database to be established
and it would be advantageous to set about this whilst the field is relatively
young. The paper will outline a framework for the construction of a neuromorphic
hardware database, for use at the stage when neuromorphic design can actively
contribute to, as well as being informed by, the biological exploration
of structurefunction relationships.
Neuronal systems are complex and models of these systems correspondingly complex. We describe methodologies which will improve the ability of neuroscientists to collaborate in the modeling process. It is crucial for modelers to have access to tools which support discussion, development and exchange of models and components of models. We report our findings on the requirements on these tools and our proposal for structuring their development. The desirability of declarative methods for describing models is discussed. We show the equivalence of this form to object-oriented class design and database schema definition (collectively called templates). We introduce a template hierarchy sufficient to describe models from membrane to network levels. The templates support both a database of models and simulation of models.
Faster than ever, neuroscience is generating vast
amounts of data that await cross-referencing, comparison, integration and
interpretation in the endeavour to unravel the mechanisms of the brain.
The complex, diverse and distributed nature of these data requires the
development of advanced neuroinformatics methodologies for databases and
associated tools that are now beginning to emerge.
Here I present an overview of current issues in
the representation, integration and analysis of neuroscience data from
molecular to brain systems levels, including issues of implementation,
standardisation, management, quality control, copyright, confidentiality
and acceptance. Particular emphasis is given to integrative neuroinformatics
approaches for exploring structure-function relationships in the brain.