Results for *

Displaying results 1 to 11 of 11.

  1. Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework
    Published: 2008
    Publisher:  BMC

    Abstract Background Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Recent studies, mainly based on various approximations of the Sankoff algorithm, have resulted in considerable... more

     

    Abstract Background Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Recent studies, mainly based on various approximations of the Sankoff algorithm, have resulted in considerable improvement in the accuracy of pairwise structural alignment. In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized. Results We took a different approach from a straightforward extension of the Sankoff algorithm to the multiple alignments from the viewpoints of accuracy and time complexity. As a new option of the MAFFT alignment program, we developed a multiple RNA alignment framework, X-INS-i, which builds a multiple alignment with an iterative method incorporating structural information through two components: (1) pairwise structural alignments by an external pairwise alignment method such as SCARNA or LaRA and (2) a new objective function, Four-way Consistency, derived from the base-pairing probability of every sub-aligned group at every multiple alignment stage. Conclusion The BRAliBASE benchmark showed that X-INS-i outperforms other methods currently available in the sum-of-pairs score (SPS) criterion. As a basis for predicting common secondary structure, the accuracy of the present method is comparable to or rather higher than those of the current leading methods such as RNA Sampler. The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability. The source code is available at the webpage found in the Availability and requirements section.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: BMC Bioinformatics, Vol 9, Iss 1, p 212 (2008)
    Subjects: Computer applications to medicine. Medical informatics; Biology (General)
  2. Quality control for terms and definitions in ontologies and taxonomies

    Abstract Background Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology (GO), the most prominent taxonomic resource in... more

     

    Abstract Background Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology (GO), the most prominent taxonomic resource in these fields, is marked by flaws of certain characteristic types, which flow from a failure to address basic ontological principles. As yet, no methods have been proposed which would allow ontology curators to pinpoint flawed terms or definitions in ontologies in a systematic way. Results We present computational methods that automatically identify terms and definitions which are defined in a circular or unintelligible way. We further demonstrate the potential of these methods by applying them to isolate a subset of 6001 problematic GO terms. By automatically aligning GO with other ontologies and taxonomies we were able to propose alternative synonyms and definitions for some of these problematic terms. This allows us to demonstrate that these other resources do not contain definitions superior to those supplied by GO. Conclusion Our methods provide reliable indications of the quality of terms and definitions in ontologies and taxonomies. Further, they are well suited to assist ontology curators in drawing their attention to those terms that are ill-defined. We have further shown the limitations of ontology mapping and alignment in assisting ontology curators in rectifying problems, thus pointing to the need for manual curation.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: BMC Bioinformatics, Vol 7, Iss 1, p 212 (2006)
    Subjects: Computer applications to medicine. Medical informatics; Biology (General)
  3. Thawing Frozen Robust Multi-array Analysis (fRMA)
    Published: 2011
    Publisher:  BMC

    Abstract Background A novel method of microarray preprocessing - Frozen Robust Multi-array Analysis (fRMA) - has recently been developed. This algorithm allows the user to preprocess arrays individually while retaining the advantages of multi-array... more

     

    Abstract Background A novel method of microarray preprocessing - Frozen Robust Multi-array Analysis (fRMA) - has recently been developed. This algorithm allows the user to preprocess arrays individually while retaining the advantages of multi-array preprocessing methods. The frozen parameter estimates required by this algorithm are generated using a large database of publicly available arrays. Curation of such a database and creation of the frozen parameter estimates is time-consuming; therefore, fRMA has only been implemented on the most widely used Affymetrix platforms. Results We present an R package, frmaTools, that allows the user to quickly create his or her own frozen parameter vectors. We describe how this package fits into a preprocessing workflow and explore the size of the training dataset needed to generate reliable frozen parameter estimates. This is followed by a discussion of specific situations in which one might wish to create one's own fRMA implementation. For a few specific scenarios, we demonstrate that fRMA performs well even when a large database of arrays in unavailable. Conclusions By allowing the user to easily create his or her own fRMA implementation, the frmaTools package greatly increases the applicability of the fRMA algorithm. The frmaTools package is freely available as part of the Bioconductor project.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: BMC Bioinformatics, Vol 12, Iss 1, p 369 (2011)
    Subjects: Computer applications to medicine. Medical informatics; Biology (General)
  4. Computing paths and cycles in biological interaction graphs
    Published: 2009
    Publisher:  BMC

    Abstract Background Interaction graphs (signed directed graphs) provide an important qualitative modeling approach for Systems Biology. They enable the analysis of causal relationships in cellular networks and can even be useful for predicting... more

     

    Abstract Background Interaction graphs (signed directed graphs) provide an important qualitative modeling approach for Systems Biology. They enable the analysis of causal relationships in cellular networks and can even be useful for predicting qualitative aspects of systems dynamics. Fundamental issues in the analysis of interaction graphs are the enumeration of paths and cycles (feedback loops) and the calculation of shortest positive/negative paths. These computational problems have been discussed only to a minor extent in the context of Systems Biology and in particular the shortest signed paths problem requires algorithmic developments. Results We first review algorithms for the enumeration of paths and cycles and show that these algorithms are superior to a recently proposed enumeration approach based on elementary-modes computation. The main part of this work deals with the computation of shortest positive/negative paths, an NP-complete problem for which only very few algorithms are described in the literature. We propose extensions and several new algorithm variants for computing either exact results or approximations. Benchmarks with various concrete biological networks show that exact results can sometimes be obtained in networks with several hundred nodes. A class of even larger graphs can still be treated exactly by a new algorithm combining exhaustive and simple search strategies. For graphs, where the computation of exact solutions becomes time-consuming or infeasible, we devised an approximative algorithm with polynomial complexity. Strikingly, in realistic networks (where a comparison with exact results was possible) this algorithm delivered results that are very close or equal to the exact values. This phenomenon can probably be attributed to the particular topology of cellular signaling and regulatory networks which contain a relatively low number of negative feedback loops. Conclusion The calculation of shortest positive/negative paths and cycles in interaction graphs is an important method for ...

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: BMC Bioinformatics, Vol 10, Iss 1, p 181 (2009)
    Subjects: Computer applications to medicine. Medical informatics; Biology (General)
  5. LinkedImm: a linked data graph database for integrating immunological data

    Abstract Background Many systems biology studies leverage the integration of multiple data types (across different data sources) to offer a more comprehensive view of the biological system being studied. While SQL (Structured Query Language)... more

     

    Abstract Background Many systems biology studies leverage the integration of multiple data types (across different data sources) to offer a more comprehensive view of the biological system being studied. While SQL (Structured Query Language) databases are popular in the biomedical domain, NoSQL database technologies have been used as a more relationship-based, flexible and scalable method of data integration. Results We have created a graph database integrating data from multiple sources. In addition to using a graph-based query language (Cypher) for data retrieval, we have developed a web-based dashboard that allows users to easily browse and plot data without the need to learn Cypher. We have also implemented a visual graph query interface for users to browse graph data. Finally, we have built a prototype to allow the user to query the graph database in natural language. Conclusion We have demonstrated the feasibility and flexibility of using a graph database for storing and querying immunological data with complex biological relationships. Querying a graph database through such relationships has the potential to discover novel relationships among heterogeneous biological data and metadata.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: BMC Bioinformatics, Vol 22, Iss S9, Pp 1-14 (2021)
    Subjects: Ontology; Knowledgebase; Graph database; Immunology; Influenza vaccine; Computer applications to medicine. Medical informatics; Biology (General)
  6. Feature Importance for Human Epithelial (HEp-2) Cell Image Classification
    Published: 2018
    Publisher:  MDPI AG

    Indirect Immuno-Fluorescence (IIF) microscopy imaging of human epithelial (HEp-2) cells is a popular method for diagnosing autoimmune diseases. Considering large data volumes, computer-aided diagnosis (CAD) systems, based on image-based... more

     

    Indirect Immuno-Fluorescence (IIF) microscopy imaging of human epithelial (HEp-2) cells is a popular method for diagnosing autoimmune diseases. Considering large data volumes, computer-aided diagnosis (CAD) systems, based on image-based classification, can help in terms of time, effort, and reliability of diagnosis. Such approaches are based on extracting some representative features from the images. This work explores the selection of the most distinctive features for HEp-2 cell images using various feature selection (FS) methods. Considering that there is no single universally optimal feature selection technique, we also propose hybridization of one class of FS methods (filter methods). Furthermore, the notion of variable importance for ranking features, provided by another type of approaches (embedded methods such as Random forest, Random uniform forest) is exploited to select a good subset of features from a large set, such that addition of new features does not increase classification accuracy. In this work, we have also, with great consideration, designed class-specific features to capture morphological visual traits of the cell patterns. We perform various experiments and discussions to demonstrate the effectiveness of FS methods along with proposed and a standard feature set. We achieve state-of-the-art performance even with small number of features, obtained after the feature selection.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: Journal of Imaging, Vol 4, Iss 3, p 46 (2018)
    Subjects: feature selection; filter methods; hybridization; random forest; class-specific features; Photography; Computer applications to medicine. Medical informatics; Electronic computers. Computer science
  7. Shape Similarity Measurement for Known-Object Localization: A New Normalized Assessment
    Published: 2019
    Publisher:  MDPI AG

    This paper presents a new, normalized measure for assessing a contour-based object pose. Regarding binary images, the algorithm enables supervised assessment of known-object recognition and localization. A performance measure is computed to quantify... more

     

    This paper presents a new, normalized measure for assessing a contour-based object pose. Regarding binary images, the algorithm enables supervised assessment of known-object recognition and localization. A performance measure is computed to quantify differences between a reference edge map and a candidate image. Normalization is appropriate for interpreting the result of the pose assessment. Furthermore, the new measure is well motivated by highlighting the limitations of existing metrics to the main shape variations (translation, rotation, and scaling), by showing how the proposed measure is more robust to them. Indeed, this measure can determine to what extent an object shape differs from a desired position. In comparison with 6 other approaches, experiments performed on real images at different sizes/scales demonstrate the suitability of the new method for object-pose or shape-matching estimation.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: Journal of Imaging, Vol 5, Iss 10, p 77 (2019)
    Subjects: distance measures; contours; shape; pose evaluation; Photography; Computer applications to medicine. Medical informatics; Electronic computers. Computer science
  8. Zig-Zag Based Single-Pass Connected Components Analysis
    Published: 2019
    Publisher:  MDPI AG

    Single-pass connected components analysis (CCA) algorithms suffer from a time overhead to resolve labels at the end of each image row. This work demonstrates how this overhead can be eliminated by replacing the conventional raster scan by a zig-zag... more

     

    Single-pass connected components analysis (CCA) algorithms suffer from a time overhead to resolve labels at the end of each image row. This work demonstrates how this overhead can be eliminated by replacing the conventional raster scan by a zig-zag scan. This enables chains of labels to be correctly resolved while processing the next image row. The effect is faster processing in the worst case with no end of row overheads. CCA hardware architectures using the novel algorithm proposed in this paper are, therefore, able to process images at higher throughput than other state-of-the-art methods while reducing the hardware requirements. The latency introduced by the conversion from raster scan to zig-zag scan is compensated for by a new method of detecting object completion, which enables the feature vector for completed connected components to be output at the earliest possible opportunity.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: Journal of Imaging, Vol 5, Iss 4, p 45 (2019)
    Subjects: connected components analysis; stream processing; feature extraction; zig-zag scan; hardware architecture; FPGA; pipeline; Photography; Computer applications to medicine. Medical informatics; Electronic computers. Computer science
  9. Identification of End-User Economical Relationship Graph Using Lightweight Blockchain-Based BERT Model

    Current methods for extracting information from user resumes do not work well with unstructured user resumes in economic announcements, and they do not work well with documents that have the same users in them. Unstructured user information is turned... more

     

    Current methods for extracting information from user resumes do not work well with unstructured user resumes in economic announcements, and they do not work well with documents that have the same users in them. Unstructured user information is turned into structured user information templates in this study. It also proposes a way to build person relationship graphs in the field of economics. First, the lightweight blockchain-based BERT model (B-BERT) is trained. The learned B-BERT pretraining model is then utilized to get the event instance vector, categorize it appropriately, and populate the hierarchical user information templates with accurate user characteristics. The aim of this research is that it has investigated the approach of creating character connection graphs in the Chinese financial system and suggests a framework for doing so in the economic sector. Furthermore, the relationship between users is found through the filled-in user information template, and a graph of user relationships is made. This is how it works: finally, the experiment is checked by filling in a manually annotated dataset. In tests, the method can be used to get text information from unstructured economic user resumes and build a relationship map of people in the financial field. The experimental results show that the proposed approach is capable of efficiently retrieving information from unstructured financial personnel resume text and generating a character relationship graph in the economic sphere.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: Computational Intelligence and Neuroscience, Vol 2022 (2022)
    Subjects: Computer applications to medicine. Medical informatics; Neurosciences. Biological psychiatry. Neuropsychiatry
  10. Dataset of Bessel function Jn maxima and minima to 600 orders and 10000 extrema

    Bessel functions of the first kind are ubiquitous in the sciences and engineering in solutions to cylindrical problems including electrostatics, heat flow, and the Schrödinger equation. The roots of the Bessel functions are often quoted and... more

     

    Bessel functions of the first kind are ubiquitous in the sciences and engineering in solutions to cylindrical problems including electrostatics, heat flow, and the Schrödinger equation. The roots of the Bessel functions are often quoted and calculated, but the maxima and minima for each Bessel function, used to match Neumann boundary conditions, have not had the same treatment. Here we compute 10000 extrema for the first 600 orders of the Bessel function J. To do this, we employ an adaptive root solver bounded by the roots of the Bessel function and solve to an accuracy of 10−19. We compare with the existing literature (to 30 orders and 5 maxima and minima) and the results match exactly. It is hoped that these data provide values needed for orthogonal function expansions and numerical expressions including the calculation of geometric correction factors in the measurement of resistivity of materials, as is done in the original paper using these data.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: Data in Brief, Vol 39, Iss , Pp 107508- (2021)
    Subjects: Bessel functions; GCF; Extrema; Minimum; Maximum; Computer applications to medicine. Medical informatics; Science (General)
  11. A (fire)cloud-based DNA methylation data preprocessing and quality control platform

    Abstract Background Bisulfite sequencing allows base-pair resolution profiling of DNA methylation and has recently been adapted for use in single-cells. Analyzing these data, including making comparisons with existing data, remains challenging due to... more

     

    Abstract Background Bisulfite sequencing allows base-pair resolution profiling of DNA methylation and has recently been adapted for use in single-cells. Analyzing these data, including making comparisons with existing data, remains challenging due to the scale of the data and differences in preprocessing methods between published datasets. Results We present a set of preprocessing pipelines for bisulfite sequencing DNA methylation data that include a new R/Bioconductor package, scmeth, for a series of efficient QC analyses of large datasets. The pipelines go from raw data to CpG-level methylation estimates and can be run, with identical results, either on a single computer, in an HPC cluster or on Google Cloud Compute resources. These pipelines are designed to allow users to 1) ensure reproducibility of analyses, 2) achieve scalability to large whole genome datasets with 100 GB+ of raw data per sample and to single-cell datasets with thousands of cells, 3) enable integration and comparison between user-provided data and publicly available data, as all samples can be processed through the same pipeline, and 4) access to best-practice analysis pipelines. Pipelines are provided for whole genome bisulfite sequencing (WGBS), reduced representation bisulfite sequencing (RRBS) and hybrid selection (capture) bisulfite sequencing (HSBS). Conclusions The workflows produce data quality metrics, visualization tracks, and aggregated output for further downstream analysis. Optional use of cloud computing resources facilitates analysis of large datasets, and integration with existing methylome profiles. The workflow design principles are applicable to other genomic data types.

     

    Export to reference management software   RIS file
      BibTeX file
    Source: BASE Selection for Comparative Literature
    Language: English
    Media type: Article (journal)
    Format: Online
    Parent title: BMC Bioinformatics, Vol 20, Iss 1, Pp 1-5 (2019)
    Subjects: DNA methylation; Cloud computing; Bioinformatics workflows; Quality control analysis; Computer applications to medicine. Medical informatics; Biology (General)