[gmx-users] Data and Knowledge Management Tools for Computational Chemists

Thu Jan 6 14:40:18 CET 2005

Dear All,

I hope this message is not too off-topic wrt Gromacs, but I think it
directly relates to the production, treatment and dissemination of
scientific results, eg those obtained with Gromacs.

I am looking for software, tools or general approaches to get hold of the
wealth of information that accumulates (mostly) electronically. In particular
emails, text/PDF/XML or similar documents, bookmarks to websites and 
bibliographic references (but eventually also results from calculations,
location of trajectories, ...).

The main request would be to be able to "store" information as is without
having to enter it individually into a curated database. Filtering, indexing
or cataloging through a script would be ok, though. A powerful search should be
possible.

Some specific points:
- concerning bibliographic references, there is a wide variety of formats
  like Pubmed, email-alerts, quotes on websites, ... sometimes with a comment
  by the person who sent the reference, sometimes with an URL link, ...
  I would like to be able to gather all information in a first pass without
  having to parse the format by hand (eg where are authors, title, etc).
- concerning bookmarks, it would be nice to also have elimination of duplicates
  and of dead links
- taking it one step further, indexing the sites listed in the bookmarks might
  also be an additional useful step

After some extensive search of the web, I could not come up with a fully
satisfactory solution. My current best bet would be to index text and other
files and email with a search engine like eg namazu. For bookmarks I'd ideally
like to store them in XBEL format, but there seem to be only a limited number
of tools, and none or very few that eliminate duplicates and dead links.
A useful bookmark tool might be bookmarker.
FramerD (a database) seems also an interesting possibility, but probably 
requires quite some substantial coding.

In an ideal world, I'd also love to make use of some artificial intelligence
code (eg Self-organizing maps, textual data mining,..) or some machine-learning
tools, but my feeling is that those are not (yet) usable by non-experts.

My question is what do other people in the field use ? Are there any miracular
packages that would do all that I want ? Are there other/better approaches ?

Thanks very much in advance.
  Marc Baaden

-- 
 Dr. Marc Baaden  - Institut de Biologie Physico-Chimique, Paris
 mailto:baaden at smplinux.de      -      http://www.marc-baaden.de
 FAX: +33 15841 5026  -  Tel: +33 15841 5176  ou  +33 609 843217