[Otmi-discuss] What's New on the OTMI Wiki
Hammond, Tony
t.hammond at nature.com
Thu Jul 19 12:41:34 EDT 2007
The wiki 'opentextmining.org' is a public forum for exchanging information
about OTMI and opening up discussion on OTMI.
This post lists below some changes made since we last blogged on OTMI
development in a post to Nascent back in February. Do go and take a look at
the wiki and consider contributing.
tags: spam, unicode, relax-ng, xsd, gem
Cheers,
Tony
= Access control
We set up the wiki as a public utility with public read/write access but, as
is the way of the digital world, the wiki has been regularly targeted by
spammers. To address this we first introduced a very lightweight security
measure - the requirement to set up an account in order to post.
Unfortunately, this was not enough to stop spam attacks and we are currently
resorting to limiting write access to approved IP addresses. We are also
looking into more robust yet flexible measures. Read access remains public.
Feel free to post (publicly or privately) any suggestions for how best we
might manage access to the wiki while preventing spam.
= Changes to spec
We have continued to make changes to the OTMI spec.
1. Changed version numbering to standard 0.0.0 style
2. Added new attribute 'data/@version'
3. Added new attributes 'vectors/@number', 'snippets/@number'
4. Added new element 'table/title'
= Schemas
As part of the site overhaul we have upgraded the reference grammar (in
ABNF) and have added the following concrete schemas:
1. Relax NG (Compact)
2. Relax NG
3. W3C XML Schema
4. DTD
= Ruby gem
The demo generator Ruby script 'gen_otmi.rb' has been repackaged as a Ruby
gem: 'otmi-0.4.2.gem'. Reasons for this are twofold:
1. To simplify script distribution and installation
2. To ease code management
Code changes include the following:
1. Changed version numbering to standard 0.0.0 style
2. Added in command line options
3. Modularized the file
4. Repackaged as a Ruby Gem otmi-0.4.2.gem
- Added LICENSE, INSTALL and README files
5. Changed sort order on vectors - now case insensitive
6. Regex processing
- Added support for Unicode characters
- Fixed error which removed errant [<>&'"] chars
7. Added early support for unknown DTD
- Added tag: for atom:id when unknown DTD
********************************************************************************
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
not the original intended recipient. If you have received this e-mail in error
please inform the sender and delete it from your mailbox or any other storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
liability for any statements made which are clearly the sender's own and not
expressly made on behalf of Macmillan Publishers Limited or one of its agents.
Please note that neither Macmillan Publishers Limited nor any of its agents
accept any responsibility for viruses that may be contained in this e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of Macmillan
Publishers Limited or its agents by means of e-mail communication. Macmillan
Publishers Limited Registered in England and Wales with registered number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
********************************************************************************
More information about the Otmi-discuss
mailing list