[Xerte-dev] Re: FW: Xenith/XOT Glossary regexps

Fay Cross Fay.Cross at nottingham.ac.uk
Wed Oct 17 09:50:30 BST 2012


Thanks for looking into this John.  I'm just looking at the code you sent and you're right it doesn't take punctuation into account.  It works when the word is at the beginning or end of a sentence, with or without a space immediately before or after but whenever there's punctuation next to it (cat. cat? etc.) it just gets replaced with cat and the punctuation is lost.  Also, longer words that start with a word from the glossary get replaced with the shorter glossary word (e.g. category becomes cat).


-----Original Message-----
From: xerte-dev-bounces at lists.nottingham.ac.uk [mailto:xerte-dev-bounces at lists.nottingham.ac.uk] On Behalf Of Smith, John
Sent: 15 October 2012 18:48
To: xerte-dev at lists.nottingham.ac.uk
Subject: [Xerte-dev] Re: FW: Xenith/XOT Glossary regexps

Thinking about it over dinner though I've not taken ending punctuation into consideration - beginners mistake!!

Will look on my return unless its solved beforehand...

Regards

John Smith
Learning Technologist
School of Health and Life Sciences

Sent from Samsung Galaxy SII



Pat Lockley <patrick.lockley at googlemail.com> wrote:


Don't think that regexp works if the word is the first thing in a sentence

On 15 Oct 2012, at 07:10, Julian Tenney <Julian.Tenney at nottingham.ac.uk> wrote:

> Just forwarding this to the list for everyone's info: Fay can use it in the Xenith code, I'm not sure if I can integrate it into engine as this is (I guess) javascript and not the actionscript RegExp engine (although the expressions should work in both...). I'll try..
>
> -----Original Message-----
> From: Smith, John [mailto:J.J.Smith at gcu.ac.uk]
> Sent: 14 October 2012 17:26
> To: julian.tenney at nottingham.ac.uk; Fay.Cross at nottingham.ac.uk; 
> ronm at mitchellmedia.co.uk; reijnders at tor.nl
> Subject: Xenith/XOT Glossary regexps
> Importance: High
>
> Hi guys,
>
> Great to meet you all and I've been looking through the xenith code to see where I can contribute. Also, have been looking through the archives and came across the regexp problem for the glossary. Since i'm only today on the list proper not sure whether a reply will go through to the correct place so sending to you all to see if it helps...
>
> Not sure whether this has been fixed yet but it seems the problem is partly caused by /b requiring a word boundary and there being no word boundary on the very first word. Also, seem to remember somewhere that /b can in some cases match international characters in the middle of words which might not be the desired effect...
>
> I have changed the regexp to this "\sTERM[^\s]*|^TERM[^\s]*" in the xenith.js code as so:
>
> // function makes every glossary word found into a link function insertGlossaryTag(node) {
>        var temp = node.nodeValue;
>        for (var k=0; k<glossary.length; k++) {
>                // ** see recent emails on list about regular expression stuff **
>                //var regExp = new RegExp(" " + glossary[k].word + " ", 
> "ig");
>
>                var regExp = new RegExp('\\s' + glossary[k].word + 
> '[^\\s]*|^' + glossary[k].word + '[^\\s]*', 'gi');
>
>                temp = temp.replace(regExp, ' <a class="glossary" href="#" title="' + glossary[k].definition + '">' + glossary[k].word + '</a> ');
>        }
>        node.nodeValue = temp;
> }
>
> and now it seems to match all the words, no matter where they are and irrespective of spaces. See attached screenshots - you can see there are no spaces before any words and only some have a space after. Probably needs further testing to go into xot though...
>
> Will start adding to the list soon...
>
> Regards,
>
> John Smith | Learning Technologist
> Room A251, Govan Mbeki Building | School of Health & Life Sciences | 
> Glasgow Caledonian University Cowcaddens Road | Glasgow | G4 0BA
>
> Glasgow Caledonian University is a registered Scottish charity, number 
> SC021474
>
> Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6
> 219,en.html
>
> Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,1
> 5691,en.html
>
> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
>
> This message has been checked for viruses but the contents of an 
> attachment may still contain software viruses which could damage your computer system:
> you are advised to perform your own checks. Email communications with 
> the University of Nottingham may be monitored as permitted by UK legislation.
>
> <glossary.png>
> <glossary2.png>
> _______________________________________________
> Xerte-dev mailing list
> Xerte-dev at lists.nottingham.ac.uk
> http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev
>
> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
>
> This message has been checked for viruses but the contents of an 
> attachment may still contain software viruses which could damage your computer system:
> you are advised to perform your own checks. Email communications with 
> the University of Nottingham may be monitored as permitted by UK legislation.
>

_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev


More information about the Xerte-dev mailing list