[Xerte-dev] Re: FW: Xenith/XOT Glossary regexps

Fay Cross Fay.Cross at nottingham.ac.uk
Fri Oct 19 09:59:04 BST 2012


Thanks John, will try your new code in html5 version later today.  I agree with Tom that it shouldn't look for plurals too. I might change my code so that only the first instance of a particular word on a page gets made into a link so pages don't get covered with links to the same word's definition.  I'm not sure if this would be a problem or not.

I haven't had a chance to look at what you did on the drag and drop yet but will try to have a look at this later too.
________________________________________
From: xerte-dev-bounces at lists.nottingham.ac.uk [xerte-dev-bounces at lists.nottingham.ac.uk] On Behalf Of Julian Tenney [Julian.Tenney at nottingham.ac.uk]
Sent: 19 October 2012 09:47
To: For Xerte technical developers
Subject: [Xerte-dev] Re: FW: Xenith/XOT Glossary regexps

> This possibly leads to another question: should glossary terms that appear in glossary descriptions be hyperlinked to their relevant glossary item?

Nice idea.

In my experience, not many people take the time to add a highly detailed glossary, and the glossary is only as good as it's list of definitions, and one problem is recreating it for every LO.

It would be worth seeing if there is any open dictionary service, say, that would make it easy to look up *any* word (but I know there are problems with that because of context). However, that's a separate problem.



-----Original Message-----
From: xerte-dev-bounces at lists.nottingham.ac.uk [mailto:xerte-dev-bounces at lists.nottingham.ac.uk] On Behalf Of Smith, John
Sent: 19 October 2012 09:34
To: xerte-dev at lists.nottingham.ac.uk
Subject: [Xerte-dev] Re: FW: Xenith/XOT Glossary regexps

Hi Tom

Yes, that's what i thought but just thought i'd check. I think the best way is if content authors define plurals in the glossary if required.

This possibly leads to another question: should glossary terms that appear in glossary descriptions be hyperlinked to their relevant glossary item?

For example if you had the glossary open and the first definition mentions xerte then should xerte be hyperlinked and if you click on it then the glossary scrolls down to xerte?

Would anyone find that to be useful functionality that I could\should work on?

Regards

John Smith
Learning Technologist
School of Health and Life Sciences

Sent from Samsung Galaxy SII



Tom Reijnders <reijnders at tor.nl> wrote:


 John,

Yes, I think this regexp is usable in xot as well. I'll implement it and will test it.

With regards to your second remark, my vote would be not to do this, because the way plurals are formed is different from language to language.

Regards,

Tom

Op 19-10-2012 9:37, Smith, John schreef:

Hi Fay,

Greetings from Czeck Republic!! Yes, sorry, I got a bit carried away and didn't read the brief correctly or do sufficient testing!!

Anyway, I think I have cracked it and now have a regular expression that 'seems' to work as desired (in my testing anyway!!) but will need to be tested with real life data that I don't have access to so i'm putting it out there to see if anyone can break it. It also only uses a single glossary term and 3 capture groups to capture the before and after parts (which may be nothing - using the ^ and $) and the actual term (so that we maintain the case). It also handles some punctuation (this can easily be expanded upon). This means that it may even be suitable for xot also.


// function makes every glossary word found into a link function insertGlossaryTag(node) {
        var temp = node.nodeValue;
        for (var k=0; k<glossary.length; k++) {
                var regExp = new RegExp('(^|\\s)(' + glossary[k].word + ')([\\s\\.,!?]|$)', 'gi');
                temp = temp.replace(regExp, '$1<a class="glossary" href="#" title="$2">$2</a>$3');
        }
        node.nodeValue = temp;
}


Again this has only been tested with Xenith code. It's also thrown up other questions such as:

1. Should it match the first or second part of a hyphenated word: not a real life example but cat-fish for example?
2. How should it handle plurals, if even at all: such at cats. It could have 's' and 'es' etc added to the punctuation group so would hyperlink cat but with a letter s afterwards, still not hyperlinked which would be fine?

Regards,

John Smith | Learning Technologist
Room A251, Govan Mbeki Building | School of Health & Life Sciences | Glasgow Caledonian University Cowcaddens Road | Glasgow | G4 0BA ________________________________________
From: xerte-dev-bounces at lists.nottingham.ac.uk<mailto:xerte-dev-bounces at lists.nottingham.ac.uk> [xerte-dev-bounces at lists.nottingham.ac.uk<mailto:xerte-dev-bounces at lists.nottingham.ac.uk>] On Behalf Of Fay Cross [Fay.Cross at nottingham.ac.uk<mailto:Fay.Cross at nottingham.ac.uk>]
Sent: 17 October 2012 09:50
To: For Xerte technical developers
Subject: [Xerte-dev] Re: FW: Xenith/XOT Glossary regexps

Thanks for looking into this John.  I'm just looking at the code you sent and you're right it doesn't take punctuation into account.  It works when the word is at the beginning or end of a sentence, with or without a space immediately before or after but whenever there's punctuation next to it (cat. cat? etc.) it just gets replaced with cat and the punctuation is lost.  Also, longer words that start with a word from the glossary get replaced with the shorter glossary word (e.g. category becomes cat).


-----Original Message-----
From: xerte-dev-bounces at lists.nottingham.ac.uk<mailto:xerte-dev-bounces at lists.nottingham.ac.uk> [mailto:xerte-dev-bounces at lists.nottingham.ac.uk] On Behalf Of Smith, John
Sent: 15 October 2012 18:48
To: xerte-dev at lists.nottingham.ac.uk<mailto:xerte-dev at lists.nottingham.ac.uk>
Subject: [Xerte-dev] Re: FW: Xenith/XOT Glossary regexps

Thinking about it over dinner though I've not taken ending punctuation into consideration - beginners mistake!!

Will look on my return unless its solved beforehand...

Regards

John Smith
Learning Technologist
School of Health and Life Sciences

Sent from Samsung Galaxy SII



Pat Lockley <patrick.lockley at googlemail.com><mailto:patrick.lockley at googlemail.com> wrote:


Don't think that regexp works if the word is the first thing in a sentence

On 15 Oct 2012, at 07:10, Julian Tenney <Julian.Tenney at nottingham.ac.uk><mailto:Julian.Tenney at nottingham.ac.uk> wrote:



Just forwarding this to the list for everyone's info: Fay can use it in the Xenith code, I'm not sure if I can integrate it into engine as this is (I guess) javascript and not the actionscript RegExp engine (although the expressions should work in both...). I'll try..

-----Original Message-----
From: Smith, John [mailto:J.J.Smith at gcu.ac.uk]
Sent: 14 October 2012 17:26
To: julian.tenney at nottingham.ac.uk<mailto:julian.tenney at nottingham.ac.uk>; Fay.Cross at nottingham.ac.uk<mailto:Fay.Cross at nottingham.ac.uk>;
ronm at mitchellmedia.co.uk<mailto:ronm at mitchellmedia.co.uk>; reijnders at tor.nl<mailto:reijnders at tor.nl>
Subject: Xenith/XOT Glossary regexps
Importance: High

Hi guys,

Great to meet you all and I've been looking through the xenith code to see where I can contribute. Also, have been looking through the archives and came across the regexp problem for the glossary. Since i'm only today on the list proper not sure whether a reply will go through to the correct place so sending to you all to see if it helps...

Not sure whether this has been fixed yet but it seems the problem is partly caused by /b requiring a word boundary and there being no word boundary on the very first word. Also, seem to remember somewhere that /b can in some cases match international characters in the middle of words which might not be the desired effect...

I have changed the regexp to this "\sTERM[^\s]*|^TERM[^\s]*" in the xenith.js code as so:

// function makes every glossary word found into a link function insertGlossaryTag(node) {
       var temp = node.nodeValue;
       for (var k=0; k<glossary.length; k++) {
               // ** see recent emails on list about regular expression stuff **
               //var regExp = new RegExp(" " + glossary[k].word + " ", "ig");

               var regExp = new RegExp('\\s' + glossary[k].word + '[^\\s]*|^' + glossary[k].word + '[^\\s]*', 'gi');

               temp = temp.replace(regExp, ' <a class="glossary" href="#" title="' + glossary[k].definition + '">' + glossary[k].word + '</a> ');
       }
       node.nodeValue = temp;
}

and now it seems to match all the words, no matter where they are and irrespective of spaces. See attached screenshots - you can see there are no spaces before any words and only some have a space after. Probably needs further testing to go into xot though...

Will start adding to the list soon...

Regards,

John Smith | Learning Technologist
Room A251, Govan Mbeki Building | School of Health & Life Sciences | Glasgow Caledonian University Cowcaddens Road | Glasgow | G4 0BA

Glasgow Caledonian University is a registered Scottish charity, number
SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6
219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,1
5691,en.html

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.

<glossary.png>
<glossary2.png>
_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk<mailto:Xerte-dev at lists.nottingham.ac.uk>
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.



_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk<mailto:Xerte-dev at lists.nottingham.ac.uk>
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk<mailto:Xerte-dev at lists.nottingham.ac.uk>
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev
_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk<mailto:Xerte-dev at lists.nottingham.ac.uk>
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html




_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk<mailto:Xerte-dev at lists.nottingham.ac.uk>
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.




--
--

Tom Reijnders
TOR Informatica
Chopinlaan 27
5242HM Rosmalen
Tel: 073 5226191
Fax: 073 5226196



Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html
_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev
_______________________________________________
Xerte-dev mailing list
Xerte-dev at lists.nottingham.ac.uk
http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev



More information about the Xerte-dev mailing list