<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
John,<br>
<br>
Yes, I think this regexp is usable in xot as well. I'll implement it
and will test it.<br>
<br>
With regards to your second remark, my vote would be not to do this,
because the way plurals are formed is different from language to
language. <br>
<br>
Regards,<br>
<br>
Tom<br>
<br>
<div class="moz-cite-prefix">Op 19-10-2012 9:37, Smith, John
schreef:<br>
</div>
<blockquote
cite="mid:EE0B2AFFDB88B34AA864E00CE98914C22479F5E19F@ITSEMBXCLUS.enterprise.gcal.ac.uk"
type="cite">
<pre wrap="">Hi Fay,
Greetings from Czeck Republic!! Yes, sorry, I got a bit carried away and didn't read the brief correctly or do sufficient testing!!
Anyway, I think I have cracked it and now have a regular expression that 'seems' to work as desired (in my testing anyway!!) but will need to be tested with real life data that I don't have access to so i'm putting it out there to see if anyone can break it. It also only uses a single glossary term and 3 capture groups to capture the before and after parts (which may be nothing - using the ^ and $) and the actual term (so that we maintain the case). It also handles some punctuation (this can easily be expanded upon). This means that it may even be suitable for xot also.
// function makes every glossary word found into a link
function insertGlossaryTag(node) {
var temp = node.nodeValue;
for (var k=0; k<glossary.length; k++) {
var regExp = new RegExp('(^|\\s)(' + glossary[k].word + ')([\\s\\.,!?]|$)', 'gi');
temp = temp.replace(regExp, '$1<a class="glossary" href="#" title="$2">$2</a>$3');
}
node.nodeValue = temp;
}
Again this has only been tested with Xenith code. It's also thrown up other questions such as:
1. Should it match the first or second part of a hyphenated word: not a real life example but cat-fish for example?
2. How should it handle plurals, if even at all: such at cats. It could have 's' and 'es' etc added to the punctuation group so would hyperlink cat but with a letter s afterwards, still not hyperlinked which would be fine?
Regards,
John Smith | Learning Technologist
Room A251, Govan Mbeki Building | School of Health & Life Sciences | Glasgow Caledonian University
Cowcaddens Road | Glasgow | G4 0BA
________________________________________
From: <a class="moz-txt-link-abbreviated" href="mailto:xerte-dev-bounces@lists.nottingham.ac.uk">xerte-dev-bounces@lists.nottingham.ac.uk</a> [<a class="moz-txt-link-abbreviated" href="mailto:xerte-dev-bounces@lists.nottingham.ac.uk">xerte-dev-bounces@lists.nottingham.ac.uk</a>] On Behalf Of Fay Cross [<a class="moz-txt-link-abbreviated" href="mailto:Fay.Cross@nottingham.ac.uk">Fay.Cross@nottingham.ac.uk</a>]
Sent: 17 October 2012 09:50
To: For Xerte technical developers
Subject: [Xerte-dev] Re: FW: Xenith/XOT Glossary regexps
Thanks for looking into this John. I'm just looking at the code you sent and you're right it doesn't take punctuation into account. It works when the word is at the beginning or end of a sentence, with or without a space immediately before or after but whenever there's punctuation next to it (cat. cat? etc.) it just gets replaced with cat and the punctuation is lost. Also, longer words that start with a word from the glossary get replaced with the shorter glossary word (e.g. category becomes cat).
-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:xerte-dev-bounces@lists.nottingham.ac.uk">xerte-dev-bounces@lists.nottingham.ac.uk</a> [<a class="moz-txt-link-freetext" href="mailto:xerte-dev-bounces@lists.nottingham.ac.uk">mailto:xerte-dev-bounces@lists.nottingham.ac.uk</a>] On Behalf Of Smith, John
Sent: 15 October 2012 18:48
To: <a class="moz-txt-link-abbreviated" href="mailto:xerte-dev@lists.nottingham.ac.uk">xerte-dev@lists.nottingham.ac.uk</a>
Subject: [Xerte-dev] Re: FW: Xenith/XOT Glossary regexps
Thinking about it over dinner though I've not taken ending punctuation into consideration - beginners mistake!!
Will look on my return unless its solved beforehand...
Regards
John Smith
Learning Technologist
School of Health and Life Sciences
Sent from Samsung Galaxy SII
Pat Lockley <a class="moz-txt-link-rfc2396E" href="mailto:patrick.lockley@googlemail.com"><patrick.lockley@googlemail.com></a> wrote:
Don't think that regexp works if the word is the first thing in a sentence
On 15 Oct 2012, at 07:10, Julian Tenney <a class="moz-txt-link-rfc2396E" href="mailto:Julian.Tenney@nottingham.ac.uk"><Julian.Tenney@nottingham.ac.uk></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Just forwarding this to the list for everyone's info: Fay can use it in the Xenith code, I'm not sure if I can integrate it into engine as this is (I guess) javascript and not the actionscript RegExp engine (although the expressions should work in both...). I'll try..
-----Original Message-----
From: Smith, John [<a class="moz-txt-link-freetext" href="mailto:J.J.Smith@gcu.ac.uk">mailto:J.J.Smith@gcu.ac.uk</a>]
Sent: 14 October 2012 17:26
To: <a class="moz-txt-link-abbreviated" href="mailto:julian.tenney@nottingham.ac.uk">julian.tenney@nottingham.ac.uk</a>; <a class="moz-txt-link-abbreviated" href="mailto:Fay.Cross@nottingham.ac.uk">Fay.Cross@nottingham.ac.uk</a>;
<a class="moz-txt-link-abbreviated" href="mailto:ronm@mitchellmedia.co.uk">ronm@mitchellmedia.co.uk</a>; <a class="moz-txt-link-abbreviated" href="mailto:reijnders@tor.nl">reijnders@tor.nl</a>
Subject: Xenith/XOT Glossary regexps
Importance: High
Hi guys,
Great to meet you all and I've been looking through the xenith code to see where I can contribute. Also, have been looking through the archives and came across the regexp problem for the glossary. Since i'm only today on the list proper not sure whether a reply will go through to the correct place so sending to you all to see if it helps...
Not sure whether this has been fixed yet but it seems the problem is partly caused by /b requiring a word boundary and there being no word boundary on the very first word. Also, seem to remember somewhere that /b can in some cases match international characters in the middle of words which might not be the desired effect...
I have changed the regexp to this "\sTERM[^\s]*|^TERM[^\s]*" in the xenith.js code as so:
// function makes every glossary word found into a link function insertGlossaryTag(node) {
var temp = node.nodeValue;
for (var k=0; k<glossary.length; k++) {
// ** see recent emails on list about regular expression stuff **
//var regExp = new RegExp(" " + glossary[k].word + " ",
"ig");
var regExp = new RegExp('\\s' + glossary[k].word +
'[^\\s]*|^' + glossary[k].word + '[^\\s]*', 'gi');
temp = temp.replace(regExp, ' <a class="glossary" href="#" title="' + glossary[k].definition + '">' + glossary[k].word + '</a> ');
}
node.nodeValue = temp;
}
and now it seems to match all the words, no matter where they are and irrespective of spaces. See attached screenshots - you can see there are no spaces before any words and only some have a space after. Probably needs further testing to go into xot though...
Will start adding to the list soon...
Regards,
John Smith | Learning Technologist
Room A251, Govan Mbeki Building | School of Health & Life Sciences |
Glasgow Caledonian University Cowcaddens Road | Glasgow | G4 0BA
Glasgow Caledonian University is a registered Scottish charity, number
SC021474
Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
<a class="moz-txt-link-freetext" href="http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6">http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6</a>
219,en.html
Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
<a class="moz-txt-link-freetext" href="http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,1">http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,1</a>
5691,en.html
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an
attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with
the University of Nottingham may be monitored as permitted by UK legislation.
<glossary.png>
<glossary2.png>
_______________________________________________
Xerte-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Xerte-dev@lists.nottingham.ac.uk">Xerte-dev@lists.nottingham.ac.uk</a>
<a class="moz-txt-link-freetext" href="http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev">http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev</a>
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an
attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with
the University of Nottingham may be monitored as permitted by UK legislation.
</pre>
</blockquote>
<pre wrap="">
_______________________________________________
Xerte-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Xerte-dev@lists.nottingham.ac.uk">Xerte-dev@lists.nottingham.ac.uk</a>
<a class="moz-txt-link-freetext" href="http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev">http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev</a>
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.
Glasgow Caledonian University is a registered Scottish charity, number SC021474
Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
<a class="moz-txt-link-freetext" href="http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html">http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html</a>
Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
<a class="moz-txt-link-freetext" href="http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html">http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html</a>
_______________________________________________
Xerte-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Xerte-dev@lists.nottingham.ac.uk">Xerte-dev@lists.nottingham.ac.uk</a>
<a class="moz-txt-link-freetext" href="http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev">http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev</a>
_______________________________________________
Xerte-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Xerte-dev@lists.nottingham.ac.uk">Xerte-dev@lists.nottingham.ac.uk</a>
<a class="moz-txt-link-freetext" href="http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev">http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev</a>
Glasgow Caledonian University is a registered Scottish charity, number SC021474
Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009.
<a class="moz-txt-link-freetext" href="http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html">http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html</a>
Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
<a class="moz-txt-link-freetext" href="http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html">http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html</a>
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Xerte-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Xerte-dev@lists.nottingham.ac.uk">Xerte-dev@lists.nottingham.ac.uk</a>
<a class="moz-txt-link-freetext" href="http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev">http://lists.nottingham.ac.uk/mailman/listinfo/xerte-dev</a>
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an attachment
may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
--
Tom Reijnders
TOR Informatica
Chopinlaan 27
5242HM Rosmalen
Tel: 073 5226191
Fax: 073 5226196
</pre>
</body>
</html>