admin管理员组文章数量:1296893
I am trying to add a span tag around Hebrew and English sentence in a paragraph. E.g. "so היי all whats up אתכם?" will bee :
[span]so[/span][span]היי[/span][span]all whats up[/span][span]אתכם[/span]
I have been trying with regexp but its just removing the Hebrew words and joining the English words in one span.
var str = 'so היי all whats up אתכם?'
var match= str.match(/(\b[a-z]+\b)/ig);
var replace = match.join().replace(match.join(),'<span>'+match.join()+'</span>')
I am trying to add a span tag around Hebrew and English sentence in a paragraph. E.g. "so היי all whats up אתכם?" will bee :
[span]so[/span][span]היי[/span][span]all whats up[/span][span]אתכם[/span]
I have been trying with regexp but its just removing the Hebrew words and joining the English words in one span.
var str = 'so היי all whats up אתכם?'
var match= str.match(/(\b[a-z]+\b)/ig);
var replace = match.join().replace(match.join(),'<span>'+match.join()+'</span>')
Share
Improve this question
edited Jul 3, 2015 at 15:32
Wiktor Stribiżew
627k41 gold badges498 silver badges611 bronze badges
asked Jul 3, 2015 at 8:36
rouderoude
993 bronze badges
9
- 1 Your regex seems wrong, it doesn't contain any Hebrew matches, just [a-z]+ which is, of course English – Ishay Peled Commented Jul 3, 2015 at 8:39
- so how to do it right? – roude Commented Jul 3, 2015 at 8:40
- 3 You can try adding the Hebrew range: [\u0590-\u05FF] to your regex, this is א-ת in unicode – Ishay Peled Commented Jul 3, 2015 at 8:43
- (match.join(),</span>'<span>'+match.join()+'</span>''<span>) // maybe something like this, with a check that it isn't the last of first word in a sentence in order to apply the tags right? Just an idea. Not very pretty. – NachoDawg Commented Jul 3, 2015 at 8:43
- 2 This question has been raised on Meta. – halfer Commented Jul 3, 2015 at 9:36
3 Answers
Reset to default 9Previous answers here did not account for the whole word requirement. Indeed, it is difficult to achieve this since \b
word boundary does not support word boundaries with neighboring Hebrew Unicode symbols that we can only match with a character class using \u
notation.
I suggest using look-aheads and capturing groups to make sure we capture the whole Hebrew word ((^|[^\u0590-\u05FF])([\u0590-\u05FF]+)(?![\u0590-\u05FF])
that makes sure there is a non-Hebrew symbol or start of string before a Hebrew word - add a \s
if there are spaces between the Hebrew words!), and \b[a-z\s]+\b
to match sequence of whole English words separated with spaces.
If you plan to insert the <span>
tags into a sentence around whole words, here is a function that may help:
var str = 'so היי all whats up אתכם?';
//var str = 'so, היי, all whats up אתכם?';
var result = str.replace(/\s*(\b[a-z\s]+\b)\s*/ig, '<span>$1</span>');
result = result.replace(/(^|[^\u0590-\u05FF])([\u0590-\u05FF]+)(?![\u0590-\u05FF])/g, '$1<span>$2</span>');
document.getElementById("r").innerHTML = result;
span {
background:#FFCCCC;
border:1px solid #0000FF;
}
<div width="645" id="r"/>
Result:
<span>so</span><span>היי</span><span>all whats up</span><span>אתכם</span>?
If you do not need any punctuation or alphanumeric entities in your output, just concatenated whole English and Hebrew words, then use
var str = 'היי, User234, so 222היי all whats up אתכם?';
var re = /(^|[^\u0590-\u05FF])([\u0590-\u05FF]+)(?![\u0590-\u05FF])|(\b[a-z\s]+\b)/ig;
var res = [];
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
if (m[1] !== undefined) {
res.push('<span>'+m[2].trim()+'</span>');
}
else
{
res.push('<span>'+m[3].trim()+'</span>');
}
}
document.getElementById("r").innerHTML = res.join("");
span {
background:#FFCCCC;
border:1px solid #0000FF;
}
<div width="645" id="r"/>
Result:
<span>היי</span><span>so</span><span>היי</span><span>all whats up</span><span>אתכם</span>
I think the Regex you want is something like [^a-z^\u0591-\u05F4^\s]
. I'm not entirely sure how you want to handle spaces.
My solution
Copy str
to a new var res
, replacing any characters that aren't A-Z / Hebrew.
Loop over any english (a-z) characters in str
and wrap them in a span
, using res.replace
.
Do the same again for the Hebrew characters.
It's not quite 100%, but seems to work well enough IMO.
var str = 'so היי all whats up אתכם?';
var finalStr = str.replace(/([^a-z^\u0591-\u05F4^\s])/gi, '');
var rgx = /([a-z ]+)/gi;
var mat = str.match(rgx);
for(var i=0; i < mat.length; ++i){
var match = mat[i];
finalStr = finalStr.replace(match.trim(),'<span>'+match.trim()+'</span>');
}
rgx = /([\u0591-\u05F4 ]+)/gi;
var mat = str.match(rgx);
for(var i=0; i < mat.length; ++i){
var match = mat[i];
finalStr = finalStr.replace(match.trim(),'<span>'+match.trim()+'</span>');
}
document.getElementById('res').innerHTML = finalStr;
http://jsfiddle/daveSalomon/0ns6nuxy/1/
Judging by this post you can try something like this: ((?:\s*\w+)+|(?:\s*[\u0590-\u05FF]+)+?(?=\s?[A-Za-z0-9!?.]))
https://regex101./r/kA3yV5/4
You may need to edit it for your particular cases (for example, if some non-word characters start to appear), but it does the trick. It tries to match words and form sentences from English character list, if it doesn't work, it tries to make words/sentences out of Hebrew character list, until an english character is spotted again.
It's not perfect yet, as you may want to add other punctuation characters and there's some spaces you don't want in the 1st position (because javascript doesn't support lookbehinds, I didn't figure out a good way to remove them on the spot, but they can be at position 1 and removed from string)
本文标签: javascriptSurround Hebrew and English text in divStack Overflow
版权声明:本文标题:javascript - Surround Hebrew and English text in div - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741620141a2388759.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论