admin管理员组

文章数量:1301563

I have an HTML textarea element. I want to prevent a user from entering any HTML tags in this area. How do I detect if a user has entered any HTML a textarea with JavaScript?

Thank you

I have an HTML textarea element. I want to prevent a user from entering any HTML tags in this area. How do I detect if a user has entered any HTML a textarea with JavaScript?

Thank you

Share edited Apr 13, 2010 at 17:46 BalusC 1.1m376 gold badges3.6k silver badges3.6k bronze badges asked Apr 13, 2010 at 17:34 user208662user208662 11k27 gold badges76 silver badges86 bronze badges 3
  • 1 You don't want to use Javascript for this. Just disabling Javascript in webbrowser already sets the XSS world wide open. – BalusC Commented Apr 13, 2010 at 17:59
  • 5 I urge you to return and read the other answers here, the solution you chose is terrible. – Andy E Commented Apr 13, 2010 at 18:01
  • 1 Warning to readers: Don't do this. Preventing users from entering or submitting HTML is never a sane solution to a security problem. You must either filter the data server side when receiving it or (much better) handle the problem when using the data (rendering, or prepare for rendering). – Denys Séguret Commented Feb 27, 2017 at 8:51
Add a ment  | 

5 Answers 5

Reset to default 6

One of the ways is to let the keypress event return false when the pressed key matches < or >. To distinguish real HTML tags from innocent "lesser than" and "greater than" signs, you may need to put some regex in. And since you can't parse HTML reliably with regex... There's however a jQuery way:

var sanitized = $('<div>').html(textareavalue).text();

The normal practice is however to just let the client enter whatever it want and sanitize HTML during display by the server side view technology in question. How to do it depends on the view technology you're using. In for example PHP you can use htmlspecialchars() for this and in JSP/JSTL the fn:escapeXml(). This is more robust since Javascript can be disabled/hacked/spoofed by the client.

Initial considerations:

  • XML != HTML, so I will consider that html are not allowed, but XML it is.
  • All html tags should be deleted, not just escaped (escape html is much easier).
  • We don't want that the user lose the position of his pointer while he is writting (that is very anoying).

First of all, define a function to replace html tags by '':

/**
* This function delete html tags from a text, even if the html tag is 
* not well formed.
* This function update the pointer position to maintain it after the replacement.
* @param {string} text The text to modify
* @param {int} initPos The current position of the pointer in the text 
* @return {int} The new pointer position
*/
function removeHtmlTags( text, initPos )
{
    // Define the regex to delete html tags
    if (undefined===removeHtmlTags.htmlTagRegexp)
    {
        removeHtmlTags.htmlTagRegexp = new RegExp('</?(?:article|aside|bdi|mand|'+
            'details|dialog|summary|figure|figcaption|footer|header|hgroup|mark|'+
            'meter|nav|progress|ruby|rt|rp|section|time|wbr|audio|'+
            'video|source|embed|track|canvas|datalist|keygen|output|'+
            '!--|!DOCTYPE|a|abbr|address|area|b|base|bdo|blockquote|body|'+
            'br|button|canvas|caption|cite|code|col|colgroup|dd|del|dfn|div|'+
            'dl|dt|em|embed|fieldset|figcaption|figure|footer|form|h1|h2|h3|h4|'+
            'h5|h6|head|hr|html|i|iframe|img|input|ins|kdb|keygen|label|legend|'+
            'li|link|map|menu|meta|noscript|object|ol|optgroup|option|p|param|'+
            'pre|q|s|samp|script|select|small|source|span|strong|style|sub|'+
            'sup|table|tbody|td|textarea|tfoot|th|thead|title|tr|u|ul|var|'+
            'acronym|applet|basefont|big|center|dir|font|frame|'+
            'frameset|noframes|strike|tt)(?:(?: [^<>]*)>|>?)', 'i');
    }

    // Delete html tags
    var thereIsMore=true;
    removeHtmlTags.htmlTagRegexp.lastIndex=0;
    // While I am not sure that all html tags are removed.
    while (thereIsMore)
    {
        var str = text.match(removeHtmlTags.htmlTagRegexp);
        if ( str!=null) // There is a match
        {
            text = text.replace(str[0], '');
            // Update the position
            if (str.index < initPos) 
                initPos= Math.max(initPos-str[0].length,str.index);
        }
        else thereIsMore = false;
    }

    // If getCaretPosition fail, the initPos may be negative
    if (initPos<0) initPos=0;

    return {text: text, pos: initPos};
}

Notes: I decided following replacements e.g.:

'<div>' -> ''
'<div selected' -> ' selected'
'<div selected>' -> ''
'<div    >' -> ''

Second, we need a function to get/set the carret position, because on updating the textarea content, it will reset. Further more, the position may change if any tag is deleted before the carret position.

/**
 * This function get/set the position of the carret in a node.
 * If the value is set, this function try to set the new position value.
 * Anyway, it return the (new) position.
 * @param {Element} node The textarea element
 * @param {int} value The new carret position
 * @return {int} The (new) carret position 
 */
function caretPosition(node, value) 
{
    // Set default Caret pos, will be returned if this function fail.
    var caretPos = 0;

    // Ensure that value is valid
    value = parseInt(value);

    // Set the new caret position if necesary
    if (!isNaN(value)) // We want to set the position
    {
        if (node.selectionStart)
        {
            node.selectionStart=value;
            node.selectionEnd= value;
        }
        else if(node.setSelectionRang)
        {
            node.focus();
            node.setSelectionRange(value, value);
        }
        else if (node.createTextRange)
        {
            var range = node.createTextRange();
            range.collapse(true);
            range.moveEnd('character', value);
            range.moveStart('character', value);
            range.select();
        }
    }

    // Get the position to return it.
    if (node.selectionStart) return node.selectionStart;
    else if (document.selection)
    {
        node.focus();
        var sel = document.selection.createRange();
        sel.moveStart('character', -node.value.length);
        caretPos = sel.text.length;
    }

    return caretPos;
}

Third, create a main function to remove html tags from the textarea and set the carret position.

/**
 * This event function remove html tags from the textarea with id=text 
 */
function updateText()
{
    // Get the textarea
    var t = document.getElementById('text');

    // Get the caret position
    var pos = caretPosition(t);

    // Remove html from the text
    var result = removeHtmlTags(t.value, pos);
    t.value = result.text;

    // Set the new caret position
    caretPosition(t, result.pos);
}

Finally, add event listeners to update the textarea on modification:

  • Key press
  • Past
  • Drop

We should be able to use "oninput" for all 3 events, but (ofc) IE fail.

HTML:

<html>
    <head>
        <script type="text/javascript">
           <!-- Copy all the js code here. -->
        </script>
    </head>
    <body>
        <textarea cols="50" rows="10" oninput="updateText();" 
            ondrop="setTimeout('updateText();',0);" 
            onpaste="setTimeout('updateText();',0);" 
            onkeyup="updateText();" id='text'></textarea>
    </body>
</html>

I hope it help you :-) Escain

You can use a regular expression, like

if ( textArea.value.match(/<\/*[a-z][^>]+?>/gi) ) {
  // do something about it
}

where "textArea" is the ID of your textarea element.

What can you consider as HTML tags? Is <b> a tag? What about the middle characters in I <3 how 5 is > 4?

I think you should not limit users with your strictness. Don't be a Steve Jobs.

firstly, bear in mind that you'll need to re-validate on the server side, since anyone can fake a http post, and if they have javascript disabled then of course you have no control :)

what i'd do is

<textarea onkeypress="disableHtml(this);" name="text"></textarea>

and for the javascript

function disableHtml(element) {
  element.value = element.value.replace(/[<>]/g, '');
}

another way to do this would be to replace < and > with &lt; and &gt; on the server side, which is the better way because it's secure and people can still frown >:)

[edit : you can make the regexp as clever as you like, if you want to only detect certain tags for instance]

本文标签: JavaScriptDetect HTMLStack Overflow