javascript - Insert HTML into user string and render it in React (and avoid XSS threat) - Stack Overflow-软件玩家

admin管理员组
文章数量:1287916

User is supplying a string to our React app, and it's being displayed to other users. I want to search for some characters, and replace them with some HTML, like if I were to search for the word "special," I would turn it into:

My <span class="special-formatting">special</span> word in a user string

Previously I was performing this replacement and then inserting the result into the DOM with dangerouslySetInnerHTML. This of course is now giving me the issue of users being able to type and enter whatever HTML/Javascript they please right into the app and render it for everyone to see.

I tried escaping the HTML characters to their entities, but dangerouslySetInnerHTML appears to render the HTML entities proper and not as an actual string. (EDIT: see below, this was the actual solution)

Is there any way to convert their message to a pure string, still preserving the display of those special characters, but also insert my own HTML into the string? Trying to avoid running a script after each string is inserted to the DOM.

Here's some more info regarding the current flow. All examples are pretty optimized to only show the relevant code.

The user text is submitted to the database with this function:

handleSubmit(event) {
        event.preventDefault();

        var messageText = this.state.messageValue;

        //bold font is missing some mon characters, fake way of making the normal font look bold
        if (this.state.bold == true) {
            messageText = messageText.replace(/\'/g, "<span class='bold-apostrophe'>'</span>");
            messageText = messageText.replace(/\"/g, "<span class='bold-quote'>&quot;</span>");
            messageText = messageText.replace(/\?/g, "<span class='bold-question'>?</span>");
            messageText = messageText.replace(/\*/g, "<span class='bold-asterisk'>*</span>");
            messageText = messageText.replace(/\+/g, "<span class='bold-plus'>+</span>");
            messageText = messageText.replace(/\./g, "<span class='bold-period'>.</span>");
            messageText = messageText.replace(/\,/g, "<span class='bold-ma'>,</span>");
        }

        Messages.insert({
            text: messageText,
            createdAt: new Date(),
            userId: user._id,
            bold: this.state.bold,
        });

    }

So, I did my replacements without issue, however at this point, the messageText string could still contain undesired, user-input HTML code.

Then, our main app with the message list tries to render all the user messages:

render() {
    return (
        <div ref="messagesList">
            {this.renderMessages()}
        </div>
    );
}

renderMessages() {
    return [].concat(this.props.messages).reverse().map((message) => {
        return <Message
            key={message._id}
            message={message} />;
        }
    });
}

In Message.jsx is where I'm doing the final touches to the message string (certain changes I don't want saved into the database of messages) and inserting it into an element to return:

export default class Message extends React.Component {
    render() {

        var processedMessageText = this.props.message.text;

        //another find and replace to insert images for :image_name: strings, similar to how Discord inputs its emoji
        processedMessageText = processedMessageText.replace(/:([\w]+):/g, function (text) {
            text = text.replace(/:/g, "");
            if (text.indexOf("_s") !== -1) {
                text = text.replace(/_s/g, "");
                text = "<img class='small-smiley' src='/smileys/small/" + text + ".png'>";
                return text;
            }
            else {
                text = "<img class='smiley' src='/smileys/" + text + ".png'>";
                return text;
            }
        });

        return (
            <div>
                <div className='username'>{this.props.message.username}: </div>
                <div className='text' dangerouslySetInnerHTML={{ __html: processedMessageText }}></div>
            </div>
        );
    }
}

So again, if the user includes malicious HTML in their input string, it will travel through all of this and get output to the message list, which is real bad. I'm hoping there's some way I can perform these desired HTML insertions to their string, while also not rendering the HTML that they potentially input as actual HTML. I would also still like to show characters monly used in HTML, like angle brackets (<>), so I want to avoid outright stripping their input string of mon HTML characters.

Since the accepted answer doesn't have much detail, I'll post what I ended up doing here. I HTML encoded the characters suggested by OWASP before adding my own HTML and rendering it into an HTML element's content. I wanted to avoid using another library, so I just did this:

messageText = messageText.replace(/\&/g, "&amp;");
messageText = messageText.replace(/</g, "&lt;");
messageText = messageText.replace(/>/g, "&gt;");
messageText = messageText.replace(/\//g, "&#x2F;");
messageText = messageText.replace(/\'/g, "&#x27;");
messageText = messageText.replace(/\"/g, "&quot;");

After doing so I was no longer able to insert anything malicious, and tested using various test strings from OWASP without issue.

My <span class="special-formatting">special</span> word in a user string

Here's some more info regarding the current flow. All examples are pretty optimized to only show the relevant code.

The user text is submitted to the database with this function:

handleSubmit(event) {
        event.preventDefault();

        var messageText = this.state.messageValue;

        //bold font is missing some mon characters, fake way of making the normal font look bold
        if (this.state.bold == true) {
            messageText = messageText.replace(/\'/g, "<span class='bold-apostrophe'>'</span>");
            messageText = messageText.replace(/\"/g, "<span class='bold-quote'>&quot;</span>");
            messageText = messageText.replace(/\?/g, "<span class='bold-question'>?</span>");
            messageText = messageText.replace(/\*/g, "<span class='bold-asterisk'>*</span>");
            messageText = messageText.replace(/\+/g, "<span class='bold-plus'>+</span>");
            messageText = messageText.replace(/\./g, "<span class='bold-period'>.</span>");
            messageText = messageText.replace(/\,/g, "<span class='bold-ma'>,</span>");
        }

        Messages.insert({
            text: messageText,
            createdAt: new Date(),
            userId: user._id,
            bold: this.state.bold,
        });

    }

So, I did my replacements without issue, however at this point, the messageText string could still contain undesired, user-input HTML code.

Then, our main app with the message list tries to render all the user messages:

render() {
    return (
        <div ref="messagesList">
            {this.renderMessages()}
        </div>
    );
}

renderMessages() {
    return [].concat(this.props.messages).reverse().map((message) => {
        return <Message
            key={message._id}
            message={message} />;
        }
    });
}

In Message.jsx is where I'm doing the final touches to the message string (certain changes I don't want saved into the database of messages) and inserting it into an element to return:

export default class Message extends React.Component {
    render() {

        var processedMessageText = this.props.message.text;

        //another find and replace to insert images for :image_name: strings, similar to how Discord inputs its emoji
        processedMessageText = processedMessageText.replace(/:([\w]+):/g, function (text) {
            text = text.replace(/:/g, "");
            if (text.indexOf("_s") !== -1) {
                text = text.replace(/_s/g, "");
                text = "<img class='small-smiley' src='/smileys/small/" + text + ".png'>";
                return text;
            }
            else {
                text = "<img class='smiley' src='/smileys/" + text + ".png'>";
                return text;
            }
        });

        return (
            <div>
                <div className='username'>{this.props.message.username}: </div>
                <div className='text' dangerouslySetInnerHTML={{ __html: processedMessageText }}></div>
            </div>
        );
    }
}

messageText = messageText.replace(/\&/g, "&amp;");
messageText = messageText.replace(/</g, "&lt;");
messageText = messageText.replace(/>/g, "&gt;");
messageText = messageText.replace(/\//g, "&#x2F;");
messageText = messageText.replace(/\'/g, "&#x27;");
messageText = messageText.replace(/\"/g, "&quot;");

After doing so I was no longer able to insert anything malicious, and tested using various test strings from OWASP without issue.

Share Improve this question edited Nov 1, 2019 at 19:06 asked Oct 21, 2019 at 3:21 addMitt 1,0212 gold badges15 silver badges27 bronze badges

Does the string that es from the server have html markup in it? If not why can't you just split the string into words and conditionally render the matched words. – Mon Villalon Commented Oct 21, 2019 at 5:22
I don't have any trouble conditionally rendering matched words. My problem is that the user could input "<span style='font-size: 999999px'>hello</span>" and since I am using dangerouslySetInnerHTML it will actually render that HTML. – addMitt Commented Oct 21, 2019 at 5:26
I meant that if you don't care to maintain any html tags that e from the server you could just use plain old react. return response.split(" ").map((w) => w === 'special' ? <span className='special-formatting'>{w}</span> ? w);` . This will be safe from XSS – Mon Villalon Commented Oct 21, 2019 at 6:14
Hm well when we render the messages, it's a function that maps the messages and returns them as rather large ponents. The shortened form of those ponents would be: <div className='text' dangerouslySetInnerHTML={{ __html: processedMessageText }}></div> Where processedMessageText is the user string that I pulled into its own variable and performed various replaces on it. This again leaving me with a string that I can either render as HTML with potentially dangerous code in it, or as a pure string w/ no HTML – addMitt Commented Oct 21, 2019 at 6:40
Added examples of our full workflow. I'm not sure where'd I'm implement something like that given the current structure. – addMitt Commented Oct 21, 2019 at 17:25

Add a ment |

5 Answers 5

Sorted by: Reset to default 3

The problem began when you injected HTML in the user's input text before saving it to the database. That makes things difficult because now you have to sanitize it, but not so much.

As a remedy, you can use dompurify or sanitize-html to remove any html but the html you've injected. Here's an example using dompurify:

import DOMPurify from "dompurify";

const dangerousString =
"<img onError='alert(\"h4ck3r\")' src='will throw error' /><span class='bold-apostrophe'>'</span>";

<div
  dangerouslySetInnerHTML={{
    __html: DOMPurify.sanitize(dangerousString, {
      ALLOWED_TAGS: ["span"],
      ALLOWED_ATTR: ["class"]
    })
  }}
/>

Keep in mind that sanitizer libs needs to be updated as frequently as possible, as hackers are constantly finding creative ways to bypass them.
The previous statement implies that you still may get XSS'ed. The only way to avoid it is to stop tempering strings with HTML before you save it to the database, so you can use a solution like the one presented by Ferrybig to add special formatting on the fly instead of dangerouslySetInnerHTML.

Couldn't you just

HTML-encode the tainted string from the user.
Do your search/replace and insert your HTML.
Then do the dangerouslySetInnerHTML().

That should safely escape whatever the user entered and leave your inserted HTML element alone, no?

This would be my approach, I hope it's not ing too late.

import React, { render } from "react";
import ReactDOM from "react-dom";
import sanitizeHtml from "sanitize-html";

// This is the place where you need to do all the magic you want to do
let SpecialTextOutPut = ({ text }) => {
  const newText = text.replace("World", "<b>Transforming Elements</b>");
  return React.createElement("div", {
    dangerouslySetInnerHTML: { __html: `${newText}` }
  });
};

// You can sanitize and clean up the user input here
let UserTextInput = text =>
  React.createElement(SpecialTextOutPut, {
    text: sanitizeHtml(text)
  });

function App() {
  return <div>{UserTextInput("~Hello World <span>Poll</span>")}</div>;
}

const rootElement = document.getElementById("root");
ReactDOM.render(<App />, rootElement);

One other solution for this is to manually convert the search terms to JSX elements. Since typical search doesn't use Regex, we can just use .indexof to split the string (although supporting Regex isn't that hard as it also has a match index.)

function highlightText(input/*: string */, searchTerm/*: string*/)/*: ReactNode */ {
    let index = input.indexOf(searchTerm);
    let lastIndex = 0;
    let result/*: ReactNode[] */ = []
    while(index >= 0) {
        result.push(<span key={result.length}>{input.substring(lastIndex, index)}<\span>);
        result.push(<mark key={result.length}>{input.substring(index, index + searchTerm.length)}<\span>);
        lastIndex = index + searchTerm.length;
        index = input.indexOf(searchTerm, lastIndex);
    }
    result.push(<span key={result.length}>{input.substring(lastIndex, input.length)}<\span>);
    return result;
}

You can then call this in your render part of your ponent like:

function MyComponent(props) {
    return <p>
        {highlightText(props.input, props.searchTerm)}
    <\p>;
}

That is tricky, to render HTML within a string, but not render the whole string as HTML...

I would take a different approach I think and do your replacement at the end if you can and that might make it simpler. Here's an example of how you could get the whole string in the DOM with textContent, and then only render the parts of it you want with innerHTML.

var ele = document.getElementById('message');

// User entered string will not be rendered as HTML
ele.textContent = '<div onclick="maliciousCode()">*</div>'; 

// Do replacement using innerHTML to render only some parts
ele.innerHTML = ele.innerHTML.replace(/\*/g, '<span class="bold">*</span>')

.bold { font-weight: 700 }

<div id="message"></div>

本文标签： javascriptInsert HTML into user string and render it in React (and avoid XSS threat)Stack Overflow

版权声明：本文标题：javascript - Insert HTML into user string and render it in React (and avoid XSS threat) - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741320153a2372158.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

javascript - Insert HTML into user string and render it in React (and avoid XSS threat) - Stack Overflow

5 Answers 5

更多相关文章