admin管理员组

文章数量:1135614

Looking for a regex/replace function to take a user inputted string say, "John Smith's Cool Page" and return a filename/url safe string like "john_smith_s_cool_page.html", or something to that extent.

Looking for a regex/replace function to take a user inputted string say, "John Smith's Cool Page" and return a filename/url safe string like "john_smith_s_cool_page.html", or something to that extent.

Share Improve this question edited Feb 5, 2018 at 21:59 A-Sharabiani 19.3k21 gold badges125 silver badges137 bronze badges asked Dec 13, 2011 at 6:05 ndmwebndmweb 3,4806 gold badges34 silver badges38 bronze badges 2
  • 1 Define "filename/url safe string". Browsers will do URL encoding of strings in addresses, modern computers have very few restrictions on file name characters. – RobG Commented Dec 13, 2011 at 6:09
  • 1 I'd use something like " aAbc1290!@#$%^&*()-=_+;:[]{}'\"|,./<>? ".replace(/[\\\/:\*\?"<>\|]/g, "").trim() + ".html" – loxaxs Commented Apr 6, 2019 at 14:04
Add a comment  | 

5 Answers 5

Reset to default 194

Well, here's one that replaces anything that's not a letter or a number, and makes it all lower case, like your example.

var s = "John Smith's Cool Page";
var filename = s.replace(/[^a-z0-9]/gi, '_').toLowerCase();

Explanation:

The regular expression is /[^a-z0-9]/gi. Well, actually the gi at the end is just a set of options that are used when the expression is used.

  • i means "ignore upper/lower case differences"
  • g means "global", which really means that every match should be replaced, not just the first one.

So what we're looking as is really just [^a-z0-9]. Let's read it step-by-step:

  • The [ and ] define a "character class", which is a list of single-characters. If you'd write [one], then that would match either 'o' or 'n' or 'e'.
  • However, there's a ^ at the start of the list of characters. That means it should match only characters not in the list.
  • Finally, the list of characters is a-z0-9. Read this as "a through z and 0 through 9". It's a short way of writing abcdefghijklmnopqrstuvwxyz0123456789.

So basically, what the regular expression says is: "Find every letter that is not between 'a' and 'z' or between '0' and '9'".

I know the original poster asked for a simple Regular Expression, however, there is more involved in sanitizing filenames, including filename length, reserved filenames, and, of course reserved characters.

Take a look at the code in node-sanitize-filename for a more robust solution.

For more flexible and robust handling of unicode characters etc, you could use the slugify in conjunction with some regex to remove unsafe URL characters

const urlSafeFilename = slugify(filename, { remove: /"<>#%\{\}\|\\\^~\[\]`;\?:@=&/g });

This produces nice kebab-case filenemas in your url and allows for more characters outside the a-z0-9 range.

Here's what I did. It works to convert full sentences into a decently clean URL.

First it trims the string, then it converts spaces to dashes (-), then it gets rid of anything that's not a letter/number/dash

function slugify(title) {
  return title
    .trim()
    .replace(/ +/g, '-')
    .toLowerCase()
    .replace(/[^a-z0-9-]/g, '')
}

slug.value = slugify(text.value);
text.oninput = () => { slug.value = slugify(text.value); };
<input id="text" value="Foo: the old @Foobîdoo!!  " style="font-size:1.2em">

<input id="slug" readonly style="font-size:1.2em">

I think your requirement is to replaces white spaces and aphostophy `s with _ and append the .html at the end try to find such regex.

refer

http://www.regular-expressions.info/javascriptexample.html

本文标签: regexjavascript urlsafe filenamesafe stringStack Overflow