admin管理员组

文章数量:1331656

I'm receiving a list of files in an object and I just need to display a file name and its type in a table. All files e back from a server in such format: timestamp_id_filename.

Example: 1568223848_12345678_some_document.pdf

I wrote a helper function which cuts the string.

At first, I did it with String.prototype.split() method, I used regex, but then again - there was a problem. Files can have underscores in their names so that didn't work, so I needed something else. I couldn't e up with a better idea. I think it looks really dumb and it's been haunting me the whole day.

The function looks like this:

const shortenString = (attachmentName) => {
    const file = attachmentName
        .slice(attachmentName.indexOf('_') + 1)
        .slice(attachmentName.slice(attachmentName.indexOf('_') + 1).indexOf('_') + 1);

    const fileName = file.slice(0, file.lastIndexOf('.'));
    const fileType = file.slice(file.lastIndexOf('.'));

    return [fileName, fileType];
};

I wonder if there is a more elegant way to solve the problem without using loops.

I'm receiving a list of files in an object and I just need to display a file name and its type in a table. All files e back from a server in such format: timestamp_id_filename.

Example: 1568223848_12345678_some_document.pdf

I wrote a helper function which cuts the string.

At first, I did it with String.prototype.split() method, I used regex, but then again - there was a problem. Files can have underscores in their names so that didn't work, so I needed something else. I couldn't e up with a better idea. I think it looks really dumb and it's been haunting me the whole day.

The function looks like this:

const shortenString = (attachmentName) => {
    const file = attachmentName
        .slice(attachmentName.indexOf('_') + 1)
        .slice(attachmentName.slice(attachmentName.indexOf('_') + 1).indexOf('_') + 1);

    const fileName = file.slice(0, file.lastIndexOf('.'));
    const fileType = file.slice(file.lastIndexOf('.'));

    return [fileName, fileType];
};

I wonder if there is a more elegant way to solve the problem without using loops.

Share Improve this question edited Sep 11, 2019 at 17:57 Bart asked Sep 11, 2019 at 17:55 BartBart 1722 gold badges5 silver badges12 bronze badges
Add a ment  | 

8 Answers 8

Reset to default 2

You can use replace and split, with the pattern we are replacing the string upto the second _ from start of string and than we split on . to get name and type

let nameAndType = (str) => {
  let replaced =  str.replace(/^(?:[^_]*_){2}/g, '')
  let splited = replaced.split('.')
  let type = splited.pop()
  let name = splited.join('.')
  return {name,type}
}

console.log(nameAndType("1568223848_12345678_some_document.pdf"))
console.log(nameAndType("1568223848_12345678_some_document.xyz.pdf"))

function splitString(val){
  return val.split('_').slice('2').join('_');
}


const getShortString = (str) => str.replace(/^(?:[^_]*_){2}/g, '')

For input like 1568223848_12345678_some_document.pdf, it should give you something like some_document.pdf

const re = /(.*?)_(.*?)_(.*)/;

const name = "1568223848_12345678_some_document.pdf";

[,date, id, filename] = re.exec(name);

console.log(date);
console.log(id);
console.log(filename);

some notes:

  • you want to make the regular expression 1 time. If you do this

    function getParts(str) {
      const re = /expression/;
      ...
    }
    

    Then you're making a new regular expression object every time you call getParts.

  • .*? is faster than .*

    This is because .* is greedy so the moment the regular expression engine sees that it puts the entire rest of the string into that slot and then checks if can continue the expression. If it fails it backs off one character. If that fails it backs off another character, etc.... .*? on the other hand is satisfied as soon as possible. So it adds one character then sees if the next part of the expression works, if not it adds one more character and sees if the expressions works, etc..

  • splitting on '_' works but it could potentially make many temporary strings

    for example if the filename is 1234_1343_a________________________.pdf

    you'd have to test to see if using a regular experssion is faster or slower than splitting, assuming speed matters.

You can kinda chain .indexOf to get second offset and any further, although more than two would look ugly. The reason is that indexOf takes start index as second argument, so passing index of the first occurrence will help you find the second one:

var secondUnderscoreIndex = name.indexOf("_",name.indexOf("_")+1);

So my solution would be:

var index =  name.indexOf("_",name.indexOf("_")+1));
var [timestamp, name] = [name.substring(0, index), name.substr(index+1)];

Alternatively, using regular expression:

var [,number1, number2, filename, extension] = /([0-9]+)_([0-9]+)_(.*?)\.([0-9a-z]+)/i.exec(name)
// Prints: "1568223848 12345678 some_document pdf"
console.log(number1, number2, filename, extension);

I like simplicity...

If you ever need the date in times, theyre in [1] and [2]

    var getFilename = function(str) {
      return str.match(/(\d+)_(\d+)_(.*)/)[3];
    }

    var f = getFilename("1568223848_12345678_some_document.pdf");
    console.log(f)

If ever files names e in this format timestamp_id_filename. You can use a regular expression that skip the first two '_' and save the nex one.

test:

var filename = '1568223848_12345678_some_document.pdf';
console.log(filename.match(/[^_]+_[^_]+_(.*)/)[1]); // result: 'some_document.pdf'

Explanation: /[^]+[^]+(.*)/

[^]+ : take characters diferents of '' : take '' character Repeat so two '_' are skiped (.*): Save characters in a group

match method: Return array, his first element is capture that match expression, next elements are saved groups.

Split the file name string into an array on underscores. Discard the first two elements of the array. Join the rest of the array with underscores. Now you have your file name.

本文标签: javascriptHow can I cut the string after a second underscoreStack Overflow