admin管理员组文章数量:1301592
I'm building a Javascript chat bot for something, and I ran into an issue:
I use string.split()
to tokenize my input like this:
tokens = message.split(" ");
Now my problem is that I need 4 tokens to make the mand, and 1 token to have a message.
when I do this:
!finbot msg testuser 12345 Hello sir, this is a test message
these are the tokens I get:
["!finbot", "msg", "testuser", "12345", "Hello", "sir,", "this", "is", "a", "test", "message"]
However, how can I make it that it will be like this:
["!finbot", "msg", "testuser", "12345", "Hello sir, this is a test message"]
The reason I want it like this is because the first token (token[0]
) is the call, the second (token[1]
) is the mand, the third (token[2]
) is the user, the fourth (token[3]
) is the password (as it's a password protected message thing... just for fun) and the fifth (token[4]
) is the actual message.
Right now, it would just send Hello
because I only use the 5th token.
the reason why I can't just go like message = token[4] + token[5];
etc. is because messages are not always exactly 3 words, or not exactly 4 words etc.
I hope I gave enough information for you to help me. If you guys know the answer (or know a better way to do this) please tell me so.
Thanks!
I'm building a Javascript chat bot for something, and I ran into an issue:
I use string.split()
to tokenize my input like this:
tokens = message.split(" ");
Now my problem is that I need 4 tokens to make the mand, and 1 token to have a message.
when I do this:
!finbot msg testuser 12345 Hello sir, this is a test message
these are the tokens I get:
["!finbot", "msg", "testuser", "12345", "Hello", "sir,", "this", "is", "a", "test", "message"]
However, how can I make it that it will be like this:
["!finbot", "msg", "testuser", "12345", "Hello sir, this is a test message"]
The reason I want it like this is because the first token (token[0]
) is the call, the second (token[1]
) is the mand, the third (token[2]
) is the user, the fourth (token[3]
) is the password (as it's a password protected message thing... just for fun) and the fifth (token[4]
) is the actual message.
Right now, it would just send Hello
because I only use the 5th token.
the reason why I can't just go like message = token[4] + token[5];
etc. is because messages are not always exactly 3 words, or not exactly 4 words etc.
I hope I gave enough information for you to help me. If you guys know the answer (or know a better way to do this) please tell me so.
Thanks!
Share Improve this question asked Aug 27, 2016 at 19:16 Finlay RoelofsFinlay Roelofs 5706 silver badges22 bronze badges4 Answers
Reset to default 3Use the limit
parameter of String.split
:
tokens = message.split(" ", 4);
From there, you just need to get the message from the string. Reusing this answer for its nthIndex()
function, you can get the index of the 4th occurrence of the space character, and take whatever es after it.
var message = message.substring(nthIndex(message, ' ', 4))
Or if you need it in your tokens
array:
tokens[4] = message.substring(nthIndex(message, ' ', 4))
I would probably start by taking the string like you did, and tokenizing it:
const myInput = string.split(" "):
If you're using JS ES6, you should be able to do something like:
const [call, mand, userName, password, ...messageTokens] = myInput;
const message = messageTokens.join(" ");
However, if you don't have access to the spread operator, you can do the same like this (it's just much more verbose):
const call = myInput.shift();
const mand = myInput.shift();
const userName = myInput.shift();
const password = myInput.shift();
const message = myInput.join(" ");
If you need them as an array again, now you can just join those parts:
const output = [call, mand, userName, password, message];
If you can use es6 you can do:
let [c1, c2, c3, c4, ...rest] = input.split (" ");
let msg = rest.join (" ");
You could revert to regexp given that you defined your format as "4 tokens of not-space separated with spaces followed by message":
function tokenize(msg) {
return (/^(\S+) (\S+) (\S+) (\S+) (.*)$/.exec(msg) || []).slice(1, 6);
}
This has the perhaps unwanted behaviour of returning an empty array if your msg
does not actually match the spec. Remove the ... || []
and handle accordingly, if that's not acceptable. The amount of tokens is also fixed to 4 + the required message. For a more generic approach you could:
function tokenizer(msg, nTokens) {
var token = /(\S+)\s*/g, tokens = [], match;
while (nTokens && (match = token.exec(msg))) {
tokens.push(match[1]);
nTokens -= 1; // or nTokens--, whichever is your style
}
if (nTokens) {
// exec() returned null, could not match enough tokens
throw new Error('EOL when reading tokens');
}
tokens.push(msg.slice(token.lastIndex));
return tokens;
}
This uses the global feature of regexp objects in Javascript to test against the same string repeatedly and uses the lastIndex
property to slice after the last matched token for the rest.
Given
var msg = '!finbot msg testuser 12345 Hello sir, this is a test message';
then
> tokenizer(msg, 4)
[ '!finbot',
'msg',
'testuser',
'12345',
'Hello sir, this is a test message' ]
> tokenizer(msg, 3)
[ '!finbot',
'msg',
'testuser',
'12345 Hello sir, this is a test message' ]
> tokenizer(msg, 2)
[ '!finbot',
'msg',
'testuser 12345 Hello sir, this is a test message' ]
Note that an empty string will always be appended to returned array, even if the given message string contains only tokens:
> tokenizer('asdf', 1)
[ 'asdf', '' ] // An empty "message" at the end
本文标签:
版权声明:本文标题:arrays - Javascript, split a string in 4 pieces, and leave the rest as one big piece - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741640370a2389881.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论