admin管理员组文章数量:1332345
Hello and sorry for the long title name! I am working with some data that has a long text string (some observations have up to ~2000 characters). Within these strings could be a word (AB/CD) that could be anywhere within the string. I am trying to detect AB/CD within the text string and create a binary variable (ABCD_present) if the word appears in the text.
Below is some example data
data test;
length status $175;
infile datalines dsd dlm="|" truncover;
input ID Status$;
datalines;
1|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data AB/CD
2|This is example AB/CD text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
3|This is example text I am using instead of real data. I AB/CD am making the length of this text longer to mimic the long text strings of my data
4|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
5|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
6|This is example text I am using instead of real data. I am making the length of this text longer to AB/CD mimic the long text strings of my data
;
run;
Any guidance on this would be lovely! I do not have a ton of experience using long text strings.
Thank you in advance
Hello and sorry for the long title name! I am working with some data that has a long text string (some observations have up to ~2000 characters). Within these strings could be a word (AB/CD) that could be anywhere within the string. I am trying to detect AB/CD within the text string and create a binary variable (ABCD_present) if the word appears in the text.
Below is some example data
data test;
length status $175;
infile datalines dsd dlm="|" truncover;
input ID Status$;
datalines;
1|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data AB/CD
2|This is example AB/CD text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
3|This is example text I am using instead of real data. I AB/CD am making the length of this text longer to mimic the long text strings of my data
4|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
5|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
6|This is example text I am using instead of real data. I am making the length of this text longer to AB/CD mimic the long text strings of my data
;
run;
Any guidance on this would be lovely! I do not have a ton of experience using long text strings.
Thank you in advance
Share Improve this question asked Nov 20, 2024 at 21:17 RyanRyan 1018 bronze badges2 Answers
Reset to default 1You can use the find
function.
data want;
set test;
flag_abcd = (find(status, 'AB/CD') > 0);
run;
Status ID flag_abcd
... 1 1
... 2 1
... 3 1
... 4 0
... 5 0
... 6 1
Two other functions that detect the presence of a substring are INDEX
and PRXMATCH
flag = index (status, 'AB/CD') > 0 ;
flag = prxmatch ('m/AB\/CD/', status) > 0 ;
本文标签:
版权声明:本文标题:Detect word in character variable text string and create variable based upon presence of that word SAS - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742327753a2454066.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论