admin管理员组文章数量:1316018
I have the following string stored in a TEXT datatype which I want to extract the values for
Date:
Queue:
File Name:
and return them in their own columns.
STRING:
If you are able to, please correct the issue and resubmit the file.
Date: 10/8/2024
Queue: ENTRY
File Name: TEST_FILE.PDF
Columns:
Date Queue File Name
-------------------------------------------
10/8/2024 ENTRY TEST_FILE.PDF
I have come up with the following code but have been unable to exclude additional information that comes back.
I get the following data returned:
Date Queue File Name
--------------------------------------------------------------
10/8/2024 Queue: ENTRY File Na TEST_FILE.PDF
SELECT
SUBSTRING(CAST(em.body AS NVARCHAR(300)),
CHARINDEX('Date:', CAST(em.body AS NVARCHAR(300))) + 6,
(CHARINDEX('Queue:', CAST(em.body AS NVARCHAR(300))) - CHARINDEX('Date:', CAST(em.body AS NVARCHAR(300))))) 'Date',
SUBSTRING(CAST(em.body AS NVARCHAR(300)),
CHARINDEX('Queue:', CAST(em.body AS NVARCHAR(300))) + 7,
(CHARINDEX('File Name:', CAST(em.body AS NVARCHAR(300))) - CHARINDEX('Queue:', CAST(em.body AS NVARCHAR(300))))) 'Queue',
RIGHT(CAST(em.body AS NVARCHAR(300)), (LEN(CAST(em.body AS NVARCHAR(300))) - 10) - CHARINDEX('File Name:', CAST(em.body AS NVARCHAR(300)))) 'File Name'
FROM
email em WITH(NOLOCK)
I know I need to decrease the length value for the SUBSTRING
calls, but no matter where I put in a value to decrease them, I get the following error:
Msg 537, Level 16, State 3, Line 2
Invalid length parameter passed to the LEFT or SUBSTRING function
I have the following string stored in a TEXT datatype which I want to extract the values for
Date:
Queue:
File Name:
and return them in their own columns.
STRING:
If you are able to, please correct the issue and resubmit the file.
Date: 10/8/2024
Queue: ENTRY
File Name: TEST_FILE.PDF
Columns:
Date Queue File Name
-------------------------------------------
10/8/2024 ENTRY TEST_FILE.PDF
I have come up with the following code but have been unable to exclude additional information that comes back.
I get the following data returned:
Date Queue File Name
--------------------------------------------------------------
10/8/2024 Queue: ENTRY File Na TEST_FILE.PDF
SELECT
SUBSTRING(CAST(em.body AS NVARCHAR(300)),
CHARINDEX('Date:', CAST(em.body AS NVARCHAR(300))) + 6,
(CHARINDEX('Queue:', CAST(em.body AS NVARCHAR(300))) - CHARINDEX('Date:', CAST(em.body AS NVARCHAR(300))))) 'Date',
SUBSTRING(CAST(em.body AS NVARCHAR(300)),
CHARINDEX('Queue:', CAST(em.body AS NVARCHAR(300))) + 7,
(CHARINDEX('File Name:', CAST(em.body AS NVARCHAR(300))) - CHARINDEX('Queue:', CAST(em.body AS NVARCHAR(300))))) 'Queue',
RIGHT(CAST(em.body AS NVARCHAR(300)), (LEN(CAST(em.body AS NVARCHAR(300))) - 10) - CHARINDEX('File Name:', CAST(em.body AS NVARCHAR(300)))) 'File Name'
FROM
email em WITH(NOLOCK)
I know I need to decrease the length value for the SUBSTRING
calls, but no matter where I put in a value to decrease them, I get the following error:
Share Improve this question edited Jan 30 at 4:45 Dale K 27.5k15 gold badges58 silver badges83 bronze badges asked Jan 30 at 4:33 OfficerSpockOfficerSpock 31 silver badge1 bronze badge 3 |Msg 537, Level 16, State 3, Line 2
Invalid length parameter passed to the LEFT or SUBSTRING function
3 Answers
Reset to default 1Update for @DaleK observations.
This should parse single or multiple entries within a text block, even those which have extra text. Note the char(10) are replaced with a space to ensure a proper delimiter.
Example or dbFiddle
Declare @YourTable Table (id int,[SomeCol] varchar(max)) Insert Into @YourTable Values
(1,'This is a long sentence.
Date: 10/8/2024
Queue: ENTRY
File Name: TEST_FILE.PDF
Date: 11/10/2024
Queue: ENTRY
File Name: SomeOFileName.PDF
'),
(2,'
Date: 11/9/2024
Queue: ENTRY
File Name: OtherFileName.PDF
');
with cte as (
Select ID
,B.*
,LV = lead(value,1) over (partition by ID order by try_convert(int,[key]))
,Grp= sum(case when value='Date:' then 1 else 0 end) over (partition by ID order by try_convert(int,[key]))
From @YourTable A
Cross Apply OpenJSON ('["'+replace(string_escape(replace([SomeCol],char(10),' '),'json'),' ','","')+'"]') B
)
Select ID
,Date = max( case when value='Date:' then LV end )
,Queue = max( case when value='Queue:' then LV end )
,FName = max( case when value='Name:' then LV end )
From cte
Where Grp>0
Group By ID,Grp
Order By ID,Grp
Results
ID Date Queue FName
1 10/8/2024 ENTRY TEST_FILE.PDF
1 11/10/2024 ENTRY SomeOFileName.PDF
2 11/9/2024 ENTRY OtherFileName.PDF
If you find yourself with no choice but to do messy string extractions then the trick is to methodically build up your logic testing each bit at a time.
Personally I like to use a DRY approach, even though its not typical for SQL, because it reduces the chance of the mistakes which can occur when you repeat logic. This can be done with use of the CROSS APPLY
operator.
This is far from the most concise, but is easier (IMO) to build and maintain.
You can see that all I define are the 3 identification strings specified and everything else is derived from that.
CREATE TABLE Email (Body TEXT);
INSERT INTO Email (Body)
VALUES
('If you are able to, please correct the issue and resubmit the file.
Date: 10/8/2024
Queue: ENTRY
File Name: TEST_FILE.PDF');
SELECT
-- Extract the string segments we require
SUBSTRING(em.body, c3.DateEndIdx, c3.QueueStartIdx - c3.DateEndIdx) [Date]
, SUBSTRING(em.body, c3.QueueEndIdx, c3.FileNameStartIdx - c3.QueueEndIdx) Queue
, SUBSTRING(em.body, c3.FileNameEndIdx, c3.EndOfText - c3.FileNameEndIdx) FileName
FROM (
-- Convert to VARCHAR in order to use all string functions
SELECT CONVERT(VARCHAR(MAX), Body) Body
FROM Email
) em
-- Capture the strings we are trying to find
CROSS APPLY (
VALUES (
'Date:'
, 'Queue:'
, 'File Name:'
)
) c1 (DateLabel, QueueLabel, FileNameLabel)
-- Find the starts and ends of the strings we are trying to find
CROSS APPLY (
VALUES (
CHARINDEX(c1.DateLabel, em.body)
, CHARINDEX(c1.QueueLabel, em.body)
, CHARINDEX(c1.FileNameLabel, em.body)
)
) c2 (DateIdx, QueueIdx, FileNameIdx)
CROSS APPLY (
VALUES (
c2.DateIdx
, c2.DateIdx + LEN(c1.DateLabel)
, c2.QueueIdx
, c2.QueueIdx + LEN(c1.QueueLabel)
, c2.FileNameIdx
, c2.FileNameIdx + LEN(c1.FileNameLabel)
, len(em.body) + 1
)
) c3 (DateStartIdx, DateEndIdx, QueueStartIdx, QueueEndIdx, FileNameStartIdx, FileNameEndIdx, EndOfText);
Date | Queue | FileName |
---|---|---|
10/8/2024 |
ENTRY |
TEST_FILE.PDF |
db<>fiddle
You can try content parsing with substring and char string. If required apply LTRIM / RTRIM based on requirement.
DECLARE @data NVARCHAR(100) = 'Date: 10/8/2024 Queue: ENTRY File Name: TEST_FILE.PDF';
SELECT
(SUBSTRING(@data, CHARINDEX('Date: ', @data) + 6, CHARINDEX('Queue:', @data) - CHARINDEX('Date: ', @data) - 6)) AS 'Date',
(SUBSTRING(@data, CHARINDEX('Queue: ', @data) + 7, CHARINDEX('File Name:', @data) - CHARINDEX('Queue:', @data) - 7)) AS 'Queue',
(SUBSTRING(@data, CHARINDEX('File Name: ', @data) + 10, LEN(@data))) AS 'File_Name'
本文标签: sql serverParse TEXT datatype column to grab values based on TAGSStack Overflow
版权声明:本文标题:sql server - Parse TEXT datatype column to grab values based on TAGS - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741986031a2408684.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
text
column (which has been deprecated for 20 years). – Thom A Commented Jan 30 at 8:53