admin管理员组

文章数量:1302275

i am trying to read fields in a CSV file to build an anized structure with all the content. but some fields have double quotation signs which causes parsing errors.

i have tried using the .replace method but i have no idea if i'm doing it in the correct way since this is my first time doing something like this.

fields(i) = fields(i).Replace("""", """)
If fields(i).StartsWith("""""") AndAlso fields(i).EndsWith("""""") Then
    fields(i) = fields(i).Substring(1, fields(i).Length - 2)
End If

This code works when i try with other fields with only regular letters and numbers, but not when trying to .replace the quotation marks.

I have also tried to make a function that reads through the CSV file that cleans it from double quotation marks before parsing it but it also doesn't work. I have sat with this problem for a while now and i can't seem to solve it.

In the printscreen you can see the error i get. This is due to two fields that looks something like this - ""datadatadata"", ""datadatadata"".

It doesn't say data inside of the quotations but that doesn't matter. I get the error because there are two quotation marks at the beginning and end of both of them. And this is only one example of 100 more(i have narrowed it down so it is easier to test).

I can remove the quotations manually but that would take too long and since I will get more CSV files this is too much work to do for each file.

The expected result is that the code will clean up the double quotations to only one for each field or variable but keep the value inside and print it.

Any help is appreciated

i am trying to read fields in a CSV file to build an anized structure with all the content. but some fields have double quotation signs which causes parsing errors.

i have tried using the .replace method but i have no idea if i'm doing it in the correct way since this is my first time doing something like this.

fields(i) = fields(i).Replace("""", """)
If fields(i).StartsWith("""""") AndAlso fields(i).EndsWith("""""") Then
    fields(i) = fields(i).Substring(1, fields(i).Length - 2)
End If

This code works when i try with other fields with only regular letters and numbers, but not when trying to .replace the quotation marks.

I have also tried to make a function that reads through the CSV file that cleans it from double quotation marks before parsing it but it also doesn't work. I have sat with this problem for a while now and i can't seem to solve it.

In the printscreen you can see the error i get. This is due to two fields that looks something like this - ""datadatadata"", ""datadatadata"".

It doesn't say data inside of the quotations but that doesn't matter. I get the error because there are two quotation marks at the beginning and end of both of them. And this is only one example of 100 more(i have narrowed it down so it is easier to test).

I can remove the quotations manually but that would take too long and since I will get more CSV files this is too much work to do for each file.

The expected result is that the code will clean up the double quotations to only one for each field or variable but keep the value inside and print it.

Any help is appreciated

Share Improve this question asked Feb 10 at 15:35 Darin TwanaDarin Twana 11 silver badge1 bronze badge 4
  • 2 You may consider using a library/Nuget package, such as CSVHelper to parse the data rather than attempting to do it yourself. Alternatively, look at the CSVHelper source code - Github for some guidance. – It all makes cents Commented Feb 10 at 15:35
  • A csv handler may work. But it may not know how to handle those double doublequotes out of the box. fields(1).Replace(New String({ChrW(34), ChrW(34)}), ChrW(34).ToString()) – djv Commented Feb 10 at 20:45
  • Please add same sample input rows and some sample output rows – aborruso Commented Feb 11 at 19:14
  • Parsing CSV is non-trivial. I also recommend using a library like CSVHelper – SSS Commented Feb 13 at 5:38
Add a comment  | 

1 Answer 1

Reset to default 1

Well, if you use the built-in CSV reader, then it does NOT care if the values have "quotes" around, or no quotes.

Hence, this:

note:

Imports Microsoft.VisualBasic.FileIO

hence, this code:

    Dim strFile As String = "c:\test\tblHotelsA.txt"

    Using MyParser As New TextFieldParser(strFile)

        MyParser.TextFieldType = FieldType.Delimited
        MyParser.SetDelimiters(",")

        While Not MyParser.EndOfData

            For Each OneField As String In MyParser.ReadFields

                Debug.Write(OneField & vbTab)

            Next
            Debug.Print("")
            Debug.Print("----------------")
        End While

    End Using

The above will correctly parse rows, and the surrounding "quotes" is optional, and will work just fine.

I suppose you could check, and remove the first letter if a ", and then the last letter? But, with above, it really does not matter.

You don't show what CSV reader you are using, but I suggest using the above.

I mean, I suppose you could simply remove all " (quotes) with this code:

    Dim q As String = """"
    ' whatever code here, and then eventually:

    myfields(i) = myfields(i).Replace(q, "")

So, we replace any " to a blank (empty) string.

本文标签: