[Excel] CSV Correct/Delete Rows/Split File
Hey everyone, I am new to the forums and kind of hope to grow and get some insight from everyone here. I am having a little bit of trouble I would really love some some kind of help with something small I have been writing in Visual Studios 2010.
At my job, we deal with a lot of excel files, mainly CSV but others as well. We import these CSV files into our databases at work and we can only import 1000 lines at a time. I have already created something that can split a header line and 999 lines to equal 1000 lines. So now I am having a problem by simply implementing a simple Replace string and delete row. So what I have so far is:
Code:
Sub SplitCSV(ByVal FilePath As String, ByVal LineCount As Integer, ByVal MaxOutputFile As Integer, ByVal Status As UpdateProgressSub, ByRef IsAbort As Boolean)
' Validate first
If LineCount < 100 Then Throw New Exception("Number of lines must be more than 100.")
' Open the csv file for reading
Dim Reader As New IO.StreamReader(FilePath)
' Create the output directory
Dim OutputFolder As String = FilePath & "_Splits"
If Directory.Exists(FilePath) = False Then
Directory.CreateDirectory(OutputFolder)
End If
' Read the csv column's header
Dim strHeader As String = Reader.ReadLine
' Start splitting
Dim FileIndex As Integer
Do
' Update progress
FileIndex += 1
If Not Status Is Nothing Then
Status.Invoke((FileIndex - 1) * LineCount)
End If
' Check if the number of splitted files doesn't exceed the limit
If (MaxOutputFile < FileIndex) And (MaxOutputFile > 0) Then Exit Do
' Create new file to store a piece of the csv file
Dim PiecePath As String = OutputFolder & "\" & Path.GetFileNameWithoutExtension(FilePath) & "_" & FileIndex & Path.GetExtension(FilePath)
Dim Writer As New StreamWriter(PiecePath, False)
Writer.AutoFlush = False
Writer.WriteLine(strHeader)
' Read and writes precise number of rows
For i As Integer = 1 To LineCount
Dim s As String = Reader.ReadLine()
If s <> Nothing And _IsAbort = False Then
Writer.WriteLine(s)
Else
Writer.Flush()
Writer.Close()
Exit Do
End If
Next
' Flush and close the splitted file
Writer.Flush()
Writer.Close()
Loop
Reader.Close()
End Sub
(Some of my code is missing, mostly my visual form data)
What I need is to somehow replace invalid characters within my CSV file for example = , ' " .
What I also need is to somehow delete a entire row if a cell is empty in Column A,B,C
(Then I need that row to shift up, for example: Row 16 has a empty cell in Column B so the row needs to be deleted and Row 17+ that is correct shifts up)
I can already tell this is most likely a simple process and somehow its just blowing through me. I am still a beginner and in school in the learning process. This is also a separate program, not a Macro in Excel itself, basically trying to make this a one click process.
Thank you.
Re: [Excel] CSV Correct/Delete Rows/Split File
Any suggestions? Much appreciated, trying to finish this tonight and really need assistance.
Re: [Excel] CSV Correct/Delete Rows/Split File
this is easy for me to do in vb6 /vba, but i do not know how much change is needed for .net
what do you want to replace the invalid characters with?
Re: [Excel] CSV Correct/Delete Rows/Split File
Thanks for the reply I really need some help lol, umm basically I need it to search the entire Excel file and remove the invalid characters with nothing or a blank space. Do you think the way I did it is inefficient? My biggest ordeal though os having a check for Column A,B,C if a cell is empty to remove the entire row and shift all other up. I am beginning to think its not possible. Help is appreciated, wondering if I need to show my entire source.
Like I mentioned in my OP the split works great and fast, I just can't seem to figure out how to implement these two features I really need.
Re: [Excel] CSV Correct/Delete Rows/Split File
i would try like
vb Code:
Dim s As String = Reader.ReadLine()
If s <> Nothing And _IsAbort = False Then
blanks = false ' remember this is vb6 /vba code modify to .net
lnarr = split(s, ",") ' split line into fields
for f = 0 to 2 ' test first 3 fields
if trim(lnarr(f)) = "" then blanks = true: exit for
next
if not blanks then
Writer.WriteLine(s)
else
i = i - 1 ' deduct this line from line count
end if
Else
Writer.Flush()
Writer.Close()
Exit Do
End If
Re: [Excel] CSV Correct/Delete Rows/Split File
Wow this looks almost exactly what I am looking for. I will take a look at it and tell you how it goes. Ty Ty Westconn
Re: [Excel] CSV Correct/Delete Rows/Split File
So what do I do to declare what "blanks" "lnarr""f" is?
or
am I doing something wrong? :\
Re: [Excel] CSV Correct/Delete Rows/Split File
Quote:
So what do I do to declare what "blanks" "lnarr""f" is?
i thought straight forward
blanks as boolean
lnarr() as string , not sure about arrays in .net
f as integer
1 Attachment(s)
Re: [Excel] CSV Correct/Delete Rows/Split File
I apologize westconn1 I am still highly inexperienced and I only know the basics. I thought I got it to work but it was splitting my files empty into the output folder. I am not sure if I am allowed to post my source but I attached it if you don't mind just taking a look at it. Any time is appreciated.
Once again sorry, I am probably in over my head but making this tool will make my life easier.
Thanks again for being patient and helpful.
Re: [Excel] CSV Correct/Delete Rows/Split File
i looked at your code and could not see any obvious problem, though note, i repeat, i am not familiar with .net coding, i do not see where you might have tried to incorporate the code i suggested, so i do not know what you have tried.
if you are lucky, robdog, koolsid or someone with .net experience may come on line and be able to help you
Re: [Excel] CSV Correct/Delete Rows/Split File
Thanks for taking a look at it westconn, you have been a lot of help so far. I guess all I can do is more trial and error and reading. Do you think it can be done another way by chance?
Re: [Excel] CSV Correct/Delete Rows/Split File
there are nearly always multiple solutions, for any task
another method might be to automate excel
i could see no code to indicate what you had tried to read the first 3 columns
i did make the assumption that your .csv file actually had comma separated values (csv), which like all assumptions may be incorrect, if your fields are separated by some other identifier then you would need to change the code to suit
google on .net string functions to find the equivalents of what i suggested
Re: [Excel] CSV Correct/Delete Rows/Split File
Yea, the file is using commas to separate, I dont need to replace them ugh I didn't realize that I had that to replace invalid characters. (In my OP)
My problem is everything I try, I cant get a replace or remove row function/code to work. I never get any errors either.
I never had anything in my original code to read the first 3 columns. The reason I don't want to use macros is because I don't have access to Excel all the time. So if I can automate this process removing invalid characters and rows that are missing data. What would be amazing is being able to generate a log file to tell me what rows were removed. I just don't have the apptitude or knowledge to do any of this. :\